Результаты исследований: Публикации в книгах, отчётах, сборниках, трудах конференций › глава/раздел › научная › Рецензирование
Extracting Software Requirements from Unstructured Documents. / Ivanov, Vladimir; Sadovykh, Andrey; Naumchev, Alexandr и др.
Extracting Software Requirements from Unstructured Documents. Springer, 2022. стр. 17-29 2 (Recent Trends in Analysis of Images, Social Networks and Texts; Том 1573).Результаты исследований: Публикации в книгах, отчётах, сборниках, трудах конференций › глава/раздел › научная › Рецензирование
}
TY - CHAP
T1 - Extracting Software Requirements from Unstructured Documents
AU - Ivanov, Vladimir
AU - Sadovykh, Andrey
AU - Naumchev, Alexandr
AU - Bagnato, Alessandra
AU - Yakovlev, Kirill
PY - 2022/8/30
Y1 - 2022/8/30
N2 - Requirements identification in textual documents or extraction is a tedious and error prone task that many researchers suggest automating. We manually annotated the PURE dataset and thus created a new one containing both requirements and non-requirements. Using this dataset, we fine-tuned the BERT model and compare the results with several baselines such as fastText and ELMo. In order to evaluate the model on semantically more complex documents we compare the PURE dataset results with experiments on Request For Information (RFI) documents. The RFIs often include software requirements, but in a less standardized way. The fine-tuned BERT showed promising results on PURE dataset on the binary sentence classification task. Comparing with previous and recent studies dealing with constrained inputs, our approach demonstrates high performance in terms of precision and recall metrics, while being agnostic to the unstructured textual input.
AB - Requirements identification in textual documents or extraction is a tedious and error prone task that many researchers suggest automating. We manually annotated the PURE dataset and thus created a new one containing both requirements and non-requirements. Using this dataset, we fine-tuned the BERT model and compare the results with several baselines such as fastText and ELMo. In order to evaluate the model on semantically more complex documents we compare the PURE dataset results with experiments on Request For Information (RFI) documents. The RFIs often include software requirements, but in a less standardized way. The fine-tuned BERT showed promising results on PURE dataset on the binary sentence classification task. Comparing with previous and recent studies dealing with constrained inputs, our approach demonstrates high performance in terms of precision and recall metrics, while being agnostic to the unstructured textual input.
KW - BERT
KW - ELMo
KW - FastText
KW - Requirements elicitation
KW - Sentence classification
KW - Software requirements
UR - https://www.mendeley.com/catalogue/e7bd97ea-a8bd-3d98-b281-aeee3efbb657/
U2 - 10.1007/978-3-031-15168-2_2
DO - 10.1007/978-3-031-15168-2_2
M3 - Chapter
SN - 978-3-031-15167-5
T3 - Recent Trends in Analysis of Images, Social Networks and Texts
SP - 17
EP - 29
BT - Extracting Software Requirements from Unstructured Documents
PB - Springer
ER -
ID: 65524499