基于BERT模型的检验检测领域命名实体识别

苏展鹏*; 李洋*; 张婷婷**; 让冉**; 张龙波**; 蔡红珍*; 邢林林**

文章摘要

苏展鹏*,李洋*,张婷婷**,让冉**,张龙波**,蔡红珍*,邢林林**.基于BERT模型的检验检测领域命名实体识别[J].高技术通讯(中文),2022,32(7):749~755

基于BERT模型的检验检测领域命名实体识别

Named entity recognition in inspection and detection field based on BERT model

DOI：10.3772/j.issn.1002-0470.2022.07.009

中文关键词: 命名实体识别；双向编码器表示法(BERT)；检验检测领域；深度学习；双向门控循环单元(BIGRU)

英文关键词: named entity recognition， bidirectional encoder representation from transformers (BERT)， inspection and detection field， deep learning， bi directional gate recurrent unit (BIGRU)

基金项目:

作者	单位
苏展鹏*	（山东理工大学农业工程与食品科学学院淄博 255000）（*山东理工大学计算机科学与技术学院淄博 255000）
李洋*	（山东理工大学农业工程与食品科学学院淄博 255000）（*山东理工大学计算机科学与技术学院淄博 255000）
张婷婷**	（山东理工大学农业工程与食品科学学院淄博 255000）（*山东理工大学计算机科学与技术学院淄博 255000）
让冉**	（山东理工大学农业工程与食品科学学院淄博 255000）（*山东理工大学计算机科学与技术学院淄博 255000）
张龙波**	（山东理工大学农业工程与食品科学学院淄博 255000）（*山东理工大学计算机科学与技术学院淄博 255000）
蔡红珍*	（山东理工大学农业工程与食品科学学院淄博 255000）（*山东理工大学计算机科学与技术学院淄博 255000）
邢林林**	（山东理工大学农业工程与食品科学学院淄博 255000）（*山东理工大学计算机科学与技术学院淄博 255000）

摘要点击次数: 4684

全文下载次数: 3373

中文摘要:

针对检验检测领域存在的实体语料匮乏、实体嵌套严重、实体类型冗杂繁多等问题，提出了一种结合双向编码器表示法(BERT)预处理语言模型、双向门控循环单元(BIGRU)双向轻编码模型和随机条件场(CRF)的命名实体识别方法。BERT-BIGRU-CRF(BGC)模型首先利用BERT预处理模型结合上下文语义训练词向量；然后经过BIGRU层双向编码；最后在CRF层计算后输出最优结果。利用含有检测组织、检测项目、检测标准和检测仪器4种命名实体的检验检测领域数据集来训练模型，结果表明BGC模型的准确率、召回率和F1值都优于不加入BERT的对比模型。同时对比BERT-BILSTM-CRF模型，BGC模型在训练时间上缩短了6%。

英文摘要:

Aiming at the problems of lack of entity corpus, serious nesting of entities, and multiple entity types in the field of inspection and detection, a named entity recognition method combining bidirectional encoder representation from transformers (BERT) preprocessing language model, bi-directional gate recurrent unit (BIGRU) bidirectional light coding model and random condition field (CRF) is proposed. The BERT-BIGRU-CRF(BGC) model first uses the BERT preprocessing model combined with contextual semantic training word vectors. Then it undergoes bidirec-tional encoding at the BIGRU layer. Finally it outputs the optimal result after calculation at the CRF layer. The model is trained by using the inspection and detection field data set containing four named entities of inspection or-ganization, inspection items, inspection standards, and inspection instruments. The experimental results show that the accuracy, recall and F1 value of the BGC model are better than the comparison model without BERT. At the same time, compared with the BERT-BILSTM-CRF model, the BGC model shortens the training time by 6% .

查看全文查看/发表评论下载PDF阅读器

关闭