文章摘要
申罕骥* **,付翔***,李俊*.基于逻辑回归监督学习的大样本日志异常检测优化方法[J].高技术通讯(中文),2022,32(8):789~800
基于逻辑回归监督学习的大样本日志异常检测优化方法
Large sample log anomaly detection optimization method based on logistic regression supervised learning
  
DOI:10.3772/j.issn.1002-0470.2022.08.002
中文关键词: 监督学习; 大样本; 日志处理; 异常检测
英文关键词: supervised learning, large sample, log processing, anomaly detection
基金项目:
作者单位
申罕骥* ** (*中国科学院计算机网络信息中心北京100190) (**中国科学院大学北京 100049) (*** 北京蓝天前沿科技创新中心北京 100085) 
付翔*** (*中国科学院计算机网络信息中心北京100190) (**中国科学院大学北京 100049) (*** 北京蓝天前沿科技创新中心北京 100085) 
李俊* (*中国科学院计算机网络信息中心北京100190) (**中国科学院大学北京 100049) (*** 北京蓝天前沿科技创新中心北京 100085) 
摘要点击次数: 1331
全文下载次数: 715
中文摘要:
      传统基于日志的异常检测方法依赖于人工分析,适用于数据量小的系统,而对于复杂且庞大的日志系统,其检测效率往往很低,无法满足要求。随着机器学习的发展,检测手段发生了根本的转变,检测效率及性能也大幅提高。对于同一个日志系统,针对不同的日志预处理方法及机器学习算法,尤其对日志模板及特征的提取目前还没有统一的成熟模型,导致最后得到较大差异的检测准确率、性能等指标。本文基于监督学习方法提出大样本日志异常检测优化方法,将数据集进行日志解析得到精确的日志模板,再进行日志序列的向量化处理,使用逻辑回归监督学习算法进行分类训练与测试,结合不同的测试指标来选取最佳的参数,最终得到最优模型。实验结果证明,经此方法获取的模型能够达到较优的检测结果。
英文摘要:
      The traditional anomaly detection method based on log relies on manual analysis, which is suitable for the system with small amount of data, but for the complex and large log system, its detection efficiency is often very low and unsuitable. With the development of machine learning, fundamental changes have taken place in detection methods, and detection efficiency and performance have also been greatly improved. For the same log system, there is no unified mature model for different log preprocessing methods and machine learning algorithms, especially for the extraction of log templates and features, which leads to relatively different detection accuracy, performance and other indicators. Based on supervised learning method, a large sample logging anomaly detection method is put forward, the accurate data set is obtained after log parse template, then the vectorization of logging sequence is processed, logistic regression supervise learning algorithms are used to classify training and testing, different test indexes are combined to select the best parameters, finally the optimal model is obtained. Experimental results show that the model obtained by this method can achieve better detection results.
查看全文   查看/发表评论  下载PDF阅读器
关闭

分享按钮