边缘大数据分析预测建模方法研究

钟运琴* **; 朱月琴*** ****; 焦守涛*** ****

文章摘要

钟运琴* **,朱月琴*** ****,焦守涛*** ****.边缘大数据分析预测建模方法研究[J].高技术通讯(中文),2022,32(10):1067~1075

边缘大数据分析预测建模方法研究

Research on edge big data analysis and predictive modeling method

DOI：10.3772/j.issn.1002-0470.2022.10.008

中文关键词: 边缘计算；大数据分析；边缘大数据；边缘机器学习；边云协同

英文关键词: edge computing, big data analysis, edge big data, edge machine learning, edge-cloud collaboration

基金项目:

作者	单位
钟运琴* **	（中国科学院大学中国科学院大数据挖掘与知识管理重点实验室北京 100190）（国务院发展研究中心信息中心北京 100010）（中国地质调查局发展研究中心北京 100037）（**自然资源部地质信息工程技术创新中心北京 100037）
朱月琴* **	（中国科学院大学中国科学院大数据挖掘与知识管理重点实验室北京 100190）（国务院发展研究中心信息中心北京 100010）（中国地质调查局发展研究中心北京 100037）（**自然资源部地质信息工程技术创新中心北京 100037）
焦守涛* **	（中国科学院大学中国科学院大数据挖掘与知识管理重点实验室北京 100190）（国务院发展研究中心信息中心北京 100010）（中国地质调查局发展研究中心北京 100037）（**自然资源部地质信息工程技术创新中心北京 100037）

摘要点击次数: 3057

全文下载次数: 2120

中文摘要:

随着物联网大数据分析实时性要求的提高，中心控制的云端大数据分析方法无法满足实时性和准确性要求，表现为响应延迟大、成本开销大、特定环境下的预测准确性低。本文提出了在海量实时数据如传感器数据、流数据等场景下的边缘侧大数据分析预测建模方法，该方法在边缘侧训练小数据样本，根据特定的应用场景多接入边缘侧进行分布式建模学习，分而治之地训练模型和推理预测分析。首先，通过将大数据分析和边缘计算相结合提出了边缘侧和云端协同的大数据分析预测建模的理论范式框架；其次，在该标准范式框架的基础上，设计了边缘侧大数据分析预测的训练算法和调优机制；最后实现了边缘侧大数据分析的训练和评估系统原型。在百个节点测试环境的实验结果表明，在实时大数据场景，同云端训练相比，本文提出的边缘侧大数据训练的性能效率平均提升了3.95倍，网络通信量减少了88.7%，边缘侧协同训练模型的预测准确率、召回率和F1值比传统训练方法可以提升3%~9%，请求预测的响应延迟降低了67.5%。本文方法可有效应用于科学计算、智能金融、自动驾驶、安防监控、数据安全、智能工厂和智慧城市等领域，具有一定的借鉴价值。

英文摘要:

With the improvement of real-time requirements for big data analysis of the Internet of Things, the cloud big data analysis method controlled by the center cannot meet the real-time and accuracy requirements due to its large response delay, high cost, and low prediction accuracy in specific environments.This paper proposes an edge-side big data analysis and predictive modeling method under massive real-time data such as sensor data, streaming data and other scenarios. This method trains small data samples on the edge side, multi-accesses the edge side for distributed distribution according to specific application scenarios and conducts model learning, models training and inference predictive analysis. Firstly, by combining big data analysis and edge computing, a theoretical paradigm framework for big data analysis and prediction modeling on the edge side and cloud collaboration is proposed. Secondly, the edge side big data analysis and prediction training algorithm and tuning mechanism are designed. Finally, the prototype of the training and evaluation system for edge-side big data analysis is realized. Experimental results in a test environment with hundreds of nodes show that in real-time big data scenarios, compared with cloud training, the performance and efficiency of the edge-side big data training proposed in this paper is increased by an average of 3.95 times, and the network traffic is reduced by 88.7%. The prediction accuracy, recall rate and F1 value of the collaborative training model can be improved by 3%-9% compared with the traditional training method, and the response delay of request prediction is reduced by 67.5%. The method in this paper can be effectively applied to scientific computing, smart finance, autonomous driving, security monitoring, data security, smart factories, smart cities and other fields.

查看全文查看/发表评论下载PDF阅读器

关闭