针对图神经网络加速器性能评估的标准测试集

宋新开* ** ***; 支天* ***; 孔维浩* ** ***; 杜子东* ***

文章摘要

宋新开* ** ***,支天* ***,孔维浩* ** ***,杜子东* ***.针对图神经网络加速器性能评估的标准测试集[J].高技术通讯(中文),2022,32(7):663~673

针对图神经网络加速器性能评估的标准测试集

Benchmarking graph neural network accelerators

DOI：10.3772/j.issn.1002-0470.2022.07.001

中文关键词: 图神经网络（GNN）；加速器；标准测试集

英文关键词: graph neural network(GNN), accelerator, Benchmark

基金项目:

作者	单位
宋新开* *	（中国科学院计算技术研究所计算机体系结构国家重点实验室北京 100190）（* 中国科学院大学北京 100049） (***中科寒武纪科技股份有限公司北京 100191)
支天* ***	（中国科学院计算技术研究所计算机体系结构国家重点实验室北京 100190）（* 中国科学院大学北京 100049） (***中科寒武纪科技股份有限公司北京 100191)
孔维浩* *	（中国科学院计算技术研究所计算机体系结构国家重点实验室北京 100190）（* 中国科学院大学北京 100049） (***中科寒武纪科技股份有限公司北京 100191)
杜子东* ***	（中国科学院计算技术研究所计算机体系结构国家重点实验室北京 100190）（* 中国科学院大学北京 100049） (***中科寒武纪科技股份有限公司北京 100191)

摘要点击次数: 1987

全文下载次数: 1365

中文摘要:

图神经网络(GNN)算法在图结构数据处理任务中取得了突破性的成功。然而，针对图神经网络硬件加速器设计的研究缺乏明确的设计目标和统一的评价标准。本文提出一种针对图神经网络硬件加速器性能评估的标准测试集(BenchGNN)。BenchGNN包括宏测试集和微测试集2部分。宏测试集包含了3种主要任务类型的图神经网络算法和5个典型应用领域的数据集。微测试集包含2种微观操作类型和4种不同量化特性的数据集。本文在现有运算设备中央处理器(CPU)、图形处理器(GPU)和图神经网络专用加速器上进行了BenchGNN的实验测试。实验结果表明，CPU由于并行度不高而无法高效处理图神经网络算法。针对图神经网络算法的随机访存行为进行优化的专用加速器取得了优于通用并行处理器GPU的性能功耗表现。根据BenchGNN的评估结果，在图神经网络加速器设计过程中需要重点考虑运算并行度和随机访存优化这两种因素。

英文摘要:

Graph neural network(GNN) has achieved breakthroughs in processing graph-structured data. However, researches on GNN accelerator design lack a clear design objective and unified evaluation methods. The Benchmark for graph neural network (BenchGNN) is proposed for evaluating the performance of GNN accelerators. BenchGNN consists of macro-benchmark and micro-benchmark. Macro-benchmark consists of algorithms of three task types of GNN and datasets from five application fields of GNN. Micro-benchmark consists of two basic micro-operation of GNN and four graph datasets of different scale characteristics. An experimental evaluation of BenchGNN is conducted on modern centrol processing unit (CPU), graphic processing unit ( GPU), and a GNN accelerator. The experimental results show that the CPU cannot process GNN efficiently due to the lack of parallel processing units. The specifically designed accelerator achieves better performance and lower energy consumption than GPU, due to the fact that the design of the accelerator optimizes the random memory access of GNN workloads. The results inspire researchers of GNN accelerators that the design of GNN accelerators should take into account both the high parallelism of processors and the ability of performing random memory access.

查看全文查看/发表评论下载PDF阅读器

关闭