DRAGON: Dynamic Recurrent Accelerator for Graph Online Convolution

José Romero Hung; Chao Li; Taolei Wang; Jinyang Guo; Pengyu Wang; Chuanming Shao; Jing Wang; Guoyong Shi; Xiangwen Liu; Hanqing Wu

doi:10.1145/3524124

DRAGON: Dynamic Recurrent Accelerator for Graph Online Convolution

José Romero Hung, Chao Li, Taolei Wang, Jinyang Guo, Pengyu Wang, Chuanming Shao, Jing Wang, Guoyong Shi, Xiangwen Liu, Hanqing Wu

Research output: Contribution to journal › Article › peer-review

2 Citations (Scopus)

Abstract

Despite the extraordinary applicative potentiality that dynamic graph inference may entail, its practical-physical implementation has been a topic seldom explored in literature. Although graph inference through neural networks has received plenty of algorithmic innovation, its transfer to the physical world has not found similar development. This is understandable since the most preeminent Euclidean acceleration techniques from CNN have little implication in the non-Euclidean nature of relational graphs. Instead of coping with the challenges arising from forcing naturally sparse structures into more inflexible stochastic arrangements, in DRAGON, we embrace this characteristic in order to promote acceleration. Inspired by high-performance computing approaches like Parallel Multi-moth Flame Optimization for Link Prediction (PMFO-LP), we propose and implement a novel efficient architecture, capable of producing similar speed-up and performance than baseline but at a fraction of its hardware requirements and power consumption. We leverage the hidden parallelistic capacity of our previously developed static graph convolutional processor ACE-GCN and expanded it with RNN structures, allowing the deployment of a multi-processing network referenced around a common pool of proximity-based centroids. Experimental results demonstrate outstanding acceleration. In comparison with the fastest CPU-based software implementation available in the literature, DRAGON has achieved roughly 191× speed-up. Under the largest configuration and dataset, DRAGON was also able to overtake a more power-hungry PMFO-LP by almost 1.59× in speed, and at around 89.59% in power efficiency. More importantly than raw acceleration, we demonstrate the unique functional qualities of our approach as a flexible and fault-tolerant solution that makes it an interesting alternative for an anthology of applicative scenarios.

Original language	English
Article number	3524124
Journal	ACM Transactions on Design Automation of Electronic Systems
Volume	28
Issue number	1
DOIs	https://doi.org/10.1145/3524124
Publication status	Published - 20 Jan 2023
Externally published	Yes

Keywords

Convolutional neural networks
HW accelerator
dynamic graphs
embedded systems

Access to Document

10.1145/3524124

Cite this

@article{ab05d20b451d4afda32775bd89108688,

title = "DRAGON: Dynamic Recurrent Accelerator for Graph Online Convolution",

abstract = "Despite the extraordinary applicative potentiality that dynamic graph inference may entail, its practical-physical implementation has been a topic seldom explored in literature. Although graph inference through neural networks has received plenty of algorithmic innovation, its transfer to the physical world has not found similar development. This is understandable since the most preeminent Euclidean acceleration techniques from CNN have little implication in the non-Euclidean nature of relational graphs. Instead of coping with the challenges arising from forcing naturally sparse structures into more inflexible stochastic arrangements, in DRAGON, we embrace this characteristic in order to promote acceleration. Inspired by high-performance computing approaches like Parallel Multi-moth Flame Optimization for Link Prediction (PMFO-LP), we propose and implement a novel efficient architecture, capable of producing similar speed-up and performance than baseline but at a fraction of its hardware requirements and power consumption. We leverage the hidden parallelistic capacity of our previously developed static graph convolutional processor ACE-GCN and expanded it with RNN structures, allowing the deployment of a multi-processing network referenced around a common pool of proximity-based centroids. Experimental results demonstrate outstanding acceleration. In comparison with the fastest CPU-based software implementation available in the literature, DRAGON has achieved roughly 191× speed-up. Under the largest configuration and dataset, DRAGON was also able to overtake a more power-hungry PMFO-LP by almost 1.59× in speed, and at around 89.59% in power efficiency. More importantly than raw acceleration, we demonstrate the unique functional qualities of our approach as a flexible and fault-tolerant solution that makes it an interesting alternative for an anthology of applicative scenarios.",

keywords = "Convolutional neural networks, HW accelerator, dynamic graphs, embedded systems",

author = "Hung, {Jos{\'e} Romero} and Chao Li and Taolei Wang and Jinyang Guo and Pengyu Wang and Chuanming Shao and Jing Wang and Guoyong Shi and Xiangwen Liu and Hanqing Wu",

note = "Publisher Copyright: {\textcopyright} 2023 Association for Computing Machinery.",

year = "2023",

month = jan,

day = "20",

doi = "10.1145/3524124",

language = "English",

volume = "28",

journal = "ACM Transactions on Design Automation of Electronic Systems",

issn = "1084-4309",

number = "1",

}

TY - JOUR

T1 - DRAGON

T2 - Dynamic Recurrent Accelerator for Graph Online Convolution

AU - Hung, José Romero

AU - Li, Chao

AU - Wang, Taolei

AU - Guo, Jinyang

AU - Wang, Pengyu

AU - Shao, Chuanming

AU - Wang, Jing

AU - Shi, Guoyong

AU - Liu, Xiangwen

AU - Wu, Hanqing

PY - 2023/1/20

Y1 - 2023/1/20

N2 - Despite the extraordinary applicative potentiality that dynamic graph inference may entail, its practical-physical implementation has been a topic seldom explored in literature. Although graph inference through neural networks has received plenty of algorithmic innovation, its transfer to the physical world has not found similar development. This is understandable since the most preeminent Euclidean acceleration techniques from CNN have little implication in the non-Euclidean nature of relational graphs. Instead of coping with the challenges arising from forcing naturally sparse structures into more inflexible stochastic arrangements, in DRAGON, we embrace this characteristic in order to promote acceleration. Inspired by high-performance computing approaches like Parallel Multi-moth Flame Optimization for Link Prediction (PMFO-LP), we propose and implement a novel efficient architecture, capable of producing similar speed-up and performance than baseline but at a fraction of its hardware requirements and power consumption. We leverage the hidden parallelistic capacity of our previously developed static graph convolutional processor ACE-GCN and expanded it with RNN structures, allowing the deployment of a multi-processing network referenced around a common pool of proximity-based centroids. Experimental results demonstrate outstanding acceleration. In comparison with the fastest CPU-based software implementation available in the literature, DRAGON has achieved roughly 191× speed-up. Under the largest configuration and dataset, DRAGON was also able to overtake a more power-hungry PMFO-LP by almost 1.59× in speed, and at around 89.59% in power efficiency. More importantly than raw acceleration, we demonstrate the unique functional qualities of our approach as a flexible and fault-tolerant solution that makes it an interesting alternative for an anthology of applicative scenarios.

AB - Despite the extraordinary applicative potentiality that dynamic graph inference may entail, its practical-physical implementation has been a topic seldom explored in literature. Although graph inference through neural networks has received plenty of algorithmic innovation, its transfer to the physical world has not found similar development. This is understandable since the most preeminent Euclidean acceleration techniques from CNN have little implication in the non-Euclidean nature of relational graphs. Instead of coping with the challenges arising from forcing naturally sparse structures into more inflexible stochastic arrangements, in DRAGON, we embrace this characteristic in order to promote acceleration. Inspired by high-performance computing approaches like Parallel Multi-moth Flame Optimization for Link Prediction (PMFO-LP), we propose and implement a novel efficient architecture, capable of producing similar speed-up and performance than baseline but at a fraction of its hardware requirements and power consumption. We leverage the hidden parallelistic capacity of our previously developed static graph convolutional processor ACE-GCN and expanded it with RNN structures, allowing the deployment of a multi-processing network referenced around a common pool of proximity-based centroids. Experimental results demonstrate outstanding acceleration. In comparison with the fastest CPU-based software implementation available in the literature, DRAGON has achieved roughly 191× speed-up. Under the largest configuration and dataset, DRAGON was also able to overtake a more power-hungry PMFO-LP by almost 1.59× in speed, and at around 89.59% in power efficiency. More importantly than raw acceleration, we demonstrate the unique functional qualities of our approach as a flexible and fault-tolerant solution that makes it an interesting alternative for an anthology of applicative scenarios.

KW - Convolutional neural networks

KW - HW accelerator

KW - dynamic graphs

KW - embedded systems

UR - http://www.scopus.com/inward/record.url?scp=85147258409&partnerID=8YFLogxK

U2 - 10.1145/3524124

DO - 10.1145/3524124

M3 - Article

AN - SCOPUS:85147258409

SN - 1084-4309

VL - 28

JO - ACM Transactions on Design Automation of Electronic Systems

JF - ACM Transactions on Design Automation of Electronic Systems

IS - 1

M1 - 3524124

ER -

DRAGON: Dynamic Recurrent Accelerator for Graph Online Convolution

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this