TY - JOUR
T1 - EEGUnity
T2 - Open-Source Tool in Facilitating Unified EEG Datasets Toward Large-Scale EEG Model
AU - Qin, Chengxuan
AU - Yang, Rui
AU - You, Wenlong
AU - Chen, Zhige
AU - Zhu, Longsheng
AU - Huang, Mengjie
AU - Wang, Zidong
N1 - Publisher Copyright:
© 2001-2011 IEEE.
PY - 2025
Y1 - 2025
N2 - The increasing number of dispersed EEG dataset publications and the advancement of large-scale Electroencephalogram (EEG) models have increased the demand for practical tools to manage diverse EEG datasets. However, the inherent complexity of EEG data, characterized by variability in content data, metadata, and data formats, poses challenges for integrating multiple datasets and conducting large-scale EEG model research. To tackle the challenges, this paper introduces EEGUnity, an open-source tool that incorporates modules of "EEG Parser", "Correction", "Batch Processing", and "Large Language Model Boost". Leveraging the functionality of such modules, EEGUnity facilitates the efficient management of multiple EEG datasets, such as intelligent data structure inference, data cleaning, and data unification. In addition, the capabilities of EEGUnity ensure high data quality and consistency, providing a reliable foundation for large-scale EEG data research. EEGUnity is evaluated across 25 EEG datasets from different sources, offering several typical batch processing workflows. The results demonstrate the high performance and flexibility of EEGUnity in parsing and data processing.
AB - The increasing number of dispersed EEG dataset publications and the advancement of large-scale Electroencephalogram (EEG) models have increased the demand for practical tools to manage diverse EEG datasets. However, the inherent complexity of EEG data, characterized by variability in content data, metadata, and data formats, poses challenges for integrating multiple datasets and conducting large-scale EEG model research. To tackle the challenges, this paper introduces EEGUnity, an open-source tool that incorporates modules of "EEG Parser", "Correction", "Batch Processing", and "Large Language Model Boost". Leveraging the functionality of such modules, EEGUnity facilitates the efficient management of multiple EEG datasets, such as intelligent data structure inference, data cleaning, and data unification. In addition, the capabilities of EEGUnity ensure high data quality and consistency, providing a reliable foundation for large-scale EEG data research. EEGUnity is evaluated across 25 EEG datasets from different sources, offering several typical batch processing workflows. The results demonstrate the high performance and flexibility of EEGUnity in parsing and data processing.
KW - Brain-computer-interface
KW - electroencephalogram data integration
KW - large-scale model
KW - open-source software
UR - http://www.scopus.com/inward/record.url?scp=105004185971&partnerID=8YFLogxK
U2 - 10.1109/TNSRE.2025.3565158
DO - 10.1109/TNSRE.2025.3565158
M3 - Article
C2 - 40293886
AN - SCOPUS:105004185971
SN - 1534-4320
VL - 33
SP - 1653
EP - 1663
JO - IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING
JF - IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING
ER -