TY - GEN
T1 - Pitch Class and Octave-Based Pitch Embedding Training Strategies for Symbolic Music Generation
AU - Li, Yuqiang
AU - Li, Shengchen
AU - Fazekas, György
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2026.
PY - 2026
Y1 - 2026
N2 - This paper presents two strategies to prevent the learned pitch representation from worsening so as to improve pitch and pitch class distributions in symbolic music generation. The first strategy is to switch the input pitch representation from the flat MIDI number representation to a hierarchical representation consisting of pitch class (chroma) and octave, which forces musically similar pitches to share part of the embedding vectors. The second strategy freezes the pitch embeddings during training according to the proposed evaluation metric of the pitch embedding space, maintaining the robustness of the embedding obtained in the first strategy. The experiments show that, when both strategies were applied to training an auto-regressive neural network for melody generation, the generated samples exhibited significant improvement in pitch class entropy (from 19% to 34% overlapping with the test dataset), and a modest but still significant improvement on pitch entropy (from 24% to 28%).
AB - This paper presents two strategies to prevent the learned pitch representation from worsening so as to improve pitch and pitch class distributions in symbolic music generation. The first strategy is to switch the input pitch representation from the flat MIDI number representation to a hierarchical representation consisting of pitch class (chroma) and octave, which forces musically similar pitches to share part of the embedding vectors. The second strategy freezes the pitch embeddings during training according to the proposed evaluation metric of the pitch embedding space, maintaining the robustness of the embedding obtained in the first strategy. The experiments show that, when both strategies were applied to training an auto-regressive neural network for melody generation, the generated samples exhibited significant improvement in pitch class entropy (from 19% to 34% overlapping with the test dataset), and a modest but still significant improvement on pitch entropy (from 24% to 28%).
KW - music feature representation
KW - pitch representation
KW - symbolic music generation
UR - https://www.scopus.com/pages/publications/105020009315
U2 - 10.1007/978-3-032-02042-0_11
DO - 10.1007/978-3-032-02042-0_11
M3 - Conference Proceeding
AN - SCOPUS:105020009315
SN - 9783032020413
T3 - Lecture Notes in Computer Science
SP - 149
EP - 167
BT - Music and Sound Generation in the AI Era - 16th International Symposium, CMMR 2023, Revised Selected Papers
A2 - Ystad, Sølvi
A2 - Kronland-Martinet, Richard
A2 - Aramaki, Mitsuko
A2 - Kitahara, Tetsuro
A2 - Hirata, Keiji
PB - Springer Science and Business Media Deutschland GmbH
T2 - 16th International Symposium on Computer Music Multidisciplinary Research, CMMR 2023
Y2 - 13 November 2023 through 17 November 2023
ER -