Timbre Style Transfer for Musical Instruments Acoustic Guitar and Piano using the Generator-Discriminator Model
Abstract
Full Text:
PDFReferences
L. Gatys, A. Ecker, and M. Bethge, “A Neural Algorithm of Artistic Style,” J. Vis., vol. 16, no. 12, p. 326, Sep. 2016.
G. Brunner, Y. Wang, R. Wattenhofer, and S. Zhao, “Symbolic Music Genre Transfer with CycleGAN,” in 2018 IEEE 30th International Conference on Tools with Artificial Intelligence (ICTAI), IEEE, Nov. 2018, pp. 786–793.
C.-Y. Lu, M.-X. Xue, C.-C. Chang, C.-R. Lee, and L. Su, “Play as you like: Timbre-enhanced multi-modal music style transfer,” in Proceedings of the aaai conference on artificial intelligence, 2019, pp. 1061–1068.
H.-W. Dong, W.-Y. Hsiao, L.-C. Yang, and Y.-H. Yang, “MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic Music Generation and Accompaniment,” Proc. AAAI Conf. Artif. Intell., vol. 32, no. 1, Apr. 2018.
L.-C. Yang, S.-Y. Chou, and Y.-H. Yang, “MidiNet: A convolutional generative adversarial network for symbolic-domain music generation,” arXiv Prepr. arXiv1703.10847, 2017.
Z. Ding, X. Liu, G. Zhong, and D. Wang, “SteelyGAN: Semantic Unsupervised Symbolic Music Genre Transfer,” 2022, pp. 305–317.
G. Brunner, A. Konrad, Y. Wang, and R. Wattenhofer, “MIDI-VAE: Modeling Dynamics and Instrumentation of Music with Applications to Style Transfer.” Sep. 20, 2018.
O. Cifka, A. Ozerov, U. Simsekli, and G. Richard, “Self-Supervised VQ-VAE for One-Shot Music Style Transfer,” in ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, Jun. 2021, pp. 96–100.
S.-L. Wu and Y.-H. Yang, “MuseMorphose: Full-Song and Fine-Grained Piano Music Style Transfer with One Transformer VAE.” May 09, 2021.
O. Cifka, U. Simsekli, and G. Richard, “Groove2Groove: One-Shot Music Style Transfer With Supervision From Synthetic Data,” IEEE/ACM Trans. Audio, Speech, Lang. Process., vol. 28, pp. 2638–2650, 2020.
Y.-N. Hung, I.-T. Chiang, Y.-A. Chen, and Y.-H. Yang, “Musical Composition Style Transfer via Disentangled Timbre Representations.” May 30, 2019.
S. Deepaisarn, S. Chokphantavee, S. Chokphantavee, P. Prathipasen, S. Buaruk, and V. Sornlertlamvanich, “NLP-based music processing for composer classification,” Sci. Rep., vol. 13, no. 1, p. 13228, Aug. 2023.
X. Xue and Z. Jia, “The Piano-Assisted Teaching System Based on an Artificial Intelligent Wireless Network,” Wirel. Commun. Mob. Comput., vol. 2022, pp. 1–9, Jan. 2022.
P. J. Donnelly and V. Ebert, “Transcription of audio to midi using deep learning,” in 2022 7th International Conference on Frontiers of Signal Processing (ICFSP), IEEE, 2022, pp. 130–135.
I. Goodfellow et al., “Generative adversarial networks,” Commun. ACM, vol. 63, no. 11, pp. 139–144, Oct. 2020.
B. Di Giorgi, M. Levy, and R. Sharp, “Mel Spectrogram Inversion with Stable Pitch.” Aug. 26, 2022.
S. Hochreiter and J. Schmidhuber, “Long Short-Term Memory,” Neural Comput., vol. 9, no. 8, pp. 1735–1780, Nov. 1997.
A. Vaswani et al., “Attention Is All You Need.” Jun. 12, 2017.
Z. Guo, J. Kang, and D. Herremans, “A domain-knowledge-inspired music embedding space and a novel attention mechanism for symbolic music modelling,” in Proceedings of the AAAI Conference on Artificial Intelligence, 2023, pp. 5070–5077.
R. Hadsell, S. Chopra, and Y. LeCun, “Dimensionality Reduction by Learning an Invariant Mapping,” in 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2 (CVPR’06), IEEE, pp. 1735–1742.
D. Yao et al., “Contrastive Learning with Positive-Negative Frame Mask for Music Representation,” in Proceedings of the ACM Web Conference 2022, New York, NY, USA: ACM, Apr. 2022, pp. 2906–2915.
I. Manco, E. Benetos, E. Quinton, and G. Fazekas, “Contrastive Audio-Language Learning for Music.” Aug. 25, 2022.
J. Koo, M. A. Martínez-Ramírez, W.-H. Liao, S. Uhlich, K. Lee, and Y. Mitsufuji, “Music Mixing Style Transfer: A Contrastive Learning Approach to Disentangle Audio Effects.” Nov. 03, 2022.
A. W. Rix, J. G. Beerends, M. P. Hollier, and A. P. Hekstra, “Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs,” in 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221), IEEE, pp. 749–752.
J. Zhao and G. Xia, “AccoMontage: Accompaniment Arrangement via Phrase Selection and Style Transfer.” Aug. 25, 2021.
K. Radzikowski, L. Wang, O. Yoshie, and R. Nowak, “Accent modification for speech recognition of non-native speakers using neural style transfer,” EURASIP J. Audio, Speech, Music Process., vol. 2021, no. 1, p. 11, Dec. 2021.
S. Yuan, P. Cheng, R. Zhang, W. Hao, Z. Gan, and L. Carin, “Improving Zero-shot Voice Style Transfer via Disentangled Representation Learning.” Mar. 16, 2021
Y. Zhang et al., “StyleSinger: Style Transfer for Out-of-Domain Singing Voice Synthesis,” in Proceedings of the AAAI Conference on Artificial Intelligence, 2024, pp. 19597–19605.
M. Koutsogiannaki, S. M. Dowall, and I. Agiomyrgiannakis, “Gender-ambiguous voice generation through feminine speaking style transfer in male voices.” Mar. 12, 2024.
M. Pasini, “MelGAN-VC: Voice Conversion and Audio Style Transfer on arbitrarily long samples using Spectrograms.” Oct. 08, 2019.
DOI: http://dx.doi.org/10.17977/um018v7i12024p101-116
Refbacks
- There are currently no refbacks.
Copyright (c) 2024 Knowledge Engineering and Data Science
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.