Singing Voice Identification and Lyrics Transcription for Music Information Retrieval (2013)
AUTHORS:
Mesaros Annamaria
BOOKTITLE:
Proceedings, 7th Conference on Speech Technology and Human-Computer Dialogue (SpeD2013)
PAGES:
10
INTERNALPDF:
internalpdf/mesaros_sped_2013.pdf
@inproceedings{ mesaros_sped2013, author = "Mesaros, Annamaria", juforank = "NA", isbn = "978-1-4799-1065-6", language = "eng", title = "Singing Voice Identification and Lyrics Transcription for Music Information Retrieval", country = "Romania", abstract = "This paper presents an overview of methods and applications dealing with analysis of singing voice audio signals, related to singer identity and lyrics content of the singing. Singer identification in polyphonic music is based on general audio classification methods. The presence of instruments is detrimental to voice identification performance, and eliminating the effect of instrumental accompaniment is an important aspect of the problem. The results show that classification of singing voices can be done robustly in polyphonic music when using source separation. Lyrics transcription is approached as a speech recognition problem, with specific elements for dealing with singing voice. The variability of phonation in singing poses a significant challenge to the speech recognition approach. The word recognition accuracy of the lyrics transcription from singing is quite low, but it is shown to be useful in a query-by-singing application, for performing a textual search based on the words recognized from the query. A system for automatic alignment of lyrics and audio is also presented, with sufficient performance for facilitating applications such as automatic karaoke annotation or song browsing.", pdf = "mesaros_sped_2013.pdf", pages = "10", responsibleauthor = "Annamaria Mesaros", flags = "COIN", il = "no", eventdetails = "7th Conference on Speech Technology and Human-Computer Dialogue (SpeD2013), Cluj-Napoca, Romania, October 16-19, 2013", year = "2013", unitcode = "T306-50, T405-50", kay = "NA", impactfactor = "A4", booktitle = "Proceedings, 7th Conference on Speech Technology and Human-Computer Dialogue (SpeD2013)" }