Discovering multi-level structure in music
Musical structure analysis broadly refers to the task of automatically inferring one or more temporal segmentations of a piece of recorded music, given only the audio signal as input. Applications of music structure analysis range from commercial (eg: automatically extracting preview excerpts) to musicological (eg: inferring form, or excerpting solos by a known performer) and creative (eg: automatic remixing). While structure analysis has long been studied within the music information retrieval community, it has commonly been cast as a simplified binary classification (boundary/non-boundary) or clustering problem, which discards a great deal of domain-specific knowledge in both the design and evaluation of algorithms.
In this talk, I will present an overview of a series of projects investigating various practical and theoretical aspects of the music structure analysis task, including: graph theoretic methods for recovering structure, multi-modal feature fusion, representing and comparing hierarchical and multi-level structure, quantifying and coping with inter-annotator disagreement, and inferring latent multi-level structure from flat annotations.
Brian McFee is Assistant Professor of Music Technology and Data Science at New York University. He received the B.S. degree (2003) in Computer Science from the University of California, Santa Cruz, and M.S. (2008) and Ph.D. (2012) degrees in Computer Science and Engineering from the University of California, San Diego. His work lies at the intersection of machine learning and audio analysis. He is an active open source software developer, and the principal maintainer of the librosa package for audio analysis.