Skip to content
This repository was archived by the owner on Apr 25, 2019. It is now read-only.

jonasmargraf/msc-thesis

Repository files navigation

Self-Organizing Maps for Sound Corpus Organization

This is the repository for the master's thesis I wrote as a graduate student with the Audio Communication Group (Fachgebiet Audiokommunikation und -technologie) at Technische Universität Berlin. My advisors for this thesis were Prof. Stefan Weinzierl (TU Berlin) and Dr. Diemo Schwarz (IRCAM, Paris). A summary of the work can be found in the abstract below.

Abstract

Large collections of audio files—sound corpora—have never been more readily available. Sample libraries are easily accessible online and cheap storage media effectively eradicate concerns of storage capacity for contemporary music producers. At the same time, tools for navigating, searching and organizing these increasingly unmanageable audio file collections have not kept pace. At present, arguably the most common tool with which producers search their sample libraries are file browsers that simply present lists of file names in alphabetical order.

The present thesis approaches this problem from a practical perspective. We implement the Self-Organizing Map (SOM), an established machine learning algorithm for dimensionality reduction and data visualization, and apply it to sound corpus organization. We present SOM Browser, a fast, visual interface for sample library exploration. It offers an alternative to the established music production workflow by incorporating the SOM algorithm, which to our knowledge is not available in any commercial audio software. It is a standalone application that organizes a collection of sound files completely unsupervised and presents a two-dimensional map of the sounds. The map forms an interactive grid interface with which the user can audition files in rapid succession. This allows for a quick way to gain an overview of the analyzed sound corpus. To optimize the space alloted to the map interface, we extend the SOM algorithm with a new method, which we call Forced Node Population (FNP). FNP reduces unpopulated (“empty”) areas of the map at the cost of some additional map distortion. Using a representative sample library of drum sounds, we search for a set of optimal algorithm parameters according to objective measures of map quality and produce a map for the chosen sound corpus. We then conduct a series of qualitative interviews with audio professionals to gain some understanding of the complex situation that is sample library interaction in a music production environment and to gauge initial reactions to the alternative software we developed. Participants' responses allow us to identify a prevalent method of working with sample libraries, which we codify into a generalized model of the established workflow. The results confirm the need for and interest in alternate interfaces. Although the organization of sounds in the map interface is seen as not easily comprehensible, interview responses confirm the need for and interest in our software.

This work thus presents a functioning proof of principal for the use of SOMs for sound corpus organization. It demonstrates that there is a high interest in such methods. Despite interview participants' criticism of details, overall feedback is positive. Therefore, further development of the presented work in close exchange with users appears to be very sensible.

About

Thesis on Self-Organzing Maps for Sound Corpus Organization

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages