Evan X. Merz

musician/technologist/human being

Composing with All Sound

I sometimes get asked about my dissertation, so I wanted to write a blog post to explain it. In this post, I describe the work I did for my dissertation, which brought together web APIs and a network model of creativity.

Composing with all sound

When I was in graduate school I was somewhat obsessed with the idea of composing with all sound. When I say "composing with all sound" I don't mean composing with a lot of sounds. I mean literally all sound that exists. Composing with all sounds that have ever existed is literally impossible, but the internet gives us a treasure trove of sound.

So I looked at the largest collections of sound on the internet, and the best that I found in 2011 was the website freesound.org/. Freesound is great for several reasons.

  • The library is absolutely massive and constantly growing
  • They have a free to use API
  • Users can upload tags and descriptions of the sounds
  • Freesound analyzes the sounds and gives access to those descriptors via the API

A model of creativity

Once I had a source for sounds, all I needed was a way to connect them. Neural network research was blossoming in 2011, so I tried to find a neural model that I could connect to the data on Freesound. That's when I found Melissa Schilling's network model of cognitive insight.

Schilling's theory essentially sees ideas as networks in the brain. Ideas are connected by various relationships. Ideas can look alike, sound alike, share a similar space, be described by similar words, and so on. In Schilling's model, cognitive insight, or creativity, occurs when two formerly disparate networks of ideas are connected using a bridge.

So to compose with all the sounds on Freesound, all I needed to do was to organize sounds into networks, then find new ways to connect them. But how could I organize sounds into networks, and what would a new connection look like?

The wordnik API

I realized that I could make lexical networks of sounds. The tags on sounds on Freesound give us a way to connect sounds. For instance, we could find all sounds that have the tag "scream" and form them into one network.

To make creative jumps, I had to bring in a new data source. After all, the sounds that share a tag are already connected.

That's when I incorporated The Wordnik API. Wordnik is an incredibly complete dictionary, thesaurus, and encyclopedia all wrapped into one. And best of all, they expost it using a fast and affordable API.

Composing with all sound using Freesound, a network model of creativity, and lexical relationships

So the final algorithm looks something like this, although there are many ways to vary it.

  1. Start with a search term provided by a user
  2. Build a network of sounds with that search term
  3. Use Wordnik to find related words
  4. Build a network around the related words
  5. Connect the original network to the new one
  6. Repeat

To sonify the resulting networks, I used a simple model of artificial intelligence that is sort of like a cellular automaton. I released a swarm of simple automata on the network and turned on sounds whenever the number of bots on a sound reached a critical mass.


Here are the results of my dissertation. You can read a paper I presented at a conference, and the dissertation itself. Then you can listen to the music written by this program. It's not Beethoven, but I guarantee that you will find it interesting.

Composing with All Sound Using the FreeSound and Wordnik APIs

Method for Simulating Creativity to Generate Sound Collages from Documents on the Web

Disconnected: Algorithmic music composed using all sounds and a network model of creativity

Avatar for Evan X. Merz

Evan X. Merz

Evan X. Merz holds degrees in computer science and music from The University of Rochester, Northern Illinois University, and University of California at Santa Cruz. He is a programmer, musician, and author of Empress blog software. He makes his online home at evanxmerz.com.