Sonic Gestalt
The Sonic Gestalt
Gestalt theory talks about how we perceive things in our world, and how our brains tend to group a bunch of little things into one large thing if they are somehow related to one another. Music is no different. Accoustical cues in sound work together to create distinct events called a "Sonic Gestalt".
What is a Musical Note?
Notes are just that... notes. Instructions for a performer to interpret and turn into musical sound. A human performer will take these notes and turn them into physical gestures applied towards an instrument (including the voice). Sounds emit from the instrument and float through the air. But are these the same notes that were written down? And what about what happens these "notes" flying through the air go into the eardrums of an audience member listening to the music? When those get processed by their brain, is it still a "note"?
Notes are notation, not sounds
It's very important to realize that there's a big different between a written down "note" and a performed "note". In my highschool English class, when we studied things like poetry or Shakespeare, they always emphasized speaking them aloud, as there was a world of difference between the text and the performance. The same holds true for music.
When I say this, I am of course speaking to my fellow computer musicians who make use of "notes" in their work. Many tools in our various music software ecosystems tend to be quite literal with their use of notes to represent musical sound. Often the musical structures that occur from this end up being highly discrete in nature, with notes lacking any context for what comes before or after it. While decent results can be achieved using these kind of structures for percussive instruments like piano and drums, they will produce very stiff results for lyrical instruments such as the human voice or violin.
Notes are a convenient way to structure music, and come in handy when we want to analyze certain kinds of music. But notes by themselves are not a full representation music. You cannot take the rules from a music theory class, throw them into some magic music box program, and expect compelling music to come out the other end. These structures are missing important elements related to performance, such as interpretation and phrasing. A generative system that doesn't take these into account is going to produce dead music.
At the end of the day, the particular notes you choose to perform don't really matter much, it's how you perform them that matters.
Implicit Notes with Sonic Gestalts and Gesture
Let's bring this back to Sonic Gestalts, and the musician interpretting notes and making musical sound.
A musical performance is continuous in nature. By the time it starts, there are no more discrete notes, just movements with intention (gesture), and pretty sound waves propagating through air. But the notes are still there, in some way. That's how we can do things like transcribe the Miles Davis solo in "Kind of Blue".
Music Perception has often been compared to Speech and Language. Somehow, something in our brain is able to take in musical sound and break it down into chunks, in a similar way that we break speech up into words. It does some kind of feature extraction using distinct accoustical cues embedded in the sound (things like pitch, amplitude, timbre, etc). Something that gets registered as a "note", is a combination of these cues moving in the right way. We can say that a note is implicitely constructed from a combination of these cues. This will be called a Sonic Gestalt.
Sonic Gestalts add a level of detail to the musical "note" structure, and come in handy when creating procedurally generated performances of lyrical sound. The core idea is to use Gestalts to shape a sound that can be perceived as a note. Once that system is put in place, only then can a note abstraction emerge.
Modular synthesizers think in a Gestalt-y kind of way all the time. As an example, an oscillator whose amplitude is modulated by an envelope generator will produce a distinct perceptual sound event that can be interpetted as a note. Pitch can be modulated with another signal, such as from a sequencer. Amplitude and pitch are gestalts, distinct perceptual motions in the sound, that can work together to shape a note.
Using many coordinated signals to shape a sound object to produce musical sonic gestalts serves as the foundation for working with gesture