Vocal Synthesis | Zachary Behrman

In my Physics Of Music class, we spent a considerable amount of time talking through the mechanics of human speech. While the task of synthesizing fully life-like human speech is very difficult (due to the complex modeling required for certain phonetics), the replication of vowels is relatively straight-forward.

Essentially, the position of our tongue, jaw and lips highlight different resonances (called formants). These unique formants go on to make distinct vowel sounds (see the graphic below for a visual example).

A visual example of mouth shape corresponding to frequency spectra (Credit: Georgia State University).

By using MaxMSP, I created a program to model this phenomena and re-create vowels outlined by the IPA (International Phonetic Alphabet).

A demo and walkthrough can be found by watching my video below.

Additionally, the .maxpat file can be found here.