Designing a Music Synthesizer Emulator

Early in 1991 I wrote a Yamaha DX-7 synthesizer (1983 vintage) emulator. The goal was simple — I wanted to model FM synthesis as implemented on the DX-7 so that I could experiment with tone generation. Now, some people absolutely hated the DX-7 — they found that the way that you created sounds with frequency modulation synthesis was totally counter-intuitive and not at all “organic.” I liked it, and thought it was very ingenious.

Of course, once the heart of the DX-7 synthesizer was done, I needed to give it files to play. So I ported a MIDI library and used it to “drive” the synthesizer — the MIDI library would effectively tell the synthesizer “key 85 has been pressed with velocity 100.” Some time later, the MIDI library would say “key 85 has been released.”

It was up to the DX-7 synthesizer to respond accordingly.

The overall infrastructure for parsing MIDI files, sequencing them through various software synthesizers, combining the outputs in a spatially aware manner, postprocessing with final effects (like room simulation, echo, etc), and writing the output to a WAVE file or directly to the sound card was a non-trivial task.

Thus was born ROSE — Rob's Own Synthesizer Emulator (I think I based the name on JOVE, which was “Jonathan's Own Version of EMACS” IIRC — if Jonathan could have his own version of EMACS, then surely I could have my own synthesizer emulator).

ROSE's job was to coordinate all of the above tasks. Inputs were decoupled, meaning that if you wanted to give ROSE a MIDI file to play, or a text file containing notes and timing, it didn't care — both reduced to the same internal representation. ROSE would then take that internal representation and sequence it through the installed software synthesizers. Here again there was a nice decoupling — ROSE didn't care what the software synthesizer did, so long as it responded to a “note on” and “note off” event and rendered audio samples. Several different software synthesizers were present; the biggest was the DX-7, the rest were more “researchy” type of synths (for example, playing with filtered white noise to get a kind of tuned whistling sound was one “synth”).

Just like with the input file and the software synthesizer interface, the output interface was decoupled in ROSE. ROSE would gather the various outputs that it had received from the software synths, and would then mix them down to a final stereo audio track. Whether the stereo output track went into a sound card or a wave file or some other output device didn't matter to ROSE, so long as those output handlers were able to accept a stereo 16-bit sample stream.

ROSE could even short-circuit the entire software synthesis path and simply copy the input MIDI file to the hardware MIDI port in order to drive real synthesizers too.

Here's a demo of ROSE and Toccata and Fugue in D Minor:

Of course, today this is all “obvious” in terms of design. Object oriented methodologies have taught us that decoupling and abstraction are the order of the day. In 1991, the exact “how” of doing that wasn't as apparent as it is today.

ROSE2 Architecture

So this is where I'm at now. Over the last 6 months, I've been intensively (and I mean intensively; my wife can tell you I've ignored her!) learning C++ (I started by writing a vi-like editor from scratch). I have a reasonable handle on it now, and it struck me that I could do a much better job on ROSE than I did 24 years ago.

My first coding attempt at ROSE2 yielded the following master chain:

wave (mixdown (render (midi (argv [i]).events())), extension (argv [i], ".wav"));

Cute, but horribly inefficient and completely inflexible. But it did demonstrate that the individual pieces were all working as expected.