As commented in the previous paragraphs the decoder is in charge of translating the input metadata into a synthesis language. Once the transmitted metadata has been translated into some language/format understandable by the synthesizer, the synthesis step is reduced to a traditional synthesis process. We therefore only need the synthesizer to be prepared to respond to the different decoded parameters.
Therefore, in the synthesis step the key issue is the language of choice. Many languages have been developed for the purpose of controlling a synthesizer. Among them, the most extended one is MIDI [MMA, 1998] although its limitations make it clearly not sufficient for the system proposed in this paper. Another synthesis language that deserves consideration at this point is MPEG4's SAOL (Structured Audio Orchestra Language) [Scheirer, 1999b]. See section 5.3.2 for a more in depth explanation on Structured Audio and its relation to the OOCTM.
But, as it will be explained in section 5.3.2, Structured Audio and SAOL present several limitations and are not well suited for our purposes. In next chapter, an object-oriented synthesis language is presented. The MetriXML language is proposed as a link between analysis, encoding, and synthesis specification and it presents a model of music that is a particular instance of DSPOOM and very much related to OOCTM. Note that a further advantage of the MetriXML is that it is an XML-based language, just as the one proposed as result of the encoding process. A transformation from one XML document containing analysis results and another XML document containing synthesis parameters can be as simple as defining an XSLT5.4.
2004-10-18