
Marsyas is (Music Analysis, Retrieval and Synthesis for Audio Signals) is an open source software framework for audio processing with specific emphasis on Music Information Retrieval applications. It has been designed and written by George Tzanetakis (gtzan@cs.uvic.ca) with help from students and researchers from around the world. Marsyas has been used for a variety of projects in both academia and industry.
Currently I'm collaborating with the Marsyas project and trying to make Marsyas a tool for multimodal analysis, including audio and video signals, as well as midi data, images and sensor information. The codename for this side-project is MarsyasX. It is a new open-source cross-modal analysis framework that aims at a broader score of applications. It follows a dataflow architecture where complex network of processing objects can be assembled to form systems that can handle multiple and different types of multimedia flows with expressiveness and efficiency.
A Cross-modal framework

MarsyasX stands for Marsyas “cross-modal” and borrows from Marsyas 0.2 most of the concepts, namely the hierarchical composition paradigm and the implicit patching of modules. Similarly, data is processed in defined chunks by calling a tick() function and each module also has a set of controls that are used to access their internal parameters. The main conceptual difference is in the way data is exchanged between processing modules. Instead of using shared matrices of real values, MarsyasX exchanges data through a payload mechanism. Whenever data is produced in a given module at each tick, a payload is created. This payload, “carrying” the data, is then sent to the output channel.

A channel is a connection between adjacent modules where payloads are stacked while waiting to be processed. It is important to note that channels are established implicitly, according to the type of composite being used. This data exchange mechanism is highly generic and flexible, supporting any type of data (e.g. images, audio frames, MIDI, XML, lists of points, etc.). However, it does not live without its own specific issues such as timing and synchronization that are addressed later in this text. In addition, payloads can be hierarchically grouped into flows to enable typing and naming of time series of payloads. Alongside with this fundamental difference when compared to the previous version, MarsyasX includes additional improvements, like an integrated implementation of events associated with controls that can (1) trigger predefined actions, (2) connect controls with expressions (similarly to Marsyas 0.2), (3) synchronize with GUIs and (4) support distributed networks.
Main differences to Marsyas-0.2:
- generic payload-based flows, identified by type and name, instead of
matrices-based processing
- independent updates for each control
- use of extensions to add new MarSystems to MarSystemManager -
commonly known as plugins
- tentative flow-driven propagation of controls
- tentative event management that can support at the same time Neil's
event system, GUI sync and networking
- tight integration with Python (optional, but very useful IMO)
- unit testing
Short-term todo list:
- improve current base visual classes
- add base audio classes
- create AVSource - probably based in FFMPEG - and AVSink
- clean up and create a lot more tests
Not so short-term todo list:
- create utility classes for flow handling (filtering, delaying, etc)
- improve and finish event management
- port Marsyas-0.2 audio classes
- create new visual processing classes
- optimizations
Long-term todo list:
- distributed networks of MarSystems
- add processing support for MIDI flows
- add processing support for XML-based flows (e.g. MPEG-7, MPEG-4 XMT,
X3D, etc)
Relevant publications:
Compile Notes (only interesting for developers)
