Monday, December 26, 2022

Fancy text clips and text animations with camala and moviepy

Problem?

When making videos, sometimes you want to add some text animations. And since not everyone has access to commercial tools like After Effects, it would be nice if a free alternative existed.

Meet Camala

Camala stands for caption markup language. You describe text animations in a toml specification, and let the system synthesize high-quality text video clips from it.

Example

More information?

For more information, please visit https://github.com/shimpe/camala

Sunday, November 7, 2021

Driving the GODOT game engine with OSC messages - part 2

Using a thread instead of a busy loop

As a quick follow-up to part I, where I basically used a while (1) { ... } loop which tends to make the CPU run at 100% (unless you use the OS.set_low_processor_usage_mode which is not portable across platforms, and which may actually cause the rest of your game/sketch to run badly, here's an alternative that parses the OSC messages whenever a timer expires.

In the rootnode script _ready function, a timer is instantiated that calls a function _on_timeout and reschedules itself after it timed out:

timer = Timer.new()
timer.autostart = true
timer.one_shot = false
timer.wait_time = 0.1
timer.connect("timeout", self, "_on_timeout")
add_child(timer)

Then also in the rootnode script, you can implement the _on_timeout function to perform the parsing of the osc information and distribution to the observers.

Instead of trying to explain everything in this post, it's probably much easier for you to clone my github example and play with it: https://github.com/shimpe/example_godot_gdscript_osc

The game listens for OSC messages like "/$index/set/color min max val" where $index is a number between 0 and 15, and min represents a value for which the sensor will color black, max represents a value for which the sensor will color maximally red and val represents a value between min and max. Val will be used to adapt the sensor color between black and maximally red, based on where it is located between min and max. So if you send /5/set/color 0 100 50 to the sketch, then the color of sensor number 5 will be changed to be halfway between black and bright red.

For testing, I used a simple supercollider program to send OSC messages to the game and which lights up the sensors randomly like some Christmas tree, but you can use whatever program you have that supports sending OSC messages.

(
b = NetAddr.new("127.0.0.1", 4242); // create the NetAddr
fork {
       1000.do {
               |idx|
               b.sendMsg("/"++15.rrand(0)++"/set/color",0,100,100.rrand(0));
               0.1.wait;
       }
};
)

Monday, February 24, 2020

Driving the GODOT game engine with OSC messages

Problem?

Note: I've tested this code with a very recent version of GODOT compiled from source code. Your mileage may vary.

In interactive art installations, an often used protocol to communicate between computer processes is the open sound control (OSC) protocol. Despite the name "sound" in OSC, the protocol is actually completely general purpose. In my opinion a game engine like the godot engine would benefit tremendously from built-in support for the OSC protocol because it opens up the engine to creative uses outside the gaming domain. On of my own use cases is to visualize in real-time musical events that are algorithmically generated (also in real-time) in supercollider.

The GODOT development team, despite several requests from users, chose not to support OSC directly and provide a high-level networking interface which they feel is more appropriate for their game engine (but which only works with their game engine).

As a result, some users have attempted to write add-on modules using the GdNative C++ add-on mechanism to extend the GODOT engine. Such modules often have the drawback of limited portability and requiring users to set up a C++ building environment. In addition, the API that needs to be used by the add-ons could be changing drastically between GODOT versions.

Therefore, I was hoping to find a more portable solution. Luckily GODOT also provides some lower level networking classes, which allow third-party developers to add some support for unsupported protocols.

OSC is typically a thin layer on top of UDP networking (sometimes TCP is used, but most often UDP is chosen for reasons of performance), and so it should be possible to add a simple OSC implementation in GDScript directly.

Note: to make sense of this blog post, you need to have a little bit of working knowledge about the GODOT game engine, at the minimum, you need to know what nodes are.

Will it perform super fast?

Probably not. But often you don't really need to send tons of data over the network to drive your visualization. Think about optimizing your data streams.

Will it be easy to use?

I think it is extremely easy to use. But feel free to try it out and judge for yourself.

How can I use it?

Glad you ask. Here's the big picture:

To the root note of our GODOT sketch (usually a Node2D for 2d games, or a Spatial for 3d games), I attach a script that monitors the network for incoming OSC messages. The OSC messages are decoded and automatically dispatched to child nodes who requested to be kept informed about certain incoming messages.

Consider an example of a Node2D rootnode (which I renamed to RootNode in the IDE) with a Sprite child node. In the script attached to the Sprite, I only have to express my desire to be kept up-to-date about incoming OSC messages, e.g. here's the complete script for a sprite that can change position by receiving a /set/xy OSC message.

extends Sprite

func _ready():
   get_tree().get_root().get_node("RootNode").register_dual_arg_callback("/set/xy", get_node("."), "set_xy")

func set_xy(x, y):
   self.position.x = x
   self.position.y = y

This means that when the Sprite is instantiated, it registers itself with the root node ("register_dual_arg_callback"). It tells the root note that it wants its function "set_xy" to be called whenever a "/set/xy" OSC message is received over the network with two arguments (x and y) which in this example then are used to set the sprite position. There's also a register_single_arg_callback for functions that expect only one value (and nothing prevents you from adding more variants).

This simple code suffices to ensure that from now on, this particular Sprite will automatically react to incoming OSC /set/xy messages by updating its position.

E.g. in an environment that supports sending OSC message like supercollider, I can run the following snippet to update the sprite position:

b = NetAddr.new("127.0.0.1", 4242); // create the NetAddr
b.sendMsg("/set/xy", 650, 200);

The godot sketch automatically reacts to the OSC message by updating the sprite xy position to 650, 200, exactly what we requested the system to do by registering a callback. Needless to say the OSC message could perfectly come from another computer, tablet or phone (e.g. from the TouchOsc app or from Open Stage Control).

The value 4242 is the port number on which the root node is listening for incoming OSC messages. In the proof of concept code it's hardcoded to 4242, but you are of course free to change it or to make it configurable.

Ahm, ok. Show me this magic code that goes into the root node then?

Note: the code in this section is obsolete, and replaced by the code in part 2 of this article: https://technogems.blogspot.com/2021/11/driving-godot-game-engine-with-osc.html

Well... it's not the prettiest code (it's a proof of concept - things like IP address and port are hardcoded, but should be simple to change). It works well enough for me at the moment (but I've only done some basic experiments so far). In case I run into trouble, I may need to revisit the details. If you run into trouble, please let me know what happened and how you solved it (or just explain the problem and we can discuss to see if we can find a solution).

Here's the code I put in the root node. For now it only supports parsing integer, float, string and blob OSC messages. In the future maybe also OSC bundles and some other data types could be added, but even with only this simple subset of OSC supported, the possibilities are already endless.

Note that the call to OS.set_low_processor_usage_mode may not be portable across all platforms, but it's probably not strictly needed to make the system work (but don't take my word for it, I'm not at all experienced with GODOT on platforms other than linux).

The code for registering callbacks and dispatching probably can be made a bit more general, and there could be support for pausing, resuming and stopping reacting to OSC notifications but for demonstration purposes what I have here should suffice.

extends Node2D

var IP_CLIENT
var PORT_CLIENT
var PORT_SERVER = 4242 # change me if you like!
var IP_SERVER = "127.0.0.1" # change me if you like!
var socketUDP = PacketPeerUDP.new()
var observers = Dictionary()

func register_single_arg_callback(oscaddress, node, functionname):
 observers[oscaddress] = [1, node, functionname]
 
func register_dual_arg_callback(oscaddress, node, functionname):
 observers[oscaddress] = [2, node, functionname]
 
func _ready():
 OS.set_low_processor_usage_mode(true)
 start_server()
 
func all_zeros(lst):
 if lst == []:
  return true
 for el in lst:
  if el != 0:
   return false
 return true

func _process(_delta):
 if socketUDP.get_available_packet_count() > 0:
  var array_bytes = socketUDP.get_packet()
  #var IP_CLIENT = socketUDP.get_packet_ip()
  #var PORT_CLIENT = socketUDP.get_packet_port()
  var stream = StreamPeerBuffer.new()
  stream.set_data_array(array_bytes)
  stream.set_big_endian(true)
  var address_finished = false
  var type_finished = false
  var address = ""
  var type = ""

                # parse osc address
  while not address_finished:
   for _i in range(4):
    var addrpart = stream.get_u8()
    if addrpart != 0:
     address += char(addrpart)
    if addrpart == 0:
     address_finished = true

                # parse osc type list 
  while not type_finished:
   for _i in range(4):
    var c = stream.get_u8()
    if c != 0 and char(c) != ",":
     type += char(c)
    if c == 0:
     type_finished = true

  # decode values from the stream
  var values = []
  for type_id in type:
   if type_id == "i":
    var intval = stream.get_32()
    values.append(intval)
   elif type_id == "f":
    var floatval = stream.get_float()
    values.append(floatval)
   elif type_id == "s":
    var stringval = ""
    var string_finished = false
    while not string_finished:
     for _i in range(4):
      var ch = stream.get_u8()
      if ch != 0:
       stringval += char(ch)
      else:
       string_finished = true
    values.append(stringval)
   elif type_id == "b":
    var data = []
    var count = stream.get_u32()
    var idx = 0
    var blob_finished = false
    while not blob_finished:
     for _i in range(4):
      var ch = stream.get_u8()
      if idx < count:
       data.append(ch)
      idx += 1
      if idx >= count:
       blob_finished = true
    values.append(data)
   else:
    printt("type " + type_id +" not yet supported")

  if observers.has(address):
   var observer = observers[address]
   var number_args = observer[0]
   var nodepath = observer[1]
   var funcname = observer[2]
   if number_args == 1:
    nodepath.call(funcname, values[0])
   elif number_args == 2:
    nodepath.call(funcname, values[0], values[1])
    
func start_server():
 if (socketUDP.listen(PORT_SERVER) != OK):
  printt("Error listening on port: " + str(PORT_SERVER))
 else:
  printt("Listening on port: " + str(PORT_SERVER))

func _exit_tree():
 socketUDP.close()

Compared to the misery of adding, compiling, maintaining, porting, ... a GdNative module, I think this is pretty acceptable (at least for my use cases).

Thursday, January 2, 2020

Making an arpeggiator in supercollider with patterns

Problem

Given some notes as input, generate a pattern making use of those notes. When the input changes, the generated pattern should also change. Most keyboards and synthesizers provide simple arpeggiators, but we'll be using supercollider which allows for generating the most complex patterns imaginable including generation of chords, polyphony, introducing random variations, etc
Code for the final piece of code in this post can be found on sccode.org: http://sccode.org/1-5cr

Approach

How can we convert a list of notes to an interesting arpeggio? Different possibilities exist, but we'll be using one of the more powerful abstractions available to supercollider users: the pattern system.

Patterns act as a kind of template for generating events, and events can be played to create sounds. This sounds exactly like what the doctor prescribed.

A simple example to get started

First let's assign some midi notes to a variable ~n:

~n = [60, 64, 67]; // c major chord

If you, like me, prefer to reason in note names instead, you can install the Panola quark:

Quarks.install("https://github.com/shimpe/panola");

Then you can write the following instead (it's a bit longer but, hey, at least with readable note names):

~n = Panola("c4 e4 g4").midinotePattern.asStream.all;

Our task is to define a pattern that uses the notes in ~n and builds from them a simple arpeggio. So, given the input notes [c4, e4, g4] instead of playing simply the notes [c4, e4, g4] we'll generate a more interesting arpeggio [c4, g4, e4, g4]. When the input notes change to e.g. [c4, f4, a4] the arpeggio that is generated should change to [c4, a4, f4, a4].

(
s.waitForBoot({
    var arp = Pbind(
        \instrument, \default,
        \midinote, Plazy {
            var n0, n1, n2;
            ~n = ~n ?? [Rest(1)];
            n0 = ~n[0] ?? Rest(1);
            n1 = ~n[1] ?? ~n[0];
            n2 = ~n[2] ?? ~n[0];
            Pseq([n0, n2, n1, n2])
        },
        \dur, Pseq([1,1,1,1].normalizeSum*2)
    );
    if (~player.notNil) { ~player.stop; };
    ~player = Pn(arp).play;
});
)

This code requires some explanation:

s.waitForBoot is a construct I use in pretty much every supercollider sketch I make. It will start the supercollider sound server if it wasn't started yet.
Once the server is booted, the function that is passed to waitForBoot is executed. This function defines a pattern (also known as Pbind) "arp" and plays it.
The full power of Pbind is available (meaning that you could e.g. generate midi events and send them to a hardware instrument), but for demo purposes we just instantiate the supercollider default instrument. This should generate some sound even if you don't own any fancy hardware (after all, supercollider is also a sound synthesis language and therefore very capable of generating its own sounds). Instantiating the default instrument is accomplished by specifying the key-value pair \instrument, \default in the Pbind.

Plazy is a filter pattern that allows to calculate a new supercollider pattern using a function.

In addition to selecting an instrument, we also need to generate the notes to be played. Remember that ~n is the input note list. First we try to extract the first 3 notes from the input note list. These are the notes that we will use to rearrange into our arpeggio.

The line ~n = ~n ?? [Rest(1)] checks if variable ~n is defined already (actually it checks if ~n is nil). If it is not defined, it is assigned a list of input notes consisting of a single Rest. Then I introduce some variables n0, n1, n2 to denote the first, second and third note in the input list respectively. It may happen that the input list contains fewer than 3 notes (e.g. if you initialize the ~n variable from midi input from a hardware device, someone might play a 2-note chord instead of a 3-note chord). In that case we don't want our code to crash. If the first note is missing, I replace it with a Rest. If the second or third notes are missing, they are replaced with the first note.

The function passed to Plazy returns a Pseq that generates our arpeggio consisting of the first, third, second and third input note: n0, n2, n1, n2. Pseq is a pattern that generates successive notes. By default the complete list of notes will be repeated once and then the pattern stops.

To generate an arpeggio we are not limited to only generating notes. We can also generate durations, volumes, legato-staccato, and a bunch of other properties. All these can be derived from the input note list, or they can be completely independent from them if so desired. In this first example, let's just give all notes equal duration and keep all other properties to their default values. Specifying durations in a pattern is done by using the \dur key.

I want the complete arpeggio to be finished in 2 beats, so I specify 4 (because there are 4 notes) equal relative durations of 1 and then normalize the values to make their sum == 2.

Finally, we need to stop any previous instances of the pattern that may be playing, and make our pattern start. This happens in the lines

if (~player.notNil) { ~player.stop; }; // call stop if not stopped already
~player = Pn(arp).play;

Note the use of the Pn pattern, to make our arpeggiator repeat indefinitely.

As soon as you redefine the ~n variable to a new value, the pattern (but only after the previous instance was completely finished) will use the new values in ~n and generate a new arpeggio built from the new notes, so e.g. while the pattern is playing, try evaluating the following lines one by one, listening how it changes the arpeggio.

~n = [60, 64, 67 ];

~n =[ 60, 65, 69 ];

~n =[ 59, 65, 67 ];

Exercise: adapt the code to make an arpeggio based on the first four notes of a list of input notes.

Exercise: adapt the code to use different durations for different notes

Fancier patterns

The simple arpeggio we generated above is already more complex than what many synthesizers and keyboards can do, but supercollider being supercollider this is just the tip of the iceberg available to us.

We can generate multiple patterns from a single list of input notes and play them together with Ppar. In addition, not all durations and amplitudes need to be the same. You could generate a complete 16 track auto-accompaniment from a simple list of input notes using this technique. Here's an example of a melody pattern with a bass line generated from the input notes. Note that I do some arithmetic on the notes (+12) to add an octave. In general you are not limited to using only the input notes given by the user. You can add any other note you desire, which can (but needn't) be derived from one of the notes in the input list.

(
s.waitForBoot({
    var right, left;
    ~n = ~n ?? [Rest(1)];
    right = Pbind(
        \instrument, \default,
        \midinote, Plazy {
            var n0, n1, n2;
            ~n = ~n ?? [Rest(1)];
            n0 = ~n[0] ?? Rest(1);
            n1 = ~n[1] ?? ~n[0];
            n2 = ~n[2] ?? ~n[0];
            Pseq([ n0, n2, n1, n2 ] ++  (([ n0, n2, n1, n2 ] + 12)!2).flatten)
        },
        \dur, Pseq([1, 1, 1, 1, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5 ].normalizeSum*2)
    );
    left = Pbind(
        \instrument, \default,
        \midinote, Plazy {
            var n0, n1, n2;
            ~n = ~n ?? [Rest(1)];
            n0 = ~n[0] ?? Rest(1);
            n1 = ~n[1] ?? ~n[0];
            n2 = ~n[2] ?? ~n[0];
            Pseq([ n0, n2, n0, n2, n0, n2, n0, n2] - 12)
        },
        \dur, Pseq([1, 1, 1, 1, 1, 1, 1, 1].normalizeSum*2)
    );
    if (~player.notNil) { ~player.stop; };
    ~player = Pn(Ppar([right,left])).play;
});
)

Exercise: define some percussion instruments and add some percussion to the fragment.

Listening to midi input

Until now we've defined the ~n variable manually. But we can just well listen to a midi device and react to the incoming notes (or control change messages).
Before we can use midi devices, we need to initialize supercollider's midi system. To do so, evaluate the following code:

(
MIDIdef.freeAll;
MIDIClient.init;
MIDIIn.connectAll;
)

MIDIdef.freeAll will remove any midi handlers that may still be running. MIDIClient.init will initialize midi communication in supercollider and MIDIIn.connectAll ensures that we react to incoming midi msgs of all midi devices connected to the system.

Now we can install a midi handler that reacts to note on and note off messages. We will maintain an array of notes, and for each note in the array whether it's on or off. This array forms the basis from which we derive our list of input notes.

Note that in the following, the arpeggio keeps playing until we press another chord. If you want the arpeggio to stop when you release the midi keys, you can add

~n = ~note_table.selectIndices({|item, i| item != 0});
if (~n == []) { ~n = nil; };

in the note off handler.

(
MIDIdef.freeAll;
MIDIClient.init;
MIDIIn.connectAll;
)

(
s.waitForBoot({
    var right, left;

    ~note_table = 0!127;
    
    MIDIdef.noteOn(
        \mynoteonhandler, // just a name for this handler
        {
            |val, num, chan, src|
            num.debug("num");
            ~note_table[num] = 1; // update note table and update ~n
            ~n = ~note_table.selectIndices({|item, i| item != 0}).postln;
        }
    );

    MIDIdef.noteOff(
        \mynoteoffhandler, // just a name for this handler
        {
            |val, num, chan, src|
            num.debug("num");
            ~note_table[num] = 0; // update note table and update ~n
        }
    );

    right = Pbind(
        \instrument, \default,
        \midinote, Plazy {
            var n0, n1, n2;
            ~n = ~n ?? [Rest(1)];
            n0 = ~n[0] ?? Rest(1);
            n1 = ~n[1] ?? ~n[0];
            n2 = ~n[2] ?? ~n[0];
            Pxrand([ n0, n2, n1, n2 ] ++  (([ n0, n2, n1, n2 ] + 12)!2).flatten)
        },
        \dur, Pseq([1, 1, 1, 1, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5 ].normalizeSum*2)
    );
    left = Pbind(
        \instrument, \default,
        \midinote, Plazy {
            var n0, n1, n2;
            ~n = ~n ?? [Rest(1)];
            n0 = ~n[0] ?? Rest(1);
            n1 = ~n[1] ?? ~n[0];
            n2 = ~n[2] ?? ~n[0];
            Pseq([ n0, n2, n0, n2, n0, n2, n0, n2] - 12)
        },
        \dur, Pseq([1, 1, 1, 1, 1, 1, 1, 1].normalizeSum*2)
    );
    if (~player.notNil) { ~player.stop; };
    ~player = Pn(Ppar([right,left])).play;
});
)

Note that now always a complete "iteration" of the pattern has to be finished before changes in the input note list create a new arpeggio. You cannot switch the arpeggio to a different chord in the middle of the arpeggio. How can we change this behaviour?

Change chords in the middle of the arpeggio

What if we want changes in the input notes to have immediate effect? Can we adapt the code to make the system react faster to changes? Well... it's supercollider so of course we can. Let's see how it can be done.
If we want immediate reaction of the system to chord changes, one approach is to replace n0, n1, n2 with patterns that reevaluate a function every time they are called. This function then performs a lookup of a note in our ~n variable, which is updated as soon as new midi notes are received.
Also I moved the midi initialization code inside the system because I don't really like having to evaluate multiple code blocks successively.

(
s.waitForBoot({
 var right, left;
 var n0, n1, n2;

 MIDIdef.freeAll;
 if (~midi_initialized.isNil) {
  MIDIClient.init;
  MIDIIn.connectAll;
  ~midi_initialized = 1;
 };

 ~note_table = 0!127;
 ~n = nil;

 MIDIdef.noteOn(
  \mynoteonhandler, // just a name for this handler
  {
   |val, num, chan, src|
   ~note_table[num] = 1; // update note table and update ~n
   ~n = ~note_table.selectIndices({|item, i| item != 0});
  }
 );

 MIDIdef.noteOff(
  \mynoteoffhandler, // just a name for this handler
  {
   |val, num, chan, src|
   ~note_table[num] = 0; // update note table and update ~n
   /*
   // enable next two lines only if you want arpeggios to stop playing
   // when you release the midi keys
   ~n = ~note_table.selectIndices({|item, i| item != 0});
   if (~n == []) { ~n = nil; };
   */
  }
 );

 n0 = Plazy {
  if (~n.isNil) {
   Pseq([Rest(1)]);
  } {
   ~n[0] ?? Pseq([Rest(1)]);
  };
 };

 n1 = Plazy {
  if (~n.isNil) {
   Pseq([Rest(1)]);
  } {
   Pseq([~n[1] ?? ~n[0]]);
  };
 };

 n2 = Plazy {
  if (~n.isNil) {
   Pseq([Rest(1)]);
  } {
   Pseq([~n[2] ?? ~n[0]]);
  };
 };

 right = Pbind(
  \instrument, \default,
  \midinote, Pseq([ n0, n2, n1, n2] ++ (([ n0, n2, n1, n2] + 12)!2).flatten),
  \dur, Pseq([1, 1, 1, 1, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5 ].normalizeSum*2)
 );
 left = Pbind(
  \instrument, \default,
  \midinote, Pseq([ n0, n2, n0, n2, n0, n2, n0, n2] - 12),
  \dur, Pseq([1, 1, 1, 1, 1, 1, 1, 1].normalizeSum*2)
 );
 if (~player.notNil) { ~player.stop; };
 ~player = Pn(Ppar([right,left])).play;
});
)

Final cleanup

As a final cleanup we can remove some code duplication

(
s.waitForBoot({
 var right, left;
 var n0, n1, n2;
 var note_getter;

 MIDIdef.freeAll;
 if (~midi_initilized.isNil) {
  MIDIClient.init;
  MIDIIn.connectAll;
  ~midi_initialized = 1;
 };

 ~note_table = 0!127;
 ~n = nil;

 MIDIdef.noteOn(
  \mynoteonhandler, // just a name for this handler
  {
   |val, num, chan, src|
   ~note_table[num] = 1; // update note table and update ~n
   ~n = ~note_table.selectIndices({|item, i| item != 0});
  }
 );

 MIDIdef.noteOff(
  \mynoteoffhandler, // just a name for this handler
  {
   |val, num, chan, src|
   ~note_table[num] = 0; // update note table and update ~n
   /*
   // only enable the following lines if you want the arpeggio to stop as soon as you release the keys
   ~n = ~note_table.selectIndices({|item, i| item != 0});
   if (~n == []) { ~n = nil; };
   */
  }
 );

 note_getter = {
  | index |
  Plazy {
   if (~n.isNil) {
    Pseq([Rest(1)]);
   } {
    ~n[index] ?? (~n[0] ?? Pseq([Rest(1)]));
   };
  };
 };

 n0 = note_getter.(0);
 n1 = note_getter.(1);
 n2 = note_getter.(2);

 right = Pbind(
  \instrument, \default,
  \midinote, Pseq([ n0, n2, n1, n2] ++ (([ n0, n2, n1, n2] + 12)!2).flatten),
  \dur, Pseq([1, 1, 1, 1, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5 ].normalizeSum*2)
 );
 left = Pbind(
  \instrument, \default,
  \midinote, Pseq([ n0, n2, n0, n2, n0, n2, n0, n2] - 12),
  \dur, Pseq([1, 1, 1, 1, 1, 1, 1, 1].normalizeSum*2)
 );
 if (~player.notNil) { ~player.stop; };
 ~player = Pn(Ppar([right,left])).play;
});
)
)

Let me know if have ideas to enhance the system!

Sunday, May 26, 2019

Jan De Cock: Laissez Faire - Laissez Passer

Jan Who?

Jan De Cock is a Belgian artist. After exhibiting in the Tate Modern (2005), he became the first living Belgian artist to have received a solo exhibition in the New York MoMA (2008). In 2012 he also presented his work in the Staatliche Kunsthalle Baden-Baden. Jan De Cock is the founder of the "Brussel's Art Institute", an organization aimed at bringing students and other people in contact with the best Belgian and international contemporary works and ideas.

Because of his appearance in the satirical TV-show "De Ideale Wereld", I was reminded of how he hit the Belgian news around february 2016 because of starting a lawsuit against some of Belgium's biggest media companies for not paying enough attention to art related topics.

Ok... so what?

In an attempt to clarify his problem with these media companies Jan De Cock wrote a 7-page manifest explaining the problems he sees with these media companies. The manifest is difficult to read, in part due to using complicated (in some cases bordering on the nonsensical) sentences. It also contains a fair share of mistakes against grammar, spelling and punctuation rules.

If this way of writing was done on purpose, it was done in a strike of genius (and if not, it was a lucky strike) as it accomplished two things at once:

It ensured a lot of media coverage of an art related topic by putting a kind of "form" (a far from flawless text) over "content".
It caused said media to massively ridicule the man's writing skills and not paying much - if any - serious attention to the underlying message.

I don't get it...what's so brilliant about writing a wacky text and being ridiculed for it? Isn't that just dumb instead?

Well, to understand why this is brilliant, you need to read the text for its message first, rather than for its form. In what follows I've tried to condense the 7 pages into points. Note that this is my own interpretation of the manifest, which you can read for yourself in its entirety (in Dutch!) on the web site of the "Brussel's Art Institute".

The manifest in a few points...

Before we dive into the manifest, I'd like to make it clear that I'm in no way affiliated with Jan De Cock or the Brussel's Art Institute. I just read the text, and found it remarkable enough to write this blog entry. I didn't ask permission or get consent from the artist to explain his views as if I totally understand them, and I'm adding words never used by the author to make my point - reader beware!

The following summarizes and paraphrases the manifest - or at least what I made of it:

Never before did the media so fiercely try to keep up the appearance that they are doing a wonderful job serving everyone with good information.
Reality is very different. Media are being restyled all the time with one sole purpose: selling more. Content is less important than form.
Physical media are being replaced with digital media very quickly. Digital media are volatile, whereas physical artifacts of the past are still with us today.
Media maintain that they serve their audience a panoramic world view, but this is nothing but marketing speak. Consumers of the media believe everything that is said in those media, and what doesn't appear in the media doesn't exist.
We, as a society, should go back to transfer the knowledge we gained through hard labor, from the moment we started back when we were still amateur-students, until the moment we mastered our field. Media are hindering us there by ignoring art related or - more generally - all difficult subjects like the plague.
Media do not pay enough attention to artists. These artists typically are completely ignored until they become of economical value (e.g. because they gather some international attention) at which point during a few weeks everyone wants them in their superficial talk show.
There's a systematic discrimination and ridiculing of artists that supposedly do "not contribute economical value", and on the contrary supposedly "live of state funding" in the form of grants. But can you economically quantify the value of such an artist? Is economy the only factor in judging if something is useful? Is a narrow-minded neo-liberal world view even capable of correctly assessing the value of art? Because monetary gain needs to be maximized, everything reduces to a kind of uniform sludge, aimed at attracting the largest possible audience.
As a consequence we live in sad times when it comes to art. Representation, fame, exposure are more important than insight and seeing past superficial form.
Artists asking attention for art in general are dismissed as merely seeking personal attention.
If the media continues like they do today, growing dumber in every new restyling, in an attempt to lure a bigger audience, the readers and viewers will keep degenerating with it, and their numbers will keep on decreasing until eventually no one's left. The media are committing suicide for short-term profit.
Just like art, media should instead establish a tradition of passing on knowledge and insights, independent of politics and economy. Not merely put up a show made up by marketeers. The message is more important than the form. Media should contribute to creating a nation of critical, analytical minds capable of distinguishing banalities from real content. The current way media works makes and keeps readers/viewers dumb.

My thoughts on this...

Now independent of whether you do or don't agree with any or all of his points, I want to point out why the manifest itself, and the way the media reacted to it, actually perfectly illustrate what the artist is complaining about:

Exactly as he predicted, the unusual form he chose to write the manifest (using weird language and writing mistakes) caused a lot more media attention that what he would have got if it were written in perfect Dutch, thereby illustrating that indeed form prevails over content, and
just like he predicted, the media, in an attempt to demonstrate their superiority, mostly ignored the actual contents of the manifest, instead concentrating on finding all the spelling and grammar errors and ridiculing the writer, and accusing him of attention seeking, in their most creative wordings instead.

One has to love the sweet irony of the media falling into the artist's trap with open eyes. In my eyes, the last laugh is on the artist here...

The tone of the manifest has been compared to that of Marxist pamphlets in the media, and to some extent I can agree to that characterization - things are stated in extremes and without proof, but to dismiss the message as nothing but attention seeking seems shallow to me.

One cannot deny that just about everything we see on television and in magazines tends to become dumb and dumber. Compare today's television quiz questions to those of 30 years ago and weep... Compare today's election debates with those of 20 years ago and weep even harder... Compare today's "reality shows" to... yeah.. to what? Weeping doesn't even begin to address the stupidity we're being force-fed in some of those tv-shows. Everyone is free to choose not to watch the crap, but the availability of less-than-crap to watch is rapidly decreasing and being replaced by super-crap in a fast tempo. It seems as every year adds a new level of stupidity to TV shows.

You may argue that creating abstract art, or writing obscure books that "no one" reads or difficult TV programs that "no one watches" are not likely to contribute to this Utopian nation of analytical and critical people either. I'd say it's good if they exist for those few people who wish to think and seek. You never know who they might inspire next. Preferably these works exist in a physical form that can surive the volatility of whatever new digital format happens to be the hype of the day.

Government grants for art nowadays appear to be given to those artists that lure most people... citing reasons like "public money should be useful for the widest possible public". But paradoxically these may exactly be the people who need the grants the least, since they already are popular and have ways to survive through performances or sales of their creative output.

Near the end of the 19-eighties e.g. the Belgian television had a daily tv-show program (only 5 minutes long) called "Kunstzaken" ("Art matters") that every day would talk about an exposition or a play or something cultural happening at that moment in time. Nowadays nothing comparable exists anymore on Belgian TV because 5 minutes a day of not attracting 1.000.000 viewers is an economic disaster, isn't it?

Having said all this.... maybe it's time for artists to help the audiences understand their art a bit better. Just putting an installation on the street, and expecting the holy spirit to explain it to a random passer-by by now has proved itself not to work for abstract or experimental art. If you always do what you always did you will always get what you always got. If artists somehow found a way to better explain what they do and why it matters, they might also be able to gain back some of the audience they seem to have lost.

Get up, work ahead!

So artists who might read this... blaming media is a bit easy. don't just sit there and weep over the state of matters, but use your fantastic creativity and imagination to find ways to explain what you do and why it matters. Ernest Rutherford supposedly said: "If you can't explain your research to the cleaning lady, it's not worth doing". Postmodern descriptions of art written in difficult language in art catalogues do not strike me as the right format for this particular task (although they can happily coexist with more mundane explanations). While it may be true that some of history's geniuses were not understood during their life time, not all ignored artists automatically are geniuses. I do not believe for a second that explaining art to a wider audience somehow kills it unless it was crap to begin with.

While it may be true that too much explanation can create prejudice or preconception in an art viewer's mind, thereby possibly closing some paths to multiple or individual interpretations, people who still remember or (re)learn how to think for themselves will not be stopped by some explanation, and might even be inspired to explore the works of art more and deeper and look at them in ways they hadn't thought of before. Art is a journey and you cannot expect a single work of art to convey the whole of history you as an artist went through to arrive there, but you can provide some hooks for people to connect it to their own world.

And people of the media... admitting you have a problem is the first and hardest step to start resolving it. You too can help artists, and humanity in general, by using your creativity to explain difficult subjects to wider audiences. Crap produced by the Rupert Murdoch's of this world has done nothing to improve the state of world affairs (it has done a lot to make the Rupert Murdoch's of this world richer and more influential though) - use your power to do some good instead.

Walsh-Hadamart transformations in supercollider

Problem

Practically the whole world is using Fourier Transforms to decompose sounds into sums of sine waves. The Fourier transform then can be edited, and transformed back to the time domain to hear the effects of the editing. One question that naturally arises is if perhaps ways exist to decompose sounds as sums of something other than sine waves.

Approach

Well, as it turns out there are infinitely many ways to decompose signals into sums of other signals, and one that personally intrigues me is the Walsh-Hadamard transform which decomposes signals into sums of pulses (square waves). Walsh functions were already known and used around 1890, but it took until 1923 for Walsh to formally examine them in a mathematical context. Most of the work on Walsh functions in the context of audio processing was done in the nineteen-seventies. The fact that it is no longer popular may indicate that the results were nothing spectacular, but that shouldn't stop us from experiencing it first-hand.

Walsh functions

With the help of mathematics it can be established that we need only a subset of all possible square waves to decompose and reconstruct any signal. These waves are now known as "Walsh functions" and they can be ordered by sequency. (Note: not frequency!). Sequency is a number that corresponds the number of zero-crossings the pulse makes in the time base. Here's a representation of some Walsh functions ordered by their sequency. Sequency is not expressed in cps (cycles per second, also known as Hz), but in zps (zero-crossings per second).

If the signal is only 2 samples long, 2 Walsh functions suffice to perfectly reconstruct any such signal. If the signal is 4 samples long, 4 Walsh functions suffice to perfectly reconstruct any such signal. In general, for a signal of 2^L long, you need to combine up to 2^L Walsh functions to perfectly reconstruct any such signal. This is similar to the discrete Fourier transform where the original signal and the transformed signal both have the same length.

Walsh functions always start with +1 as their first component. For completeness I should probably mention that the even sequencies are sometimes called CAL functions, whereas the odd sequencies are called SAL functions.

WAL(2n, t) = CAL(n, t) n = 1, 2, ...
WAL(2n-l, t) = SAL(n, t) n = 1, 2, ...

As with sines and cosines, CAL and SAL are in essence time-shifted versions of each other.

Here's an example of decomposing a 4 sample signal into a linear combination of Walsh functions:

The signal [-1, 1, 0, -2] can be decomposed using level 2 sequencies:

[-1,1,0,-2] = -0.5*[1,1,1,1] + 0.5*[1,1,-1,-1] -1*[1,-1,-1,1] + 0*[1,-1,1,-1]

or in words:

[-1,1,0,-2] = -0.5*sequency0 + 0.5*sequency1 - 1*sequency2 + 0*sequency3

Note that this formula shows how to convert from Walsh spectrum back to time domain using the (known) sequencies.

Ok, so how did you find that combination? Can you always do this? Is there always only one possible combination?

I'll skip the mathematics, but yes, it can always be done, and there's always exactly one possible decomposition. Some very smart people invented an efficient way to find this decomposition, and the efficient way is known as the "Fast Walsh Transform". Explaining the transform in detail, however, is way out of bounds for this article. Please refer to some external reference for the juicy details.

Looking at the example above, the fast Walsh transform of [ -1, 1, 0, -2] should give [ -0.5, 0.5, -1, 0].

And we can go back from [ -0.5, 0.5, -1, 0] to [ -1, 1, 0, -2] by using the sequencies as we did in the example above.While we did the calculations by hand in the example above, the fast Walsh transform actually has a wonderful property: it is its own inverse (except for some constant factor).

This means that if you apply the Walsh transform to a signal, you get the Walsh spectrum, and if you then apply the Walsh transform to that Walsh spectrum again, you get the original signal again (apart from some constant factor).

What does any of this have to do with audio or supercollider?

Well, in itself nothing really. But we can propose some experiments with this transform.

Remember that what we are really doing here is decomposing signals (think: sounds) into weighted sums of Walsh functions (think: square waves). Square waves happen to be a basic waveform used in subtractive synthesis (think analog synths!) so now it starts to sound kind of interesting doesn't it?

We already have a way to decompose any sound into a sum of square waves, and to go back from these square waves to the original signal (time domain). What if, just before we go back to the time domain, we modify the Walsh spectrum first?

What is the effect on a sound of removing all the fast square waves (= high sequencies)? (For lack of a better word, you could call it a Walsh-Low-Pass-Filter). What is the effect on a sound of removing the slow square waves (= low sequencies)? (kind of Walsh-High-Pass-Filter). What is the effect on a sound of setting all Welsh spectrum values to 0 if they happen to be smaller than some threshold? (This could be the core algorithm of some lossy data compression scheme). What is the effect on a sound of shifting the Welsh spectrum values to the left/right (a kind of Walsh-Pitch-Shifting). Can we synthesize interesting sounds by making up a new Walsh spectrum (kind of additive synthesis with square waves)? What does a walsh filter sweep sound like? What do you get if you reinterpret the Walsh spectrum as a Fourier spectrum or vice versa? Can useful/beautiful visualizations be derived from the Walsh spectrum?

It is to be expected that the auditory results will be wildly different from what we are used to hearing in transformations based on the Fourier transform (classical high-pass and low-pass filters e.g.), but that should be all the more reason to try it out, shouldn't it? Maybe you can think of really cool new applications made possible by using the Walsh transform in audio context? If so, be sure to comment :)

Walsh transform in supercollider

Before we can play with audio, we need a way to calculate the fast Walsh transform in supercollider. Since I don't know how to write UGEN's yet, I will do some calculations in the language for now.

Here's a pretty straightforward translation of this c implementation:
This code is also available on https://sccode.org/1-5bD.

If we evaluate the following code in scide:

~walsh_transform.(values:[-1,1,0,2]);

we get back the expected result:

[0.5,-0.5,0,-1]

And to check that the inverse transform works as expected:

~walsh_transform.(values:~walsh_transform.(values:[-1,1,0,2]), rescale:false);

gives the original signal:

[-1,1,0,2].

Bring on the sounds!

This walsh, this walsh, this walsh, this walsh...

My first interest is in hearing the timbres of the Walsh functions. by themselves. So let's listen to some of those. I'll take the 256 Walsh functions from level 8 (calculate them by applying the inverse walsh transform on a spectrum containing a single "1"), and concatenate 100 copies of each into a (stereo) buffer and then play the buffer.

The resulting tones pretty much sound like pulsewidth modulated pulse waves because, obviously, they *are* pulse waves.

This article has been more than long enough for now. If there's any interest in the subject I may prepare a follow-up article in which we experiment with Walsh-transform based filters.

Sunday, July 22, 2018

Using the midi tuning specification (MTS) standard in supercollider

Problem?

I have a hardware synthesizer Dave Smith Instruments Rev2 ( (aka DSI Rev2)with support for MTS (=midi tuning specification) and I'm intrigued by its possibilities. How can I reprogram my synth's frequencies from supercollider? In a next post I may or may not (depending on how fast I can solve some annoying bugs :) ) explain how to download scala tunings and keyboard mappings with the Rev2 from supercollider, but for now let's concentrate on the low-level MTS messages. Most of the information in the article explain calculations, so it should also be usable for anyone trying to use MTS from any programming language.

Approach?

It will be explained how to assign random frequencies to all midi notes by composing a valid sysex message containing bulk tuning information. The explanation is somewhat math heavy, but every step is completely detailed which should help you implement something in your favorite programming language.

MTS?

Midi tuning specification is a protocol that allows users of synthesizers to assign any frequency to any midi note. You can reverse your keyboard if you like, or you can assign random frequencies to all keys, or you can explore the wonderful world of microtonality and xenharmonics. The way to assign any frequency to any midi note is by sending midi sysex messages.

Sysex?

Sysex is a hook in the midi protocol that allows for sending anything over midi. "Anything" is to be interpreted as follows: "anything that fullfills the demands made on bytes sent over MIDI, i.e. the highest bit is always 0". This constraint means that in some cases data has to be encoded in a form that is suitable to be sent over midi, and this is also the case for MTS messages. We'll go over the calculations in excruciating detail in the rest of the article.

With some exceptions, sysex messages are not standardized. Any manufacturer can implement its own sysex messages. DSI, e.g. uses sysex to perform OS updates on their instruments. Such sysex messages typically are not publicly documented.

The sysex messages used for sending tuning information, on the other hand, are standardized. They come in a few flavors. There's a message to request an synth to send over its tuning to a computer, but the DSI instruments appear not to respond to those messages, so I guess they are not supported. There's another message to send tuning information from the computer to a synth, and that works well. Still other messages exist to retune a few notes in real-time, but I haven't tried those yet.

MTS message structure

MTS message structure is not very complicated, but it takes a bit of explanation. The full explanation can be read in the midi tuning standard spec, but it might contain some mistakes (currently under investigation).

First of all let's explain the general message layout.

F0 F7 7E 7F 08 01 IDX C1 C2 .. C16 TA1_0 TA2_0 TA3_0 .. TA1_127 TA2_127 TA3_127 CS F7

F0 F7 appears at the start of any sysex message
7E specifies that it's a non-real-time message.
08 specifies that we're about to send tuning information
01 specifies that we're about to send a bulk request, i.e. tuning information for all midi notes
IDX specifies a 0-based tuning index, this corresponds to the tuning index selected in DSI's global parameters menu, alternative tunings menu entry
C1 .. C16 are 7-bit ascii codes for the characters in the name of the custom tuning
TA1_0 TA2_0 TA3_0...TA1_127 TA2_127 TA3_127 are the actual tuning information. Tuning information for each midi note is described using three bytes, to be explained below.
CS is a checksum, also explained below
F7 signals the end of the sysex message

TA bytes tuning information structure

How do the three bytes TA1, TA2, TA3 encode tuning information for a midi note? Well, the first of the three bytes, TA1 is a number that says which lower midi note number is closest to the desired frequency, if the instrument would be tuned in 12-Tone equal temperament (i.e. freq = 2^{(midinote-69)/12}*440 Hz or vice versa midinote = 12*log₂(freq/440 Hz)) + 69, where midinote is a number 0-127, and freq is a float.

The bytes TA2 and TA3 then encode the difference in "cents" between the frequency indicated by the midi note number in TA1 and the desired frequency. The difference in cents between two frequencies f1 and f2 can be calculated as n_c = 1200*log₂(f2/f1)), but supercollider provides some built-in functions to hide these formulas.

In the MTS message, 128 such triples TA1,TA2,TA3 are sent, one for each midi note.

So if you wanted to retune the 50th midi note to the value of 670 Hz, you'd have to put as 50th triplet in the message the following bytes:

the closest, lower midi number corresponding to 670 would be 76 (midi note 76, on a 12-TET tuned instrument, corresponds to a frequency of 2^(76-69)/12*440 = 659.26 Hz ). The way to calculate this 76 is by taking the floor of the formula that transforms frequency to midi note number: floor(12*log2(670/440 Hz) + 69) = 76. In hexadecimal, 76 is 4C.

TA1_50 = 76 (or hex: 0x4C)

the difference in cents between 659.26Hz and 670Hz would then be 1200*log₂(670/659.26) = 27.98908618457 cents. This difference in cents will always be < 100, because the reference frequency is the closest lower midi note (and the difference between two successive midi notes in 12 TET, is always exactly 100 cents).
We have 2 bytes TA2 and TA3 to encode this number 27.98908618457. Because in midi bytes, the high bit is reserved to indicate commands instead of data, that gives us a total of 14 bits (= 0 to 2^14-1) to encode a number in the interval [0, 100] (I include 100 because the difference in cents might e.g. be 99.995). In other words, you get a resolution of 100 cents / (2^14) = 0.0061 cents. That is what we call super high precision :)
If you map number 27.98908618457 from the interval [0,100] to the interval [0, (2^14-1)], by means of the formula 27.98908618457 * (2^14-1)/100 ,you get 4585.4519896181, or after rounding 4585. This 4585 is the number that must be encoded in TA2 and TA3. This happens with bit masking operations:

TA2_50 = 4585 >> 7 = 35 (or hex: 0x23)

TA3_50 = 4585 & 127 = 105 (or hex: 0x69)

Checksum calculation

The checksum is calculated as a XOR between all the bytes in the message, except for the sysex header (F0 F7), the checksum itself (obviously) and the end of sysex byte at the end (7F).
Gotcha: at the end of the checksum calculation, be sure to "and" the result with 127 (=0x7F) to mask out the highest bit, otherwise you end up with an invalid MIDI message. This is not explicitly mentioned in the MTS standard, but it's implicitly assumed for all MIDI data bytes.

Gotchas

Beware of limited precision of floating point operations. In some cases, it may lead to unexpected results and invalid sysex messages as a result. I encountered such problems in my supercollider implementation of the tuning message and put in a workaround.
Be sure to check and double check that midi note numbers used as TA1 do not become negative or >127. Failing to do so will cause you hours of debugging (I should know by now!) to find out why sometimes - seemingly out of the blue - the synthesizer does unexpected and weird things when sending a new tuning (like randomly switching to a different program and changing some LFO settings) or why sending a new tuning doesn't seem to have any effect.
The MIDI MTS spec states that sending a tuning value with TA1, TA2, TA3 = 0x7F 0x7F 0x7F should be interpreted by the synth as "do not change tuning", but it seems that DSI instruments interpret it as "tune to the highest possible frequency" instead. For this reason the code currently just puts the standard note where no note should be present.

Supercollider code

Note that the following screenshot is part of a supercollider class, part of a larger system to communicate with the DSI prophet rev2 hardware synthesizer (the code can be found here: https://github.com/shimpe/sc-prophet-rev2 ), so you'd have to change some minor syntax details to get it working as a regular function (click the image to enlarge, or - even better - copy paste it from github (which may have a more recent version): https://github.com/shimpe/sc-prophet-rev2/blob/master/Classes/ScalaCalculator.sc).