Problem
Practically the whole world is using Fourier Transforms to decompose sounds into sums of sine waves. The Fourier transform then can be edited, and transformed back to the time domain to hear the effects of the editing. One question that naturally arises is if perhaps ways exist to decompose sounds as sums of something other than sine waves.
Approach
Well, as it turns out there are infinitely many ways to decompose signals into sums of other signals, and one that personally intrigues me is the Walsh-Hadamard transform which decomposes signals into sums of pulses (square waves). Walsh functions were already known and used around 1890, but it took until 1923 for Walsh to formally examine them in a mathematical context. Most of the work on Walsh functions in the context of audio processing was done in the nineteen-seventies. The fact that it is no longer popular may indicate that the results were nothing spectacular, but that shouldn't stop us from experiencing it first-hand.
Walsh functions
With the help of mathematics it can be established that we need only a subset of all possible square waves to decompose and reconstruct any signal. These waves are now known as "Walsh functions" and they can be ordered by sequency. (Note: not frequency!). Sequency is a number that corresponds the number of zero-crossings the pulse makes in the time base. Here's a representation of some Walsh functions ordered by their sequency. Sequency is not expressed in cps (cycles per second, also known as Hz), but in zps (zero-crossings per second).
If the signal is only 2 samples long, 2 Walsh functions suffice to perfectly reconstruct any such signal. If the signal is 4 samples long, 4 Walsh functions suffice to perfectly reconstruct any such signal. In general, for a signal of 2^L long, you need to combine up to 2^L Walsh functions to perfectly reconstruct any such signal. This is similar to the discrete Fourier transform where the original signal and the transformed signal both have the same length.
Walsh functions always start with +1 as their first component. For completeness I should probably mention that the even sequencies are sometimes called CAL functions, whereas the odd sequencies are called SAL functions.
WAL(2n, t) = CAL(n, t) n = 1, 2, ...
WAL(2n-l, t) = SAL(n, t) n = 1, 2, ...
As with sines and cosines, CAL and SAL are in essence time-shifted versions of each other.
Here's an example of decomposing a 4 sample signal into a linear combination of Walsh functions:
If the signal is only 2 samples long, 2 Walsh functions suffice to perfectly reconstruct any such signal. If the signal is 4 samples long, 4 Walsh functions suffice to perfectly reconstruct any such signal. In general, for a signal of 2^L long, you need to combine up to 2^L Walsh functions to perfectly reconstruct any such signal. This is similar to the discrete Fourier transform where the original signal and the transformed signal both have the same length.
Walsh functions always start with +1 as their first component. For completeness I should probably mention that the even sequencies are sometimes called CAL functions, whereas the odd sequencies are called SAL functions.
WAL(2n, t) = CAL(n, t) n = 1, 2, ...
WAL(2n-l, t) = SAL(n, t) n = 1, 2, ...
As with sines and cosines, CAL and SAL are in essence time-shifted versions of each other.
Here's an example of decomposing a 4 sample signal into a linear combination of Walsh functions:
The signal [-1, 1, 0, -2] can be decomposed using level 2 sequencies:
[-1,1,0,-2] = -0.5*[1,1,1,1] + 0.5*[1,1,-1,-1] -1*[1,-1,-1,1] + 0*[1,-1,1,-1]
or in words:
[-1,1,0,-2] = -0.5*sequency0 + 0.5*sequency1 - 1*sequency2 + 0*sequency3
Note that this formula shows how to convert from Walsh spectrum back to time domain using the (known) sequencies.
[-1,1,0,-2] = -0.5*[1,1,1,1] + 0.5*[1,1,-1,-1] -1*[1,-1,-1,1] + 0*[1,-1,1,-1]
or in words:
[-1,1,0,-2] = -0.5*sequency0 + 0.5*sequency1 - 1*sequency2 + 0*sequency3
Note that this formula shows how to convert from Walsh spectrum back to time domain using the (known) sequencies.
Ok, so how did you find that combination? Can you always do this? Is there always only one possible combination?
I'll skip the mathematics, but yes, it can always be done, and there's always exactly one possible decomposition. Some very smart people invented an efficient way to find this decomposition, and the efficient way is known as the "Fast Walsh Transform". Explaining the transform in detail, however, is way out of bounds for this article. Please refer to some external reference for the juicy details.
Looking at the example above, the fast Walsh transform of [ -1, 1, 0, -2] should give [ -0.5, 0.5, -1, 0].
And we can go back from [ -0.5, 0.5, -1, 0] to [ -1, 1, 0, -2] by using the sequencies as we did in the example above.While we did the calculations by hand in the example above, the fast Walsh transform actually has a wonderful property: it is its own inverse (except for some constant factor).
This means that if you apply the Walsh transform to a signal, you get the Walsh spectrum, and if you then apply the Walsh transform to that Walsh spectrum again, you get the original signal again (apart from some constant factor).
What does any of this have to do with audio or supercollider?
Well, in itself nothing really. But we can propose some experiments with this transform.
Remember that what we are really doing here is decomposing signals (think: sounds) into weighted sums of Walsh functions (think: square waves). Square waves happen to be a basic waveform used in subtractive synthesis (think analog synths!) so now it starts to sound kind of interesting doesn't it?
We already have a way to decompose any sound into a sum of square waves, and to go back from these square waves to the original signal (time domain). What if, just before we go back to the time domain, we modify the Walsh spectrum first?
What is the effect on a sound of removing all the fast square waves (= high sequencies)? (For lack of a better word, you could call it a Walsh-Low-Pass-Filter). What is the effect on a sound of removing the slow square waves (= low sequencies)? (kind of Walsh-High-Pass-Filter). What is the effect on a sound of setting all Welsh spectrum values to 0 if they happen to be smaller than some threshold? (This could be the core algorithm of some lossy data compression scheme). What is the effect on a sound of shifting the Welsh spectrum values to the left/right (a kind of Walsh-Pitch-Shifting). Can we synthesize interesting sounds by making up a new Walsh spectrum (kind of additive synthesis with square waves)? What does a walsh filter sweep sound like? What do you get if you reinterpret the Walsh spectrum as a Fourier spectrum or vice versa? Can useful/beautiful visualizations be derived from the Walsh spectrum?
It is to be expected that the auditory results will be wildly different from what we are used to hearing in transformations based on the Fourier transform (classical high-pass and low-pass filters e.g.), but that should be all the more reason to try it out, shouldn't it? Maybe you can think of really cool new applications made possible by using the Walsh transform in audio context? If so, be sure to comment :)
Here's a pretty straightforward translation of this c implementation:
This code is also available on https://sccode.org/1-5bD.
If we evaluate the following code in scide:
Remember that what we are really doing here is decomposing signals (think: sounds) into weighted sums of Walsh functions (think: square waves). Square waves happen to be a basic waveform used in subtractive synthesis (think analog synths!) so now it starts to sound kind of interesting doesn't it?
We already have a way to decompose any sound into a sum of square waves, and to go back from these square waves to the original signal (time domain). What if, just before we go back to the time domain, we modify the Walsh spectrum first?
What is the effect on a sound of removing all the fast square waves (= high sequencies)? (For lack of a better word, you could call it a Walsh-Low-Pass-Filter). What is the effect on a sound of removing the slow square waves (= low sequencies)? (kind of Walsh-High-Pass-Filter). What is the effect on a sound of setting all Welsh spectrum values to 0 if they happen to be smaller than some threshold? (This could be the core algorithm of some lossy data compression scheme). What is the effect on a sound of shifting the Welsh spectrum values to the left/right (a kind of Walsh-Pitch-Shifting). Can we synthesize interesting sounds by making up a new Walsh spectrum (kind of additive synthesis with square waves)? What does a walsh filter sweep sound like? What do you get if you reinterpret the Walsh spectrum as a Fourier spectrum or vice versa? Can useful/beautiful visualizations be derived from the Walsh spectrum?
It is to be expected that the auditory results will be wildly different from what we are used to hearing in transformations based on the Fourier transform (classical high-pass and low-pass filters e.g.), but that should be all the more reason to try it out, shouldn't it? Maybe you can think of really cool new applications made possible by using the Walsh transform in audio context? If so, be sure to comment :)
Walsh transform in supercollider
Before we can play with audio, we need a way to calculate the fast Walsh transform in supercollider. Since I don't know how to write UGEN's yet, I will do some calculations in the language for now.Here's a pretty straightforward translation of this c implementation:
This code is also available on https://sccode.org/1-5bD.
If we evaluate the following code in scide:
~walsh_transform.(values:[-1,1,0,2]);we get back the expected result:
[0.5,-0.5,0,-1]And to check that the inverse transform works as expected:
~walsh_transform.(values:~walsh_transform.(values:[-1,1,0,2]), rescale:false);gives the original signal:
[-1,1,0,2].
Bring on the sounds!
This walsh, this walsh, this walsh, this walsh...
My first interest is in hearing the timbres of the Walsh functions. by themselves. So let's listen to some of those. I'll take the 256 Walsh functions from level 8 (calculate them by applying the inverse walsh transform on a spectrum containing a single "1"), and concatenate 100 copies of each into a (stereo) buffer and then play the buffer.
The resulting tones pretty much sound like pulsewidth modulated pulse waves because, obviously, they *are* pulse waves.
This article has been more than long enough for now. If there's any interest in the subject I may prepare a follow-up article in which we experiment with Walsh-transform based filters.
The resulting tones pretty much sound like pulsewidth modulated pulse waves because, obviously, they *are* pulse waves.
This article has been more than long enough for now. If there's any interest in the subject I may prepare a follow-up article in which we experiment with Walsh-transform based filters.
No comments:
Post a Comment