Steven Tattersall's personal blog
Created on: 2017-07-15
Posted in programming projects atari 68000 tim-follin chiptune
I can't remember the reason why now, but I've ended up looking at Tim Follin's Atari ST sound driver and reverse-engineering it. I'm convinced many people have been down this path before, but I did it just because it interested me.
If you don't know about Tim Follin, have a quick Google and try listening to this or any of his other ST tunes here. Tim Follin wrote this tune in about 1988 (when he was about 20.)
To do the analysis, I've been using the version of the driver from LED Storm, originally ripped for the SNDH archive by Grazey/PHF. Many thanks to him!
The disassembled code was quite easy to work out and annotate. You can find my documented and buildable version of the driver at https://github.com/tattlemuss/folly.
If you don't know what the YM-2149 sound chip of the Atari ST can do... it's basically nothing. It boasts:
There are a few interesting things to say about this driver, but what strikes me the most is how primitive it is in some ways, and sophisticated in others. The use of envelopes and pitch modifications is very simple, for example. There is no use of the buzzer. The drum sounds are very basic. If you compare it to a modern YM driver tune by someone like Tao, for example, it's clearly from a different era. But I love its melodic feel.
In terms of YM "features", this driver has the following:
That's more or less it. It's very simple.
This is not a criticism; the state-of-the-art in terms of pushing the hardware was still in its infancy. Indeed, it's amazing what can be produced by so little.
The routine updates the YM chip at a rate of 50Hz (one PAL vblank). That was the standard interval for ST music at the time. Modern routines often update of a rate of up to 200Hz, taking more processor time. This routine usually runs in about 10 scanlines, or 5000 cycles -- roughly 1/30 of the available processor time in a vblank.
What's more interesting is the sequencing and programmatic arrangement of the tracks. A lot of chiptune music is done in the "tracker" paradigm: you get a palette of "instruments" you can define, then you can design "patterns" of say 64 steps, inserting notes into the pattern's grid at given steps. Then there are fixed commands you can have on each step to modify the sound. For example you can change volume or set an arpeggio per step.
This driver isn't like that; each channel is a standalone sequence of "commands", where a command can be a note, or any number of modifiers or sequencing steps. There is no concept of "instrument"; envelope, arpeggio, slide etc can be set independently of each other, and several can be changed at the same step.
More powerfully there is flow control: you can loop a sequence, and even have a stack of "subroutines" to reuse common sequences and return back after completion. Since all channels run independently, they can be running overlapping stacks of sequences. You'll see a couple of examples of this below.
The command list is better documented in the driver code itself, but here is a quick list of all the commands the player supports. Everything else in the channel data is a list of note values, potentially followed by a number of frames to wait before the next update (see set_default_note_time below for more details.)
So the usual flow is
If this is a bit too involved, skip down to the "Visualising" part.
0 - start_loop Start a loop from this point. Loops are not stacked. 1 - end_loop Decrement counter and if not zero, go back to loop point. 2 - set_default_note_time If not zero, all following notes take this value as the duration. If zero, all following notes have an extra byte with the note's duration. 3 - stop Stop playback of the channel 4 - gosub Push the current command address on a stack. Start processing from a new address Next 2 bytes: offset of the subroutine from the start of the tune data (little-endian) 5 - return Pop the return address off the stack and continue processing from the popped address. 6 - set_transpose Next byte: number of semitones to transpose all following notes. Signed 8-bit value. 7 - set_raw_detune Next byte: raw value to add to the final note period in YM register space. Unsigned 8-bit value. 8 - direct_write Write a value directly to the YM registers. This is often used to write noise pitch. The "mixer" register, register 7, is treated differently. This register combines settings for all 3 channels A,B,C to determine whether they use the square or noise channel, so the driver ensures that only the bits relevant to the active channel are set and cleared. 9 - set_adsr Sets the note envelope. This takes 3 bytes and contains the attack and decay speeds, the minimum and maximum volume levels after attack or decay, and which stage to start in (attack, decay, or hold) 10 - set_adsr_reset Next byte: if zero, moving to a new note does not reset ADSR, otherwise ADSR is reset. (A zero value is usually used to define complex arpeggio sequences.) 11 - set_arpeggio Sets the semitone note offset of the arpeggio and the times they are held for. 12 - set_slide Set the number of semitones to jump per update when applying glissando between notes. Usually set to 1. 13 - set_vibrato Set the delay, size, speed and starting direction of the vibrato effect. 14 - skip_transpose For the next note only, don't apply transpose. Usually used for drums mixed in with bassline notes. 15 - set_fixfreq Force using a fixed frequency, defined in the next 3 bytes (mixer and period low/high), or turn it off by using a single zero-byte. 16 - jump Jump to a new offset. Used for the infinite looping of tunes. 17 - set_mute_time Turn the channel off after N more updates. 18 - set_nomute If set to non-zero suppress the automatic muting. The starting commands of a channel usually set this to 0xff.
Once we've worked out the format of the data, we can do some more interesting things with it.
I wrote a very hacky Python script called "reader.py", included in the GitHub repo, which loads in the tune data from LED Storm then parses the commands in each tune in a similar way as the original player. It then outputs a text log of what happens, and a .png file of the notes played over time. We can use it to see what's going on more clearly (and to test that our analysis is correct.)
Here's an example, the commands where there is a bassline and drum sounds are playing on the same channel. My annotations are lines beginning with "#".
#################################################### # # Play a couple of bass notes, with different delays in between # time:0000 $1497 Note: G#-1 {0x14} time:6 time:0006 $1499 Note: G#-2 {0x20} time:12 time:0018 $149b Note: D#-2 {0x1b} time:6 # Call a subroutine to play the drum only.... time:0024 $149d {4} -> gosub($c4, $14) Gosub 14c4 #################################################### # Play the drum. # # Turn on both square wave and noise for this channel (direct to YM) time:0024 $14c4 {8} -> direct_write($7, $0) Write $0 to reg 'mixer' # Don't apply transpose to the next note time:0024 $14c7 {e} -> skip_transpose() # Set a sharp decay (drum waveform) time:0024 $14c8 {9} -> set_adsr($f0, $1, $1) Volume 15 -> 0 Attackspd: 0 Decayspd: 1 start step; 1 # Set the noise frequency (direct to YM) time:0024 $14cc {8} -> direct_write($6, $b) Write $b to reg 'noise_freq' # Add a sharp downwards vibrato time:0024 $14cf {d} -> set_vibrato($1, $14, $0, $1) # Play the note! time:0024 $14d4 Note: D -3 {0x26} time:12 # ... then reset all these settings back to what they were time:0036 $14d6 {8} -> direct_write($6, $5) Write $5 to reg 'noise_freq' time:0036 $14d9 {d} -> set_vibrato($2, $3, $3, $0) time:0036 $14de {8} -> direct_write($7, $10) Write $10 to reg 'mixer' time:0036 $14e1 {9} -> set_adsr($e0, $1, $1) Volume 14 -> 0 Attackspd: 0 Decayspd: 1 start step; 1 time:0036 $14e5 {5} -> return() Return to 14a0 #################################################### # # Play some more bassline # time:0036 $14a0 Note: G#-2 {0x20} time:6 time:0042 $14a2 Note: G#-1 {0x14} time:12 time:0054 $14a4 Note: D#-2 {0x1b} time:6 time:0060 $14a6 Note: G#-1 {0x14} time:12 # Another drum... time:0072 $14a8 {4} -> gosub($c4, $14) Gosub 14c4
Ultimately, all of the tunes are constructed in this way, using repeated building blocks and modifiers placed around them. It's a very programmatic approach. I am assuming the tunes were written in a code editor, since creating a UI for this would be very time-consuming.
Often this is much clearer to understand using the graphical visualisation.
The image below is a representation of LED Storm's main tune (the one I linked to above). Each channel (A/B/C) in the tune is coloured (red/green/yellow). Read the lines across in rows to follow time.
The green channel ends up as the main melody, yellow the bass and drums, red the "fill" arpeggios. Darker colouration is a simple attempt to represent the envelope volume.
In this example the "up and down" scales at the start of this tune are the same sequenced subroutine on each channel, but delayed and with several different note transpositions and envelopes applied. In fact almost anything that looks repetitive in the image is a shared subroutine.
See if you can follow the notes while listening to the MP3 recording again.
The code itself is pretty straightforward and is contained in LEDSTOR2.S. I have documented it fairly thoroughly and used defines to make the data offsets more readable.
There are a couple of routines for initialisation of either a tune or a sound effect, and the main update function is marked as "follin_update". Its job is to run through for each channel, updating the shape of the active note and potentially running through commands before looping to the next channel.
One very strange thing is that all data offsets, from the start of the tune data, are little-endian, even though the 68000 is big-endian. I suspect that is because the tune is orignally composed on a different machine (possibly the Spectrum) and it was more convenient to use that.
The other unusual point is how the per-channel data is stored. Data for the channels is interleaved; for example the volume for three channels is stored at byte offsets -20, -19 and -18 relative to the "a6" register that points to the middle of the block. This means that for single-byte values stored, you see a lot of things like
ADDQ.B #1,o_volume(A6,D6.W)
Where "A6" points to the storage area, and "D6" is a number incremented per channel. This type of addressing is quite slow, and needs a different register depending on whether the data item is 1, 2 or 4 bytes in size (D6, D5 and D4 are used, respectively.)
It would be simpler to de-interleave the data, and do something like:
ADDQ.B #1,o_volume(A6)
Again, I suspect this is because the code is a port, and it's simpler to maintain code over multiple platforms, if you keep them similar. Also the musicians of the 1980s were knocking out a lot of stuff very fast; they didn't have time for multiple rewrites!
Anyway, feel free to use this information however you want. Theoretically it's now possible to write your own tunes with Tim's driver, or optimise the player. Have fun!