Let's take a simple example: the ST's screen is always 32K in size assuming no borders are removed. At 8Mhz, each frame of the screen "lasts" for exactly 160,256 processor cycles (each scan line lasting 512 cycles with 313 scanlines in total - see last issue's article for more info about scanlines.) To copy one area of memory 32K in size to another, as fast as possible, only using the processor, requires almost exactly this amount of time. In fact to do this we need to access 64K of data: 32K for the source and 32K for the destination.
So simply copying a whole screen locks up the system completely. If
we want other effects or music, or fiddle about with what we are copying,
then it would be impossible to do smoothly.
(1) There are 200 rows of pixels and 320 pixels in each row.
(2) Each pixel takes a value from 0 to 15 (hence 16 colours) and since this is a computer it is used in binary format i.e. 0000 for zero, 0001 for one, 0010 for two and so on, up to 1111 or fifteen. So to represent 16 colours we need 4 computer "bits" to represent it - 4 bits being half a computer "byte".
(3) The screen is stored in horizontal lines. If there are 320 pixels, this means there are 320 * 1/2 = 160 bytes for each line.
(4) Now the strange bit: in the memory itself we break each line up into groups of 16 pixels. Let's take the top left 16 pixels of the first row. The values of the pixels (top left first) we shall take to be 0, 1, 2, 3... 15. Or in binary:
0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111
What is NOT done is that we take the 4 bits of the first pixel and the 4 bits of the second pixel and make a byte by glueing them together.
(5) Instead, we take the lowest (i.e. last) "bit" of all 16 values in the chunk and make 2 bytes by glueing them all together. If we look at the line of bit data above, then this will make 01010101-01010101. This makes up the first two bytes of the screen as it is stored in memory.
(6) We repeat this for the third, second and first bits of data in turn. In the end we have the following bit patterns:
0,1 01010101 01010101
2,3 00110011 00110011
4,5 00001111 00001111
6,7 00000000 11111111
In this way the pairs of bytes are separated from one another. Each chunk is called a "bitplane" since it refers to all the same bits of a group of many pixels.
Second, there is the cunning advantage that we can draw 16 pixels all
at once by simply altering two consecutive bytes of memory. If we take
the above example, let us alter bytes 0 and 1 to 11111111-11111111. This
will alter pixel 0,2,4...14 all in one go. If we used the easy way to store
a screen, we would need to access all 8 bytes of data, so the bitplane
technique is quicker.
move.w (a0)+,d0 ;get a word of data, stick
move.w d0,(a1) ;copy to row 0
move.w d0,160(a1) ;copy to row 1
move.w d0,320(a1) ;copy to row 2
move.w d0,480(a1) ;copy to row 3
move.w d0,640(a1) ;copy to row 4
and so on, where a0 pointed to the source graphic and a1 the destination.
The other big trick was to only use one bitplane (see the above explanation)
In this way, to cover a whole screen you only need to copy 8K of data onto
the screen (i.e. one quarter) and if the repeated "chunky" trick is used,
perhaps only fetch 1K of data to copy on. This is opposed to reading a
full screen of 32K from a background and writing another 32K. Hence we
have had to copy about 15% of the data of our first example. Most demos
desperateyl tried to hide their 1-colour shabbiness by using gaudy rasters
(see last issue) or other cunning bitplane effects.
This is all that there is to the "ROXL" scroller. Here it is in its assembler glory:
+136 +144 +152 X reg Assembler code
---- ---- ---- ----- --------------
00101010 10101110 10101110 0 roxl.w 152(a0) ;do the
last chunk of
;16 on a row
00101010 10101110 01011100 1 roxl.w 144(a0) ;do the chunk to its
00101010 01011101 01011100 1 roxl.w 136(a0) ;and left again...
01010101 01011101 01011100 0 ..etc..
Despite its simplicity, the roxl was still quite slow. What happens if your scroller moves 4 or 8 pixels left at a time? Doing the "roxl" trick 4 times meant you usually ran out of processor time, unless your scroller was slow. We need to find way round this - the answer was buffering.
To demonstrate, here's an example. Each "character" in the example represents 2 pixels of the buffers we use. The splits for each 16 pixels are denoted by the spaces:
AAAAAAAA BBBBBBBB CCCCCCCC DDDDDDDD EEEEEEEE
AAAAAAAB BBBBBBBC CCCCCCCD DDDDDDDE EEEEEEEF
AAAAAABB BBBBBBCC CCCCCCDD DDDDDDEE EEEEEEFF
AAAAABBB BBBBBCCC CCCCCDDD DDDDDEEE EEEEEFFF
AAAABBBB BBBBCCCC CCCCDDDD DDDDEEEE EEEEFFFF
AAABBBBB BBBCCCCC CCCDDDDD DDDEEEEE EEEFFFFF
AABBBBBB BBCCCCCC CCDDDDDD DDEEEEEE EEFFFFFF
ABBBBBBB BCCCCCCC CDDDDDDD DEEEEEEE EFFFFFFF
If in frame one we copy buffer 0, then in frame two copy buffer 1 to the screen, the scroller moves along at two pixels. When we get to buffer 7, we then copy buffer 0, only this time copy 8*2 pixels = 16 pixels = one data word from its start, and tag some new slice of graphics on to the end of the buffer. Voila! A much faster method of scrolling buffers. In this way, scrollers using more bitplanes or larger areas of the screen can be achieved.
You may well have noticed that this takes up a lot of memory, and this
is one of the main principles of demos: more memory --> faster demos. The
other main idea here is that you always appear to be doing more work than
is really the case!
The scroller goes in the 4th bitplane, that is it governs the most significant
bit of the colour of each pixel, of value %1000 or 8. If a pixel of this
bitplane is set, then the colour of that pixel MUST lie in the range %1000
to %1111 (8 to 15) no matter what lies in the other 3 bitplanes. Look at
the example below:
Bitplane 0 0101010101010101 Picture 1
Bitplane 1 0011001100110011 Picture 2
Bitplane 2 0000111100001111 Picture 3
Bitplane 3 0000111111110000 Scroller
Resulting 1111 11
Now, if we use a random palette all the colours will look ugly and you won't be able to read the scroller. But if we set all the colours 8 to 15 to the same value (let's say white) then the scroller will appear to run over the top of the picture without them interfering with one another.
Similarly, if we want the scroller to run behind the picture, the colours of 0 to 7 should be copied to those of 8 to 15, with the exception of 8 (ie. where the picture is empty) which should be set to the scroller colour.
That's about it for bitplanes; these tricks are used all over the place
in demos, especially those "one million bitplane scroller" screens in fashion
a few years ago.