Home » Personal collection » Acorn ADFS disks » Electron_User_Group » EUG_53.ADF » V/+COMPR1
V/+COMPR1
This website contains an archive of files for the Acorn Electron, BBC Micro, Acorn Archimedes, Commodore 16 and Commodore 64 computers, which Dominic Ford has rescued from his private collection of floppy disks and cassettes.
Some of these files were originally commercial releases in the 1980s and 1990s, but they are now widely available online. I assume that copyright over them is no longer being asserted. If you own the copyright and would like files to be removed, please contact me.
Tape/disk: | Home » Personal collection » Acorn ADFS disks » Electron_User_Group » EUG_53.ADF |
Filename: | V/+COMPR1 |
Read OK: | ✔ |
File size: | 0F83 bytes |
Load address: | 56204556 |
Exec address: | 4F432B2E |
Duplicates
There are 2 duplicate copies of this file in the archive:
- AEW website » eug » eug_3_5_discs_Eug-53_A-EUG53.adf » V/+COMPR1
- AEW website » eug » eug_5_25_discs_Eug-53_D-EUG53.dsd » V.+COMPR1
- Personal collection » Acorn ADFS disks » Electron_User_Group » EUG_53.ADF » V/+COMPR1
File contents
THE BIG SQUEEZE GIVEN the memory constraints of the BBC, text compression can often come in handy when writing programs. If you were designing an adventure game, imagine how many more locations you could fit in if you could reduce the space taken up by the description of each. In fact, using the techniques discussed below, it's possible to achieve size reductions of 50 to 60 percent on reasonably large amounts of text. What we do is to convert the text into four-bit blocks, or nibbles; each of which is a commonly used character, or a code giving information about the nibble that follows: Nibble value Meaning ------------ ------- 0 Next nibble is a less commonly used character. 1 Next nibble is a token*. 2 End of text. 3-15 The 13 most commonly used characters. *See the following documentation. When we speak of "most" or "less commonly used characters", we are referring to character frequences. The order of frequency of alpha- betic characters in the English language is: e t a o n r i s h d l f c m u g y p w b v k x j q z Including the space, we would say that the 13 most commonly used characters are: e t a o n r i s h d l f [space] and the remaining fourteen letters, plus a couple of punctuation marks, make up 16 less common characters: c m u g y p w b v k x i q z , . Now, if we had a string like "hello you", before compaction it would look like this: byte 1 2 3 4 5 6 7 8 9 10 h e l l o y o u . Applying the rules above, it would be compressed to: byte 1 2 3 4 5 6 7 value &B3 DD 6F 04 60 20 F2 he ll o y o u . The six nibbles in bytes 1-3 index into a table of most commonly used letters, spelling "hello ". Remember that 0, 1 and 2 have special meanings, which is why the letter e has the index value of 3. The 0 in byte 4 indicates that the next nibble indexes into a table of less commonly used letters. And so on, until the 2 at the end flags the end of the text. Now we'll put this into practice with program COMPRESS. After the machine code has been assembled, you'll be asked to type some text. At the moment, only lower-case characters, spaces, commas and full stops are allowed. If you try anything else you'll get an error message: Illegal character The input string is held at 'intext%' and the resultant compressed string is placed at 'outtext%'. These locations are indirectly addressed via 'input' and 'output'. The machine code routine 'compress' compares each character of the input string to those in 'commtab' and 'raretab', tables of more and less commonly-used characters. If a match is found, the index value in the Y register is written to the output string, preceeded by a zero nibble if the matched character was a less commonly used one. Note how 'mask' is used by 'wrtnib' to determine the position at which the nibble is to be written. If 'mask' is 0, the nibble is multiplied by 16 and stored in the output string at the current offset 'outptr'. If it's &FF, the nibble is ORed into the right hand nibble of the byte at offset 'outptr', and 'offset' advanced by one to point to the next byte in the the output string. If the character in the input string is a carriage return, a corresponding two nibble is stored in the output string to signify the end of the text. 'expand' is just the reverse of the compression process. The input string must be a previously compressed one. The nibbles are inspected in turn to see if a more or less common character should be printed on the screen. Try typing in "she sells sea shells on the sea shore". The program works out the compression factor, and in this case it's about 48%! In the next article however we'll be looking at how tokens can achieve a factor of up to 60%! Christopher Dewhurst, EUG #53
00000000 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 | | 00000010 20 20 20 20 20 20 20 20 20 20 20 54 48 45 20 42 | THE B| 00000020 49 47 20 53 51 55 45 45 5a 45 0d 0d 47 49 56 45 |IG SQUEEZE..GIVE| 00000030 4e 20 74 68 65 20 6d 65 6d 6f 72 79 20 63 6f 6e |N the memory con| 00000040 73 74 72 61 69 6e 74 73 20 6f 66 20 74 68 65 20 |straints of the | 00000050 42 42 43 2c 20 74 65 78 74 20 63 6f 6d 70 72 65 |BBC, text compre| 00000060 73 73 69 6f 6e 20 63 61 6e 20 6f 66 74 65 6e 20 |ssion can often | 00000070 63 6f 6d 65 0d 69 6e 20 68 61 6e 64 79 20 77 68 |come.in handy wh| 00000080 65 6e 20 77 72 69 74 69 6e 67 20 70 72 6f 67 72 |en writing progr| 00000090 61 6d 73 2e 20 49 66 20 79 6f 75 20 77 65 72 65 |ams. If you were| 000000a0 20 64 65 73 69 67 6e 69 6e 67 20 61 6e 20 61 64 | designing an ad| 000000b0 76 65 6e 74 75 72 65 20 67 61 6d 65 2c 0d 69 6d |venture game,.im| 000000c0 61 67 69 6e 65 20 68 6f 77 20 6d 61 6e 79 20 6d |agine how many m| 000000d0 6f 72 65 20 6c 6f 63 61 74 69 6f 6e 73 20 79 6f |ore locations yo| 000000e0 75 20 63 6f 75 6c 64 20 66 69 74 20 69 6e 20 69 |u could fit in i| 000000f0 66 20 79 6f 75 20 63 6f 75 6c 64 20 72 65 64 75 |f you could redu| 00000100 63 65 20 74 68 65 0d 73 70 61 63 65 20 74 61 6b |ce the.space tak| 00000110 65 6e 20 75 70 20 62 79 20 74 68 65 20 64 65 73 |en up by the des| 00000120 63 72 69 70 74 69 6f 6e 20 6f 66 20 65 61 63 68 |cription of each| 00000130 2e 20 49 6e 20 66 61 63 74 2c 20 75 73 69 6e 67 |. In fact, using| 00000140 20 74 68 65 20 74 65 63 68 6e 69 71 75 65 73 0d | the techniques.| 00000150 64 69 73 63 75 73 73 65 64 20 62 65 6c 6f 77 2c |discussed below,| 00000160 20 69 74 27 73 20 70 6f 73 73 69 62 6c 65 20 74 | it's possible t| 00000170 6f 20 61 63 68 69 65 76 65 20 73 69 7a 65 20 72 |o achieve size r| 00000180 65 64 75 63 74 69 6f 6e 73 20 6f 66 20 35 30 20 |eductions of 50 | 00000190 74 6f 20 36 30 0d 70 65 72 63 65 6e 74 20 6f 6e |to 60.percent on| 000001a0 20 72 65 61 73 6f 6e 61 62 6c 79 20 6c 61 72 67 | reasonably larg| 000001b0 65 20 61 6d 6f 75 6e 74 73 20 6f 66 20 74 65 78 |e amounts of tex| 000001c0 74 2e 0d 20 20 20 20 20 20 20 57 68 61 74 20 77 |t.. What w| 000001d0 65 20 64 6f 20 69 73 20 74 6f 20 63 6f 6e 76 65 |e do is to conve| 000001e0 72 74 20 74 68 65 20 74 65 78 74 20 69 6e 74 6f |rt the text into| 000001f0 20 66 6f 75 72 2d 62 69 74 20 62 6c 6f 63 6b 73 | four-bit blocks| 00000200 2c 20 6f 72 20 0d 6e 69 62 62 6c 65 73 3b 20 65 |, or .nibbles; e| 00000210 61 63 68 20 6f 66 20 77 68 69 63 68 20 69 73 20 |ach of which is | 00000220 61 20 63 6f 6d 6d 6f 6e 6c 79 20 75 73 65 64 20 |a commonly used | 00000230 63 68 61 72 61 63 74 65 72 2c 20 6f 72 20 61 20 |character, or a | 00000240 63 6f 64 65 20 67 69 76 69 6e 67 0d 69 6e 66 6f |code giving.info| 00000250 72 6d 61 74 69 6f 6e 20 61 62 6f 75 74 20 74 68 |rmation about th| 00000260 65 20 6e 69 62 62 6c 65 20 74 68 61 74 20 66 6f |e nibble that fo| 00000270 6c 6c 6f 77 73 3a 0d 0d 4e 69 62 62 6c 65 20 76 |llows:..Nibble v| 00000280 61 6c 75 65 20 20 20 20 20 20 4d 65 61 6e 69 6e |alue Meanin| 00000290 67 0d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 20 20 |g.------------ | 000002a0 20 20 20 20 2d 2d 2d 2d 2d 2d 2d 0d 30 20 20 20 | -------.0 | 000002b0 20 20 20 20 20 20 20 20 20 20 20 20 20 20 4e 65 | Ne| 000002c0 78 74 20 6e 69 62 62 6c 65 20 69 73 20 61 20 6c |xt nibble is a l| 000002d0 65 73 73 20 63 6f 6d 6d 6f 6e 6c 79 20 75 73 65 |ess commonly use| 000002e0 64 20 63 68 61 72 61 63 74 65 72 2e 0d 31 20 20 |d character..1 | 000002f0 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 4e | N| 00000300 65 78 74 20 6e 69 62 62 6c 65 20 69 73 20 61 20 |ext nibble is a | 00000310 74 6f 6b 65 6e 2a 2e 0d 32 20 20 20 20 20 20 20 |token*..2 | 00000320 20 20 20 20 20 20 20 20 20 20 45 6e 64 20 6f 66 | End of| 00000330 20 74 65 78 74 2e 0d 33 2d 31 35 20 20 20 20 20 | text..3-15 | 00000340 20 20 20 20 20 20 20 20 20 54 68 65 20 31 33 20 | The 13 | 00000350 6d 6f 73 74 20 63 6f 6d 6d 6f 6e 6c 79 20 75 73 |most commonly us| 00000360 65 64 20 63 68 61 72 61 63 74 65 72 73 2e 0d 2a |ed characters..*| 00000370 53 65 65 20 74 68 65 20 66 6f 6c 6c 6f 77 69 6e |See the followin| 00000380 67 20 64 6f 63 75 6d 65 6e 74 61 74 69 6f 6e 2e |g documentation.| 00000390 0d 0d 57 68 65 6e 20 77 65 20 73 70 65 61 6b 20 |..When we speak | 000003a0 6f 66 20 22 6d 6f 73 74 22 20 6f 72 20 22 6c 65 |of "most" or "le| 000003b0 73 73 20 63 6f 6d 6d 6f 6e 6c 79 20 75 73 65 64 |ss commonly used| 000003c0 20 63 68 61 72 61 63 74 65 72 73 22 2c 20 77 65 | characters", we| 000003d0 20 61 72 65 0d 72 65 66 65 72 72 69 6e 67 20 74 | are.referring t| 000003e0 6f 20 63 68 61 72 61 63 74 65 72 20 66 72 65 71 |o character freq| 000003f0 75 65 6e 63 65 73 2e 20 54 68 65 20 6f 72 64 65 |uences. The orde| 00000400 72 20 6f 66 20 66 72 65 71 75 65 6e 63 79 20 6f |r of frequency o| 00000410 66 20 61 6c 70 68 61 2d 0d 62 65 74 69 63 20 63 |f alpha-.betic c| 00000420 68 61 72 61 63 74 65 72 73 20 69 6e 20 74 68 65 |haracters in the| 00000430 20 45 6e 67 6c 69 73 68 20 6c 61 6e 67 75 61 67 | English languag| 00000440 65 20 69 73 3a 0d 0d 65 20 74 20 61 20 6f 20 6e |e is:..e t a o n| 00000450 20 72 20 69 20 73 20 68 20 64 20 6c 20 66 20 63 | r i s h d l f c| 00000460 20 6d 20 75 20 67 20 79 20 70 20 77 20 62 20 76 | m u g y p w b v| 00000470 20 6b 20 78 20 6a 20 71 20 7a 0d 0d 49 6e 63 6c | k x j q z..Incl| 00000480 75 64 69 6e 67 20 74 68 65 20 73 70 61 63 65 2c |uding the space,| 00000490 20 77 65 20 77 6f 75 6c 64 20 73 61 79 20 74 68 | we would say th| 000004a0 61 74 20 74 68 65 20 31 33 20 6d 6f 73 74 20 63 |at the 13 most c| 000004b0 6f 6d 6d 6f 6e 6c 79 20 75 73 65 64 0d 63 68 61 |ommonly used.cha| 000004c0 72 61 63 74 65 72 73 20 61 72 65 3a 0d 20 20 20 |racters are:. | 000004d0 20 20 20 65 20 74 20 61 20 6f 20 6e 20 72 20 69 | e t a o n r i| 000004e0 20 73 20 68 20 64 20 6c 20 66 20 5b 73 70 61 63 | s h d l f [spac| 000004f0 65 5d 0d 0d 61 6e 64 20 74 68 65 20 72 65 6d 61 |e]..and the rema| 00000500 69 6e 69 6e 67 20 66 6f 75 72 74 65 65 6e 20 6c |ining fourteen l| 00000510 65 74 74 65 72 73 2c 20 70 6c 75 73 20 61 20 63 |etters, plus a c| 00000520 6f 75 70 6c 65 20 6f 66 20 70 75 6e 63 74 75 61 |ouple of punctua| 00000530 74 69 6f 6e 20 6d 61 72 6b 73 2c 0d 6d 61 6b 65 |tion marks,.make| 00000540 20 75 70 20 31 36 20 6c 65 73 73 20 63 6f 6d 6d | up 16 less comm| 00000550 6f 6e 20 63 68 61 72 61 63 74 65 72 73 3a 0d 20 |on characters:. | 00000560 20 20 20 20 20 63 20 6d 20 75 20 67 20 79 20 70 | c m u g y p| 00000570 20 77 20 62 20 76 20 6b 20 78 20 69 20 71 20 7a | w b v k x i q z| 00000580 20 2c 20 2e 0d 0d 4e 6f 77 2c 20 69 66 20 77 65 | , ...Now, if we| 00000590 20 68 61 64 20 61 20 73 74 72 69 6e 67 20 6c 69 | had a string li| 000005a0 6b 65 20 22 68 65 6c 6c 6f 20 79 6f 75 22 2c 20 |ke "hello you", | 000005b0 62 65 66 6f 72 65 20 63 6f 6d 70 61 63 74 69 6f |before compactio| 000005c0 6e 20 69 74 20 77 6f 75 6c 64 0d 6c 6f 6f 6b 20 |n it would.look | 000005d0 6c 69 6b 65 20 74 68 69 73 3a 0d 20 20 20 20 20 |like this:. | 000005e0 20 62 79 74 65 20 20 20 31 20 20 32 20 20 33 20 | byte 1 2 3 | 000005f0 20 34 20 20 35 20 20 36 20 20 37 20 20 38 20 20 | 4 5 6 7 8 | 00000600 39 20 20 31 30 0d 20 20 20 20 20 20 20 20 20 20 |9 10. | 00000610 20 20 20 68 20 20 65 20 20 6c 20 20 6c 20 20 6f | h e l l o| 00000620 20 20 20 20 20 79 20 20 6f 20 20 75 20 20 2e 0d | y o u ..| 00000630 0d 41 70 70 6c 79 69 6e 67 20 74 68 65 20 72 75 |.Applying the ru| 00000640 6c 65 73 20 61 62 6f 76 65 2c 20 69 74 20 77 6f |les above, it wo| 00000650 75 6c 64 20 62 65 20 63 6f 6d 70 72 65 73 73 65 |uld be compresse| 00000660 64 20 74 6f 3a 0d 20 20 20 20 20 20 62 79 74 65 |d to:. byte| 00000670 20 20 20 31 20 20 32 20 20 33 20 20 34 20 20 35 | 1 2 3 4 5| 00000680 20 20 36 20 20 37 0d 20 20 20 20 20 20 76 61 6c | 6 7. val| 00000690 75 65 20 26 42 33 20 44 44 20 36 46 20 30 34 20 |ue &B3 DD 6F 04 | 000006a0 36 30 20 32 30 20 46 32 0d 20 20 20 20 20 20 20 |60 20 F2. | 000006b0 20 20 20 20 20 20 68 65 20 6c 6c 20 6f 20 20 20 | he ll o | 000006c0 79 20 6f 20 20 75 20 20 2e 0d 0d 54 68 65 20 73 |y o u ...The s| 000006d0 69 78 20 6e 69 62 62 6c 65 73 20 69 6e 20 62 79 |ix nibbles in by| 000006e0 74 65 73 20 31 2d 33 20 69 6e 64 65 78 20 69 6e |tes 1-3 index in| 000006f0 74 6f 20 61 20 74 61 62 6c 65 20 6f 66 20 6d 6f |to a table of mo| 00000700 73 74 20 63 6f 6d 6d 6f 6e 6c 79 20 75 73 65 64 |st commonly used| 00000710 0d 6c 65 74 74 65 72 73 2c 20 73 70 65 6c 6c 69 |.letters, spelli| 00000720 6e 67 20 22 68 65 6c 6c 6f 20 22 2e 20 52 65 6d |ng "hello ". Rem| 00000730 65 6d 62 65 72 20 74 68 61 74 20 30 2c 20 31 20 |ember that 0, 1 | 00000740 61 6e 64 20 32 20 68 61 76 65 20 73 70 65 63 69 |and 2 have speci| 00000750 61 6c 0d 6d 65 61 6e 69 6e 67 73 2c 20 77 68 69 |al.meanings, whi| 00000760 63 68 20 69 73 20 77 68 79 20 74 68 65 20 6c 65 |ch is why the le| 00000770 74 74 65 72 20 65 20 68 61 73 20 74 68 65 20 69 |tter e has the i| 00000780 6e 64 65 78 20 76 61 6c 75 65 20 6f 66 20 33 2e |ndex value of 3.| 00000790 20 54 68 65 20 30 20 69 6e 0d 62 79 74 65 20 34 | The 0 in.byte 4| 000007a0 20 69 6e 64 69 63 61 74 65 73 20 74 68 61 74 20 | indicates that | 000007b0 74 68 65 20 6e 65 78 74 20 6e 69 62 62 6c 65 20 |the next nibble | 000007c0 69 6e 64 65 78 65 73 20 69 6e 74 6f 20 61 20 74 |indexes into a t| 000007d0 61 62 6c 65 20 6f 66 20 6c 65 73 73 0d 63 6f 6d |able of less.com| 000007e0 6d 6f 6e 6c 79 20 75 73 65 64 20 6c 65 74 74 65 |monly used lette| 000007f0 72 73 2e 20 41 6e 64 20 73 6f 20 6f 6e 2c 20 75 |rs. And so on, u| 00000800 6e 74 69 6c 20 74 68 65 20 32 20 61 74 20 74 68 |ntil the 2 at th| 00000810 65 20 65 6e 64 20 66 6c 61 67 73 20 74 68 65 20 |e end flags the | 00000820 65 6e 64 0d 6f 66 20 74 68 65 20 74 65 78 74 2e |end.of the text.| 00000830 0d 20 20 20 4e 6f 77 20 77 65 27 6c 6c 20 70 75 |. Now we'll pu| 00000840 74 20 74 68 69 73 20 69 6e 74 6f 20 70 72 61 63 |t this into prac| 00000850 74 69 63 65 20 77 69 74 68 20 70 72 6f 67 72 61 |tice with progra| 00000860 6d 20 43 4f 4d 50 52 45 53 53 2e 20 41 66 74 65 |m COMPRESS. Afte| 00000870 72 20 74 68 65 0d 6d 61 63 68 69 6e 65 20 63 6f |r the.machine co| 00000880 64 65 20 68 61 73 20 62 65 65 6e 20 61 73 73 65 |de has been asse| 00000890 6d 62 6c 65 64 2c 20 79 6f 75 27 6c 6c 20 62 65 |mbled, you'll be| 000008a0 20 61 73 6b 65 64 20 74 6f 20 74 79 70 65 20 73 | asked to type s| 000008b0 6f 6d 65 20 74 65 78 74 2e 20 41 74 0d 74 68 65 |ome text. At.the| 000008c0 20 6d 6f 6d 65 6e 74 2c 20 6f 6e 6c 79 20 6c 6f | moment, only lo| 000008d0 77 65 72 2d 63 61 73 65 20 63 68 61 72 61 63 74 |wer-case charact| 000008e0 65 72 73 2c 20 73 70 61 63 65 73 2c 20 63 6f 6d |ers, spaces, com| 000008f0 6d 61 73 20 61 6e 64 20 66 75 6c 6c 20 73 74 6f |mas and full sto| 00000900 70 73 0d 61 72 65 20 61 6c 6c 6f 77 65 64 2e 20 |ps.are allowed. | 00000910 49 66 20 79 6f 75 20 74 72 79 20 61 6e 79 74 68 |If you try anyth| 00000920 69 6e 67 20 65 6c 73 65 20 79 6f 75 27 6c 6c 20 |ing else you'll | 00000930 67 65 74 20 61 6e 20 65 72 72 6f 72 20 6d 65 73 |get an error mes| 00000940 73 61 67 65 3a 0d 20 20 20 20 20 20 20 49 6c 6c |sage:. Ill| 00000950 65 67 61 6c 20 63 68 61 72 61 63 74 65 72 0d 0d |egal character..| 00000960 54 68 65 20 69 6e 70 75 74 20 73 74 72 69 6e 67 |The input string| 00000970 20 69 73 20 68 65 6c 64 20 61 74 20 27 69 6e 74 | is held at 'int| 00000980 65 78 74 25 27 20 61 6e 64 20 74 68 65 20 72 65 |ext%' and the re| 00000990 73 75 6c 74 61 6e 74 20 63 6f 6d 70 72 65 73 73 |sultant compress| 000009a0 65 64 0d 73 74 72 69 6e 67 20 69 73 20 70 6c 61 |ed.string is pla| 000009b0 63 65 64 20 61 74 20 27 6f 75 74 74 65 78 74 25 |ced at 'outtext%| 000009c0 27 2e 20 54 68 65 73 65 20 6c 6f 63 61 74 69 6f |'. These locatio| 000009d0 6e 73 20 61 72 65 20 69 6e 64 69 72 65 63 74 6c |ns are indirectl| 000009e0 79 20 61 64 64 72 65 73 73 65 64 0d 76 69 61 20 |y addressed.via | 000009f0 27 69 6e 70 75 74 27 20 61 6e 64 20 27 6f 75 74 |'input' and 'out| 00000a00 70 75 74 27 2e 0d 20 20 20 54 68 65 20 6d 61 63 |put'.. The mac| 00000a10 68 69 6e 65 20 63 6f 64 65 20 72 6f 75 74 69 6e |hine code routin| 00000a20 65 20 27 63 6f 6d 70 72 65 73 73 27 20 63 6f 6d |e 'compress' com| 00000a30 70 61 72 65 73 20 65 61 63 68 20 63 68 61 72 61 |pares each chara| 00000a40 63 74 65 72 20 6f 66 20 74 68 65 0d 69 6e 70 75 |cter of the.inpu| 00000a50 74 20 73 74 72 69 6e 67 20 74 6f 20 74 68 6f 73 |t string to thos| 00000a60 65 20 69 6e 20 27 63 6f 6d 6d 74 61 62 27 20 61 |e in 'commtab' a| 00000a70 6e 64 20 27 72 61 72 65 74 61 62 27 2c 20 74 61 |nd 'raretab', ta| 00000a80 62 6c 65 73 20 6f 66 20 6d 6f 72 65 20 61 6e 64 |bles of more and| 00000a90 0d 6c 65 73 73 20 63 6f 6d 6d 6f 6e 6c 79 2d 75 |.less commonly-u| 00000aa0 73 65 64 20 63 68 61 72 61 63 74 65 72 73 2e 20 |sed characters. | 00000ab0 49 66 20 61 20 6d 61 74 63 68 20 69 73 20 66 6f |If a match is fo| 00000ac0 75 6e 64 2c 20 74 68 65 20 69 6e 64 65 78 20 76 |und, the index v| 00000ad0 61 6c 75 65 20 69 6e 0d 74 68 65 20 59 20 72 65 |alue in.the Y re| 00000ae0 67 69 73 74 65 72 20 69 73 20 77 72 69 74 74 65 |gister is writte| 00000af0 6e 20 74 6f 20 74 68 65 20 6f 75 74 70 75 74 20 |n to the output | 00000b00 73 74 72 69 6e 67 2c 20 70 72 65 63 65 65 64 65 |string, preceede| 00000b10 64 20 62 79 20 61 20 7a 65 72 6f 0d 6e 69 62 62 |d by a zero.nibb| 00000b20 6c 65 20 69 66 20 74 68 65 20 6d 61 74 63 68 65 |le if the matche| 00000b30 64 20 63 68 61 72 61 63 74 65 72 20 77 61 73 20 |d character was | 00000b40 61 20 6c 65 73 73 20 63 6f 6d 6d 6f 6e 6c 79 20 |a less commonly | 00000b50 75 73 65 64 20 6f 6e 65 2e 0d 20 20 20 4e 6f 74 |used one.. Not| 00000b60 65 20 68 6f 77 20 27 6d 61 73 6b 27 20 69 73 20 |e how 'mask' is | 00000b70 75 73 65 64 20 62 79 20 27 77 72 74 6e 69 62 27 |used by 'wrtnib'| 00000b80 20 74 6f 20 64 65 74 65 72 6d 69 6e 65 20 74 68 | to determine th| 00000b90 65 20 70 6f 73 69 74 69 6f 6e 20 61 74 0d 77 68 |e position at.wh| 00000ba0 69 63 68 20 74 68 65 20 6e 69 62 62 6c 65 20 69 |ich the nibble i| 00000bb0 73 20 74 6f 20 62 65 20 77 72 69 74 74 65 6e 2e |s to be written.| 00000bc0 20 49 66 20 27 6d 61 73 6b 27 20 69 73 20 30 2c | If 'mask' is 0,| 00000bd0 20 74 68 65 20 6e 69 62 62 6c 65 20 69 73 0d 6d | the nibble is.m| 00000be0 75 6c 74 69 70 6c 69 65 64 20 62 79 20 31 36 20 |ultiplied by 16 | 00000bf0 61 6e 64 20 73 74 6f 72 65 64 20 69 6e 20 74 68 |and stored in th| 00000c00 65 20 6f 75 74 70 75 74 20 73 74 72 69 6e 67 20 |e output string | 00000c10 61 74 20 74 68 65 20 63 75 72 72 65 6e 74 20 6f |at the current o| 00000c20 66 66 73 65 74 0d 27 6f 75 74 70 74 72 27 2e 20 |ffset.'outptr'. | 00000c30 49 66 20 69 74 27 73 20 26 46 46 2c 20 74 68 65 |If it's &FF, the| 00000c40 20 6e 69 62 62 6c 65 20 69 73 20 4f 52 65 64 20 | nibble is ORed | 00000c50 69 6e 74 6f 20 74 68 65 20 72 69 67 68 74 20 68 |into the right h| 00000c60 61 6e 64 20 6e 69 62 62 6c 65 20 6f 66 0d 74 68 |and nibble of.th| 00000c70 65 20 62 79 74 65 20 61 74 20 6f 66 66 73 65 74 |e byte at offset| 00000c80 20 27 6f 75 74 70 74 72 27 2c 20 61 6e 64 20 27 | 'outptr', and '| 00000c90 6f 66 66 73 65 74 27 20 61 64 76 61 6e 63 65 64 |offset' advanced| 00000ca0 20 62 79 20 6f 6e 65 20 74 6f 20 70 6f 69 6e 74 | by one to point| 00000cb0 20 74 6f 0d 74 68 65 20 6e 65 78 74 20 62 79 74 | to.the next byt| 00000cc0 65 20 69 6e 20 74 68 65 20 74 68 65 20 6f 75 74 |e in the the out| 00000cd0 70 75 74 20 73 74 72 69 6e 67 2e 0d 20 20 20 49 |put string.. I| 00000ce0 66 20 74 68 65 20 63 68 61 72 61 63 74 65 72 20 |f the character | 00000cf0 69 6e 20 74 68 65 20 69 6e 70 75 74 20 73 74 72 |in the input str| 00000d00 69 6e 67 20 69 73 20 61 20 63 61 72 72 69 61 67 |ing is a carriag| 00000d10 65 20 72 65 74 75 72 6e 2c 20 61 0d 63 6f 72 72 |e return, a.corr| 00000d20 65 73 70 6f 6e 64 69 6e 67 20 74 77 6f 20 6e 69 |esponding two ni| 00000d30 62 62 6c 65 20 69 73 20 73 74 6f 72 65 64 20 69 |bble is stored i| 00000d40 6e 20 74 68 65 20 6f 75 74 70 75 74 20 73 74 72 |n the output str| 00000d50 69 6e 67 20 74 6f 20 73 69 67 6e 69 66 79 20 74 |ing to signify t| 00000d60 68 65 0d 65 6e 64 20 6f 66 20 74 68 65 20 74 65 |he.end of the te| 00000d70 78 74 2e 0d 20 20 20 27 65 78 70 61 6e 64 27 20 |xt.. 'expand' | 00000d80 69 73 20 6a 75 73 74 20 74 68 65 20 72 65 76 65 |is just the reve| 00000d90 72 73 65 20 6f 66 20 74 68 65 20 63 6f 6d 70 72 |rse of the compr| 00000da0 65 73 73 69 6f 6e 20 70 72 6f 63 65 73 73 2e 20 |ession process. | 00000db0 54 68 65 20 69 6e 70 75 74 0d 73 74 72 69 6e 67 |The input.string| 00000dc0 20 6d 75 73 74 20 62 65 20 61 20 70 72 65 76 69 | must be a previ| 00000dd0 6f 75 73 6c 79 20 63 6f 6d 70 72 65 73 73 65 64 |ously compressed| 00000de0 20 6f 6e 65 2e 20 54 68 65 20 6e 69 62 62 6c 65 | one. The nibble| 00000df0 73 20 61 72 65 20 69 6e 73 70 65 63 74 65 64 20 |s are inspected | 00000e00 69 6e 0d 74 75 72 6e 20 74 6f 20 73 65 65 20 69 |in.turn to see i| 00000e10 66 20 61 20 6d 6f 72 65 20 6f 72 20 6c 65 73 73 |f a more or less| 00000e20 20 63 6f 6d 6d 6f 6e 20 63 68 61 72 61 63 74 65 | common characte| 00000e30 72 20 73 68 6f 75 6c 64 20 62 65 20 70 72 69 6e |r should be prin| 00000e40 74 65 64 20 6f 6e 20 74 68 65 0d 73 63 72 65 65 |ted on the.scree| 00000e50 6e 2e 0d 20 20 20 54 72 79 20 74 79 70 69 6e 67 |n.. Try typing| 00000e60 20 69 6e 20 22 73 68 65 20 73 65 6c 6c 73 20 73 | in "she sells s| 00000e70 65 61 20 73 68 65 6c 6c 73 20 6f 6e 20 74 68 65 |ea shells on the| 00000e80 20 73 65 61 20 73 68 6f 72 65 22 2e 20 54 68 65 | sea shore". The| 00000e90 20 70 72 6f 67 72 61 6d 0d 77 6f 72 6b 73 20 6f | program.works o| 00000ea0 75 74 20 74 68 65 20 63 6f 6d 70 72 65 73 73 69 |ut the compressi| 00000eb0 6f 6e 20 66 61 63 74 6f 72 2c 20 61 6e 64 20 69 |on factor, and i| 00000ec0 6e 20 74 68 69 73 20 63 61 73 65 20 69 74 27 73 |n this case it's| 00000ed0 20 61 62 6f 75 74 20 34 38 25 21 20 49 6e 0d 74 | about 48%! In.t| 00000ee0 68 65 20 6e 65 78 74 20 61 72 74 69 63 6c 65 20 |he next article | 00000ef0 68 6f 77 65 76 65 72 20 77 65 27 6c 6c 20 62 65 |however we'll be| 00000f00 20 6c 6f 6f 6b 69 6e 67 20 61 74 20 68 6f 77 20 | looking at how | 00000f10 74 6f 6b 65 6e 73 20 63 61 6e 20 61 63 68 69 65 |tokens can achie| 00000f20 76 65 20 61 0d 66 61 63 74 6f 72 20 6f 66 20 75 |ve a.factor of u| 00000f30 70 20 74 6f 20 36 30 25 21 0d 20 20 20 20 20 20 |p to 60%!. | 00000f40 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 | | * 00000f60 20 20 20 20 20 43 68 72 69 73 74 6f 70 68 65 72 | Christopher| 00000f70 20 44 65 77 68 75 72 73 74 2c 20 45 55 47 20 23 | Dewhurst, EUG #| 00000f80 35 33 0d |53.| 00000f83