Compression for CDs and DVDs

Written by Donald R.J. White
March 24, 2008

Warning: This article is fairly technical. If you're not looking to understand compression algorithms, then the following will probably hurt your brain!

There are different compression techniques that can be used for compressing plain text languages such as English, latin-based, Slovak and the like. Two examples are (1) RLE for Run length encoding and (2) LZ encoding to cover entropy encoding and Lamped-Ziv compression. While impressive compressions are available for video, as described in the next section, the best that can be expected for plain text is about 3:1.

Thus, a strong argument can be used for not bothering to compress alphanumerics at all, especially if there is embedded video such as graphics, sketches, drawings, and photos. For example, assume a report has 80 % plain text and 20% graphics. Further, suppose the graphics, requiring 1,000 x the surface area can be compressed 100:1, then the results of the compressed and uncompressed documents per page are:

Uncompressed per 6.5" x 9" page (59 sq in): (for #10 point and resolution of 200 lines/inch):

Pixels = 80% x 7 x 85 cpi x 59 sq in x 6b/c/8b/B = 26 kB for the text

20% x 59 x 200 x 200 x 8b/gray = 3.78 MB for gray scale (no color) images

Total = 0.025 MB + 3.78 MB = 3.805 MB

Compressed, assuming 100:1 image only compression :

Total = 0.025 MN + 0.01 x 3.78 MB = 63 kB

Or assuming that text is treated as an image: 59 x 200 x 200 x 8 x 0.01 = 189 kB

Lessons learned from the above is that text with its compliment of assigned font, point, leading, typeface, location, etc. takes up very little room on a disc relative to video. Expect 3:1 for the best text compression. Thus, consider no text compression, if the document contains more than 2% images. Never compress text by scanning in as this will actually multiply the storage requirements.