A First Course in Electrical and Computer Engineering by Louis Scharf - HTML preview

PLEASE NOTE: This is an HTML preview only and some elements such as links or page numbers may be incorrect.
Download the book in PDF, ePub, Kindle for a complete version.

Chapter 7Binary Codes

7.1Binary Codes: Introduction*

This module is part of the collection, A First Course in Electrical and Computer Engineering. The LaTeX source files for this collection were created using an optical character recognition technology, and because of this process there may be more errors than usual. Please contact us if you discover any errors.

Acknowledgment: Richard Hamming's book, Information Theory andCoding, Prentice-Hall, New York (1985) and C. T. Mullis's unpublished notes have influenced our treatment of binary codes. The numerical experiment was developed by Mullis.

We use this chapter to introduce students to the communication paradigm and to show how arbitrary symbols may be represented by binary codes. These symbols and their corresponding binary codes may be computer instructions, integer data, approximations to real data, and so on.

We develop some ad hoc tree codes for representing information and then develop Huffman codes for optimizing the use of bits. Hamming codes add check bits to a binary word so that errors may be detected and corrected. The numerical experiment has the students design a Huffman code for coding Lincoln's Gettysburg Address.

Introduction

It would be stretching our imagination to suggest that Sir Francis had digital audio on his minde (sic) when he wrote the prophetic words

Sir Francis Bacon, 1623

...a man may express and signifie the intentions of his minde, at any distance... by... objects... capable of a twofold difference onely.

Nonetheless, this basic idea forms the basis of everything we do in digital computing, digital communications, and digital audio/video. In 1832, Samuel F. B. Morse used the very same idea to propose that telegram words be coded into binary addresses or binary codes that could be transmitted over telegraph lines and decoded at the receiving end to unravel the telegram. Morse abandoned his scheme, illustrated in Figure 1, as too complicated and, in 1838, proposed his fabled Morse code for coding letters (instead of words) into objects (dots, dashes, spaces) capable of a threefold difference onely (sic).

Sir Francis Bacon, 1623
Figure 7.1
Generalized Coder-Decoder

The basic idea of Figure 1 is used today in cryptographic systems, where the “address ai" is an encyphered version of a message wi ; in vector quantizers, where the “address ai" is the address of a close approximation to data wi ; in coded satellite transmissions, where the “address ai" is a data word wi plus parity check bits for detecting and correcting errors; in digital audio systems, where the “address ai" is a stretch of digitized and coded music; and in computer memories, where ai is an address (a coded version of a word of memory) and wi is a word in memory.

In this chapter we study three fundamental questions in the construction of binary addresses or binary codes. First, what are plausible schemes for mapping symbols (such as words, letters, computer instructions, voltages, pressures, etc.) into binary codes? Second, what are plausible schemes for coding likely symbols with short binary words and unlikely symbols with long words in order to minimize the number of binary digits (bits) required to represent a message? Third, what are plausible schemes for “coding” binary words into longer binary words that contain “redundant bits” that may be used to detect and correct errors? These are not new questions. They have occupied the minds of many great thinkers. Sir Francis recognized that arbitrary messages had binary representations. Alan Turing, Alonzo Church, and Kurt Goedel studied binary codes for computations in their study of computable numbers and algorithms. Claude Shannon, R. C. Bose, Irving Reed, Richard Hamming, and many others have studied error control codes. Shan- non, David Huffman, and many others have studied the problem of efficiently coding information.

In this chapter we outline the main ideas in binary coding and illustrate the role that binary coding plays in digital communications. In your subsequent courses in electrical and computer engineering you will study integrated circuits for building coders and decoders and mathematical models for designing good codes.

7.2Binary Codes: The Communication Paradigm*

This module is part of the collection, A First Course in Electrical and Computer Engineering. The LaTeX source files for this collection were created using an optical character recognition technology, and because of this process there may be more errors than usual. Please contact us if you discover any errors.

A paradigm is a pattern of ideas that form the foundation for a body of knowledge. A paradigm for (tele-) communication theory is a pattern of basic building blocks that may be applied to the dual problems of (i) reliably transmitting information from source to receiver at high speed or (ii) reliably storing information from source to memory at high density. High-speed communication permits us to accommodate many low-rate sources (such as audio) or one high-rate source (such as video). High-density storage permits us to store a large amount of information in a small space. For example, a typical 1.2 Mbyte floppy disc stores 9.6×106 bits of information, whereas a typical CD stores about 2×109 bits, enough for one hour's worth of high-quality sound.

Binary Codes: The Communication Paradigm
Figure 7.2
Basic Building Blocks in a (Tele-) Communication System

Figure 1 illustrates the basic building blocks that apply to any problem in the theory of (tele-) communication. The source is an arbitrary source of information. It can be the time-varying voltage at the output of a vibration sensor (such as an integrating accelerometer for measuring motion or a microphone for measuring sound pressure); it can be the charges stored in the CCD array of a solid-state camera; it can be the addresses generated from a sequence of keystrokes at a computer terminal; it can be a sequence of instructions in a computer program. The source coder is a device for turning primitive source outputs into more efficient representations. For example, in a recording studio, the source coder would convert analog voltages into digital approximations using an A/D converter; a fancy source coder would use a fancy A/D converter that finely quantized likely analog values and crudely quantized unlikely values. If the source is a source of discrete symbols like letters and numbers, then a fancy source code would assign short binary sequences to likely symbols (such as e ) and long binary sequences to unlikely symbols (such as z ). The channel coder adds “redundant bits” to the binary output of the source coder so that errors of transmission or storage may be detected and corrected. In the simplest example, a binary string of the form 01001001 would have an extra bit of 1 added to give even parity (an even number of l's) to the string; the string 10110111 would have an extra bit of 0 added to preserve the even parity. If one bit error is introduced in the channel, then the parity is odd and the receiver knows that an error has occurred. The modulator takes outputs of the channel coder, a stream of binary digits, and constructs an analog waveform that represents a block of bits. For example, in a 9600 baud Modem, five bits are used to determine one of 25=32 phases that are used to modulate the signal A cos(ωt+φ). Each possible string of five bits has its own personalized phase, φ , and this phase can be determined at the receiver. The signal Acos(ωt+φ) is an analog signal that may be transmitted over a channel (such as a telephone line, a microwave link, or a fiber-optic cable). The channel has a finite bandwidth, meaning that it distorts signals, and it is subject to noise or interference from other electromagnetic radiation. Therefore transmitted information arrives at the demodulator in imperfect form. The demodulator uses filters matched to the modulated signals to demodulate the phase and look up the corresponding bit stream. The channel decoder converts the coded bit stream into the information bit stream, and the source decoder looks up the corresponding symbol. This sequence of steps is illustrated symbolically in Figure 7.3.

Binary Codes: The Communication Paradigm
Figure 7.3
Symbolic Representation of Communication

In your subsequent courses on communication theory you will study each block of Figure 1 in detail. You will find that every source of information has a characteristic complexity, called entropy, that determines the minimum rate at which bits must be generated in order to represent the source. You will also find that every communication channel has a characteristic tolerance for bits, called channel capacity. This capacity depends on signal-to-noise ratio and bandwidth. When the channel capacity exceeds the source entropy, then you can transmit information reliably; if it does not, then you cannot.

7.3Binary Codes: From Symbols to Binary Codes*

This module is part of the collection, A First Course in Electrical and Computer Engineering. The LaTeX source files for this collection were created using an optical character recognition technology, and because of this process there may be more errors than usual. Please contact us if you discover any errors.

Perhaps the most fundamental idea in communication theory is that arbitrary symbols may be represented by strings of binary digits. These strings are called binary words, binary addresses, or binary codes. In the simplest of cases, a finite alphabet consisting of the letters or symbols s0,s1,...,sM–1 is represented by binary codes. The obvious way to implement the representation is to let the ith binary code be the binary representation for the subscript i :

(7.1)
_autogen-svg2png-0004.png

The number of bits required for the binary code is N where

(7.2)2N–1<M≤2N.

We say, roughly, that N=log2M.

Octal Codes. When the number of symbols is large and the corresponding binary codes contain many bits, then we typically group the bits into groups of three and replace the binary code by its corresponding octal code. For example, a seven-bit binary code maps into a three-digit octal code as follows:

(7.3)
_autogen-svg2png-0008.png

The octal ASCII codes for representing letters, numbers, and special characters are tabulated in Table 1.

Exercise 1.

Write out the seven-bit ASCII codes for A,q,7,and{.

Table 7.1. Octal ASCII Codes (from Donald E. Knuth, The TEXbook, ©1986 by the American Mathematical Society, Providence, Rhode Island p. 367, published by Addison-Wesley Publishing Co.)
 '0'1'2'3'4'5'6'7
'00x
'01x
'02x
'03x
'04x!"#$%&'
'05x()*+,-./
'06x01234567
'07x89:;<=>?
'10x@ABCDEFG
'11xHIJKLMNO
'12x