A Mathematical Theory of Communication by C. E. Shannon - HTML preview

/ Home / Mathematics / A Mathematical Theory of Communication

PLEASE NOTE: This is an HTML preview only and some elements such as links or page numbers may be incorrect.
Download the book in PDF, ePub, Kindle for a complete version.

PART IV: THE CONTINUOUS CHANNEL

24. THE CAPACITY OF A CONTINUOUS CHANNEL

In a continuous channel the input or transmitted signals will be continuous functions of time f t belonging to a certain set, and the output or received signals will be perturbed versions of these. We will consider

only the case where both transmitted and received signals are limited to a certain band W . They can then be specified, for a time T , by 2 TW numbers, and their statistical structure by finite dimensional distribution functions. Thus the statistics of the transmitted signal will be determined by

P x 1

P x

;

and those of the noise by the conditional probability distribution

P y

;

;:::;

The rate of transmission of information for a continuous channel is defined in a way analogous to that

for a discrete channel, namely

H x

Hy x

where H x is the entropy of the input and Hy x the equivocation. The channel capacity C is defined as the maximum of R when we vary the input over all possible ensembles. This means that in a finite dimensional approximation we must vary P x

P x 1

xn and maximize

;

P x y

P x log P x dx

P x y log ; dx dy

;

P y

This can be written

Z Z

P x y

P x y log

;

dx dy

;

P x P y

Z Z

using the fact that

P x y log P x dx dy

P x log P x dx. The channel capacity is thus expressed as

;

follows:

1 ZZ

P x y

Lim Max

P x y log

;

dx dy

;

∞ P x T

P x P y

It is obvious in this form that R and C are independent of the coordinate system since the numerator P x y

and denominator in log

;

will be multiplied by the same factors when x and y are transformed in

P x P y

any one-to-one way. This integral expression for C is more general than H x

Hy x . Properly interpreted

(see Appendix 7) it will always exist while H x

Hy x may assume an indeterminate form ∞

∞ in some

cases. This occurs, for example, if x is limited to a surface of fewer dimensions than n in its n dimensional approximation.

If the logarithmic base used in computing H x and Hy x is two then C is the maximum number of binary digits that can be sent per second over the channel with arbitrarily small equivocation, just as in

the discrete case. This can be seen physically by dividing the space of signals into a large number of

small cells, sufficiently small so that the probability density Px y of signal x being perturbed to point y is substantially constant over a cell (either of x or y). If the cells are considered as distinct points the situation is essentially the same as a discrete channel and the proofs used there will apply. But it is clear physically that this quantizing of the volume into individual points cannot in any practical situation alter the final answer

significantly, provided the regions are sufficiently small. Thus the capacity will be the limit of the capacities for the discrete subdivisions and this is just the continuous capacity defined above.

On the mathematical side it can be shown first (see Appendix 7) that if u is the message, x is the signal, y is the received signal (perturbed by noise) and v is the recovered message then

H x

Hy x

H u

Hv u

regardless of what operations are performed on u to obtain x or on y to obtain v. Thus no matter how we encode the binary digits to obtain the signal, or how we decode the received signal to recover the message,

the discrete rate for the binary digits does not exceed the channel capacity we have defined. On the other

hand, it is possible under very general conditions to find a coding system for transmitting binary digits at the rate C with as small an equivocation or frequency of errors as desired. This is true, for example, if, when we take a finite dimensional approximating space for the signal functions, P x y is continuous in both x and y

;

except at a set of points of probability zero.

An important special case occurs when the noise is added to the signal and is independent of it (in the

probability sense). Then Px y is a function only of the difference n

x ,

Px y

Q y

and we can assign a definite entropy to the noise (independent of the statistics of the signal), namely the

entropy of the distribution Q n . This entropy will be denoted by H n .

Theorem 16: If the signal and noise are independent and the received signal is the sum of the transmitted signal and the noise then the rate of transmission is

H y

H n

;

i.e., the entropy of the received signal less the entropy of the noise. The channel capacity is

Max H y

H n

P x

We have, since y

H x y

H x n

;

Expanding the left side and using the fact that x and n are independent

H y

Hy x

H x

H n

Hence

H x

Hy x

H y

H n

Since H n is independent of P x , maximizing R requires maximizing H y , the entropy of the received signal. If there are certain constraints on the ensemble of transmitted signals, the entropy of the received

signal must be maximized subject to these constraints.

25. CHANNEL CAPACITY WITH AN AVERAGE POWER LIMITATION

A simple application of Theorem 16 is the case when the noise is a white thermal noise and the transmitted

signals are limited to a certain average power P. Then the received signals have an average power P

where N is the average noise power. The maximum entropy for the received signals occurs when they also form a white noise ensemble since this is the greatest possible entropy for a power P

N and can be obtained

by a suitable choice of transmitted signals, namely if they form a white noise ensemble of power P. The entropy (per second) of the received ensemble is then

H y

W log 2 e P

;

and the noise entropy is

H n

W log 2 eN

The channel capacity is

H y

H n

W log

Summarizing we have the following:

Theorem 17: The capacity of a channel of band W perturbed by white thermal noise power N when the average transmitter power is limited to P is given by

W log

This means that by sufficiently involved encoding systems we can transmit binary digits at the rate

W log

bits per second, with arbitrarily small frequency of errors. It is not possible to transmit at a

higher rate by any encoding system without a definite positive frequency of errors.

To approximate this limiting rate of transmission the transmitted signals must approximate, in statistical

properties, a white noise.6 A system which approaches the ideal rate may be described as follows: Let

6This and other properties of the white noise case are discussed from the geometrical point of view in “Communication in the Presence of Noise,” loc. cit.

2 s samples of white noise be constructed each of duration T . These are assigned binary numbers from

0 to M

1. At the transmitter the message sequences are broken up into groups of s and for each group

the corresponding noise sample is transmitted as the signal. At the receiver the M samples are known and the actual received signal (perturbed by noise) is compared with each of them. The sample which has the

least R.M.S. discrepancy from the received signal is chosen as the transmitted signal and the corresponding

binary number reconstructed. This process amounts to choosing the most probable ( a posteriori) signal.

The number M of noise samples used will depend on the tolerable frequency

of errors, but for almost all

selections of samples we have

log M

;

Lim Lim

W log

;

0 T

∞

so that no matter how small

is chosen, we can, by taking T sufficiently large, transmit as near as we wish

to TW log

binary digits in the time T .

Formulas similar to C

W log

for the white noise case have been developed independently

by several other writers, although with somewhat different interpretations. We may mention the work of

N. Wiener,7 W. G. Tuller,8 and H. Sullivan in this connection.

In the case of an arbitrary perturbing noise (not necessarily white thermal noise) it does not appear that

the maximizing problem involved in determining the channel capacity C can be solved explicitly. However, upper and lower bounds can be set for C in terms of the average noise power N the noise entropy power N 1.

These bounds are sufficiently close together in most practical cases to furnish a satisfactory solution to the problem.

Theorem 18: The capacity of a channel of band W perturbed by an arbitrary noise is bounded by the inequalities

N 1

W log

N 1

where

average transmitter power

average noise power

N 1

entropy power of the noise.

Here again the average power of the perturbed signals will be P

N. The maximum entropy for this

power would occur if the received signal were white noise and would be W log 2 e P

N . It may not

be possible to achieve this; i.e., there may not be any ensemble of transmitted signals which, added to the

perturbing noise, produce a white thermal noise at the receiver, but at least this sets an upper bound to H y .

We have, therefore

Max H y

H n

W log 2 e P

W log 2 eN 1

This is the upper limit given in the theorem. The lower limit can be obtained by considering the rate if we

make the transmitted signal a white noise, of power P. In this case the entropy power of the received signal must be at least as great as that of a white noise of power P

N 1 since we have shown in in a previous

theorem that the entropy power of the sum of two ensembles is greater than or equal to the sum of the

individual entropy powers. Hence

Max H y

W log 2 e P

N 1

7 Cybernetics, loc. cit.

8“Theoretical Limitations on the Rate of Transmission of Information,” Proceedings of the Institute of Radio Engineers, v. 37, No. 5, May, 1949, pp. 468–78.

and

W log 2 e P

N 1

W log 2 eN 1

N 1

W log

N 1

As P increases, the upper and lower bounds approach each other, so we have as an asymptotic rate

W log

N 1

If the noise is itself white, N

N 1 and the result reduces to the formula proved previously:

W log 1

If the noise is Gaussian but with a spectrum which is not necessarily flat, N 1 is the geometric mean of the noise power over the various frequencies in the band W . Thus

1 Z

N 1

exp

log N f d f

W W

where N f is the noise power at frequency f .

Theorem 19: If we set the capacity for a given transmitter power P equal to

W log