Fast Fourier Transforms (6x9 Version) by C. Sidney Burrus - HTML preview

PLEASE NOTE: This is an HTML preview only and some elements such as links or page numbers may be incorrect.
Download the book in PDF, ePub for a complete version.

Chapter 5

The DFT as Convolution or

Filtering1

A major application of the FFT is fast convolution or fast filtering

where the DFT of the signal is multiplied term-by-term by the DFT

of the impulse (helps to be doing finite impulse response (FIR) fil-

tering) and the time-domain output is obtained by taking the in-

verse DFT of that product. What is less well-known is the DFT

can be calculated by convolution. There are several different ap-

proaches to this, each with different application.

5.1 Rader’s Conversion of the DFT into

Convolution

In this section a method quite different from the index mapping or

polynomial evaluation is developed. Rather than dealing with the

DFT directly, it is converted into a cyclic convolution which must

then be carried out by some efficient means. Those means will

be covered later, but here the conversion will be explained. This

method requires use of some number theory, which can be found

1This content is available online at <http://cnx.org/content/m16328/1.9/>.

Available for free at Connexions

<http://cnx.org/content/col10683/1.5>

35

CHAPTER 5. THE DFT AS

36

CONVOLUTION OR FILTERING

in an accessible form in [234] or [262] and is easy enough to verify

on one’s own. A good general reference on number theory is [259].

The DFT and cyclic convolution are defined by

N−1

C (k) = ∑ x(n) W nk

(5.1)

n=0

N−1

y (k) = ∑ x(n) h(k − n)

(5.2)

n=0

For both, the indices are evaluated modulo N. In order to convert

the DFT in (5.1) into the cyclic convolution of (5.2), the nk prod-

uct must be changed to the k − n difference. With real numbers,

this can be done with logarithms, but it is more complicated when

working in a finite set of integers modulo N. From number theory

[28], [234], [262], [259], it can be shown that if the modulus is a

prime number, a base (called a primitive root) exists such that a

form of integer logarithm can be defined. This is stated in the fol-

lowing way. If N is a prime number, a number r called a primitive

roots exists such that the integer equation

n = ((rm))

(5.3)

N

creates a unique, one-to-one map of the N − 1 member set m =

{0, ..., N − 2} and the N − 1 member set n = {1, ..., N − 1}. This is

because the multiplicative group of integers modulo a prime, p, is

isomorphic to the additive group of integers modulo (p − 1) and is

illustrated for N = 5 below.

Available for free at Connexions

<http://cnx.org/content/col10683/1.5>

37

r

m=

0

1

2

3

4

5

6

7

1

1

1

1

1

1

1

1

1

2

1

2

4

3

1

2

4

3

3

1

3

4

2

1

3

4

2

4

1

4

1

4

1

4

1

4

5

*

0

0

0

*

0

0

0

6

1

1

1

1

1

1

1

1

Table 5.1: Table of Integers n = ((rm)) modulo 5, [* not defined]

Table 5.1 is an array of values of rm modulo N and it is easy to

see that there are two primitive roots, 2 and 3, and (5.3) defines

a permutation of the integers n from the integers m (except for

zero). (5.3) and a primitive root (usually chosen to be the smallest

of those that exist) can be used to convert the DFT in (5.1) to the

convolution in (5.2). Since (5.3) cannot give a zero, a new length-

(N-1) data sequence is defined from x (n) by removing the term

with index zero. Let

n = r−m

(5.4)

and

k = rs

(5.5)

where the term with the negative exponent (the inverse) is defined

as the integer that satisfies

r−mrm

= 1

(5.6)

N

If N is a prime number, r−m always exists.

For example,

Available for free at Connexions

<http://cnx.org/content/col10683/1.5>

CHAPTER 5. THE DFT AS

38

CONVOLUTION OR FILTERING

2−1

= 3. (5.1) now becomes

5

N−2

C (rs) = ∑ x r−m W r−mrs + x(0),

(5.7)

m=0

for s = 0, 1, .., N − 2, and

N−1

C (0) = ∑ x(n)

(5.8)

n=0

New functions are defined, which are simply a permutation in the

order of the original functions, as

x’ (m) = x r−m , C’ (s) = C (rs) , W ’ (n) = W rn

(5.9)

(5.7) then becomes

N−2

C’ (s) = ∑ x’ (m) W ’ (s − m) + x(0)

(5.10)

m=0

which is cyclic convolution of length N-1 (plus x (0)) and is de-

noted as

C’ (k) = x’ (k) ∗ W ’ (k) + x (0)

(5.11)

Applying this change of variables (use of logarithms) to the DFT

can best be illustrated from the matrix formulation of the DFT.

(5.1) is written for a length-5 DFT as

 

C (0)

0 0 0 0 0

x (0)

 

 C (1) 

 0

1 2 3 4   x (1) 

 

 

 C (2)  =  0

2 4 1 3   x (2) 

(5.12)

 

 

 C (3) 

 0

3 1 4 2   x (3) 

 

C (4)

0 4 3 2 1

x (4)

Available for free at Connexions

<http://cnx.org/content/col10683/1.5>

39

where the square matrix should contain the terms of W nk but for

clarity, only the exponents nk are shown. Separating the x (0) term,

applying the mapping of (5.9), and using the primitive roots r = 2

(and r−1 = 3) gives

 

C (1)

1 3 4 2

x (1)

x (0)

 

 C (2) 

 2

1 3 4   x (3) 

 x (0) 

 = 

 

 + 

(5.13)

 

 C (4) 

 4

2 1 3   x (4) 

 x (0) 

 

C (3)

3 4 2 1

x (2)

x (0)

and

C (0) = x (0) + x (1) + x (2) + x (3) + x (4)

(5.14)

which can be seen to be a reordering of the structure in (5.12).

This is in the form of cyclic convolution as indicated in (5.10).

Rader first showed this in 1968 [234], stating that a prime length-

N DFT could be converted into a length-(N-1) cyclic convolution

of a permutation of the data with a permutation of the W’s. He

also stated that a slightly more complicated version of the same

idea would work for a DFT with a length equal to an odd prime to

a power. The details of that theory can be found in [234], [169].

Until 1976, this conversion approach received little attention since

it seemed to offer few advantages. It has specialized applications in

calculating the DFT if the cyclic convolution is done by distributed

arithmetic table look-up [77] or by use of number theoretic trans-

forms [28], [234], [262]. It and the Goertzel algorithm [273], [62]

are efficient when only a few DFT values need to be calculated.

It may also have advantages when used with pipelined or vector

hardware designed for fast inner products. One example is the

TMS320 signal processing microprocessor which is pipelined for

inner products. The general use of this scheme emerged when new

fast cyclic convolution algorithms were developed by Winograd

[405].

Available for free at Connexions

<http://cnx.org/content/col10683/1.5>

CHAPTER 5. THE DFT AS

40

CONVOLUTION OR FILTERING

5.2 The Chirp Z-Transform (or Bluestein’s

Algorithm)

The DFT of x (n) evaluates the Z-transform of x (n) on N equally

spaced points on the unit circle in the z plane. Using a nonlinear

change of variables, one can create a structure which is equivalent

to modulation and filtering x (n) by a “chirp" signal. [34], [306],

[298], [273], [304], [62].

The mathematical identity (k − n)2 = k2 − 2kn + n2 gives

nk = n2 − (k − n)2 + k2 /2

(5.15)

which substituted into the definition of the DFT in Multidimen-

sional Index Mapping: Equation 1 (3.1) gives

N−1

C (k) = { ∑ x(n) W n2/2 W −(k−n)2/2} W k2/2

(5.16)

n=0

This equation can be interpreted as first multiplying (modulating)

the data x (n) by a chirp sequence (W n2/2, then convolving (filter-

ing) it, then finally multiplying the filter output by the chirp se-

quence to give the DFT.

Define the chirp sequence or signal as h (n) = W n2/2 which is called

a chirp because the squared exponent gives a sinusoid with chang-

ing frequency. Using this definition, (5.16) becomes

C (n) = {[x (n) h (n)] ∗ h−1} h (n)

(5.17)

We know that convolution can be carried out by multiplying the

DFTs of the signals, here we see that evaluation of the DFT can

be carried out by convolution. Indeed, the convolution represented

by ∗ in (5.17) can be carried out by DFTs (actually FFTs) of a

larger length. This allows a prime length DFT to be calculated by

a very efficient length-2M FFT. This becomes practical for large N

Available for free at Connexions

<http://cnx.org/content/col10683/1.5>

41

when a particular non-composite (or N with few factors) length is

required.

As developed here, the chirp z-transform evaluates the z-transform

at equally spaced points on the unit circle. A slight modification

allows evaluation on a spiral and in segments [298], [273] and al-

lows savings with only some input values are nonzero or when only

some output values are needed. The story of the development of

this transform is given in [304].

Two Matlab programs to calculate an arbitrary length DFT using

the chirp z-transform is shown in p. ??.

Available for free at Connexions

<http://cnx.org/content/col10683/1.5>

CHAPTER 5. THE DFT AS

42

CONVOLUTION OR FILTERING

❢✉♥❝t✐♦♥ ② ❂ ❝❤✐r♣❝✭①✮❀

✪ ❢✉♥❝t✐♦♥ ② ❂ ❝❤✐r♣❝✭①✮

✪ ❝♦♠♣✉t❡s ❛♥ ❛r❜✐tr❛r②✲❧❡♥❣t❤ ❉❋❚ ✇✐t❤ t❤❡

✪ ❝❤✐r♣ ③✲tr❛♥s❢♦r♠ ❛❧❣♦r✐t❤♠✳ ❝s❜✳ ✻✴✶✷✴✾✶

✪◆ ❂ ❧❡♥❣t❤✭①✮❀ ♥ ❂ ✵✿◆✲✶❀

✪❙❡q✉❡♥❝❡ ❧❡♥❣t❤

❲ ❂ ❡①♣✭✲❥✯♣✐✯♥✳✯♥✴◆✮❀

✪❈❤✐r♣ s✐❣♥❛❧

①✇ ❂ ①✳✯❲❀

✪▼♦❞✉❧❛t❡ ✇✐t❤ ❝❤✐r♣

❲❲ ❂ ❬❝♦♥❥✭❲✭◆✿✲✶✿✷✮✮✱❝♦♥❥✭❲✮❪❀ ✪❈♦♥str✉❝t ❢✐❧t❡r

② ❂ ❝♦♥✈✭❲❲✱①✇✮❀

✪❈♦♥✈♦❧✈❡ ✇ ❢✐❧t❡r

② ❂ ②✭◆✿✷✯◆✲✶✮✳✯❲❀

✪❉❡♠♦❞✉❧❛t❡ ✇ ❝❤✐r♣

❢✉♥❝t✐♦♥ ② ❂ ❝❤✐r♣✭①✮❀

✪ ❢✉♥❝t✐♦♥ ② ❂ ❝❤✐r♣✭①✮

✪ ❝♦♠♣✉t❡s ❛♥ ❛r❜✐tr❛r②✲❧❡♥❣t❤ ❉✐s❝r❡t❡ ❋♦✉r✐❡r ❚r❛♥s❢♦r♠ ✭❉❋❚✮

✪ ✇✐t❤ t❤❡ ❝❤✐r♣ ③ tr❛♥s❢♦r♠ ❛❧❣♦r✐t❤♠✳ ❚❤❡ ❧✐♥❡❛r ❝♦♥✈♦❧✉t✐♦♥

✪ t❤❡♥ r❡q✉✐r❡❞ ✐s ❞♦♥❡ ✇✐t❤ ❋❋❚s✳

✪ ✶✾✽✽✿ ▲✳ ❆r❡✈❛❧♦❀ ✶✶✳✵✻✳✾✶ ❑✳ ❙❝❤✇❛r③✱ ▲◆❚ ❊r❧❛♥❣❡♥❀ ✻✴✶✷✴✾✶ ❝s❜✳

✪◆ ❂ ❧❡♥❣t❤✭①✮❀

✪❙❡q✉❡♥❝❡ ❧❡♥❣t❤

❂ ✷❫❝❡✐❧✭❧♦❣✭✭✷✯◆✲✶✮✮✴❧♦❣✭✷✮✮❀ ✪❋❋❚ ❧❡♥❣t❤

❂ ✵✿◆✲✶❀

❂ ❡①♣✭✲❥✯♣✐✯♥✳✯♥✴◆✮❀

✪❈❤✐r♣ s✐❣♥❛❧

❋❲ ❂ ❢❢t✭❬❝♦♥❥✭❲✮✱ ③❡r♦s✭✶✱▲✲✷✯◆✰✶✮✱ ❝♦♥❥✭❲✭◆✿✲✶✿✷✮✮❪✱▲✮❀

❂ ✐❢❢t✭❋❲✳✯❢❢t✭①✳✬✳✯❲✱▲✮✮❀ ✪❈♦♥✈♦❧✈❡ ✉s✐♥❣ ❋❋❚

❂ ②✭✶✿◆✮✳✯❲❀

✪❉❡♠♦❞✉❧❛t❡

Figure 5.1

Available for free at Connexions

<http://cnx.org/content/col10683/1.5>

43

5.3 Goertzel’s Algorithm (or A Better DFT

Algorithm)

Goertzel’s algorithm [144], [62], [269] is another methods that cal-

culates the DFT by converting it into a digital filtering problem.

The method looks at the calculation of the DFT as the evaluation of

a polynomial on the unit circle in the complex plane. This evalua-

tion is done by Horner’s method which is implemented recursively

by an IIR filter.

5.3.1 The First-Order Goertzel Algorithm

The polynomial whose values on the unit circle are the DFT is a

slightly modified z-transform of x(n) given by

N−1

X (z) = ∑ x(n) z−n

(5.18)

n=0

which for clarity in this development uses a positive exponent .

This is illustrated for a length-4 sequence as a third-order polyno-

mial by

X (z) = x (3) z3 + x (2) z2 + x (1) z + x (0)

(5.19)

The DFT is found by evaluating (5.18) at z = W k, which can be

written as

C (k) = X (z) |z=Wk = DFT {x (n)}

(5.20)

where

W = e− j2π/N

(5.21)

The most efficient way of evaluating a general polynomial without

any pre-processing is by “Horner’s rule" [208] which is a nested

Available for free at Connexions

<http://cnx.org/content/col10683/1.5>

CHAPTER 5. THE DFT AS

44

CONVOLUTION OR FILTERING

evaluation. This is illustrated for the polynomial in (5.19) by

X (z) = {[x (3) z + x (2)] z + x (1)}z + x (0)

(5.22)

This nested sequence of operations can be written as a linear dif-

ference equation in the form of

y (m) = z y (m − 1) + x (N − m)

(5.23)

with initial condition y (0) = 0, and the desired result being the

solution at m = N. The value of the polynomial is given by

X (z) = y (N) .

(5.24)

(5.23) can be viewed as a first-order IIR filter with the input being

the data sequence in reverse order and the value of the polynomial

at z being the filter output sampled at m = N. Applying this to the

DFT gives the Goertzel algorithm [283], [269] which is

y (m) = W ky (m − 1) + x (N − m)

(5.25)

with y (0) = 0 and

C (k) = y (N)

(5.26)

where

N−1

C (k) = ∑ x(n) W nk.

(5.27)

n=0

The flowgraph of the algorithm can be found in [62], [269] and a

simple FORTRAN program is given in the appendix.

When comparing this program with the direct calculation of (5.27),

it is seen that the number of floating-point multiplications and ad-

ditions are the same. In fact, the structures of the two algorithms

look similar, but close examination shows that the way the sines

and cosines enter the calculations is different. In (5.27), new sine

Available for free at Connexions

<http://cnx.org/content/col10683/1.5>

45

and cosine values are calculated for each frequency and for each

data value, while for the Goertzel algorithm in (5.25), they are cal-

culated only for each frequency in the outer loop. Because of the

recursive or feedback nature of the algorithm, the sine and cosine

values are “updated" each loop rather than recalculated. This re-

sults in 2N trigonometric evaluations rather than 2N2. It also re-

sults in an increase in accumulated quantization error.

It is possible to modify this algorithm to allow entering the data

in forward order rather than reverse order. The difference (5.23)

becomes

y (m) = z−1y (m − 1) + x (m − 1)

(5.28)

if (5.24) becomes

C (k) = zN−1 y (N)

(5.29)

for y (0) = 0. This is the algorithm programmed later.

5.3.2 The Second-Order Goertzel Algorithm

One of the reasons the first-order Goertzel algorithm does not im-

prove efficiency is that the constant in the feedback or recursive

path is complex and, therefore, requires four real multiplications

and two real additions. A modification of the scheme to make it

second-order removes the complex multiplications and reduces the

number of required multiplications by two.

Define the variable q (m) so that

y (m) = q (m) − z−1 q (m − 1) .

(5.30)

This substituted into the right-hand side of (5.23) gives

y (m) = z q (m − 1) − q (m − 2) + x (N − m) .

(5.31)

Available for free at Connexions

<http://cnx.org/content/col10683/1.5>

CHAPTER 5. THE DFT AS

46

CONVOLUTION OR FILTERING

Combining (5.30) and (5.31) gives the second order difference

equation

q (m) = z + z−1 q (m − 1) − q (m − 2) + x (N − m)

(5.32)

which together with the output (5.30), comprise the second-order

Goertzel algorithm where

X (z) = y (N)

(5.33)

for initial conditions q (0) = q (−1) = 0.

A similar development starting with (5.28) gives a second-order

algorithm with forward ordered input as

q (m) = z + z−1 q (m − 1) − q (m − 2) + x (m − 1)

(5.34)

y (m) = q (m) − z q (−1)

(5.35)

with

X (z) = zN−1 y (N)

(5.36)

and for q (0) = q (−1) = 0.

Note that both difference (5.32) and (5.34) are not changed if z is

replaced with z−1, only the output (5.30) and (5.35) are different.

This means that the polynomial X (z) may be evaluated at a partic-

ular z and its inverse z−1 from one solution of the difference (5.32)

or (5.34) using the output equations

X (z) = q (N) − z−1 q (N − 1)

(5.37)

and

X (1/z) = zN−1 (q (N) − z q (N − 1)) .

(5.38)

Clearly, this allows the DFT of a sequence to be calculated with

half the arithmetic since the outputs are calculated two at a time.

Available for free at Connexions

<http://cnx.org/content/col10683/1.5>

47

The second-order DE actually produces a solution q (m) that con-

tains two first-order components. The output equations are, in ef-

fect, zeros that cancel one or the other pole of the second-order

solution to give the desired first-order solution. In addition to al-

lowing the calculating of two outputs at a time, the second-order

DE requires half the number of real multiplications as the first-

order form. This is because the coefficient of the q (m − 2) is unity

and the coefficient of the q (m − 1) is real if z and z−1 are complex

conjugates of each other which is true for the DFT.

5.3.3 Analysis of Arithmetic Complexity and Tim-

ings

Analysis of the various forms of the Goertzel algorithm from their

programs gives the following operation count for real multiplica-

tions and real additions assuming real data.

Algorithm

Real Mults.

Real Adds

Trig Eval.

Direct DFT

4 N2

4 N2

2 N2

First-Order

4 N2

4 N2 − 2N

2 N

Second-Order

2 N2 + 2N

4 N2

2 N

Second-Order 2

N2 + N

2 N2 + N

N

Table 5.2

Timings of the algorithms on a PC in milliseconds are given in the

following table.

Available for free at Connexions

<http://cnx.org/content/col10683/1.5>

CHAPTER 5. THE DFT AS

48

CONVOLUTION OR FILTERING

Algorithm

N = 125

N = 257

Direct DFT

4.90

19.83

First-Order

4.01

16.70

Second-Order

2.64

11.04

Second-Order 2

1.32

5.55

Table 5.3

These timings track the floating point operation counts fairly well.

5.3.4 Conclusions

Goertzel’s algorithm in its first-order form is not particularly in-

teresting, but the two-at-a-time second-order form is significantly

faster than a direct DFT. It can also be used for any polynomial

evaluation or for the DTFT at unequally spaced values or for eval-

uating a few DFT terms. A very interesting observation is that the