Filtering1
A major application of the FFT is fast convolution or fast filtering
where the DFT of the signal is multiplied term-by-term by the DFT
of the impulse (helps to be doing finite impulse response (FIR) fil-
tering) and the time-domain output is obtained by taking the in-
verse DFT of that product. What is less well-known is the DFT
can be calculated by convolution. There are several different ap-
proaches to this, each with different application.
5.1 Rader’s Conversion of the DFT into
Convolution
In this section a method quite different from the index mapping or
polynomial evaluation is developed. Rather than dealing with the
DFT directly, it is converted into a cyclic convolution which must
then be carried out by some efficient means. Those means will
be covered later, but here the conversion will be explained. This
method requires use of some number theory, which can be found
1This content is available online at <http://cnx.org/content/m16328/1.9/>.
Available for free at Connexions
<http://cnx.org/content/col10683/1.5>
35
CHAPTER 5. THE DFT AS
36
CONVOLUTION OR FILTERING
in an accessible form in [234] or [262] and is easy enough to verify
on one’s own. A good general reference on number theory is [259].
The DFT and cyclic convolution are defined by
N−1
C (k) = ∑ x(n) W nk
(5.1)
n=0
N−1
y (k) = ∑ x(n) h(k − n)
(5.2)
n=0
For both, the indices are evaluated modulo N. In order to convert
the DFT in (5.1) into the cyclic convolution of (5.2), the nk prod-
uct must be changed to the k − n difference. With real numbers,
this can be done with logarithms, but it is more complicated when
working in a finite set of integers modulo N. From number theory
[28], [234], [262], [259], it can be shown that if the modulus is a
prime number, a base (called a primitive root) exists such that a
form of integer logarithm can be defined. This is stated in the fol-
lowing way. If N is a prime number, a number r called a primitive
roots exists such that the integer equation
n = ((rm))
(5.3)
N
creates a unique, one-to-one map of the N − 1 member set m =
{0, ..., N − 2} and the N − 1 member set n = {1, ..., N − 1}. This is
because the multiplicative group of integers modulo a prime, p, is
isomorphic to the additive group of integers modulo (p − 1) and is
illustrated for N = 5 below.
Available for free at Connexions
<http://cnx.org/content/col10683/1.5>
37
r
m=
0
1
2
3
4
5
6
7
1
1
1
1
1
1
1
1
1
2
1
2
4
3
1
2
4
3
3
1
3
4
2
1
3
4
2
4
1
4
1
4
1
4
1
4
5
*
0
0
0
*
0
0
0
6
1
1
1
1
1
1
1
1
Table 5.1: Table of Integers n = ((rm)) modulo 5, [* not defined]
Table 5.1 is an array of values of rm modulo N and it is easy to
see that there are two primitive roots, 2 and 3, and (5.3) defines
a permutation of the integers n from the integers m (except for
zero). (5.3) and a primitive root (usually chosen to be the smallest
of those that exist) can be used to convert the DFT in (5.1) to the
convolution in (5.2). Since (5.3) cannot give a zero, a new length-
(N-1) data sequence is defined from x (n) by removing the term
with index zero. Let
n = r−m
(5.4)
and
k = rs
(5.5)
where the term with the negative exponent (the inverse) is defined
as the integer that satisfies
r−mrm
= 1
(5.6)
N
If N is a prime number, r−m always exists.
For example,
Available for free at Connexions
<http://cnx.org/content/col10683/1.5>
CHAPTER 5. THE DFT AS
38
CONVOLUTION OR FILTERING
2−1
= 3. (5.1) now becomes
5
N−2
C (rs) = ∑ x r−m W r−mrs + x(0),
(5.7)
m=0
for s = 0, 1, .., N − 2, and
N−1
C (0) = ∑ x(n)
(5.8)
n=0
New functions are defined, which are simply a permutation in the
order of the original functions, as
x’ (m) = x r−m , C’ (s) = C (rs) , W ’ (n) = W rn
(5.9)
(5.7) then becomes
N−2
C’ (s) = ∑ x’ (m) W ’ (s − m) + x(0)
(5.10)
m=0
which is cyclic convolution of length N-1 (plus x (0)) and is de-
noted as
C’ (k) = x’ (k) ∗ W ’ (k) + x (0)
(5.11)
Applying this change of variables (use of logarithms) to the DFT
can best be illustrated from the matrix formulation of the DFT.
(5.1) is written for a length-5 DFT as
C (0)
0 0 0 0 0
x (0)
C (1)
0
1 2 3 4 x (1)
C (2) = 0
2 4 1 3 x (2)
(5.12)
C (3)
0
3 1 4 2 x (3)
C (4)
0 4 3 2 1
x (4)
Available for free at Connexions
<http://cnx.org/content/col10683/1.5>
39
where the square matrix should contain the terms of W nk but for
clarity, only the exponents nk are shown. Separating the x (0) term,
applying the mapping of (5.9), and using the primitive roots r = 2
(and r−1 = 3) gives
C (1)
1 3 4 2
x (1)
x (0)
C (2)
2
1 3 4 x (3)
x (0)
=
+
(5.13)
C (4)
4
2 1 3 x (4)
x (0)
C (3)
3 4 2 1
x (2)
x (0)
and
C (0) = x (0) + x (1) + x (2) + x (3) + x (4)
(5.14)
which can be seen to be a reordering of the structure in (5.12).
This is in the form of cyclic convolution as indicated in (5.10).
Rader first showed this in 1968 [234], stating that a prime length-
N DFT could be converted into a length-(N-1) cyclic convolution
of a permutation of the data with a permutation of the W’s. He
also stated that a slightly more complicated version of the same
idea would work for a DFT with a length equal to an odd prime to
a power. The details of that theory can be found in [234], [169].
Until 1976, this conversion approach received little attention since
it seemed to offer few advantages. It has specialized applications in
calculating the DFT if the cyclic convolution is done by distributed
arithmetic table look-up [77] or by use of number theoretic trans-
forms [28], [234], [262]. It and the Goertzel algorithm [273], [62]
are efficient when only a few DFT values need to be calculated.
It may also have advantages when used with pipelined or vector
hardware designed for fast inner products. One example is the
TMS320 signal processing microprocessor which is pipelined for
inner products. The general use of this scheme emerged when new
fast cyclic convolution algorithms were developed by Winograd
[405].
Available for free at Connexions
<http://cnx.org/content/col10683/1.5>
CHAPTER 5. THE DFT AS
40
CONVOLUTION OR FILTERING
5.2 The Chirp Z-Transform (or Bluestein’s
Algorithm)
The DFT of x (n) evaluates the Z-transform of x (n) on N equally
spaced points on the unit circle in the z plane. Using a nonlinear
change of variables, one can create a structure which is equivalent
to modulation and filtering x (n) by a “chirp" signal. [34], [306],
[298], [273], [304], [62].
The mathematical identity (k − n)2 = k2 − 2kn + n2 gives
nk = n2 − (k − n)2 + k2 /2
(5.15)
which substituted into the definition of the DFT in Multidimen-
sional Index Mapping: Equation 1 (3.1) gives
N−1
C (k) = { ∑ x(n) W n2/2 W −(k−n)2/2} W k2/2
(5.16)
n=0
This equation can be interpreted as first multiplying (modulating)
the data x (n) by a chirp sequence (W n2/2, then convolving (filter-
ing) it, then finally multiplying the filter output by the chirp se-
quence to give the DFT.
Define the chirp sequence or signal as h (n) = W n2/2 which is called
a chirp because the squared exponent gives a sinusoid with chang-
ing frequency. Using this definition, (5.16) becomes
C (n) = {[x (n) h (n)] ∗ h−1} h (n)
(5.17)
We know that convolution can be carried out by multiplying the
DFTs of the signals, here we see that evaluation of the DFT can
be carried out by convolution. Indeed, the convolution represented
by ∗ in (5.17) can be carried out by DFTs (actually FFTs) of a
larger length. This allows a prime length DFT to be calculated by
a very efficient length-2M FFT. This becomes practical for large N
Available for free at Connexions
<http://cnx.org/content/col10683/1.5>
41
when a particular non-composite (or N with few factors) length is
required.
As developed here, the chirp z-transform evaluates the z-transform
at equally spaced points on the unit circle. A slight modification
allows evaluation on a spiral and in segments [298], [273] and al-
lows savings with only some input values are nonzero or when only
some output values are needed. The story of the development of
this transform is given in [304].
Two Matlab programs to calculate an arbitrary length DFT using
the chirp z-transform is shown in p. ??.
Available for free at Connexions
<http://cnx.org/content/col10683/1.5>
CHAPTER 5. THE DFT AS
42
CONVOLUTION OR FILTERING
❢✉♥❝t✐♦♥ ② ❂ ❝❤✐r♣❝✭①✮❀
✪ ❢✉♥❝t✐♦♥ ② ❂ ❝❤✐r♣❝✭①✮
✪ ❝♦♠♣✉t❡s ❛♥ ❛r❜✐tr❛r②✲❧❡♥❣t❤ ❉❋❚ ✇✐t❤ t❤❡
✪ ❝❤✐r♣ ③✲tr❛♥s❢♦r♠ ❛❧❣♦r✐t❤♠✳ ❝s❜✳ ✻✴✶✷✴✾✶
✪◆ ❂ ❧❡♥❣t❤✭①✮❀ ♥ ❂ ✵✿◆✲✶❀
✪❙❡q✉❡♥❝❡ ❧❡♥❣t❤
❲ ❂ ❡①♣✭✲❥✯♣✐✯♥✳✯♥✴◆✮❀
✪❈❤✐r♣ s✐❣♥❛❧
①✇ ❂ ①✳✯❲❀
✪▼♦❞✉❧❛t❡ ✇✐t❤ ❝❤✐r♣
❲❲ ❂ ❬❝♦♥❥✭❲✭◆✿✲✶✿✷✮✮✱❝♦♥❥✭❲✮❪❀ ✪❈♦♥str✉❝t ❢✐❧t❡r
② ❂ ❝♦♥✈✭❲❲✱①✇✮❀
✪❈♦♥✈♦❧✈❡ ✇ ❢✐❧t❡r
② ❂ ②✭◆✿✷✯◆✲✶✮✳✯❲❀
✪❉❡♠♦❞✉❧❛t❡ ✇ ❝❤✐r♣
❢✉♥❝t✐♦♥ ② ❂ ❝❤✐r♣✭①✮❀
✪ ❢✉♥❝t✐♦♥ ② ❂ ❝❤✐r♣✭①✮
✪ ❝♦♠♣✉t❡s ❛♥ ❛r❜✐tr❛r②✲❧❡♥❣t❤ ❉✐s❝r❡t❡ ❋♦✉r✐❡r ❚r❛♥s❢♦r♠ ✭❉❋❚✮
✪ ✇✐t❤ t❤❡ ❝❤✐r♣ ③ tr❛♥s❢♦r♠ ❛❧❣♦r✐t❤♠✳ ❚❤❡ ❧✐♥❡❛r ❝♦♥✈♦❧✉t✐♦♥
✪ t❤❡♥ r❡q✉✐r❡❞ ✐s ❞♦♥❡ ✇✐t❤ ❋❋❚s✳
✪ ✶✾✽✽✿ ▲✳ ❆r❡✈❛❧♦❀ ✶✶✳✵✻✳✾✶ ❑✳ ❙❝❤✇❛r③✱ ▲◆❚ ❊r❧❛♥❣❡♥❀ ✻✴✶✷✴✾✶ ❝s❜✳
✪◆ ❂ ❧❡♥❣t❤✭①✮❀
✪❙❡q✉❡♥❝❡ ❧❡♥❣t❤
▲
❂ ✷❫❝❡✐❧✭❧♦❣✭✭✷✯◆✲✶✮✮✴❧♦❣✭✷✮✮❀ ✪❋❋❚ ❧❡♥❣t❤
♥
❂ ✵✿◆✲✶❀
❲
❂ ❡①♣✭✲❥✯♣✐✯♥✳✯♥✴◆✮❀
✪❈❤✐r♣ s✐❣♥❛❧
❋❲ ❂ ❢❢t✭❬❝♦♥❥✭❲✮✱ ③❡r♦s✭✶✱▲✲✷✯◆✰✶✮✱ ❝♦♥❥✭❲✭◆✿✲✶✿✷✮✮❪✱▲✮❀
②
❂ ✐❢❢t✭❋❲✳✯❢❢t✭①✳✬✳✯❲✱▲✮✮❀ ✪❈♦♥✈♦❧✈❡ ✉s✐♥❣ ❋❋❚
②
❂ ②✭✶✿◆✮✳✯❲❀
✪❉❡♠♦❞✉❧❛t❡
Figure 5.1
Available for free at Connexions
<http://cnx.org/content/col10683/1.5>
43
5.3 Goertzel’s Algorithm (or A Better DFT
Algorithm)
Goertzel’s algorithm [144], [62], [269] is another methods that cal-
culates the DFT by converting it into a digital filtering problem.
The method looks at the calculation of the DFT as the evaluation of
a polynomial on the unit circle in the complex plane. This evalua-
tion is done by Horner’s method which is implemented recursively
by an IIR filter.
5.3.1 The First-Order Goertzel Algorithm
The polynomial whose values on the unit circle are the DFT is a
slightly modified z-transform of x(n) given by
N−1
X (z) = ∑ x(n) z−n
(5.18)
n=0
which for clarity in this development uses a positive exponent .
This is illustrated for a length-4 sequence as a third-order polyno-
mial by
X (z) = x (3) z3 + x (2) z2 + x (1) z + x (0)
(5.19)
The DFT is found by evaluating (5.18) at z = W k, which can be
written as
C (k) = X (z) |z=Wk = DFT {x (n)}
(5.20)
where
W = e− j2π/N
(5.21)
The most efficient way of evaluating a general polynomial without
any pre-processing is by “Horner’s rule" [208] which is a nested
Available for free at Connexions
<http://cnx.org/content/col10683/1.5>
CHAPTER 5. THE DFT AS
44
CONVOLUTION OR FILTERING
evaluation. This is illustrated for the polynomial in (5.19) by
X (z) = {[x (3) z + x (2)] z + x (1)}z + x (0)
(5.22)
This nested sequence of operations can be written as a linear dif-
ference equation in the form of
y (m) = z y (m − 1) + x (N − m)
(5.23)
with initial condition y (0) = 0, and the desired result being the
solution at m = N. The value of the polynomial is given by
X (z) = y (N) .
(5.24)
(5.23) can be viewed as a first-order IIR filter with the input being
the data sequence in reverse order and the value of the polynomial
at z being the filter output sampled at m = N. Applying this to the
DFT gives the Goertzel algorithm [283], [269] which is
y (m) = W ky (m − 1) + x (N − m)
(5.25)
with y (0) = 0 and
C (k) = y (N)
(5.26)
where
N−1
C (k) = ∑ x(n) W nk.
(5.27)
n=0
The flowgraph of the algorithm can be found in [62], [269] and a
simple FORTRAN program is given in the appendix.
When comparing this program with the direct calculation of (5.27),
it is seen that the number of floating-point multiplications and ad-
ditions are the same. In fact, the structures of the two algorithms
look similar, but close examination shows that the way the sines
and cosines enter the calculations is different. In (5.27), new sine
Available for free at Connexions
<http://cnx.org/content/col10683/1.5>
45
and cosine values are calculated for each frequency and for each
data value, while for the Goertzel algorithm in (5.25), they are cal-
culated only for each frequency in the outer loop. Because of the
recursive or feedback nature of the algorithm, the sine and cosine
values are “updated" each loop rather than recalculated. This re-
sults in 2N trigonometric evaluations rather than 2N2. It also re-
sults in an increase in accumulated quantization error.
It is possible to modify this algorithm to allow entering the data
in forward order rather than reverse order. The difference (5.23)
becomes
y (m) = z−1y (m − 1) + x (m − 1)
(5.28)
if (5.24) becomes
C (k) = zN−1 y (N)
(5.29)
for y (0) = 0. This is the algorithm programmed later.
5.3.2 The Second-Order Goertzel Algorithm
One of the reasons the first-order Goertzel algorithm does not im-
prove efficiency is that the constant in the feedback or recursive
path is complex and, therefore, requires four real multiplications
and two real additions. A modification of the scheme to make it
second-order removes the complex multiplications and reduces the
number of required multiplications by two.
Define the variable q (m) so that
y (m) = q (m) − z−1 q (m − 1) .
(5.30)
This substituted into the right-hand side of (5.23) gives
y (m) = z q (m − 1) − q (m − 2) + x (N − m) .
(5.31)
Available for free at Connexions
<http://cnx.org/content/col10683/1.5>
CHAPTER 5. THE DFT AS
46
CONVOLUTION OR FILTERING
Combining (5.30) and (5.31) gives the second order difference
equation
q (m) = z + z−1 q (m − 1) − q (m − 2) + x (N − m)
(5.32)
which together with the output (5.30), comprise the second-order
Goertzel algorithm where
X (z) = y (N)
(5.33)
for initial conditions q (0) = q (−1) = 0.
A similar development starting with (5.28) gives a second-order
algorithm with forward ordered input as
q (m) = z + z−1 q (m − 1) − q (m − 2) + x (m − 1)
(5.34)
y (m) = q (m) − z q (−1)
(5.35)
with
X (z) = zN−1 y (N)
(5.36)
and for q (0) = q (−1) = 0.
Note that both difference (5.32) and (5.34) are not changed if z is
replaced with z−1, only the output (5.30) and (5.35) are different.
This means that the polynomial X (z) may be evaluated at a partic-
ular z and its inverse z−1 from one solution of the difference (5.32)
or (5.34) using the output equations
X (z) = q (N) − z−1 q (N − 1)
(5.37)
and
X (1/z) = zN−1 (q (N) − z q (N − 1)) .
(5.38)
Clearly, this allows the DFT of a sequence to be calculated with
half the arithmetic since the outputs are calculated two at a time.
Available for free at Connexions
<http://cnx.org/content/col10683/1.5>
47
The second-order DE actually produces a solution q (m) that con-
tains two first-order components. The output equations are, in ef-
fect, zeros that cancel one or the other pole of the second-order
solution to give the desired first-order solution. In addition to al-
lowing the calculating of two outputs at a time, the second-order
DE requires half the number of real multiplications as the first-
order form. This is because the coefficient of the q (m − 2) is unity
and the coefficient of the q (m − 1) is real if z and z−1 are complex
conjugates of each other which is true for the DFT.
5.3.3 Analysis of Arithmetic Complexity and Tim-
ings
Analysis of the various forms of the Goertzel algorithm from their
programs gives the following operation count for real multiplica-
tions and real additions assuming real data.
Algorithm
Real Mults.
Real Adds
Trig Eval.
Direct DFT
4 N2
4 N2
2 N2
First-Order
4 N2
4 N2 − 2N
2 N
Second-Order
2 N2 + 2N
4 N2
2 N
Second-Order 2
N2 + N
2 N2 + N
N
Table 5.2
Timings of the algorithms on a PC in milliseconds are given in the
following table.
Available for free at Connexions
<http://cnx.org/content/col10683/1.5>
CHAPTER 5. THE DFT AS
48
CONVOLUTION OR FILTERING
Algorithm
N = 125
N = 257
Direct DFT
4.90
19.83
First-Order
4.01
16.70
Second-Order
2.64
11.04
Second-Order 2
1.32
5.55
Table 5.3
These timings track the floating point operation counts fairly well.
5.3.4 Conclusions
Goertzel’s algorithm in its first-order form is not particularly in-
teresting, but the two-at-a-time second-order form is significantly
faster than a direct DFT. It can also be used for any polynomial
evaluation or for the DTFT at unequally spaced values or for eval-
uating a few DFT terms. A very interesting observation is that the