Automatic Generation of Prime Length FFT Programs by C. Sidney Burrus - HTML preview

PLEASE NOTE: This is an HTML preview only and some elements such as links or page numbers may be incorrect.
Download the book in PDF, ePub, Kindle for a complete version.

Chapter 2Preliminaries

Preliminaries

Because we compute prime point DFTs by converting them in to circular convolutions, most of this and the next section is devoted to an explanation of the split nesting convolution algorithm. In this section we introduce the various operations needed to carry out the split nesting algorithm. In particular, we describe the prime factor permutation that is used to convert a one-dimensional circular convolution into a multi-dimensional one. We also discuss the reduction operations needed when the Chinese Remainder Theorem for polynomials is used in the computation of convolution. The reduction operations needed for the split nesting algorithm are particularly well organized. We give an explicit matrix description of the reduction operations and give a program that implements the action of these reduction operations.

The presentation relies upon the notions of similarity transformations, companion matrices and Kronecker products. With them, we describe the split nesting algorithm in a manner that brings out its structure. We find that when companion matrices are used to describe convolution, the reduction operations block diagonalizes the circular shift matrix.

The companion matrix of a monic polynomial, M(s)=m0+m1s+⋯+mn–1sn–1+sn is given by

(2.1)
_autogen-svg2png-0002.png

Its usefulness in the following discussion comes from the following relation which permits a matrix formulation of convolution. Let

(2.2)
_autogen-svg2png-0003.png

Then

(2.3)
_autogen-svg2png-0004.png

where _autogen-svg2png-0005.png, _autogen-svg2png-0006.png, and CM is the companion matrix of M(s). In Equation 2.3, we say y is the convolution of x and h with respect to M(s). In the case of circular convolution, M(s)=sn–1 and Csn–1 is the circular shift matrix denoted by Sn,

(2.4)
_autogen-svg2png-0016.png

Notice that any circulant matrix can be written as khkSkn.

Similarity transformations can be used to interpret the action of some convolution algorithms. If CM=T–1AT for some matrix T (CM and A are similar, denoted CMA), then Equation 2.3 becomes

(2.5)
_autogen-svg2png-0023.png

That is, by employing the similarity transformation given by T in this way, the action of Skn is replaced by that of Ak. Many circular convolution algorithms can be understood, in part, by understanding the manipulations made to Sn and the resulting new matrix A. If the transformation T is to be useful, it must satisfy two requirements: (1) Tx must be simple to compute, and (2) A must have some advantageous structure. For example, by the convolution property of the DFT, the DFT matrix F diagonalizes Sn,

(2.6)
_autogen-svg2png-0034.png

so that it diagonalizes every circulant matrix. In this case, Tx can be computed by using an FFT and the structure of A is the simplest possible. So the two above mentioned conditions are met.

The Winograd Structure can be described in this manner also. Suppose M(s) can be factored as M(s)=M1(s)M2(s) where M1 and M2 have no common roots, then CM∼(CM1CM2) where denotes the matrix direct sum. Using this similarity and recalling Equation 2.3, the original convolution is decomposed into disjoint convolutions. This is, in fact, a statement of the Chinese Remainder Theorem for polynomials expressed in matrix notation. In the case of circular convolution, sn–1=∏d|nΦd(s), so that Sn can be transformed to a block diagonal matrix,

(2.7)
_autogen-svg2png-0045.png

where Φd(s) is the dth cyclotomic polynomial. In this case, each block represents a convolution with respect to a cyclotomic polynomial, or a `cyclotomic convolution'. Winograd's approach carries out these cyclotomic convolutions using the Toom-Cook algorithm. Note that for each divisor, d, of n there is a corresponding block on the diagonal of size φ(d), for the degree of Φd(s) is φ(d) where φ(·) is the Euler totient function. This method is good for short lengths, but as n increases the cyclotomic convolutions become cumbersome, for as the number of distinct prime divisors of d increases, the operation described by khk(CΦd)k becomes more difficult to implement.

The Agarwal-Cooley Algorithm utilizes the fact that

(2.8) Sn = Pt (Sn1Sn2) P

where n=n1n2, _autogen-svg2png-0059.png and P is an appropriate permutation 1. This converts the one dimensional circular convolution of length n to a two dimensional one of length n1 along one dimension and length n2 along the second. Then an n1-point and an n2-point circular convolution algorithm can be combined to obtain an n-point algorithm. In polynomial notation, the mapping accomplished by this permutation P can be informally indicated by

(2.9)
_autogen-svg2png-0068.png

It should be noted that Equation 2.8 implies that a circulant matrix of size n1n2 can be written as a block circulant matrix with circulant blocks.

The Split-Nesting algorithm 3 combines the structures of the Winograd and Agarwal-Cooley methods, so that Sn is transformed to a block diagonal matrix as in Equation 2.7,

(2.10)
_autogen-svg2png-0071.png

Here Ψ(d)=⊗p|d,p∈PCΦHd(p) where Hd(p) is the highest power of p dividing d, and P is the set of primes.

Example 2.1
(2.11)
_autogen-svg2png-0077.png

In this structure a multidimensional cyclotomic convolution, represented by Ψ(d), replaces each cyclotomic convolution in Winograd's algorithm (represented by CΦd in Equation 2.7. Indeed, if the product of b1,⋯,bk is d and they are pairwise relatively prime, then CΦdCΦb1⊗⋯⊗CΦbk. This gives a method for combining cyclotomic convolutions to compute a longer circular convolution. It is like the Agarwal-Cooley method but requires fewer additions 3.

Prime Factor Permutations

One can obtain Sn1Sn2 from Sn1n2 when _autogen-svg2png-0085.png, for in this case, Sn is similar to Sn1Sn2, n=n1n2. Moreover, they are related by a permutation. This permutation is that of the prime factor FFT algorithms and is employed in nesting algorithms for circular convolution 1, 2. The permutation is described by Zalcstein 7, among others, and it is his description we draw on in the following.

Let n=n1n2 where _autogen-svg2png-0090.png. Define ek, (0≤kn–1), to be the standard basis vector, (0,⋯,0,1,0,⋯,0)t, where the 1 is in the kth position. Then, the circular shift matrix, Sn, can be described by

(2.12)
_autogen-svg2png-0096.png

Note that, by inspection,

(2.13)
_autogen-svg2png-0097.png

where 0≤an1–1 and 0≤bn2–1. Because n1 and n2 are relatively prime a permutation matrix P can be defined by

(2.14)
_autogen-svg2png-0103.png

With this P,

(2.15)
_autogen-svg2png-0105.png

and

(2.16)
_autogen-svg2png-0106.png

Since _autogen-svg2png-0107.png and P–1=Pt, one gets, in the multi-factor case, the following.

Lemma

If n=n1nk and n1,...,nk are pairwise relatively prime, then _autogen-svg2png-0111.png where P is the permutation matrix given by _autogen-svg2png-0113.png.

This useful permutation will be denoted here as _autogen-svg2png-0114.png. If n=p1e1p2e2pkek then this permutation yields the matrix, Sp1e1⊗⋯⊗Spkek. This product can be written simply as _autogen-svg2png-0117.png, so that one has _autogen-svg2png-0118.png.

It is quite simple to show that

(2.17)
_autogen-svg2png-0119.png

In general, one has

(2.18)
_autogen-svg2png-0120.png

A Matlab function for Pa,bIs is pfp2I() in one of the appendices. This program is a direct implementation of the definition. In a paper by Templeton 5, another method for implementing Pa,b, without `if' statements, is given. That method requires some precalculations, however. A function for Pn1,⋯,nk is pfp(). It uses Equation 2.18