Automatic Generation of Prime Length FFT Programs by C. Sidney Burrus - HTML preview

PLEASE NOTE: This is an HTML preview only and some elements such as links or page numbers may be incorrect.
Download the book in PDF, ePub, Kindle for a complete version.

Chapter 7Programs for Prime Length FFTs

Programs for Prime Length FFTs

Using the circular convolution algorithms described above, we can easily design algorithms for prime length FFTs. The only modifications that needs to be made involve the permutation of Rader 5 and the correct calculation of the DC term (y(0)). These modifications are easily made to the above described approach. It simply requires a few extra commands in the programs. Note that the multiplicative constants are computed directly, since we have programs for all the relevant operations.

In the version we have currently implemented and verified for correctness, we precompute the multiplicative constants, the input permutation and the output permutation. From Equation 8 from A Bilinear Form for the DFT, the multiplicative constants are given by _autogen-svg2png-0002.png, the input permutation is given by 1⊕PQr, and the output permutation is given by 1⊕QstJPt. The multiplicative constants, the input and output permutation are each stored as vectors. These vectors are then passed to the prime length FFT program, which consists of the appropriate function calls, see the appendix. In previous prime length FFT modules, the input and output permutations can be completely absorbed in to the computational instructions. This is possible because they are written as straight line code. It is simple to modify the code generating program we have implemented so that it produces straight line code and absorbs the permutations in to the computational program instructions.

In an in-place in-order prime factor algorithm for the DFT 1, 6, the necessary permuted forms of the DFT can be obtained by modifying the multiplicative constants. This can be easily done by permuting the roots of unity, w, in the expression for the multiplicative constants 1, 3, nothing else in the structure of the algorithm needs to be changed. By changing the multiplicative constants, it is not possible, however, to omit the permutation required for Rader's conversion of the prime length DFT in to circular convolution.

Operation Counts

Table 7.1 lists the arithmetic operations incurred by the FFT programs we have generated and verified for correctness. Note that the number of additions and multiplications incurred by the programs we have generated are the same as previously existing programs for prime lengths up to and including 13. For p=17 a program with 70 multiplications and 314 additions has been written, and for p=19 a program with 76 multiplications and 372 additions has been written 2. Thus for the length p=17, the program we have generated requires fewer total arithmetic operations, while for p=19, ours uses more.

There are several table of operation counts in 4, each table corresponding to a different variation of the algorithms used in that paper. For each variation, the algorithms we have described use fewer additions and fewer multiplications. The focus of 4, however, is the implementation of prime point FFT on various computer architectures and the advantage that can be gained from matching algorithms with architectures. It should be noted that the highest prime in 4 for which an FFT was designed is 29. Although we have not executed the programs described in this paper on these computers, they are, as mentioned above, written to be easily adapted to parallel/vector computers.

Table 7.1. Operation counts for prime length FFTs
Pmulsadds Pmulsadds Pmulsadds
3412 412801140 241328013020
51034 432561440 271376018152
71672 614001908 281448019036
1140168 716403112 337524822268
1340188 735322504 379601632880
1782274 1099405096 421640029412
1976404 11313125516 433770832864
29160836 12712166760 541940043020
31160776 18119008936 6311216056056
37190990 211256012368 7571504076292
Operation Counts
Figure 7.1
Plot of additions and multiplications incurred by prime length FFTs.
Operation Counts
Figure 7.2
Plot of additions and multiplications incurred by prime length FFTs.

References

  1. Burrus, C. S. and Eschenbacher, P. W. (1981, August). An In-place, In-order prime factor FFT algorithm. IEEE Trans. Acoust., Speech, Signal Proc., 29(4), 806-817.

  2. Johnson, H. W. and Burrus, S. (1981). Large DFT Modules: 11, 13, 17, 19, 25. (8105). Technical report. Rice University.

  3. Johnson, H. W. and Burrus, C. S. (1985, February). On the Structure of Efficient DFT Algorithms. IEEE Trans. Acoust., Speech, Signal Proc., 33(1), 248-254.

  4. Lu, C. and Cooley, J. W. and Tolimieri, R. (1993, February). FFT Algorithms for Prime Transform Sizes and their Implementations of VAX, IBM3090VF, and IBM RS/6000. IEEE Trans. Acoust., Speech, Signal Proc., 41(2), 638-648.

  5. Rader, C. M. (1968, June). Discrete Fourier Transform When the Number of Data Samples is Prime. Proc. IEEE, 56(6), 1107-1108.

  6. Temperton, C. (1985). Implementation of a Self-Sorting In-Place Prime Factor FFT Algorithm. Journal of Computational Physics, 58, 283-299.

Solutions