The development of fast algorithms usually consists of using special properties of the algorithm of interest to remove redundant or unnecessary operations of a direct implementation. Because of the periodicity, symmetries, and orthogonality of the basis functions and the special relationship with convolution, the discrete Fourier transform (DFT) has enormous capacity for improvement of its arithmetic efficiency.
There are four main approaches to formulating efficient DFT 1 algorithms. The first two break a DFT into multiple shorter ones. This is done in Multidimensional Index Mapping by using an index map and in Polynomial Description of Signals by polynomial reduction. The third is Factoring the Signal Processing Operators which factors the DFT operator (matrix) into sparse factors. The DFT as Convolution or Filtering develops a method which converts a prime-length DFT into cyclic convolution. Still another approach is interesting where, for certain cases, the evaluation of the DFT can be posed recursively as evaluating a DFT in terms of two half-length DFTs which are each in turn evaluated by a quarter-length DFT and so on.
The very important computational complexity theorems of Winograd are stated and briefly discussed in Winograd's Short DFT Algorithms. The specific details and evaluations of the Cooley-Tukey FFT and Split-Radix FFT are given in The Cooley-Tukey Fast Fourier Transform Algorithm, and PFA and WFTA are covered in The Prime Factor and Winograd Fourier Transform Algorithms. A short discussion of high speed convolution is given in Convolution Algorithms, both for its own importance, and its theoretical connection to the DFT. We also present the chirp, Goertzel, QFT, NTT, SR-FFT, Approx FFT, Autogen, and programs to implement some of these.
Ivan Selesnick gives a short introduction in Winograd's Short DFT Algorithms to using Winograd's techniques to give a highly structured development of short prime length FFTs and describes a program that will automaticlly write these programs. Markus Pueschel presents his ``Algebraic Signal Processing" in DFT and FFT: An Algebraic View on describing the various FFT algorithms. And Steven Johnson describes the FFTW (Fastest Fourier Transform in the West) in Implementing FFTs in Practice
The organization of the book represents the various approaches to understanding the FFT and to obtaining efficient computer programs. It also shows the intimate relationship between theory and implementation that can be used to real advantage. The disparity in material devoted to the various approaches represent the tastes of this author, not any intrinsic differences in value.
A fairly long list of references is given but it is impossible to be truly complete. I have referenced the work that I have used and that I am aware of. The collection of computer programs is also somewhat idiosyncratic. They are in Matlab and Fortran because that is what I have used over the years. They also are written primarily for their educational value although some are quite efficient. There is excellent content in the Connexions book by Doug Jones 2.
Burrus, C. S. (1988). Efficient Fourier Transform and Convolution Algorithms. In Lim, J. S. and Oppenheim, A. V. (Eds.), Advanced Topics in Signal Processing. (p. 199–245). Englewood Cliffs, NJ: Prentice-Hall.
Jones, Douglas L. (2007, February). The DFT, FFT, and Practical Spectral Analysis. [http://cnx.org/content/col10281/1.2/]. Connexions.