Digital Signal Processing: A User's Guide by Douglas L. Jones. - HTML preview

/ Home / Mathematics / Digital Signal Processing: A User's Guide

PLEASE NOTE: This is an HTML preview only and some elements such as links or page numbers may be incorrect.
Download the book in PDF, ePub for a complete version.

Chapter 5. Adaptive Filters and Applications

5.1. Introduction to Adaptive Filters^*

In many applications requiring filtering, the necessary frequency response may not be known beforehand, or it may vary with time. (Example; suppression of engine harmonics in a car stereo.) In such applications, an adaptive filter which can automatically design itself and which can track system variations in time is extremely useful. Adaptive filters are used extensively in a wide variety of applications, particularly in telecommunications.

Outline of adaptive filter material

Wiener Filters: L₂ optimal (FIR) filter design in a statistical context
LMS algorithm: simplest and by-far-the-most-commonly-used adaptive filter algorithm
Stability and performance of the LMS algorithm: When and how well it works
Applications of adaptive filters: Overview of important applications
Introduction to advanced adaptive filter algorithms: Techniques for special situations or faster convergence

5.2. Wiener Filter Algorithm

Discrete-Time, Causal Wiener Filter^*

Stochastic L₂ optimal (least squares) FIR filter design problem: Given a wide-sense stationary (WSS) input signal x_k and desired signal d_k (WSS ⇔ E[y_k]=E[y_k+d] , r_yz(l)=E[y_kz_k+l] , r_yy(0)<∞ )

Figure 5.1.

The Wiener filter is the linear, time-invariant filter minimizing E[ε²] , the variance of the error.

As posed, this problem seems slightly silly, since d_k is already available! However, this idea is useful in a wide cariety of applications.

Example 5.1.

active suspension system design

Figure 5.2.

optimal system may change with different road conditions or mass in car, so an adaptive system might be desirable.

Example 5.2.

System identification (radar, non-destructive testing, adaptive control systems)

Figure 5.3.

Exercise 1.

Usually one desires that the input signal x_k be "persistently exciting," which, among other things, implies non-zero energy in all frequency bands. Why is this desirable?

Determining the optimal length-N causal FIR Weiner filter

for convenience, we will analyze only the causal, real-data case; extensions are straightforward.

where r_dd(0)=E[d_k²] r_dx(l)=E[d_kX_{k
–
l}] r_xx(l−m)=E[x_kx_{k
+
l
–
m}] This can be written in matrix form as E[ε²]=r_dd(0)−2PW^T+W^TRW where To solve for the optimum filter, compute the gradient with respect to the top weights vector W ∇=–(2P)+2RW (recall , for symmetric M) setting the gradient equal to zero ⇒ Since R is a correlation matrix, it must be non-negative definite, so this is a minimizer. For R positive definite, the minimizer is unique.

Practical Issues in Wiener Filter Implementation^*

The weiner-filter, , is ideal for many applications. But several issues must be addressed to use it in practice.

Exercise 2.

In practice one usually won't know exactly the statistics of x_k and d_k (i.e. R and P) needed to compute the Weiner filter.

How do we surmount this problem?

Estimate the statistics

then solve

Exercise 3.

In many applications, the statistics of x_k, d_k vary slowly with time.

How does one develop an adaptive system which tracks these changes over time to keep the system near optimal at all times?

Use short-time windowed estiamtes of the correlation functions.

Equation in Question

and

Exercise 4.

How can be computed efficiently?

Recursively! r_xx^k(l)=r_xx^{k
–
1}(l)+x_kx_{k
–
l}−x_{k
–
N}x_{k
–
N
–
l} This is critically stable, so people usually do (1−α)r_xx^k(l)=αr_xx^{k
–
1}(l)+x_kx_{k
–
l}

Exercise 5.

how does one choose N?

Tradeoffs

Larger N → more accurate estimates of the correlation values → better . However, larger N leads to slower adaptation.

The success of adaptive systems depends on x, d being roughly stationary over at least N samples, N>M . That is, all adaptive filtering algorithms require that the underlying system varies slowly with respect to the sampling rate and the filter length (although they can tolerate occasional step discontinuities in the underlying system).

Computational Considerations

As presented here, an adaptive filter requires computing a matrix inverse at each sample. Actually, since the matrix R is Toeplitz, the linear system of equations can be sovled with O(M²) computations using Levinson's algorithm, where M is the filter length. However, in many applications this may be too expensive, especially since computing the filter output itself requires O(M) computations. There are two main approaches to resolving the computation problem

Take advantage of the fact that R^k+1 is only slightly changed from R^k to reduce the computation to O(M) ; these algorithms are called Fast Recursive Least Squareds algorithms; all methods proposed so far have stability problems and are dangerous to use.
Find a different approach to solving the optimization problem that doesn't require explicit inversion of the correlation matrix.

Adaptive algorithms involving the correlation matrix are called Recursive least Squares (RLS) algorithms. Historically, they were developed after the LMS algorithm, which is the slimplest and most widely used approach O(M) . O(M²) RLS algorithms are used in applications requiring very fast adaptation.

Quadratic Minimization and Gradient Descent^*

Quadratic minimization problems

The least squares optimal filter design problem is quadratic in the filter coefficients: E[ε²]=r_dd(0)−2P^TW+W^TRW If R is positive definite, the error surface E[ε²](w₀, w₁, …, w_{M
–
1}) is a unimodal "bowl" in ℝ^N.

Figure 5.4.

The problem is to find the bottom of the bowl. In an adaptive filter context, the shape and bottom of the bowl may drift slowly with time; hopefully slow enough that the adaptive algorithm can track it.

For a quadratic error surface, the bottom of the bowl can be found in one step by computing R^-1P . Most modern nonlinear optimization methods (which are used, for example, to solve the L^P optimal IIR filter design problem!) locally approximate a nonlinear function with a second-order (quadratic) Taylor series approximation and step to the bottom of this quadratic approximation on each iteration. However, an older and simpler appraoch to nonlinear optimaztion exists, based on gradient descent.

Contour plot of ε-squared (fig2QuadraticMin.png)

Figure 5.5. Contour plot of ε-squared

The idea is to iteratively find the minimizer by computing the gradient of the error function: . The gradient is a vector in ℝ^M pointing in the steepest uphill direction on the error surface at a given point Wⁱ, with ∇ having a magnitude proportional to the slope of the error surface in this steepest direction.

By updating the coefficient vector by taking a step opposite the gradient direction : Wⁱ⁺¹=Wⁱ−μ∇ⁱ, we go (locally) "downhill" in the steepest direction, which seems to be a sensible way to iteratively solve a nonlinear optimization problem. The performance obviously depends on μ; if μ is too large, the iterations could bounce back and forth up out of the bowl. However, if μ is too small, it could take many iterations to approach the bottom. We will determine criteria for choosing μ later.

In summary, the gradient descent algorithm for solving the Weiner filter problem is: Guess W⁰ do i=1 , ∞ ∇ⁱ=–(2P)+2R