Purdue Digital Signal Processing Labs by Charles A. Bouman - HTML preview

PLEASE NOTE: This is an HTML preview only and some elements such as links or page numbers may be incorrect.
Download the book in PDF, ePub, Kindle for a complete version.

Chapter 10Lab 7a - Discrete-Time Random Processes (part 1)

Questions or comments concerning this laboratory should be directed to Prof. Charles A. Bouman, School of Electrical and Computer Engineering, Purdue University, West Lafayette IN 47907; (765) 494-0340; bouman@ecn.purdue.edu

Introduction

Many of the phenomena that occur in nature have uncertainty and are best characterized statistically as random processes. For example, the thermal noise in electronic circuits, radar detection, and games of chance are best modeled and analyzed in terms of statistical averages.

This lab will cover some basic methods of analyzing random processes. "Random Variables" reviews some basic definitions and terminology associated with random variables, observations, and estimation. "Estimating the Cumulative Distribution Function" investigates a common estimate of the cumulative distribution function. "Generating Samples from a Given Distribution" discusses the problem of transforming a random variable so that it has a given distribution, and lastly, "Estimating the Probability Density Function" illustrates how the histogram may be used to estimate the probability density function.

Note that this lab assumes an introductory background in probability theory. Some review is provided, but it is unfeasible to develop the theory in detail. A secondary reference such as 1 is strongly encouraged.

Random Variables

The following section contains an abbreviated review of some of the basic definitions associated with random variables. Then we will discuss the concept of an observation of a random event, and introduce the notion of an estimator.

Basic Definitions

A random variable is a function that maps a set of possible outcomes of a random experiment into a set of real numbers. The probability of an event can then be interpreted as the probability that the random variable will take on a value in a corresponding subset of the real line. This allows a fully numerical approach to modeling probabilistic behavior.

A very important function used to characterize a random variable is the cumulative distribution function (CDF), defined as

(10.1)
_autogen-svg2png-0001.png

Here, X is the random variable, and FX(x) is the probability that X will take on a value in the interval (–∞,x]. It is important to realize that x is simply a dummy variable for the function FX(x), and is therefore not random at all.

The derivative of the cumulative distribution function, if it exists, is known as the probability density function, denoted as fX(x). By the fundamental theorem of calculus, the probability density has the following property:

(10.2)
_autogen-svg2png-0009.png

Since the probability that X lies in the interval (–∞,∞) equals one, the entire area under the density function must also equal one.

Expectations are fundamental quantities associated with random variables. The expected value of some function of X, call it g(X), is defined by the following.

(10.3)
_autogen-svg2png-0014.png

Note that expected value of g ( X ) is a deterministic number. Note also that due to the properties of integration, expectation is a linear operator.

The two most common expectations are the mean μX and variance σX2 defined by

(10.4) μX = E [ X ] = ∫ – ∞ x fX ( x ) d x
(10.5)
_autogen-svg2png-0019.png

A very important type of random variable is the Gaussian or normal random variable. A Gaussian random variable has a density function of the following form:

(10.6)
_autogen-svg2png-0020.png

Note that a Gaussian random variable is completely characterized by its mean and variance. This is not necessarily the case for other types of distributions. Sometimes, the notation _autogen-svg2png-0021.png is used to identify X as being Gaussian with mean μ and variance σ2.

Samples of a Random Variable

Suppose some random experiment may be characterized by a random variable X whose distribution is unknown. For example, suppose we are measuring a deterministic quantity v, but our measurement is subject to a random measurement error ε. We can then characterize the observed value, X, as a random variable, X=v+ε.

If the distribution of X does not change over time, we may gain further insight into X by making several independent observations _autogen-svg2png-0032.png. These observations Xi, also known as samples, will be independent random variables and have the same distribution FX(x). In this situation, the Xi's are referred to as i.i.d., for independent and identically distributed. We also sometimes refer to _autogen-svg2png-0036.png collectively as a sample, or observation, of size N.

Suppose we want to use our observation _autogen-svg2png-0038.png to estimate the mean and variance of X. Two estimators which should already be familiar to you are the sample mean and sample variance defined by

(10.7)
_autogen-svg2png-0040.png
(10.8)
_autogen-svg2png-0041.png

It is important to realize that these sample estimates are functions of random variables, and are therefore themselves random variables. Therefore we can also talk about the statistical properties of the estimators. For example, we can compute the mean and variance of the sample mean _autogen-svg2png-0042.png.

(10.9)
_autogen-svg2png-0043.png
(10.10)
_autogen-svg2png-0044.png

In both Equation 10.9 and Equation 10.10 we have used the i.i.d. assumption. We can also show that _autogen-svg2png-0045.png.

An estimate _autogen-svg2png-0046.png for some parameter a which has the property _autogen-svg2png-0048.png is said to be an unbiased estimate. An estimator such that _autogen-svg2png-0049.png as N→∞ is said to be consistent. These two properties are highly desirable because they imply that if a large number of samples are used the estimate will be close to the true parameter.

Suppose X is a Gaussian random variable with mean 0 and variance 1. Use the Matlab function random or randn to generate 1000 samples of X, denoted as X1, X2, ..., X1000. See the online help for the random function. Plot them using the Matlab function plot. We will assume our generated samples are i.i.d.

Write Matlab functions to compute the sample mean and sample variance of Equation 10.7 and Equation 10.8 without using the predefined mean and var functions. Use these functions to compute the sample mean and sample variance of the samples you just generated.

INLAB REPORT

  1. Submit the plot of samples of X.

  2. Submit the sample mean and the sample variance that you calculated. Why are they not equal to the true mean and true variance?

Linear Transformation of a Random Variable

A linear transformation of a random variable X has the following form

(10.11) Y = a X + b

where a and b are real numbers, and a≠0. A very important property of linear transformations is that they are distribution-preserving, meaning that Y will be random variable with a distribution of the same form as X. For example, in Equation 10.11, if X is Gaussian then Y will also be Gaussian, but not necessarily with the same mean and variance.

Using the linearity property of expectation, find the mean μY and variance σY2 of Y in terms of a, b, μX, and σX2. Show your derivation in detail.

Hint

First find the mean, then substitute the result when finding the variance.

Consider a linear transformation of a Gaussian random variable X with mean 0 and variance 1. Calculate the constants a and b which make the mean and the variance of Y 3 and 9, respectively. Using Equation 10.6, find the probability density function for Y.

Generate 1000 samples of X, and then calculate 1000 samples of Y by applying the linear transformation in Equation 10.11, using the a and b that you just determined. Plot the resulting samples of Y, and use your functions to calculate the sample mean and sample variance of the samples of Y.

INLAB REPORT

  1. Submit your derivation of the mean and variance of Y.

  2. Submit the transformation you used, and the probability density function for Y.

  3. Submit the plot of samples of Y and the Matlab code used to generate Y. Include the calculated sample mean and sample variance for Y.

Estimating the Cumulative Distribution Function

Suppose we want to model some phenomenon as a random variable X with distribution FX(x). How can we assess whether or not this is an accurate model? One method would be to make many observations and estimate the distribution function based on the observed values. If the distribution estimate is “close” to our proposed model FX(x), we have evidence that our model is a good characterization of the phenomenon. This section will introduce a common estimate of the cumulative distribution function.

Given a set of i.i.d. random variables _autogen-svg2png-0091.png with CDF FX(x), the empirical cumulative distribution function _autogen-svg2png-0093.png is defined as the following.

(10.12)
_autogen-svg2png-0094.png

In words, _autogen-svg2png-0095.png is the fraction of the Xi's which are less than or equal to x.

To get insight into the estimate _autogen-svg2png-0098.png, let's compute its mean and variance. To do so, it is easiest to first define Nx as the number of Xi's which are less than or equal to x.

(10.13)
_autogen-svg2png-0102.png

Notice that _autogen-svg2png-0103.png, so

(10.14)
_autogen-svg2png-0104.png

Now we can compute the mean of _autogen-svg2png-0105.png as follows,

(10.15)
_autogen-svg2png-0106.png

This shows that _autogen-svg2png-0107.png is an unbiased estimate of FX(x). By a similar approach, we can show that

(10.16)
_autogen-svg2png-0109.png

Therefore the empirical CDF _autogen-svg2png-0110.png is both an unbiased and consistent estimate of the true CDF.

Exercise

Write a function F=empcdf(X,t) to compute the empirical CDF _autogen-svg2png-0111.png from the sample vector X at the points specified in the vector t.

Hint

The expression sum(X<=s) will return the number of elements in the vector X which are less than or equal to s.

To test your function, generate a sample of Uniform[0,1] random variables using the function X=rand(1,N). Plot two CDF estimates: one using a sample size N=20, and one using N=200. Plot these functions in the range t=[-1:0.001:2] , and on each plot superimpose the true distribution for a Uniform[0,1] random variable.

INLAB REPORT

Hand in your empcdf function and the two plots.

Generating Samples from a Given Distribution

It is oftentimes necessary to generate samples from a particular distribution. For example, we might want to run simulations to test how an algorithm performs on noisy inputs. In this section we will address the problem of generating random numbers from a given distribution FX(x).

Suppose we have a continuous random variable