Wolfram ResearchPRODUCTSPURCHASEFOR USERSCOMPANYOUR SITES
THIS IS DOCUMENTATION FOR AN OBSOLETE PRODUCT.
SEE THE DOCUMENTATION CENTER FOR THE LATEST INFORMATION.

Statistics`MultinormalDistribution`

The most commonly used probability distributions for multivariate data analysis are those derived from the multinormal (multivariate Gaussian) distribution. This package contains multinormal, multivariate Student  , Wishart, Hotelling  , and quadratic form distributions.

Distributions are usually represented in the symbolic form name[ ,  , ... ]. When there are many parameters, they may be organized into lists, as in the case of QuadraticFormDistribution. Functions such as Mean, which give properties of statistical distributions, take the symbolic representation of the distribution as an argument.

Standard probability distributions derived from the multivariate Gaussian distribution.

A  -variate multinormal distribution with mean vector  and covariance matrix  is denoted  . If  ,  , is distributed  (where  is the zero vector), and X denotes the  data matrix composed of the  row vectors  , then the  matrix  has a Wishart distribution with scale matrix  and degrees of freedom parameter  , denoted  . The Wishart distribution is most typically used when describing the covariance matrix of multinormal samples.

A vector that has a multivariate Student t distribution can also be written as a function of a multinormal random vector. Let  be a standardized multinormal vector with covariance matrix  and let  be a chi-square variable with  degrees of freedom. (Note that since  is standardized,  is the mean vector of  and  is also the correlation matrix of  .) Then  has a multivariate  distribution with correlation matrix  and  degrees of freedom, denoted  . The multivariate Student  distribution is elliptically contoured like the multinormal distribution, and characterizes the ratio of a multinormal vector to the standard deviation common to each variate. When  and  , the multivariate  distribution is the same as the multivariate Cauchy distribution (here  denotes the identity matrix).

The Hotelling T distribution is a univariate distribution proportional to the F-ratio distribution. If vector  and matrix  are independently distributed  and  , then  has the Hotelling  distribution with parameters  and  , denoted  . This distribution is commonly used to describe the sample Mahalanobis distance between two populations.

A quadratic form in a multinormal vector  distributed  is given by  , where  is a symmetric  matrix,  is a  -vector, and  is a scalar. This univariate distribution can be useful in discriminant analysis of multinormal samples.

Functions of univariate statistical distributions applicable to multivariate distributions.

In this package distributions are represented in symbolic form. Generally, PDF[dist, x] evaluates the density at  if  is a numerical value, vector, or matrix, and otherwise leaves the function in symbolic form. Similarly, CDF[dist, x] gives the cumulative density and CharacteristicFunction[dist, t] gives the characteristic function of the specified distribution.

In some cases explicit forms of these expressions are not available. For example, PDF[QuadraticFormDistribution[{A, b, c}, {mu, sigma}], x] does not evaluate, but a Series expansion of the PDF about the lower support point of the domain (for a positive definite quadratic form) does evaluate. The CDF of MultinormalDistribution and StudentTDistribution is available for numerical vector arguments, but not for symbolic vector arguments. In the case of MultivariateTDistribution, the CharacteristicFunction is expressed in terms of an integral.

If CapitalSigma is a diagonal matrix, the closed form result for CDF[MultinormalDistribution[mu,sigma],x] is computed directly. If CapitalSigma is not diagonal and  has the form  for  and  for  , where  , a method for multivariate normal distributions with product-covariance structures described in Y. L. Tong, The Multivariate Normal Distribution, Springer-Verlag, 1990 is used. If CapitalSigma does not have either of these special forms, a general method described in Alan Genz, "Numerical Computation of Multivariate Normal Probabilities," Journal of Computational and Graphical Statistics 1 (1992), pp. 141-149 is used.

If the correlation matrix  in CDF[MultivariateTDistribution[r, m], x] is diagonal, a single numeric integration is performed using the closed form result for the multivariate normal CDF and the relationship between multivariate normal and multivariate T distributions. Otherwise, general methods based on separation of variable techniques in Alan Genz and Frank Bretz, "Comparison of Methods for the Computation of Multivariate t-Probabilities," Journal of Computational and Graphical Statistics 11 (2002), pp. 950-971 are used.

This loads the package.

Here is a symbolic representation of a standardized binormal distribution. A standardized random vector has a zero mean vector and a covariance matrix equal to its correlation matrix.

This gives its probability density function.

You can make a plot of the density to observe its distribution.

Here is the probability of the distribution in the region  .

This gives the domain of the quadratic form distribution qdist.

The series expansion of the PDF of the quadratic form distribution can be plotted. A 20-term expansion is clearly poor for  .

CDF[MultinormalDistribution[mu,sigma],x] and CDF[MultivariateTDistribution[r, m], x] are computed as multidimensional numeric integrals with the same default options as NIntegrate. If fewer digits of precision are required, quicker results can be obtained by setting a lower value for PrecisionGoal. For large values of m, a change of variable is performed by CDF to provide accurate results. The change of variable is made for m > 400 if the correlation matrix R is diagonal and for m > 550 if R is not diagonal. For values of m above these threshholds, computations will generally be slower. Also, sharp features near the edges of the integration region may pose additional problems for convergence as precision and accuracy goals are increased. Increasing the value of SingularityDepth or MaxRecursion will often overcome these problems.

The following CDF may take several seconds to integrate.

If only 3 digits of precision are required, good results can be obtained in a fraction of a second by use of the PrecisionGoal option.

The change of variables used for large m may slow down the computations with default settings.

Furthermore, convergence problems may exist, unless options are chosen to improve the integration.

Many of the multivariate distributions have hidden arguments that are evaluated when the distribution is first entered. Random variate generation will be more efficient if these arguments are evaluated only once.

This is an inefficient means of computing 1000 multinormal variates because the Cholesky decomposition of the covariance matrix is computed for each variate.

This method of generating 1000 variates is more efficient because the Cholesky decomposition is computed once.

Functions of univariate statistical distributions not applicable to multivariate distributions.

In the multivariate case, it is difficult to define Quantile as the inverse of the CDF function, since many values of the random vector (or random matrix) correspond to a single probability value. This package defines Quantile only for the univariate distribution HotellingTSquareDistribution and some minor degenerate cases of the other distributions. The elliptically-contoured distributions MultinormalDistribution and MultivariateTDistribution support EllipsoidQuantile and its inverse RegionProbability.

Functions of vector-valued multivariate statistical distributions.

This gives the ellipse centered on the mean that encloses 50% of the ndist distribution.

This gives the probability of the distribution within the ellipse. Note that the ellipse must correspond to a constant-probability contour of the prescribed distribution.

As  , the   elliptical contour of MultivariateTDistribution[m, r] approaches the   elliptical contour of a multinormal distribution with zero mean vector and covariance matrix  .


Any questions about topics on this page? Click here to get an individual response.Buy NowFree TrialMore Information



 © 2009 Wolfram Research, Inc.  Terms of Use  Privacy Policy |
Sign up for our newsletter: