Covariance Matrices and Data Distributions
来源:互联网 发布:念诗 知乎 编辑:程序博客网 时间:2024/05/22 00:29
Correlation between variables in a -dimensional dataset are often summarized by a covariance matrix. To get a better understanding of how correlation matrices characterize correlations between data points, we plot data points drawn from 3 different 2-dimensional Gaussian distributions, each of which is defined by a different covariance matrix.
The left plots below display the covariance matrix for each Gaussian distribution. The values along the diagonal represent the variance of the data along each dimension, and the off-diagonal values represent the covariances between the dimensions. Thus the -th entry of each matrix represents the correlation between the -th and -th dimensions. The right plots show data drawn from the corresponding 2D Gaussian.
The top row plot display a covariance matrix equal to the identity matrix, and the points drawn from the corresponding Gaussian distribution. The diagonal values are 1, indicating the data have variance of 1 along both of the dimensions. Additionally, the off-diagonal elements are zero, meaning that the two dimensions are uncorrelated. We can see this in the data drawn from the distribution as well. The data are distributed in a sphere about origin. For such a distribution of points, it is difficult (impossible) to draw any single regression line that can predict the second dimension from the first, and vice versa. Thus an identity covariance matrix is equivalent to having independent dimensions, each of which has unit (i.e. 1) variance. Such a dataset is often called “white” (this naming convention comes from the notion that white noise signals–which can be sampled from independent Gaussian distributions–have equal power at all frequencies in the Fourier domain).
The middle row plots the points that result from a diagonal, but not identity covariance matrix. The off-diagonal elements are still zero, indicating that the dimensions are uncorrelated. However, the variances along each dimension are not equal to one, and are not equal. This is demonstrated by the elongated distribution in red. The elongation is along the second dimension, as indicated by the larger value in the bottom-right (point ) of the covariance matrix.
The bottom row plots points that result from a non-diagonal covariance matrix. Here the off-diagonal elements of covariance matrix have non-zero values, indicating a correlation between the dimensions. This correlation is reflected in the distribution of drawn datapoints (in blue). We can see that the primary axis along which the points are distributed is not along either of the dimensions, but a linear combination of the dimensions.
The MATLAB code to create the above plots is here
% INITIALIZE SOME CONSTANTS
mu = [0 0];
% ZERO MEAN
S = [1 .9; .9 3];
% NON-DIAGONAL COV.
SDiag = [1 0; 0 3];
% DIAGONAL COV.
SId = eye(2);
% IDENTITY COV.
% SAMPLE SOME DATAPOINTS
nSamples = 1000;
samples = mvnrnd(mu,S,nSamples)';
samplesId = mvnrnd(mu,SId,nSamples)';
samplesDiag = mvnrnd(mu,SDiag,nSamples)';
% DISPLAY
subplot(321);
imagesc(SId); axis image,
caxis([0 1]), colormap hot, colorbar
title(
'Identity Covariance'
)
subplot(322)
plot(samplesId(1,:),samplesId(2,:),
'ko'
); axis square
xlim([-5 5]), ylim([-5 5])
grid
title(
'White Data'
)
subplot(323);
imagesc(SDiag); axis image,
caxis([0 3]), colormap hot, colorbar
title(
'Diagonal Covariance'
)
subplot(324)
plot(samplesDiag(1,:),samplesDiag(2,:),
'r.'
); axis square
xlim([-5 5]), ylim([-5 5])
grid
title(
'Uncorrelated Data'
)
subplot(325);
imagesc(S); axis image,
caxis([0 3]), colormap hot, colorbar
title(
'Non-diagonal Covariance'
)
subplot(326)
plot(samples(1,:),samples(2,:),
'b.'
); axis square
xlim([-5 5]), ylim([-5 5])
grid
title(
'Correlated Data'
)
- Covariance Matrices and Data Distributions
- swirl 7: Matrices and Data Frames
- Covariance and Contravariance
- Covariance and Contra-variance
- Covariance, Contravariance and Invariance
- Invariance, covariance and contravariance
- Covariance and Contravariance in Java
- Mean Vector and Covariance Matrix
- Tables, Matrices, and Lists
- Shifting and Sorting Matrices
- Creating and Concatenating Matrices
- Resizing and Reshaping Matrices
- projection and view matrices
- Matrices and Vectors
- learning R with swirl- Matrices and Data Frames(矩阵和数据帧)
- covariance
- Covariance
- Statistics: Mean, Variance, Covariance, and Correlation
- mongo db
- js 开发进阶之 开发中容易遇到的问题
- Selenium(2): DOM元素定位、操作
- ORACLE数据库表及数据恢复
- Android Binder机制中的异步回调
- Covariance Matrices and Data Distributions
- Spring MVC使用fastjson做消息转换器,与默认Jackson的区别
- js写很菜的轮播图
- react-native调用Android原生模块
- asp.net页面传值方法汇总
- 编码
- NEUQ网络赛补题
- Selenium(3): 浏览器操作
- QT关于网络TCP通讯的记录