Visualization of High-Dimensional DataMatrices > Applications > Visualization
Visualization ProblemConsider a data set of We can represent this data set as a Example: Raw data matrix for the US Senate, 2004-2006. We can try to visualize the data set, by projecting each data point (each row or column of the matrix) on (say) a 1D-, 2D- or 3D-space. Each ‘‘view’’ corresponds to a particular projection, that is, a particular one-, two- or three-dimensional subspace on which we choose to project the data. The visualization problem consists of choosing an appropriate projection. There are many ways to formulate the visualization problem, and none dominates the others. Here,we focus on the basics of that problem. Projecting on a lineTo simplify, let us first consider the simple problem of representing the high-dimensional data set on a simple line, using the method described here. Specifically we would like to assign a single number, or ‘‘score’’, to each column of the matrix. We choose a direction ![]() We thus obtain a vector of values ![]() that is: ![]() is the vector of sample averages across the columns of the matrix (that is, data points). The vector The values of our scoring function can now be expressed as ![]() In order to be able to compare the relative merits of different directions, we can assume, without loss of generality, that the vector It is convenient to work with the ‘‘centered’’ data matrix, which is ![]() where In matlab, we can compute the centered data matrix as follows. Matlab syntax
>> xhat = mean(X,2); >> [m,n] = size(X); >> Xcent = X-xhat*ones(1,n); We can compute the (row) vector scores using the simple matrix-vector product: ![]() We can check that the average of the above row vector is zero: ![]() Example: Senator scores on average bill. Projection on a planeWe can also try to project the data on a plane, which involves assigning two scores to each data point. This corresponds to the affine ‘‘scoring’’ map ![]() where The affine map ![]() by choosing the vector ![]() We can encapsulate the scores in the ![]() with Example: Visualizing Senate voting on a plane. |