Relatively complex matrix indexing question
2 visualizzazioni (ultimi 30 giorni)
Mostra commenti meno recenti
Hi,
I have a reasonably complex matrix indexing question that I'm struggling to solve. I have an nxn matrix of float values. The matrix is fairly large (n~30000). The matrix is upper (or lower) triangular with the other values being NaN. Each of the rows represented has a vector associated with it, so the matrix rows has a secondary matrix associated with it that is nxk, where each row has k characteristics. Row 1 has the same characteristics as column 1, row 2 has the same characteristics as column2, etc.
I want to plot a histogram of all of the elements in the nxn matrix whose rows/columns match certain characteristics.
To illustrate, let's say I had the following 4x4 value matrix
Values = [...
NaN 4.0 3.1 2.3
NaN NaN 1.2 5.2
NaN NaN NaN 7.1
NaN NaN NaN NaN]
Each row/column has three characteristics and this characteristics matrix has the values
Charac = [...
4 2 3
1 4 5
-3 5 1
10 6 2]
I only want to get histogram elements for rows/columns that have the second characteristic greater than 1 and the third characteristic less than 4. In this case only the first, third and fourth rows would meet the criterion and the values that I would take from the matrix Values are 3.1, 2.3 and 7.1.
Is there a compact and computationally efficient way to logically index this?
Thanks in advance!
8 Commenti
Risposta accettata
Cedric
il 10 Giu 2014
Modificato: Cedric
il 10 Giu 2014
You are correct about sparse matrices, and I had completely misunderstood your setup! So the lower triangular part is NaN, but the upper part has a density of 1(?) If so, there is no point in using sparse matrices. Reading your last explanation, I think that what you are looking for is along the line of what follows.
EDIT - I provide a second approach (B), which could work if you had only non-NaN elements in a vector. It is just a quick test though and it would have to be tested/profiled.
values = [NaN 4.0 3.1 2.3; ...
NaN NaN 1.2 5.2; ...
NaN NaN NaN 7.1; ...
NaN NaN NaN NaN] ;
charac = [4 2 3; ...
1 4 5; ...
-3 5 1; ...
10 6 2] ;
%%%A. Values is square num. array with NaNs.
% - Find valid row IDs.
isValid = charac(:,2) > 1 & charac(:,3) < 4 ;
validId = find( isValid ) ;
% - Build all possible row/col pairs.
validRC = nchoosek( validId, 2 ) ;
isValid = validRC(:,2) > validRC(:,1) ;
validRC = validRC(isValid,:) ;
% - Extract relevant values.
validId = sub2ind( size(values), validRC(:,1), validRC(:,2)) ;
extract = values(validId) ;
%%%B. Values is lin num array with non-NaNs only.
% - Build what could be values if you were not storing NaNs. Makes little
% sense to build it this way (by creating the large, square array and
% reducing it), but maybe you can build this vector directly instead
% of the array.
valuesLin = values(~isnan(values)) ;
% - Build sub2linInd function.
cSum = [0, cumsum( 1 : (size(values,2)-1) )] ;
sub2linInd = @(row, col) row + cSum(col-1).' ;
% - Extract relevant values.
validId = sub2linInd( validRC(:,1), validRC(:,2) ) ;
extractLin = valuesLin(validId) ;
7 Commenti
Cedric
il 11 Giu 2014
Modificato: Cedric
il 11 Giu 2014
Yes, you are absolutely right, I thought that it was "with replacement" and it is not (sorry, I should have checked!).
How are you building the first large matrix (?), would it be difficult to build directly a vector of non-NaN values ( double or even uintx) and avoid building the large array?
If you must build an array, could you use single elements instead of double? The single type/class supports NaN.
If you need e.g. uint8 because you are short on RAM, do you really need the 256 possible values; couldn't you, as mentioned above, use e.g. 255 to code for NaN?
All solutions have advantages and bottlenecks: memory for the double/NaN approach, practicality and speed for the nchoosek/non-NaN approach (nchoosek won't work if you have too many rows matching conditions, even with a k as small as 2), etc. I guess that at this stage, we could proceed by elimination: how much RAM do you have available, and how many rows matching conditions could you have?
I'd say that your short term best bet (reasonable memory usage, precision, computation time, and development time) is likely to be Image Analyst's approach, with your modification, applied to a class single Values array.
Più risposte (1)
Image Analyst
il 9 Giu 2014
Try this and see if it's what you want:
Values = [...
NaN 4.0 3.1 2.3
NaN NaN 1.2 5.2
NaN NaN NaN 7.1
NaN NaN NaN NaN]
Charac = [...
4 2 3
1 4 5
-3 5 1
10 6 2]
% Find rows where column 2 > 1 and column 3 < 4
rowsToExtract = Charac(:, 2) > 1 & Charac(:, 3) < 4
% Extract only those rows and no others.
extractedRows = Values(rowsToExtract,:)
% Get the values from that which are not NaN.
extractedValues = extractedRows(~isnan(extractedRows(:)))
3 Commenti
Image Analyst
il 10 Giu 2014
I don't see why my way would not work. Your suggestion will end up eliminating some columns also, and you didn't say anything about eliminating columns.
Vedere anche
Categorie
Scopri di più su Matrix Indexing in Help Center e File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!