Improving efficiency of a char array function
1 visualizzazione (ultimi 30 giorni)
Mostra commenti meno recenti
Paolo Binetti
il 8 Gen 2017
Modificato: Paolo Binetti
il 9 Gen 2017
I have built a function to be run 100000 times in a loop. Its input, "motifs", is a full 20x15 char array, containing only four types of characters, 'A', 'C', 'G', 'T'. I am wondering if my implementation, below, can be improved of if it is pretty much as fast as it gets:
% Find consensus string and score of a motifs array
function [score, count, consensus] = scoremotifs_2(motifs)
count = [ sum(motifs == 'A',1) ;
sum(motifs == 'C',1) ;
sum(motifs == 'G',1) ;
sum(motifs == 'T',1) ];
[count_max, consensus_num] = max(count);
consensus(consensus_num == 1) = 'A';
consensus(consensus_num == 2) = 'C';
consensus(consensus_num == 3) = 'G';
consensus(consensus_num == 4) = 'T';
score = sum(sum(motifs ~= consensus));
3 Commenti
Risposta accettata
the cyclist
il 8 Gen 2017
Modificato: the cyclist
il 8 Gen 2017
This is starting to get a little obfuscated, but it's significantly faster:
list ='ACGT';
count = sum(motifs == reshape(list,[1 1 4]));
[count_max, consensus_num] = max(count,[],3);
consensus = list(consensus_num);
score = sum(sum(motifs ~= consensus));
The key algorithmic difference here is that I am able to compare against all four elements 'ACGT' in parallel, by permuting them into a third dimension.
You need R2016b or later in order for MATLAB to make the implicit dimension expansion to compare the 20x15x1 array against the 1x1x4 array. If you have an older version, you will explicitly need to use repmat.
2 Commenti
Più risposte (1)
the cyclist
il 8 Gen 2017
It's somewhat faster to calculate consensus like this:
list = 'ACGT';
consensus = list(consensus_num);
0 Commenti
Vedere anche
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!