Given a string s and a number n, find the most frequently occurring n-gram in the string, where the n-grams can begin at any point in the string. This comes up in DNA analysis, where the 3-base reading frame for a codon can begin at any point in the sequence.
So for
s = 'AACTGAACG'
and
n = 3
we get the following n-grams (trigrams):
AAC, ACT, CTG, TGA, GAA, AAC, ACG
Since AAC appears twice, then the answer, hifreq, is AAC. There will always be exactly one highest frequency n-gram.
Solution Stats
Problem Comments
1 Comment
Solution Comments
Show comments
Loading...
Problem Recent Solvers1376
Suggested Problems
-
Sum all integers from 1 to 2^n
17825 Solvers
-
Project Euler: Problem 6, Natural numbers, squares and sums.
2571 Solvers
-
Back to basics 11 - Max Integer
811 Solvers
-
Calculate the Number of Sign Changes in a Row Vector (No Element Is Zero)
943 Solvers
-
1618 Solvers
More from this Author96
Problem Tags
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!
It should be noted that spaces should be ignored or else test suites 3 and 5 fail.