Azzera filtri
Azzera filtri

using fitcknn in matlab

6 visualizzazioni (ultimi 30 giorni)
MiauMiau
MiauMiau il 6 Dic 2014
Modificato: Star Strider il 6 Dic 2014
Hi
I want to use fitcknn but with an implemented Distance metric, in my case levenshtein:
mdl = fitcknn(citynames,citycodes,'NumNeighbors', 50, 'exhaustive','Distance',@levenshtein);
This doesn't work, although it says in the Documentation "Distance metric, specified as the comma-separated pair consisting of 'Distance' and a valid distance metric string or function handle."
The error I get:
Error using internal.stats.parseArgs (line 42) Wrong number of arguments.
Error in classreg.learning.generator.Partitioner.processArgs (line 65) [cvpart,crossval,kfold,holdout,leaveout,~,otherArgs] = ...
Error in ClassificationKNN.fit (line 728) Nfold = classreg.learning.generator.Partitioner.processArgs(varargin{:});
Error in fitcknn (line 263) this = ClassificationKNN.fit(X,Y,varargin{:});
Error in NNlevenshtein (line 8) mdl = fitcknn(citynames,citycodes,'NumNeighbors', 50, 'exhaustive','Distance',@levenshtein);

Risposte (1)

Star Strider
Star Strider il 6 Dic 2014
We need to see your code for levenshtein.
According to the documentation, your levenshtein function has to have the form:
function D2 = DISTFUN(ZI,ZJ)
% calculation of distance
...
where
  • ZI is a 1-by-|N| vector containing one row of X or y.
  • ZJ is an M2-by-|N| matrix containing multiple rows of X or y.
  • D2 is an M2-by-|1| vector of distances, and D2(k) is the distance between observations ZI and ZJ(J,:).
  2 Commenti
MiauMiau
MiauMiau il 6 Dic 2014
Modificato: Star Strider il 6 Dic 2014
oh I see. So I used some code published on Github, see below, where the input are strings, but I think I can first convert my data fo ASCI characters then
function score = levenshtein(s1, s2)
% score = levenshtein(s1, s2)
%
% Calculates the area under the ROC for a given set
% of posterior predictions and labels. Currently limited to two classes.
%
% s1: string
% s2: string
% score: levenshtein distance
%
% Author: Ben Hamner (ben@benhamner.com)
if length(s1) < length(s2)
score = levenshtein(s2, s1);
elseif isempty(s2)
score = length(s1);
else
previous_row = 0:length(s2);
for i=1:length(s1)
current_row = 0*previous_row;
current_row(1) = i;
for j=1:length(s2)
insertions = previous_row(j+1) + 1;
deletions = current_row(j) + 1;
substitutions = previous_row(j) + (s1(i) ~= s2(j));
current_row(j+1) = min([insertions, deletions, substitutions]);
end
previous_row = current_row;
end
score = current_row(end);
end
Star Strider
Star Strider il 6 Dic 2014
Modificato: Star Strider il 6 Dic 2014
I had to look up Levenshtein distance. It is designed to measure the number of letter changes in two strings that would convert one string to another. I don’t see any reason for it not to work in a knn classifier.
I had to review the documentation on fitcknn since I’ve not used it in a while. I’ve also never encountered a problem such as yours.
You likely don’t have to specify 'exhaustive' since according to the documentation, the routine will do that by default. If you do specify it, you have to precede it with 'NSMethod'. Its presence in the argument list without that is likely throwing the error.
See if:
mdl = fitcknn(citynames,citycodes, 'NumNeighbors',50, 'NSMethod','exhaustive', 'Distance',@levenshtein);
or
mdl = fitcknn(citynames,citycodes,'NumNeighbors', 50,'Distance',@levenshtein);
(without 'exhaustive') works.

Accedi per commentare.

Tag

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by