# Graph analysis question

2 views (last 30 days)

Show older comments

I have this data:

X = 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

With

Y = 0.5 1 1.5 2 2.5 3 3.5 4 1 1.5 3 3.5 4.5 5.5 6 2 4 4.5 5.5 6.5 8

This may be more of a math question rather than a matlab question. If you plot the above data you will see three distinct sets of data, is there any way I can get matlab to automatically split this data into the three seperate variables. The number of variables may change depending on the data set, the data I have provided is indicative for the real data the X axis is actually dates.

In short I need Matlab to detect how many separate data sets there are and split the data into different variables.

I've tried using a max and min point script but I didn't get very far with it.

Thanks in advance

##### 3 Comments

Robert Cumming
on 3 Aug 2011

yes I see now that I actually plotted it... Misinterpreted the question

### Accepted Answer

the cyclist
on 3 Aug 2011

After I posted my answer about cluster analysis, I noticed the following:

Are your (X,Y) data always sorted as in your example, or can the pairs be jumbled? If they are always sorted, you can just look for negative jumps in Y, using the find() and diff() commands, and separate the data wherever there is such a jump downward.

X = [1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21];

Y = [0.5 1 1.5 2 2.5 3 3.5 4 1 1.5 3 3.5 4.5 5.5 6 2 4 4.5 5.5 6.5 8];

N = numel(Y);

negativeJumpIndex = find(diff([0,Y])<0);

numberNegativeJumps = numel(negativeJumpIndex);

numberClusters = numberNegativeJumps + 1;

indexToNewLine = [1 negativeJumpIndex N+1];

IDX = zeros(N,1);

for nc=1:numberClusters

IDX(indexToNewLine(nc):indexToNewLine(nc+1)-1) = nc;

end

figure

gscatter(X,Y,IDX)

### More Answers (2)

the cyclist
on 3 Aug 2011

##### 1 Comment

the cyclist
on 3 Aug 2011

Wolfgang Schwanghart
on 3 Aug 2011

I just tried your example. While results of a kmeans clustering don't look too promising, the function clusterdata works quite well. What you should know a priori is the number of clusters in your data.

x = [1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21]';

y = [0.5 1 1.5 2 2.5 3 3.5 4 1 1.5 3 3.5 4.5 5.5 6 2 4 4.5 5.5 6.5 8]';

IDX = clusterdata([x y],'distance','chebychev','maxclust',3);

gscatter(x,y,IDX)

Regards, W.

### See Also

### Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!