A question about checking the significant difference between data sets
25 views (last 30 days)
I have 3 columns of data (C1, C2, C3) which represents modeled precipitation values (360 rows) using three different models, I want to found if the difference between these three models is significant or not. I searched a lot and found that Matlab has a t-student test function that can do this.
Since there are ttest and ttset2 functions I want to ask you is it true to perform ttest2 between C1 and C2 and then doing this for C2 and C3, and finally for C1 and C3 ?
Does any alternative way exist in Matlab?
Any help or advice is highly appreciated.
Thanks a lot
Abdolkarim Mohammadi on 29 May 2020
Edited: Abdolkarim Mohammadi on 29 May 2020
There are a lot of differences between these tests. Choosing one of them depends on your dataset and research design. All of the tests you mentioned are used for comparing means of two samples.
ttest performs a one-sample t-test and is suitable when you need to know whether or not your only sample's mean is equal to a certain mean. The certain mean is called the test value. I think this is not suitable for your work. One-sample t-test is a parametric statistical test, meaning that your dataset has to have certain properties like normality to use this test. chi2gof or kstest must be used to determine the normality of the variables. There are non-parametric equivalents of this test like rank sum (ranksum) and signed rank (signrank).
ttest2 performs a paired-samples t-test. It is suitable when you want to know whether two paired samples have equal means. This is also a parametric test and your dataset has to have certain properties like normality. Again, chi2gof or kstest must be used to determine the normality of the variables. ranksum or signrank can also be used in a two-sample fashion.
There is also another test which I think is the best choice for you. ANOVA compares the means of two or more variables in just one test. It can be thought of as the extension of ttest2 for k variables. You have only one variable (precipitation) and your samples are for different locations, which are different categories of the main variable. So you need to perform the one-way ANOVA test (anova1). ANOVA is a parametric test and requires your dataset to have properties like normality. If your variables do not satisfy its assumptions, you need to use a non-parametric equivalent, which are Kruskal-Wallis test (kruskalwallis) and Friedman test (friedman). If the null hypothesis is rejected in ANOVA, Kruskal-Wallis, or Friedman, then you have to perform a post-hoc analysis to find out which category (sample) has significantly different mean. You pass the output of the test to the multcompare and it performs the suitable post-hoc test for your original test.