Friday, 4 March 2016

Inferential and Descriptive Statistics, Data Levels, Significance and p-Values

If unsure on the types of hypothesis and what they predict, I'd suggest reading this post: http://samsa2psych.blogspot.co.uk/2016/02/hypotheses-sampling-and-design.html before trying to get your head around inferential statistics!


Descriptive Statistics



Descriptive statistics help us draw conclusions from the results of a set of data, giving a typical score. They are divided into measures of central tendency and measures of dispersion.


Measures of central tendency such as the mean (add all the values up, divide by number of values), median (arrange the values in numerical order, take the one in the middle) and mode (most common value.)


Measures of dispersion give a spread of results, suggesting the consistency of the values that make up our results. These include range (difference between highest and lowest value), interquartile range, and standard deviation.


Inferential Statistics



Inferential statistics tell us the probability of differences in our results being due to direct IV manipulation (an alternative hypothesis = a causal link between the IV and the DV) rather than through chance or extraneous variables. The inferential test used depends on the experimental design and the level of data collected - data can be classified as one of three types.


Nominal level data - the most basic form, named categories. We know little other than what category a result fits into. For example, measuring height using nominal level data could be done through two categories: "above 6ft" and "below 6ft"


Ordinal level data (rank measurement)- data is ordered from highest to lowest value. We don't know the gaps and boundaries between rankings (parameters) but we know what order results come in. For example, measuring height using ordinal level data would be done by arranging results in order of highest to lowest value.


Likert scales are an example of ordinal data - a participant is given a statement and asked how much they agree or disagree with it - a response of "4" means that they agree more than a response of "2" - but we do not know how much more - it lacks parameters.


In Milgram's study of conformity, he measured the maximum voltage electric shock the participants would give to a confederate - but this data is only "at least ordinal" - administering 200V did not necessarily make participants twice as conforming as administering 100V, or half as conforming as administering 400V.


Interval level data - we know that gaps and boundaries between data, and the parameters are scientifically equivalent. For example, measuring height in metres and centimetres. Ratio level data is a type of interval data starting at 0 - for example, height.


Metrics such as words remembered from a list are also not necessarily interval level data - it is unclear how much each word remembered represents a better memory - someone who can remember 10 words does not necessarily have only twice as good a memory as someone who can remember 5.


In cases such as the above, we say the data is at least ordinal level, and treat it accordingly - when it is ambiguous as to whether the data is ordinal or interval.-


Types of Inferential Statistical Test



Once we have established the level of the data collected, we select an appropriate test to calculate the statistical significance of the results.


                Repeated Measures/Matched Pairs     Independent Groups     Correlational




Nominal    Binomial Sign Test                           Chi-Squared              Chi-Squared




Ordinal    Wilcoxon Signed Ranks            Mann-Whitney U          Spearman's RHO        



Interval     Related t-Test                                Unrelated t-Test               Pearson's










These tests give us a p-value, a measure of the probability that our results are due to chance/EVs rather than IV manipulation (validating the alternative hypothesis). Our calculated p-values are compared to a number ranging from 0 to 1.0 to judge its significance. (The "critical value")


p ≤ 0.10 means that the probability that results are due to chance (p) is less than or equal to 0.1 (10%) - meaning that the probability of the results being due to IV manipulation is greater than or equal to 90% - if our p-value is less than 0.10, we can be 90% certain that the alternative hypothesis is valid.


Generally, in psychology, we measure significance by p ≤ 0.05 - p is less than 5%, meaning that for results to be statistically significant, we must be 95% sure that they are due to IV manipulation, not chance. The value that p must be less than is our "critical value" - required for significance. This is a fairly stringent measurement of significance - although sometimes p ≤ 0.01 is used - 99%.


A type 1 error occurs when we are led to incorrectly thinking that there is a significant difference, and accepting our alternative hypothesis, when in fact there is not a significant difference, and our hypothesis is incorrect. We incorrectly reject a null hypothesis. This is done by setting the required significance value too high (e.g. p ≤ 0.2 means 20% chance that results are due to chance is still counted as statistically significant) - and having a p-value lower than this significance level. The more stringent our level of significance, the less chance of making a type 1 error. At p ≤ 0.01, the chance of a type 1 error is 1% or less.


A type 2 error occurs when we are led to incorrectly believing that there is no significant difference where one does actually exist, and rejecting our alternative hypothesis despite it being valid and correct. We incorrectly accept a null hypothesis. This is done by setting the required significance value too low (e.g. p ≤ 0.01 means only 1%  or less chance that results are due are chance are counted as significant - 98.9 % chance of alternative hypothesis being correct is rejected as insignificant - 99% is required for significance) and having a p-value higher than the significance level. The more stringent our level of significance, the greater chance of making a type 2 error.


Answering questions on this, refer to three points in your answer - the calculated p value, the value required to indicate significance (critical value), and the degree of significance - if, to be significant, the critical value is 0.05, then we have a fairly high degree of significance when p 0.05 - the probability that our results are due to IV manipulation rather than chance or EVs is greater than or equal to 95%. This would be phrased by writing "this result is significant at the p≤0.05 level"


Some inferential statistical tests have significance values work in the opposite way, with the calculated value needing to be greater than the critical value for significance. This will be told to you in the question if this happens.













No comments:

Post a Comment