Friday 26 February 2016

Reliability and Validity

Only description really here - Research Methods doesn't have the clear AO1/AO2 division that the other units have.


Validity


Validity is the extent to which research measures what it claims to measure – a test of emotional intelligence should measure emotional intelligence rather than something else, like memory.

Internal validity refers to how much the effects on the DV observed in a study are due to IV manipulation, rather than something else – whether there is a causal relationship between the IV and the DV.

Internal validity can be improved by:
  • Extraneous variable (EV) control
  • Standardisation of instructions
  • Counterbalancing
  • Controlling for individual differences
  • Reducing demand characteristics and investigator effects.

Ecological validity refers to the extent to which the results of a study can be generalised to other settings and situations. A study entirely carried out on university students might have very poor ecological validity due to the specific culture or practices of its setting, which would make it difficult to accurately generalise.

Population validity – The extent to which the results of a study can be generalised to people other than the studied sample. Studies are often Eurocentric or Americanocentric, reflecting a westernised worldview and an individualist philosophy, often being inapplicable and invalid when applied to eastern, collectivist cultures.

Validity over time – The extent to which the results of a study can be generalised to other historical or future societies and situations. Results from Solomon Asch's study on conformity in America during a period of the Cold War known as McCarthyism, where individuality was highly discouraged, may not apply to America's more liberal and tolerant society nowadays.


To test for validity we can use:
  • Concurrent validity – Assessing how well a study’s results correlate with another study at a similar time that has already  been validated – if they have similar results and conclusions, the study we are assessing is more valid.
  • Content validity – The extent to which a measure represents all elements of the system or construct being studied. For example, a measure of intelligence is more valid if it takes into account emotional, verbal, spatial and mathematical intelligence, rather than just a single measure.
  • Face validity – Whether the test appears, at face value, to measure what it claims to. 

Reliability

Reliability is the consistency of a research study or measuring test – to what extent the results match the results of similar studies, or how much the study can be replicated and achieve similar results.


Internal reliability is the degree to which a measure is consistent within itself – checking that all parts of a study are testing for the same thing.



To assess internal reliability, we can use the split-half method for any metric or study that uses a numerical system such as a numbered rating or likert scale. Splitting a participant's responses to the questions into two halves and tallying up the totals should yield similar results if the metric has high internal reliability - indicating that questions have similar degrees of relevance and importance. Scores on one half are correlated with the other half - a more positive correlation indicates higher internal reliability of the metric.


External reliability is the extent to which a measure or metric varies from one use to another - whether through time or through who is carrying out the research. This can be assessed through use of the:

Test-retest method – Carrying out multiple tests at different points in time to measure the stability and reliability of the metric over a long period of time. The test is given to the same participant again at a later point in time, and the scores are correlated. A more positive correlation indicates higher external reliability.

Inter-rater reliability – Have multiple raters carry out an assessment using the same metric or test with the same rating scale and check that they have similar results.


No comments:

Post a Comment