

To increase the consistency of how responses are categorized, this technique relies upon multiple individuals reviewing the same results. Interrater reliability: Applicable to open-ended questions, or the results of interviews or focus groups, interrater reliability is concerned with how those charged with reading the results are interpreting them. In addition, a Cornbach’s Alpha test can quantify the internal consistency of the construct. If the construct halves are in agreement, the construct can be said to have split-half reliability. In these cases, a researcher can split the construct in half, and compare the results of the two halves to each other.

Split-half reliability: This procedure is meant to be used on indicator constructs (or indexes), where a series of questions are thought to collectively measure a phenomenon. If so, the indicators can be said to have inter-item reliability. The researcher than examines the results to determine if the two indicators, each phrased differently, are producing similar results. Alternatively, inter-item reliability retains the duplicate indicators within a single survey instrument. If the same individuals score the same on both versions, the indicators can be said to have parallel form reliability. With parallel form, the researcher purposefully creates at least two versions of a questionnaire (similar to using two versions of a test in a course). Both parallel form and inter-item reliability depend upon such duplication to lend support to the consistency of our indicators.

Parallel form and inter-item reliability: Oftentimes there is more than one way to measure the phenomenon in which we are interested. Assuming no changes in the sample subject, the question/questionnaire/indicator should return consistent results over multiple administrations. Test-retest reliability: Identical to the process described in the scale sample above, test-retest reliability relies upon administering the question to a subject multiple times. To that end, there are several methods that can be used to increase the reliability of a question or questionnaire. As with the example of the scale, a questionnaire which is unable to provide consistent results has little to no useful purpose. Even so, increasing the reliability of a question or questionnaire has important implications for our ability to use the results. If a scale is reliable, it will report the same weight for the same item measured successively (assuming the weight of the item has not changed).Īs is the case with validity, perfect reliability can be difficult, if not impossible, to achieve. Reliability is concerned with the consistency or dependability of a question, and is analogous to the repeated use of a scale.
