What forms of reliability and validity do Grogan et al. (2000) establish for their patient satisfaction questionnaire (PSQ) measure?
- Briefly (one paragraph each) review the forms established.
1) Grogan et al.’s (2000) chose an internal reliability to check for the consistency or inconsistency of their inter-subscales. They performed Cronbach’s alpha analysis to measure the correlation of each subscales on the general satisfaction subscale. The results showed a high level of correlation coefficient, which range between .74-.95, indicating that the subscales are internally reliable. Using internal reliability (Cronbach’s alpha) is efficient and most widely used, however, the expression of its result could only be in terms of consistency and inconsistency. Moreover, the Cronbach’s alpha allows items to be discarded from the analysis just to get a better alpha value (Vehkalnti, 2004). Furthermore, the present result showed a strong alpha value of doctor subscale (.95), therefore, it is reasonable to exclude ‘doctor’ subscale from the five-factor domain, since it is very similar to the general satisfaction subscale.
2) Grogan et al. (2000) chose an internal validity way of assessment (focusing on the inference from the cause and effect of a variable on another variable). Grogan et al. used a construct validity test by using the Confirmatory Factor Analysis (CFA), the Pearson’s product moment correlation (PPMC), and the Analysis of Covariance (ANOVA). The CFA was used to identify the approximate closeness of the 40 items to fit on the appropriate factor of the five-factor model (doctors, nurses, access, appointments, and facilities), which was suggested to embody patients’ satisfaction. The result showed a low value of measure and high value of non-normed fit index (NNFI), which indicated that the items were having a good fit with the five-factor model. The PPMC test was used to look at the correlation of the subscale on the general satisfaction subscale. The result showed a positive significant correlation, which means ANOVA was used to compare the five different subgroups (patients divided according to age) on the 46-items satisfaction scores. The results showed a significant difference of age group, with a greater satisfaction of older patients on the service provision, than younger patients. Construct validity is widely use because its relevant and clear measurements, but the present study did not measure a low correlation scores of items to indicate that the items were irrelevant. Moreover, construct validity is also subjective (in terms of judging the items, where the researchers believe that the items measure what they suppose to measure).
- Also briefly review other forms that might be established and how this might be achieved.
1) Test-retest reliability might be another good way to test for reliability of items measurements. This can be done by giving the questionnaire to the same respondents (patients) at different occasions, preferably three months after the initial test (Kline, 1993). The correlation of scores between the two tests can be compared. If the correlation coefficient is high, this means that the questionnaire (the subscale items) are reliable and consistence. This can be done by analysing the Cronbach’s alpha value, which needs to be greater than .70 to be reliable but not greater than 1 (preferably not greater than .95). Moreover, test-retest should not only be carried out to the patients who had full respondent rates but also to those who had partially respondent rates (17% of the patients), just to check the presence of any specific group of patients that might think the questionnaire was not reliable to them.
2) Alternate-form of reliability can be used to assess reliability of the items, which can be done by using a different wording for each of the items (but having the same meaning) to measure the patients’ satisfaction based on the five-factor dimensions. According to Litwin (1995), the items created should not be identical but similar to each other, and the test should be his should be given to the same patients at different times. The correlation between their scores will show the reliability of the measurement of the questionnaire. By looking at Cronbach’s alpha, high correlation of the items indicates high consistency of measurement.
3) Interobserver reliability is a method that can be used to find how well the inter-subscale is (Litwin, 1995), which measures how the five-factor domains agree with the 46-items questionnaire, by allow the professionals from each domains (such as the doctors, nurse, people who responsible for the ‘environment’ factor, people responsible for the ‘access’ factor, and people who are responsible to provide the ‘facilities’) to answer the questionnaire to assess their own satisfaction of the service provision. The data can be analysed by using Pearson’s correlation to find out the correlation coefficient of the items and the satisfaction. High correlation indicates higher reliability of the subscale.
4) External ways of assessing the validity would be good to mentioned, in which the inter-subscale can be generalized across different patients, places and times. This can be achieved by doing a sampling model and proximal similarity model approaches, in which the questionnaire is first distributed to a sample population, then to its nearby population, and lastly to the outside population. The analysis of scores from these populations can be done by using ANOVA, and the significant correlation can be revealed. If their correlation is significant (p < .5), this indicates that the subscales is externally valid.
5) Criterion validity is a good way of analysing the research’s validity as poor criterion tests would lead to inefficient technique of measurement. It has two major forms: predictive validity and concurrent validity. The predictive validity can be applied to find out how well the service of general practitioners could predict the patients’ satisfaction in the future. This can be done by asking the five domain factors (doctors, nurses, etc) and the patients to fill in the questionnaire separately. Then, the scores will be calculated by factor analysis (CFA) to see if the domain factors fit with five-factor model and by the PPMC to see the correlation on satisfaction. If the domains’ score fits with the model and have high correlation coefficient, this could predict that the patients’ score would also be similar. In contrast, concurrent validity cannot be applied because it could not be compared to the ‘gold-standard’ questionnaire of patients’ satisfaction as it was none.
6) Content validity can be addressed in terms of finding how adequate the items are to reflect its domain. This can be examined by using CFA, in which it is to find the proximate knowledge of the items’ adequacy (i.e. to know which specific five-factor domain was the item belongs to).
7) Method bias to measure the present of any biased items in the questionnaire. This can be done by using logistic regression. The items are considered to be biased if they have characteristics that only allow the respondent to give a certain answers, bias to the aim of the study.
REFERENCES
Carmines, E. G. & Richard, A. Z. (1979). Reliability and validity assessment. London: Sage.
Grogan, S., Conner, M., Norman, P., & Porter, I. (2000). Validation of a questionnaire measuring patient satisfaction with general practitioner services. Quality in Health Care, 9, 210-215.
Kane, T. M. (2001). Current concerns in validity theory. Journal of Educational Measurement, 38(4), 319-342.
Kerlinger, F. N. (1986). Foundations of behavioural research. London: Holt, Rinehart and Winston.
Kline, P. (1986). A handbook of test construction. New York: Methuen.
Kline, P. (1993). The handbook of psychological testing. New York: Routledge.
Litwin, M. S. (1995). How to measure survey reliability and validity. London: Sage.
Loewenthal, K. M. (2001). An introduction to psychological tests and scales. Hove: Psychology press .
Rubin, H. R., Gandek, B., & Rogers, W. H. (1993. Patients’ ratings of outpatient visits in different practice settings: Results from the medical outcomes study. Journal of the American Medical Association, 270, 835-840.
Vehkalahti, K. (2000) Reliability of Measurement Scales. Retrived November 18, 2009, from http://ethesis.helsinki.fi/julkaisut/val/tilas/vk/vehkalahti/
Cite This Work
To export a reference to this article please select a referencing style below: