I am a
Home I AM A Search Login

Measuring pain catastrophizing in the clinical setting



The 2024 Global Year will examine what is known about sex and gender differences in pain perception and modulation and address sex-and gender-related disparities in both the research and treatment of pain.

Learn More >

This post is a shortened version of one that can be found on http://www.whipresearch.com/blog-home.

This post is with respect to our recent publication in the Clinical Journal of Pain, in which Drs. Tim Wideman, Micheal Sullivan and myself performed a Rasch analysis on the commonly-used Pain Catastrophizing Scale.  For those interested in the true nitty-gritty of the methods, analyses and results, I’ll refer you to the paper itself.  However, a cursory summary of the findings is warranted, after which I’ll offer up some opinions regarding the clinical or real-world implications.

We used a database of 235 subjects with work-related injuries (93% low back pain) from Quebec for this analysis.  Rasch analysis is considered one of the ‘new’ techniques for evaluating the properties of a scale (although it’s been around since the early ‘60s).  Some readers may also be familiar with the term Item Response Theory (IRT), which is another ‘new’ approach to scale evaluation that has also been around since the ‘60s.  These are in contrast to approaches drawn from Classical Test Theory (CTT), with which most readers will likely be familiar; CTT gives us such characteristics as reliability, minimum detectable change, and construct validity to name a few.  Rasch and IRT purists will argue that CTT is a ‘weak’ approach to scale evaluation, in that it necessarily involves the non-confirmable notion of random error, which can never be either accepted or refuted.  Rasch and IRT do not include random error in their mathematical models, and hence are largely considered more conceptually sound (although like with anything in statistics, alternative opinions abound).

Anyhoo, the value of Rasch analysis is that it allows for some very deep exploration of the properties of a scale, right down to the level of the individual item and the individual response options.  The end result of a scale that ‘fits’ the Rasch model is that the scale can be confidently considered an interval-level measurement tool.  Why is this important?  For starters, most statistical tests (e.g. t-tests, ANOVA, Pearson correlation, linear regression) are only purely appropriate if the data are normally distributed and are interval-level.  In fact, most mathematical procedures require interval-level data, which in a simple nutshell means that the conceptual difference between a 1 and 2 is the same as the distance between a 10 and 11.  If you want to perform any kind of mathematical operation (addition, subtraction, multiplication or division) then this property has to be true in order for the result to make any sense.  This cannot be assumed for most ordinal-level scales, which form the vast majority of scale types in most health-related fields.  For example, we can’t assume that distance between ‘strongly disagree’ and ‘disagree’ is the same as the distance between ‘disagree’ and ‘agree’, the latter actually requiring a conceptual transition across the threshold from general disagreement to general agreement.  So basically, in order for almost all of current knowledge on pain catastrophizing and its effects on things like treatment effectiveness or long-term outcomes to be valid, the Pain Catastrophizing Scale (PCS) had better act like an interval-level measure when tested.  To keep this story at least somewhat short, the results of our analysis indicate that it does, but with some potentially important caveats.

The first is that the response options for two items (items 8 and 12) were somewhat disordered, that is, respondents appeared to have some difficulty in reliably answering those questions with the response options given.  This doesn’t mean the PCS is in any way a poor scale, in fact we’ve shown rather convincingly (in my opinion) that it actually functions quite well, so arguably this may be irrelevant.  But I highlight them because

I believe it’s important for clinicians to look closely at not just the statistical properties of a scale as reported in the scientific literature, but also at the qualitative properties of a scale to help them truly interpret what the scale can and what it cannot tell them.

The end result of this is that we’ve suggested a rescoring of items 8 and 12 so that they’re both now out of 3 rather than 4, and the overall scale is out of 50 rather than 52, the nice round number being a pleasant side effect.

Another consideration is the notion of dimensionality in a measurement tool.  It is a very basic axiom of quantitative measurement that any scale meant to be subject to mathematical manipulation should only measure a single construct.  I will put forth an opinion here that most common scales in use in rehabilitation have not been adequately evaluated for this important trait, which renders summative scores highly susceptible to inaccuracies or bias.  The Rasch model allows this type of analysis, and to keep what is already a very long story at least somewhat short, it was adequately unidimensional for use as a summative score.  There’s a pile of deep level philosophical stuff we could talk about here, the very nature of catastrophizing being one of them, but for now we’ll just leave it at that.

The last thing I’ll talk about here is how Rasch analysis gives us the ability to see how a scale performs across the range of possible scores.  The statistical method allows us to make use of a transformation matrix which tells us how to transform raw ordinal-level scores out of 50 to interval-level scores, again out of 50.  As has been the case with every such analysis I’ve seen so far, the results of this transformation suggest that change at the extreme ends of the scale are more meaningful than change in the middle of it.  A 1 point difference at the extreme bottom of the PCS represents a 10-fold greater interval level change than the same 1 point difference in the middle of the scale.  This probably has highly important implications for establishing things such as minimum detectable change or minimum clinically important difference, but given that the latter are drawn from classical test theory, there currently isn’t a good analog that can be drawn from the newer Rasch or IRT approaches.

So in the end, the results of our analysis indicate that the PCS is by and large a reasonably good scale from a mathematical perspective, especially when scored out of 50 rather than 52.  It doesn’t of course tell you when or how you should use it in clinical practice and what to do about a high score if you encounter one.  For that type of information, readers are directed to http://sullivan-painresearch.mcgill.ca/pcs.php.

Dave Walton

Dave WaltonDave Walton is Assistant Professor with the School of Physical Therapy at the University of Western Ontario in Ontario, Canada. He is a Fellow of the Canadian Academy of Manipulative Therapy and a co-founder of the Pain Science Division of the Canadian Physiotherapy Association. A little while back he completed a 2-month tour of different research settings in Australia in an effort to absorb as much knowledge as possible before his brain fills up. This may require longer than two months because it has become clear that he not only knows what a Rasch analysis is, he knows how to do one. (Rasch, not Rash – we can all analyse them). The BiM team liked Dave so much that they are contemplating getting Dave Dolls for the Lab. Clearly, he did not write this bio.


Walton DM, Wideman TH, & Sullivan MJ (2013). A Rasch analysis of the pain catastrophizing scale supports its use as an interval-level measure. The Clinical Journal of Pain, 29 (6), 499-506 PMID: 23328327

Share this