PHYS THER
Vol. 89, No. 8, August 2009, pp. 770-785
DOI: 10.2522/ptj.20080227

This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow The Bottom Line
Right arrow All Versions of this Article:
ptj.20080227v1
89/8/770    most recent
Right arrow Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Rapid Responses are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Hart, D. L.
Right arrow Articles by Choi, S. W.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Hart, D. L.
Right arrow Articles by Choi, S. W.
Related Collections
Right arrow Neurology/Neuromuscular System: Other
Right arrow Fear-Avoidance
Right arrow Tests and Measurements
Social Bookmarking
 Add to CiteULike   Add to Complore   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?

Research Reports

Screening for Elevated Levels of Fear-Avoidance Beliefs Regarding Work or Physical Activities in People Receiving Outpatient Therapy

Dennis L. Hart, Mark W. Werneke, Steven Z. George, James W. Matheson, Ying-Chih Wang, Karon F. Cook, Jerome E. Mioduski and Seung W. Choi

D.L. Hart, PT, PhD, is Director of Consulting and Research, Focus On Therapeutic Outcomes, Inc, PO Box 11444, Knoxville, TN 37939 (USA).
M.W. Werneke, PT, MS, Dip MDT, is Physical Therapist, Spine Rehabilitation at CentraState Medical Center, Freehold, New Jersey.
S.Z. George, PT, PhD, is Associate Professor, Department of Physical Therapy, Center for Pain Research and Behavioral Health, Brooks Center for Rehabilitation Studies, University of Florida, Gainesville, Florida.
J.W. Matheson, PT, DPT, MS, SCS, OCS, CSCS, is Physical Therapist, Minnesota Sport and Spine Rehabilitation, Burnsville, Minnesota.
Y.-C. Wang, OT, PhD, is Research Assistant, Focus On Therapeutic Outcomes, Inc, Knoxville, Tennessee, and Postdoctoral Fellow, Rehabilitation Institute of Chicago, Chicago, Illinois.
K.F. Cook, PhD, is Research Associate Professor, Department of Rehabilitation Medicine, University of Washington, Seattle, Washington.
J.E. Mioduski, MS, is Programmer, Focus On Therapeutic Outcomes, Inc, Knoxville, Tennessee.
S.W. Choi, PhD, is Research Assistant Professor, Department of Medical Social Sciences and Center on Outcomes, Research and Education, Feinberg School of Medicine, Northwestern University, Chicago, Illinois.

Address all correspondence to Dr Hart at: hart{at}fotoinc.com


Submitted July 24, 2008; Accepted April 10, 2009


arrow
Abstract
 
Background: Screening people for elevated levels of fear-avoidance beliefs is uncommon, but elevated levels of fear could worsen outcomes. Developing short screening tools might reduce the data collection burden and facilitate screening, which could prompt further testing or management strategy modifications to improve outcomes.

Objective: The purpose of this study was to develop efficient yet accurate screening methods for identifying elevated levels of fear-avoidance beliefs regarding work or physical activities in people receiving outpatient rehabilitation.

Design: A secondary analysis of data collected prospectively from people with a variety of common neuromusculoskeletal diagnoses was conducted.

Methods: Intake Fear-Avoidance Beliefs Questionnaire (FABQ) data were collected from 17,804 people who had common neuromusculoskeletal conditions and were receiving outpatient rehabilitation in 121 clinics in 26 states (in the United States). Item response theory (IRT) methods were used to analyze the FABQ data, with particular emphasis on differential item functioning among clinically logical groups of subjects, and to identify screening items. The accuracy of screening items for identifying subjects with elevated levels of fear was assessed with receiver operating characteristic analyses.

Results: Three items for fear of physical activities and 10 items for fear of work activities represented unidimensional scales with adequate IRT model fit. Differential item functioning was negligible for variables known to affect functional status outcomes: sex, age, symptom acuity, surgical history, pain intensity, condition severity, and impairment. Items that provided maximum information at the median for the FABQ scales were selected as screening items to dichotomize subjects by high versus low levels of fear. The accuracy of the screening items was supported for both scales.

Limitations: This study represents a retrospective analysis, which should be replicated using prospective designs. Future prospective studies should assess the reliability and validity of using one FABQ item to screen people for high levels of fear-avoidance beliefs.

Conclusions: The lack of differential item functioning in the FABQ scales in the sample tested in this study suggested that FABQ screening could be useful in routine clinical practice and allowed the development of single-item screening for fear-avoidance beliefs that accurately identified subjects with elevated levels of fear. Because screening was accurate and efficient, single IRT-based FABQ screening items are recommended to facilitate improved evaluation and care of heterogeneous populations of people receiving outpatient rehabilitation.


arrow
Introduction
 
Clinicians and researchers have recognized the role that psychosocial factors play in the development of chronic disability in people with low back pain.13 Among the psychosocial risk factors are fear-avoidance beliefs,4 which are embodied in the fear-avoidance model of musculoskeletal pain.5 The model posits that an individual's response to an episode of pain falls along a continuum ranging from avoidance (maladaptive) to confrontation (adaptive) and provides one explanation for why some people with acute low back pain syndromes develop chronic disability.69

On the basis of theories of fear and avoidance of activities, Waddell et al4 developed the Fear-Avoidance Beliefs Questionnaire (FABQ) to assess the association between fear-avoidance beliefs and work disability for people with chronic low back pain syndromes. The FABQ is a self-report questionnaire with 2 scales: 1 assessing fear-avoidance beliefs regarding work activities (FABQ-W) and 1 assessing fear-avoidance beliefs regarding physical activities (FABQ-PA).4 Evidence supported an association between fear-avoidance beliefs regarding work and absence from work because of low back pain.4 Thus, Waddell et al4 recommended that clinicians consider screening for fear-avoidance beliefs when managing low back pain. Subsequent studies indicated that elevated levels of fear were associated with10,11 and were predictive of12,13 disability and absence from work in people with low back and cervical spine pain syndromes. There is evidence that identifying people with elevated levels of fear-avoidance beliefs and managing those beliefs accordingly may reduce fear and predict or improve outcomes.1,4,5,10,1222

Fear-avoidance beliefs may affect people with conditions other than low back pain. Evidence5,23 supported the possible existence of fear-avoidance beliefs or pain-related fear in people who have other impairments or who may not have pain, perhaps because of learned behavior after previous painful episodes or misconceptions about pain.24 Pain-related fear scales, including the FABQ scales, have been used to assess the levels of fear in people with acute16 and chronic4,10,11 low back pain syndromes, cervical spine pain syndromes,11,2527 cervical spine and shoulder28 pain syndromes, hip impairments,29 knee impairments,2931 chronic headache,32 fibromyalgia,33 and chronic fatigue syndrome.33 It is reasonable to believe that pain-related fear would be applicable to people with other conditions including, but not limited to, osteoarthritis,34 knee impairments,30,31 and neuropathic pain.35 These studies suggested that pain-related fear is not uncommon in people with a wide variety of neuromusculoskeletal conditions, with and without pain, and another study reported the prevalence of elevated levels of fear-avoidance beliefs to be more than 40% in specific samples.36

George17 described several screening methods designed to identify people with elevated levels of fear, including the FABQ. Despite the availability of these methods, therapists do not routinely screen for elevated levels of fear, a fact that may be attributable partly to the burden of collecting data or the difficulty in interpreting measures. In response to these concerns, George17 challenged clinicians and researchers to refine screening techniques by making them more efficient and accurate to try to improve acceptance and clinical use. Developing efficient and accurate screening methods is particularly important for therapists assuming first-contact roles in patient care,37 who need to identify confounding conditions that could reduce the effectiveness of their management strategies in diverse patient populations.38 Screening results indicating elevated levels of fear would alert therapists to the likelihood that patients might be fearful of activities that might be part of their therapeutic interventions; such a situation might portend worse outcomes.39 Because short tests commonly are associated with increased measurement error,40 definitive testing often is recommended to confirm the presence of the condition.41 Given that there is preliminary evidence of effective interventions for people with elevated levels of pain-related fear,19,42 the challenge appears to be relevant to improved patient care and outcomes.

In an effort to minimize the measurement error related to short tests, some authors have recommended modern psychometric techniques, such as item response theory (IRT) methods.43 Such methods are useful for assessing patient-report screening surveys because they facilitate both the evaluation of whether items mean the same thing to different respondents (ie, differential item functioning [DIF], described in the Method section)44 and the identification of screening items by use of item information functions (described in the Method section).45 The absence of DIF is important if FABQ scales are to be used to screen diverse populations for elevated levels of fear-avoidance beliefs. The use of item information functions facilitates the selection of single screening items associated with the lowest measurement error related to a given level of fear.45

The overall purpose of this study was to develop an efficient yet accurate screening method for identifying people who have elevated levels of fear-avoidance beliefs regarding work or physical activities and who are receiving outpatient rehabilitation. The specific purposes were: (1) to use IRT methods to analyze FABQ items, with particular emphasis on DIF among clinically logical groups of people and identify screening items for each FABQ scale and (2) to assess the accuracy of screening items for identifying people with elevated levels of fear-avoidance beliefs. If the results suggest that screening items can identify people with elevated levels of fear accurately, then more-precise fear-avoidance testing could be initiated or management strategies could be used to reduce fear and improve outcomes. In addition, accurate screening would reduce costly testing of people not likely to be at risk of having elevated levels of fear.


arrow
Method
 
Design

We conducted a secondary analysis of data collected prospectively from people with a variety of common neuromusculoskeletal diagnoses.

Setting and Participants

We analyzed data from 17,804 people (a sample of convenience) treated for common neuromusculoskeletal conditions in 121 outpatient rehabilitation clinics in 26 states (in the United States) between May 2002 and December 2006 (Tab. 1). Clinics were participating with Focus On Therapeutic Outcomes, Inc (Knoxville, Tennessee), an international medical rehabilitation database management company.46,47 People were selected from the database of Focus On Therapeutic Outcomes because they had answered the FABQ for physical or work activities (see below): 16,243 people had answered the FABQ for physical activities, 5,517 people had answered the FABQ for work activities, and 3,956 people had answered both the work activity and the physical activity surveys. Although diagnostic information was available for only 68% of the people, the most prevalent groupings of ICD-9-CM codes48 were related to soft-tissue disorders of muscle, synovium, tendon, or bursa (ICD-9-CM codes 725–729; 25% of people) and pathologies of the spine (ICD-9-CM codes 720–724; 18% of people). Most people were receiving payment benefits from health maintenance organizations (17%), preferred provider organizations (11%), workers’ compensation (10%), and Medicare Part B (9%). Data on payers were missing for 38% of the people.


View this table:
[in this window]
[in a new window]

 
Table 1. Characteristics of Subjects at Rehabilitation Intake (N=17,804)

Data Collection

As described previously,46,4952 data were collected by use of Patient Inquiry computer software,* which participating clinics used for routine collection of data as part of their patient care strategies. People seeking rehabilitation provided demographic data before the initial evaluation and functional status information through the use of condition-specific computerized adaptive tests4955 at the initial evaluation (intake) and at the end of rehabilitation (discharge). Therapists also could elect to ask patients to complete the FABQ physical subscale, the FABQ work subscale, or both; when selected, these surveys were administered via computer at intake and at discharge (but were not computerized adaptive tests). Clinical staff entered demographic data at intake and at discharge. Only intake data were analyzed.

Fear-Avoidance Beliefs Items

The items in the FABQ describe the relationship between pain and physical activities or work activities; for example: "Physical activity might harm my back" or "I cannot do my normal work with my present pain." For each item, a scale with ratings of 0 to 6 (0="completely disagree," 3="unsure," and 6="completely agree") is used. There are no word descriptors for responses 1, 2, 4, and 5. Responses from 4 items are summed to produce a score representative of the level of fear of physical activities, and responses from 7 items are summed to produce a score representative of the level of fear of work activities.4 Research findings have supported good item internal consistency and reliability and the presence of 2 factors in the FABQ (fear of work activities and fear of physical activities),4 FABQ measure test-retest reliability,4 and an association of fear-avoidance beliefs with absence from work and disability.8,10,11,56 Because of interest in assessing the fear-avoidance beliefs of people receiving outpatient rehabilitation regardless of impairment, 2 items were reworded to eliminate references to the back (Appendix). We believed that the resulting scale was appropriate for anyone with pain or fear of pain, such as the people seeking outpatient rehabilitation.

Data Analyses

Distribution of response choices.
The frequency distribution of responses to each item was evaluated.

IRT analyses.
We used unidimensional IRT methods to analyze the data43,5759 to determine how well the IRT model fit the data and how well IRT assumptions were met.58 For unidimensional IRT models to be appropriate for analyzing FABQ items, the items must measure only one construct; that is, the scale must be unidimensional.45,59 In addition, the items must be locally independent; that is, any 2 items must not be correlated when the latent trait is fixed.45 We used modern factor analytic methods5052,60 to investigate unidimensionality and local independence assumptions. The presence of a dominant factor in the FABQ items was assessed with exploratory factor analysis (EFA) and then confirmatory factor analysis (CFA),61 eliminating items with factor loadings of less than 0.40.62 Pairs of items with absolute residual correlations of greater than 0.25 were considered locally dependent.62 All 16 FABQ items were used for the initial IRT analyses because we wanted to test the factor structure of all FABQ items.4 The EFA was used to explore the general structure of the FABQ items without the imposition of a preconceived structure to determine whether 1 or more factors were present in the data. The CFA was used to verify the factor structure once the factors were identified with the EFA.63 The CFA model fit was evaluated with the comparative fit index (CFI),64 the Tucker-Lewis index (TLI),65 and the root-mean-square error of approximation (RMSEA).63,66 The TLI and the CFI range from 0 (poor fit) to 1 (good fit). Values for the CFI and the TLI of greater than 0.90 are indicative of good model fit. Values for the RMSEA of less than 0.08 suggest adequate fit.64

IRT model selection, item information function analysis, and item fit.
We fitted items remaining after unidimensionality and local independence testing to the graded-response IRT model (GRM)67,68 by using PARSCALE software (version 4.1).{dagger},69 The GRM was chosen because it is appropriate for ordered responses (such as FABQ items), it allows item discrimination parameters to vary, and it can be used to estimate ability parameters (theta values) that represent a subject's level of fear. We used PARSCALE software to fit the data to the GRM and to estimate discrimination parameters and category response functions for each item.68,70 Category response functions represent the probability that an examinee will successfully complete a particular response category. The category characteristic curve for each item in a response category is used to estimate the operating characteristic curve for each item, which represents the probability of endorsing a response category for the item at a given subject's ability (theta value).70 The category response functions are resolved into an item location or difficulty parameter and a set of category parameters. Therefore, PARSCALE produces an item discrimination parameter, an item difficulty parameter, and a set of category parameters for each item. PARSCALE estimates a subject's level of fear (theta value), category characteristic curves and parameters, and item difficulty parameters, all of which are placed on the same normal (X=0, SD=1) fear metric in logits.67

PARSCALE (with the GRM) also estimates item information functions, which quantify the capability of a given item to adequately estimate a subject's ability across the fear-avoidance scale range.45,70 An item information function describes each item's contribution to overall test precision. The sum of the item information functions defines the ideal precision of the test (ie, test information function) at a given ability, facilitating evaluation of the expected standard error. The standard error of a subject's ability estimate is inversely proportional to the test information function: SE=1/square root of the test information function. For samples with an observed variance of 1, a standard error of less than 0.23 is comparable to a reliability of greater than 0.95 (reliability=1–SE2).71,72

Item discrimination parameters and operating characteristic curves were assessed to determine how well the items were modeled with the GRM. Because there is no recognized best way to assess the fit of data to the GRM, particularly for samples exceeding 1,500,71 we used 3 basic approaches to assess the fit of our data to the GRM. First, we assessed empirical operating characteristic curves to ensure that they progressed from less difficult to more difficult along the fear-avoidance axis and that each curve reached a maximum at a unique interval of the scale.67,70 Second, we assessed item discrimination parameters (ie, slopes) for an estimation of the discrimination power for each item. Items with larger discrimination parameters (higher slopes) differentiate subjects with fear levels varying over the range of theta values appropriate for the item better than do items with lower slopes; therefore, items with slopes of greater than 0.70 are preferred.62 Third, we assessed theoretical versus empirical operating characteristic curves for a qualitative determination of the fit of items to the GRM. Visual inspection of empirical operating characteristic curves can reveal the extent and nature of item fit or misfit.

DIF.
The remaining items were assessed for DIF by selection of clinically logical groups of subjects: sex (male/female), surgical history (yes/no), acuity of symptoms (number of calendar days between date of onset of symptoms and date of initial evaluation: acute=21 days or less, subacute=22–<90 days, and chronic =90 days or more), age group (18–<45, 45–<65, 65–<75, and 75 years or older), number of comorbidities (0, 1, 2, 3, or more), pain intensity (below median, median, or above median, as indicated with a numeric rating of 0 ["no pain"] to 10 ["pain as bad as it can be"]), and impairment grouping (Tab. 1). Differential item functioning is present when the relationship between item responses and the trait measured by the test differs systematically between groups of subjects after the subjects’ underlying abilities are controlled for.44 The variables selected for testing have been shown to affect functional status outcomes.53,54 Associations between fear and these independent variables have only begun to be investigated, and preliminary results suggest the need for further investigation.39

For FABQ measures to be used as screening tools regardless of a subject's impairment, DIF must be absent or negligible in as many independent variables as possible—but most importantly, in impairment. Because confirmable diagnoses for many subjects receiving outpatient rehabilitation often are not available,73 we elected to group subjects by impairment (ie, the problem directing patient management). Differential item functioning testing for impairment grouping was performed in 3 ways. First, all subjects were grouped by general impairment (ie, a medical, neurological, or orthopedic problem). Second, subjects with an orthopedic impairment were grouped by area treated (ie, upper extremity, spine, or lower extremity). Third, subjects with an orthopedic impairment were grouped by specific body part treated within each area treated (upper extremity: shoulder, elbow, and wrist or hand; spine: cervical and lumbar; lower extremity: hip, knee, and foot or ankle).

Each item was assessed for DIF with difwithpar software (version 1.0),{ddagger},7479 which combines IRT calibration estimated by the GRM67 with PARSCALE software69 with multiple ordinal logistic regression models for each item and demographic category by use of Stata software (version 9.2).§,80 Using methods described by Crane et al,75 we evaluated items for the presence of uniform DIF (ie, the interference related to demographic groups between ability and item responses is the same across the entire range measured by the test) by examining the relative difference between beta coefficients in the regression models and nonuniform DIF (ie, the interference varies at different levels of the trait being measured) by comparing the –2 log likelihoods of 2 of the regression models.79 For nonuniform DIF, we used Bonferroni adjustment for {alpha} values on the basis of the number of items in the scale. The process is sequential (ie, it starts with one independent variable and progresses to subsequent variables) and iterative (ie, decisions are made at each step during the difwithpar process). For example, when an item was identified with DIF, the software created a new item. Thus, items found to have DIF related to an independent variable, such as sex, were split into 2 new items. For the first new item, responses for women were coded as in the original data set, whereas for men, all responses were set to missing. For the second new item, responses for men were coded as in the original data set, whereas for women, all responses were set to missing. We thus calibrated item parameters independently in the 2 groups for items identified with DIF. Items free of DIF served as anchor items, ensuring that ability estimates (ie, levels of fear) were calibrated on the same metric for the 2 sexes. The presence of possible false-positive or false-negative DIF results was assessed.75

In some samples, particularly large samples, DIF might be detected (significant) but might be of little practical importance.52,78 Therefore, before progressing sequentially to the next variable for DIF assessment, we assessed the correlation between unadjusted ability estimates and DIF-adjusted ability estimates, and we assessed the magnitude of the difference between unadjusted and DIF-adjusted ability estimates. We repeated the entire procedure for surgical history, severity, age group, and impairment grouping.

Screening item selection and accuracy of the screening items.
We wanted to select screening items that provided the most information for the center of the fear continuum. We expected FABQ measures not to be normally distributed14; therefore, we used the median for each fear scale as the measure of central tendency.

For each scale, we examined item information functions and selected 2 screening items that provided the most information (ie, the lowest measurement error) at the median fear level. Using the median, we dichotomized subjects by low versus high levels of fear of physical activities and fear of work activities with the IRT-based theta values estimated from all items for each scale.

We used nonparametric receiver operating characteristic (ROC) curve analyses to quantify the accuracy of the responses to the screening item or items (ie, 1 or 2 screening items per scale) for discriminating subjects with fear levels below the median (low) or above the median (high).81 Such analyses produce plots of sensitivity/(1 – specificity) for the diagnostic test (ie, the screening items). For each ROC, a diagnostic cut score was identified by selecting the item response (or sum of 2 item responses) with the largest average specificity/sensitivity. Positive likelihood ratios (+LRs) and negative likelihood ratios (–LRs)82 and the percentages of subjects correctly identified were produced for each cut score. Positive likelihood ratios were calculated as sensitivity/(1 – specificity), and negative likelihood ratios were calculated as (1 – sensitivity)/specificity.83 Likelihood ratios are summary measures of diagnostic test performance (ie, classification) that indicate how much a given classification will raise or lower the pretest probability of the target disorder of interest (ie, level of fear).8385 Acceptable +LRs are 2 or higher, and acceptable –LRs are 0.5 or lower because they generate at least small but possibly important changes in the predictive value of the test.85 Areas under the ROC curves, standard errors, and 95% confidence intervals were used to describe the ROC results. To determine whether using 1 versus using 2 screening items was more accurate for discriminating subjects with low versus high levels of fear, we assessed the equality of the area under the curves by using an algorithm suggested by DeLong et al.86

Mapping IRT-based measures to original summative scores.
To assist clinicians in relating new IRT-based FABQ measures to original FABQ summative scores,4 we mapped the new IRT-based FABQ measures to the original FABQ summative scores4 by aggregating the original summative scores by each tenth of a logit of the IRT-based measures. Using the original 0 to 6 item responses,4 we summed the responses for items 2 through 5 (Appendix) to produce a summative score (0–24) for fear of physical activities, and we summed the responses for items 6, 7, 9, 10, 11, 12, and 15 to produce a summative score (0–42) for fear of work activities. At each tenth of a logit for each FABQ-PA and FABQ-W IRT-based theta value, the mean and 95% confidence interval for the original summative scores were calculated.


arrow
Results
 
Distribution of Response Choices

No item had greater than 95% responses in any one category. Examination of proportions of responses per item for the 11 items in the original FABQ-W scale and the 5 items in the original FABQ-PA scale demonstrated that subjects selected responses with word descriptors more (proportions of between .16 and .47 for response choices 0, 3, and 6) than responses without word descriptors (proportions of between 0.02 and 0.05 for response choices 1, 2, 4, and 5), regardless of the FABQ scale.

IRT Analyses

The EFA results indicated that a 2-factor solution for the 16 FABQ items (n=3,956, CFI=0.92, TLI=0.97, RMSEA=0.19) fit the data well. Items were loaded on the FABQ-PA and FABQ-W scales originally described by Waddell et al,4 and the 2-factor solution controlled 69.1% of the variance in the data. The 2-factor CFA results supported the presence of 2 factors (CFI=0.93, TLI=0.97, RMSEA=0.18). Items were separated into respective scales (for the 11-item FABQ-W, n=5,517; for the 5-item FABQ-PA, n=16,243), and separate 1-factor CFAs were run.

The CFA results for the 11-item FABQ-W suggested that 2 of the 3 fit statistics supported the fit of the 1-factor solution (CFI=0.94, TLI=0.97, RMSEA=0.25), all items were loaded on 1 factor (loadings of >.75), but the items "My pain was caused by my work or by an accident at work" and "I do not think that I will ever be able to go back to that work" had a residual correlation of –0.26. The former item was deleted because it was associated with the most pairs of items with higher absolute residual correlations, and another CFA was run on the 10 remaining items. The CFA results for the 10-item FABQ-W supported a slightly improved model fit (CFI=0.95, TLI=0.98, RMSEA=0.23), all items were loaded on 1 factor (loadings of 0.68), there was no absolute residual correlation of greater than 0.25, there was 1 residual correlation of less than –0.20, and there was a reduction in absolute residual correlations of greater than 0.10, from 36.4% (11-item scale) to 28.9% (10-item scale). With the exception of the RMSEA, the results supported a unidimensional scale with good local independence.

The CFA results for the 5-item FABQ-PA suggested that 2 of the 3 fit statistics supported the fit of the 1-factor solution (CFI=0.95, TLI=0.93, RMSEA=0.23), all items but 1 were loaded on 1 factor (for the item "My pain was caused by physical activity," the loading was 0.34), and all absolute residual correlations were less than 0.25. This item was deleted, and another CFA was run on the 4 remaining items. The CFA results for the 4-item FABQ-PA suggested a questionably improved model fit (CFI=0.96, TLI=0.95, RMSEA=0.26), all items were loaded on 1 factor (loadings of >0.60), there was no absolute residual correlation of greater than 0.25, there was 1 residual correlation of greater than 0.20, and there was a reduction in absolute residual correlations of greater than .10, from 15.0% (5-item scale) to 8.3% (4-item scale). With the exception of the RMSEA, the results supported a unidimensional scale with good local independence.

IRT Modeling, Item Information Function Analysis, and Item Fit

We fitted items from both FABQ scales separately to the GRM. Initial inspection of the operating characteristic curves demonstrated that there were no distinct maximum values of the item response curves for the second and third as well as the fifth and sixth response categories (ie, responses without word descriptors) for both scales. Therefore, these response categories (ie, second and third as well as fifth and sixth) were collapsed, and the data were refit to the GRM. Subsequent inspection of operating characteristic curves supported an improved shape for each curve, with clear maximum values. One item ("Physical activity makes my pain worse") had a discrimination parameter of less than 0.7 (actual value=0.67) and was deleted. Examination of empirical operating characteristic curve plots suggested that all items fit the GRM. The 3-item FABQ-PA data were refit to the GRM (Tab. 2).


View this table:
[in this window]
[in a new window]

 
Table 2. Fear-Avoidance Belief Item Banks and Item Parameter Estimates

Test information functions with standard errors for both FABQ scales are displayed in Figure 1. If a measure of fear of physical activities is estimated with the 3-item scale, then the plot of standard errors (SEs of < 0.23 represent a measure reliability of >.95) demonstrates that the measure can be estimated with high precision between –1.1 and 1.5, or 29.5% of the range for the FABQ-PA trait in our sample. Similarly, with the 10-item bank for fear of work activities, the measure of fear can be estimated with high precision (reliability of >.95) between –0.2 and 0.4, or 11.5% of the range for the FABQ-W trait in our sample.


Figure 1
View larger version (22K):
[in this window]
[in a new window]

 
Figure 1. (A) Test information function for fear-avoidance beliefs regarding physical activities. (B) Test information function for fear-avoidance beliefs regarding work activities. Information=test information function.

DIF

The DIF results for fear of physical activities (3-item scale) indicated that there were no items with DIF for the variables sex, symptom acuity, surgical history, number of comorbidities, level of pain, and any impairment grouping. The item "I should not do physical activities which (might) make my pain worse" was significant (P<.001) for nonuniform DIF for age, but the unadjusted and adjusted levels of fear were highly correlated (r>.99), and the average difference between the unadjusted and adjusted measures was <.001 (SD=.02, range=.14–.07)—a value that was <.001 standard deviation from the full 3-item scale. Therefore, the DIF was considered to be of no practical importance.

The DIF results for fear of work activities (10-item scale) provided similar results. No items with DIF were identified for the variables sex, age, number of comorbidities, level of pain, overall impairment grouping (medical, neurological, and orthopedic), and orthopedic impairments of the upper or lower extremity. Several items were shown to have nonuniform DIF for the variables symptom severity; surgical history; orthopedic impairment grouping by upper extremity, lower extremity, or spine; and orthopedic impairment grouping by cervical or lumbar spine. However, the unadjusted and adjusted levels of fear were highly correlated (r>.99), and the average differences between the unadjusted and adjusted measures ranged from –.12 to .02—values that represented a range of standard deviations of .01 to .11. Therefore, the identified DIF was considered to be of little practical importance.

Screening Item Selection and Accuracy of the Screening Items

Both the FABQ-PA and the FABQ-W were distributed nonnormally (Shapiro-Wilks W statistics, P<.05). The median for the FABQ-PA was –.07, and the median for the FABQ-W was –.02; these values were used to dichotomize subjects by level of fear.

We identified 2 items (Tab. 2) with the highest slopes as the first 2 screening items per scale: screening item 1 [SHLDNOT—"I should not do physical activities which (might) make my pain worse"] and screening item 2 [CANNOT—"I cannot do physical activities which (might) make my pain worse"] for fear of physical activities and screening item 1 (WRKCANT—"I cannot do my normal work with my present pain") and screening item 2 (WRKSHNT—"I should not do my normal work with my present pain") for fear of work activities. Although WRKSHNT provided slightly more information ({alpha}=2.53) than WRKCANT ({alpha}=2.52), the WRKCANT item information function provided more information at the median theta value and therefore was selected as the most informative for the work scale at the cut score for high levels of fear.

The ROC results describing the accuracy of using 1 or 2 items to predict subjects with high levels of fear of physical or work activities are shown in Table 3. Although the areas under the ROCs were similar when 1 or 2 screening items were used to identify high levels of fear, the use of 2 items produced larger areas ({chi}2=402.6, df=1, P<.001, for the FABQ-PA; {chi}2=139.6, df=1, P<.001, for the FABQ-W) (Fig. 2). However, because the use of 1 screening item produced strong values for areas under the curves, sensitivity, specificity, +LR, –LR, and percentages of subjects correctly classified and because the addition of a second screening item did not substantially improve all of these values over those obtained with 1 screening item, we decided to use only 1 item (ie, the most informative at the median theta value) as the screening item to identify subjects with high levels of fear for both scales.


View this table:
[in this window]
[in a new window]

 
Table 3. Diagnostic Accuracy of Using 1 or 2 Screening Items to Identify Subjects With High Levels of Fear-Avoidance Beliefs Regarding Physical Activities (PA) or Work Activities (WA)


Figure 2
View larger version (15K):
[in this window]
[in a new window]

 
Figure 2. (A) Receiver operating characteristic (ROC) curves for use of 1 and 2 screening items to identify subjects with high levels of fear of physical activities. (B) ROC curves for use of 1 and 2 screening items to identify subjects with high levels of fear of work activities. One ROC area=1 screening item was used to estimate the area under the ROC curve, Two ROC area=2 screening items were used to estimate the area under the ROC curve.

Mapping IRT-Based Measures to Original Summative Scores

A cross-walk table for both fear scales is shown in Table 4. With the table, once an IRT-based measure is known, the original FABQ summative score can be identified.


View this table:
[in this window]
[in a new window]

 
Table 4. Cross-Walk Table for Scoring Fear-Avoidance Scales With Item Response Theory (IRT) and Original Summative Methodsa


arrow
Discussion
 
The 2 most important findings of these analyses were that the items in the scales for fear-avoidance beliefs regarding physical and work activities had negligible DIF across many variables describing people commonly seen in outpatient rehabilitation. The lack of practically important DIF allowed us to identify for IRT-based FABQ scales single screening items that could be used to efficiently classify people with elevated levels of fear-avoidance beliefs in an accurate manner regardless of the impairment being treated. These findings are consistent with reports that fear influences outcomes for people with hip,29 knee,2931 cervical spine and shoulder,28 and neck11,2527 pain, as well as for people with lumbar spine impairments,4,10,11,16 for whom the FABQ was designed.4 Therefore, when appropriate, clinicians could use the single screening items identified in these analyses to identify people with elevated levels of fear across a wide variety of impairments.1,12,15,16,18,87 If elevated levels of fear were detected, people could be tested further to estimate more-precise measures of fear-avoidance beliefs. There is evidence to suggest that management strategies can be used to reduce fear1921 and improve outcomes for people with low back pain.14,1921,35,8890 In addition, there is evidence that management strategies are evolving for other musculoskeletal conditions, such as shoulder impairments.42

The present study was performed in direct response to a challenge to refine current screening techniques for elevated levels of fear-avoidance beliefs, so that people can be accurately and efficiently classified and their conditions can be managed accordingly.17 Because our IRT-based screening requires only 1 item to accurately classify people, the method is efficient. Improved efficiency, that is, a reduced burden of collecting data, may be the catalyst for more widespread screening for fear-avoidance beliefs in routine outpatient therapy and may facilitate concurrent screenings of multiple psychosocial prognostic indicators, such as depression38 and pain-related fear.4 As more therapists assume a first-contact role,37 efficient yet accurate screening of multiple constructs will be developed to meet therapists’ needs, which will allow rapid identification of people who may require certain types of help as early as possible. The use of IRT methods can facilitate such development because IRT methods are well suited to the development of new scales and the reassessment of existing scales, including the identification of single screening items and the assessment of measurement precision.

The results of the present study indicated that subjects selected the original FABQ item responses4 with word descriptors more than responses without word descriptors. In addition, approximately 4% (floor) and 9% (ceiling) of the subjects selected "completely disagree" and "completely agree" responses for all items of the FABQ-PA scale, respectively; the corresponding values for the FABQ-W scale were 20% (floor) and 3% (ceiling). Therefore, FABQ scores tended not to be normally distributed, and subjects might cluster at the 2 scale extremes; these findings support the results of previous studies14,91 in which medians were used as measures of central tendencies for both FABQ scales to dichotomize subjects.

The results of the factor analyses indicated that the FABQ items were unidimensional, with good local independence, in the original item format4 once the item CAUSED was deleted from the work scale and the item PHYSACTV was deleted from the physical activity scale because they were not loaded strongly on the respective scales. The loss of the item PHYSACTV because of low factor loading is consistent with the reports of Waddell et al4 and Staerkle et al.13 However, beyond the loss of the item PHYSACTV, our results cannot be compared directly with those of Waddell et al4 and Staerkle et al13 because we used factor analyses designed for categorical data, the samples differed in size and diversity, and we edited 2 FABQ items to eliminate references to the back (Appendix).

To our knowledge, no other research group has analyzed FABQ data by using IRT methods, which allowed subjects’ FABQ responses to be described in probabilistic terms.43,5759,92 Specifically, operating characteristic curves, which graphically depict the correspondence between the predicted responses to an item and the latent trait,59 demonstrated that subjects did not differentiate the original FABQ responses well for responses with no word descriptors; this finding supports the frequency distribution results and calls into question the use of response categories without word descriptors. When we collapsed the 7 responses to 5, the monotonic nature of the operating characteristic curves was restored; this finding implies that the original response anchoring adds error to the measurement of levels of fear-avoidance beliefs and that subjects may be able to better differentiate among 5 responses, thus supporting recent recommendations.93,94 A good fit of items to the 2-parameter IRT model was obtained with the 10-item work activity scale and the collapsed response choices, but a good item fit with the physical activity scale and the collapsed response choices was obtained only after the deletion of 1 more item for fear of physical activities: WORSE. Therefore, the final IRT-based FABQ scales contained 3 items for physical activities and 10 items for work activities. These results suggest that the FABQ—in its original format of 7 response categories for 4 items in the physical activity scale and 7 items in the work activity scale—could be improved through the use of IRT methods, which some authors suggest are more exacting than the classical test theory method43,58,92 originally used to analyze FABQ data and develop the original FABQ scales.

Once the FABQ data were analyzed with IRT methods, screening items that provided maximum information45 at the median for the FABQ scales could be easily selected. Item response theory methods are ideally suited to this task because plots of item information functions allow identification of the amount of information or discriminating ability of each item at any level of fear. We wanted to develop a test (ie, the screening items) that accurately dichotomized subjects into groups with low versus high levels of fear (ie, high levels of fear are disease positive),83 and selecting screening items that provided maximum information at the median fear level produced strong diagnostic test results.

According to Sackett et al,83 when a test with high specificity is positive, the result effectively rules in the diagnosis. The specificities for the physical activity and work activity scales with 1 screening item were strong, 0.98 and 0.93, respectively. In addition, the +LRs, which can be interpreted as the ratio of true-positive results to false-positive results,84 also were strong. Although the use of 2 screening items dramatically improved the +LR for the physical activity scale, the already strong specificity improved little; in addition, the work activity scale specificity and +LR did not improve appreciably with 2 screening items. The +LR can be interpreted as a cost-to-benefit ratio, in which the rate of true-positive results represents a benefit criterion and the rate of false-positive results represents a cost criterion.84 To minimize unnecessary testing and inappropriate treatment, +LR should be high. Here, we elected to use one screening item per scale, and this method was accurate and efficient and produced high +LRs. Therefore, with the IRT-based fear-avoidance belief scales, a subject who selected the unlabeled response "unsure" or higher for the SHLDNOT screening item on the FABQ-PA scale was about 35 times more likely to have high levels of fear; a subject who selected "unsure" or higher on the WRKCANT triage item on the FABQ-W scale was about 13 times more likely to have high levels of fear. High levels of fear represented FABQ scores higher than the median fear level, which has been associated with poorer functional status outcomes.14,1921,35,8890

The original FABQ scales scored with summative methods as described by Waddell et al4 are common. However, summative scoring of categorical data typically produces nonlinear scores, whereas IRT-based measures produce linear scores, as evidenced by the data in Table 4. Summative scores are easy to obtain in clinics, but scores from IRT-based measures require computer technology to obtain. The validity of using parametric statistical techniques for nonlinear summative measures has been questioned.95,96 Clinicians who wish to transform new IRT-based measures to original FABQ summative scores4 can use the cross-walk table. For example, if the IRT-based measure of fear of physical activities were 0.6 logit, then the original summative score, estimated from the cross-walk table, would be 17.5 (95% confidence interval=17.4–17.7)—a value considered to be elevated.

Finally, predictions of elevated levels of fear relate to intake fear. Median intake FABQ scores have been used to classify subjects into groups with high versus low levels of fear.14,19 Findings from these randomized controlled trials14,19 suggested that modifications of management strategies designed to reduce the effects of fear-avoidance beliefs for subjects with elevated levels of fear tend to decrease disability (ie, improve functional status). However, as described by George et al,1 dichotomizing subjects on the basis of a median cut score at intake does not necessarily represent an increased probability of developing chronic symptoms. In addition, FABQ items demonstrated no DIF by level of pain intensity, a finding that could facilitate future studies examining the relationship between pain intensity and activity-related fear. Further studies with longitudinal designs and external criteria are recommended to test the predictive power of the cut scores identified with our data, as well as the use of screening items to assess improvements in functional status or quality of life associated with changes in fear-avoidance beliefs or even improvements in fear-avoidance beliefs as a treatment outcome.

Limitations and Future Studies

The present study is not without limitations. The RMSEA values were higher than desired for assessing the fit of the data to the CFAs. All other CFA fit indexes were strong, as were assessments of the fit of the data to the GRM. High RMSEA values imply that the data do not fit the CFA; therefore, further testing to validate the notion that FABQ items represent unidimensional scales worthy of IRT analyses is recommended. Subject grouping by impairment to assess fear-avoidance belief screening may not be as discriminating as other methods of grouping subjects, including grouping by diagnosis; however, the impairment data appeared to be clinically logical, and obtaining confirmable, reliable, and valid diagnoses for many people seeking outpatient rehabilitation is difficult. Other methods of grouping subjects should be explored.

The present study represents a retrospective analysis of an existing effectiveness database. The researchers had no control over which subjects were asked to complete the FABQ surveys; therefore, the potential for biased results exists. However, because the sample was large, it can be argued that the results represented adequate estimates of the FABQ scores. However, it would prudent to investigate the effect of not all subjects answering the FABQ surveys. Future prospective studies related to the reliability and validity of screening FABQ measures are encouraged. Future studies should consider screening for levels of fear with FABQ cut scores that are not based on a median split and should explore potentially informative associations between clinical variables and other psychosocial factors, including levels of fear, false-positive results, and floor and ceiling effects. The use of screening information by clinicians to modify management and interventions to improve outcomes is encouraged. Because IRT-based screening for elevated levels of fear-avoidance beliefs was accurate, the use of elevated levels of fear-avoidance beliefs as a risk adjustment variable in longitudinal studies of changes in functional status should be explored. Finally, efficient collection of data is facilitated by the use of computers. Exploration of the efficiency and accuracy of combining IRT-based FABQ screening with computerized adaptive testing of functional status is encouraged.


arrow
Conclusion
 
Using IRT methods, we analyzed scales commonly used for fear-avoidance beliefs regarding physical and work activities with a large sample of people being treated for common neuromusculoskeletal impairments in outpatient rehabilitation. The results indicated that IRT methods can improve assessments with FABQ scales and can be used to determine single screening items that can accurately identify people with high levels of fear at rehabilitation intake. Because the items in the IRT-based FABQ scales had negligible DIF, particularly for a subject's impairment, the results support the use of the screening items to identify people with high levels of fear in routine practice in a variety of subgroups of people seeking outpatient rehabilitation. The use of IRT-based FABQ scales might prove beneficial by alerting therapists to the likelihood of elevated levels of fear, which could prompt further testing or modifications of management strategies designed to produce improved outcomes.


arrow
Appendix.
 


Figure 1
View larger version (37K):
[in this window]
[in a new window]

 
Appendix. Modified Fear-Avoidance Beliefs Questionnairea

a Modified and reprinted with permission of the International Association for the Study of Pain from: Waddell G, Newton M, Henderson I, et al. A Fear-Avoidance Beliefs Questionnaire (FABQ) and the role of fear-avoidance beliefs in chronic low back pain and disability. Pain. 1993:52:157–168.

b Item was modified from the original wording by eliminating references to the back.


arrow
Footnotes
 
Dr Hart and Mr Werneke provided concept/idea/research design. Dr Hart, Mr Werneke, Dr George, Dr Matheson, and Dr Cook provided writing. Dr Hart, Mr Werneke, Dr Matheson, and Mr Mioduski provided data collection. Dr Hart and Dr Cook provided data analysis. Dr Hart provided project management. Dr George, Dr Matheson, Dr Wang, Dr Cook, and Dr Choi provided consultation (including review of manuscript before submission. Dr Hart is an employee of and investor in Focus On Therapeutic Outcomes, Inc (FOTO), the database management company that manages the data analyzed in the study. Mr Mioduski wrote the software used to collect the data and manage the aggregated database from which the data were drawn for the analyses. Mr Werneke and Dr Matheson work for clinical facilities that use the FOTO data collection system for routine data collection and case management. The authors thank Dr Paul Crane and Dr Laura Gibbons for their insightful comments regarding differential item functioning analyses.

The Institutional Review Board for the Protection of Human Subjects, Focus On Therapeutic Outcomes, Inc, approved the project.

Part of this research was presented at the International Conference on Outcomes Measurement; September 11–13, 2008; Bethesda, Maryland; and at the Combined Sections Meeting of the American Physical Therapy Association; February 9–12, 2009; Las Vegas, Nevada.

* Focus On Therapeutic Outcomes, Inc, PO Box 11444, Knoxville, TN 37939-1444 (Web site: www.fotoinc.com). Back

{dagger} Scientific Software International Inc, 7383 N Lincoln Ave, Suite 100, Lincolnwood, IL 60712-1747. Back

{ddagger} Crane P, Gibbons LE, Jolley L, van Belle G, University of Washington, Seattle, WA, 2005. Back

§ StataCorp LP, 4905 Lakeway Dr, College Station, TX 77845. Back


arrow
References
 
  1. George SZ, Fritz JM, Childs JD. Investigation of elevated fear-avoidance beliefs for patients with low back pain: a secondary analysis involving patients enrolled in physical therapy clinical trials. J Orthop Sports Phys Ther. 2008;38:50–58.[Web of Science][Medline]
  2. Linton SJ. A review of psychological risk factors in back and neck pain. Spine. 2000;25:1148–1156.[CrossRef][Web of Science][Medline]
  3. Pincus T, Burton AK, Vogel S, Field AP. A systematic review of psychological factors as predictors of chronicity/disability in prospective cohorts of low back pain. Spine. 2002;27:E109–E120.[CrossRef][Medline]
  4. Waddell G, Newton M, Henderson I, et al. A Fear-Avoidance Beliefs Questionnaire (FABQ) and the role of fear-avoidance beliefs in chronic low back pain and disability. Pain. 1993;52:157–168.[CrossRef][Web of Science][Medline]
  5. Leeuw M, Goossens ME, Linton SJ, et al. The fear-avoidance model of musculoskeletal pain: current state of scientific evidence. J Behav Med. 2007;30:77–94.[CrossRef][Web of Science][Medline]
  6. Lethem J, Slade PD, Troup JD, Bentley G. Outline of a Fear-Avoidance Model of exaggerated pain perception—I. Behav Res Ther. 1983;21:401–408.[CrossRef][Web of Science][Medline]
  7. Vlaeyen JW, Koel-Snijders AMJ, Rotteveel AM, et al. The role of fear of movement/(re)injury in pain disability. J Occup Rehabil. 1995;5:235–252.[CrossRef][Web of Science]
  8. Vlaeyen JW, Kole-Snijders AM, Boeren RG, van Eek H. Fear of movement/(re)injury in chronic low back pain and its relation to behavioral performance. Pain. 1995;62:363–372.[CrossRef][Web of Science][Medline]
  9. Vlaeyen JW, Linton SJ. Fear-avoidance and its consequences in chronic musculoskeletal pain: a state of the art. Pain. 2000;85:317–332.[CrossRef][Web of Science][Medline]
  10. Crombez G, Vlaeyen JW, Heuts PH, Lysens R. Pain-related fear is more disabling than pain itself: evidence on the role of pain-related fear in chronic back pain disability. Pain. 1999;80:329–339.[CrossRef][Web of Science][Medline]
  11. George SZ, Fritz JM, Erhard RE. A comparison of fear-avoidance beliefs in patients with lumbar spine pain and cervical spine pain. Spine. 2001;26:2139–2145.[CrossRef][Web of Science][Medline]
  12. Fritz JM, George SZ, Delitto A. The role of fear-avoidance beliefs in acute low back pain: relationships with current and future disability and work status. Pain. 2001;94:7–15.[CrossRef][Web of Science][Medline]
  13. Staerkle R, Mannion AF, Elfering A, et al. Longitudinal validation of the Fear-Avoidance Beliefs Questionnaire (FABQ) in a Swiss-German sample of low back pain patients. Eur Spine J. 2004;13:332–340.[Web of Science][Medline]
  14. Burton AK, Waddell G, Tillotson KM, Summerton N. Information and advice to patients with back pain can have a positive effect: a randomized controlled trial of a novel educational booklet in primary care. Spine. 1999;24:2484–2491.[CrossRef][Web of Science][Medline]
  15. Cleland JA, Fritz JM, Brennan GP. Predictive validity of initial fear avoidance beliefs in patients with low back pain receiving physical therapy: is the FABQ a useful screening tool for identifying patients at risk for a poor recovery? Eur Spine J. 2008;17:70–79.[CrossRef][Web of Science][Medline]
  16. Fritz JM, George SZ. Identifying psychosocial variables in patients with acute work-related low back pain: the importance of fear-avoidance beliefs. Phys Ther. 2002;82:973–983.[Abstract/Free Full Text]
  17. George SZ. Fear: a factor to consider in musculoskeletal rehabilitation. J Orthop Sports Phys Ther. 2006;36:264–266.[Web of Science][Medline]
  18. George SZ, Bialosky JE, Donald DA. The centralization phenomenon and fear-avoidance beliefs as prognostic factors for acute low back pain: a preliminary investigation involving patients classified for specific exercise. J Orthop Sports Phys Ther. 2005;35:580–588.[Web of Science][Medline]
  19. George SZ, Fritz JM, Bialosky JE, Donald DA. The effect of a fear-avoidance-based physical therapy intervention for patients with acute low back pain: results of a randomized clinical trial. Spine. 2003;28:2551–2560.[CrossRef][Web of Science][Medline]
  20. George SZ, Fritz JM, McNeil DW. Fear-avoidance beliefs as measured by the Fear-Avoidance Beliefs Questionnaire: change in Fear-Avoidance Beliefs Questionnaire is predictive of change in self-report of disability and pain intensity for patients with acute low back pain. Clin J Pain. 2006;22:197–203.[CrossRef][Web of Science][Medline]
  21. Linton SJ, Boersma K, Jansson M, et al. The effects of cognitive-behavioral and physical therapy preventive interventions on pain-related sick leave: a randomized controlled trial. Clin J Pain. 2005;21:109–119.[CrossRef][Web of Science][Medline]
  22. Linton SJ, Buer N, Vlaeyen J, Hellsing AL. Are fear-avoidance beliefs related to the inception of an episode of back pain? A prospective study. Psychol Health. 2000;14:1051–1059.[CrossRef][Web of Science]
  23. Houben RM, Leeuw M, Vlaeyen JW, et al. Fear of movement/injury in the general population: factor structure and psychometric properties of an adapted version of the Tampa Scale for Kinesiophobia. J Behav Med. 2005;28:415–424.[CrossRef][Web of Science][Medline]
  24. Goubert L, Crombez G, De Bourdeaudhuij I. Low back pain, disability and back pain myths in a community sample: prevalence and interrelationships. Eur J Pain. 2004;8:385–394.[CrossRef][Web of Science][Medline]
  25. Landers MR, Creger RV, Baker CV, Stutelberg KS. The use of fear-avoidance beliefs and nonorganic signs in predicting prolonged disability in patients with neck pain. Man Ther. 2008;13:239–248.[CrossRef][Web of Science][Medline]
  26. Linton SJ, Ryberg M. A cognitive-behavioral group intervention as prevention for persistent neck and back pain in a non-patient population: a randomized controlled trial. Pain. 2001;90:83–90.[CrossRef][Web of Science][Medline]
  27. Nederhand MJ, Ijzerman MJ, Hermens HJ, et al. Predictive value of fear avoidance in developing chronic neck pain disability: consequences for clinical decision making. Arch Phys Med Rehabil. 2004;85:496–501.[CrossRef][Web of Science][Medline]
  28. Huis ’t Veld RM, Vollenbroek-Hutten MM, Groothuis-Oudshoorn KC, Hermens HJ. The role of the fear-avoidance model in female workers with neck-shoulder pain related to computer work. Clin J Pain. 2007;23:28–34.[CrossRef][Web of Science][Medline]
  29. van Baar ME, Dekker J, Oostendorp RA, et al. The effectiveness of exercise therapy in patients with osteoarthritis of the hip or knee: a randomized clinical trial. J Rheumatol. 1998;25:2432–2439.[Web of Science][Medline]
  30. Chmielewski TL, Jones D, Day T, et al. The association of pain and fear of movement/reinjury with function during anterior cruciate ligament reconstruction rehabilitation. J Orthop Sports Phys Ther. 2008;38:746–753.[CrossRef][Web of Science][Medline]
  31. Kvist J, Ek A, Sporrstedt K, Good L. Fear of re-injury: a hindrance for returning to sports after anterior cruciate ligament reconstruction. Knee Surg Sports Traumatol Arthrosc. 2005;13:393–397.[CrossRef][Web of Science][Medline]
  32. Nash JM, Williams DM, Nicholson R, Trask PC. The contribution of pain-related anxiety to disability from headache. J Behav Med. 2006;29:61–67.[CrossRef][Web of Science][Medline]
  33. Turk DC, Robinson JP, Burwinkle T. Prevalence of fear of pain and activity in patients with fibromyalgia syndrome. J Pain. 2004;5:483–490.[CrossRef][Web of Science][Medline]
  34. Heuts PH, Vlaeyen JW, Roelofs J, et al. Pain-related fear and daily functioning in patients with osteoarthritis. Pain. 2004;110:228–235.[CrossRef][Web of Science][Medline]
  35. de Jong JR, Vlaeyen JW, Onghena P, et al. Reduction of pain-related fear in complex regional pain syndrome type I: the application of graded exposure in vivo. Pain. 2005;116:264–275.[CrossRef][Web of Science][Medline]
  36. Werneke MW, Hart DL. Centralization: association between repeated end-range pain responses and behavioral signs in patients with acute non-specific low back pain. J Rehabil Med. 2005;37:286–290.[CrossRef][Web of Science][Medline]
  37. Leemrijse CJ, Swinkels IC, Veenhof C. Direct access to physical therapy in the Netherlands: results from the first year in community-based physical therapy. Phys Ther. 2008;88:936–946.[Abstract/Free Full Text]
  38. Haggman S, Maher CG, Refshauge KM. Screening for symptoms of depression by physical therapists managing low back pain. Phys Ther. 2004;84:1157–1166.[Abstract/Free Full Text]
  39. Werneke MW, Hart DL, George SZ, et al. Clinical outcomes for patients classified by fear-avoidance beliefs and centralization phenomenon. Arch Phys Med Rehabil. 2009;90:768–777.[CrossRef][Web of Science][Medline]
  40. DeSalvo KB, Fisher WP, Tran K, et al. Assessing measurement properties of two single-item general health measures. Qual Life Res. 2006;15:191–201.[CrossRef][Web of Science][Medline]
  41. Rost K, Burnam MA, Smith GR. Development of screeners for depressive disorders and substance disorder history. Med Care. 1993;31:189–200.[Web of Science][Medline]
  42. Geraets JJ, Goossens ME, de Groot IJ, et al. Effectiveness of a graded exercise therapy program for patients with chronic shoulder complaints. Aust J Physiother. 2005;51:87–94.[Web of Science][Medline]
  43. van der Linden WJ, Hambleton RK, eds. Handbook of Modern Item Response Theory. New York, NY: Springer-Verlag; 1997.
  44. Millsap RE, Everson HT. Methodology review: statistical approaches for assessing measurement bias. Appl Psychol Meas. 1993;17:287–334.
  45. Lord FM. Applications of Item Response Theory to Practical Testing Problems. Hillsdale, NJ: Lawrence Erlbaum Associates; 1980.
  46. Deutscher D, Hart DL, Dickstein R, et al. Implementing an integrated electronic outcomes and electronic health record process to create a foundation for clinical practice improvement. Phys Ther. 2008;88:270–285.[Abstract/Free Full Text]
  47. Swinkels ICS, van den Ende CHM, de Bakker D, et al. Clinical databases in physical therapy. Physiother Theory Pract. 2007;23:153–167.[CrossRef][Medline]
  48. Hart AC, Stegman MS. ICD-9-CM 2008 Expert. 6th ed. Salt Lake City, UT: Ingenix; 2007.
  49. Hart DL, Connolly JB. Pay-for-Performance for Physical Therapy and Occupational Therapy: Medicare Part B Services. Final report. Grant #18-P-93066/9–01. Baltimore, MD: Centers for Medicare and Medicaid Services, Department of Health and Human Services; 2006. Available at: http://www.cms.hhs.gov/TherapyServices/downloads/P4PFinalReport06-01-06.pdf. Accessed May 18, 2009
  50. Hart DL, Cook KF, Mioduski JE, et al. Simulated computerized adaptive test for patients with shoulder impairments was efficient and produced valid measures of function. J Clin Epidemiol. 2006;59:290–298.[CrossRef][Web of Science][Medline]
  51. Hart DL, Mioduski JE, Stratford PW. Simulated computerized adaptive tests for measuring functional status were efficient with good discriminant validity in patients with hip, knee, or foot/ankle impairments. J Clin Epidemiol. 2005;58:629–638.[CrossRef][Web of Science][Medline]
  52. Hart DL, Mioduski JE, Werneke MW, Stratford PW. Simulated computerized adaptive test for patients with lumbar spine impairments was efficient and produced valid measures of function. J Clin Epidemiol. 2006;59:947–956.[CrossRef][Web of Science][Medline]
  53. Hart DL, Wang YC, Stratford PW, Mioduski JE. Computerized adaptive test for patients with foot or ankle impairments produced valid and responsive measures of function. Qual Life Res. 2008;17:1081–1091.[CrossRef][Web of Science][Medline]
  54. Hart DL, Wang YC, Stratford PW, Mioduski JE. A computerized adaptive test for patients with hip impairments produced valid and responsive measures of function. Arch Phys Med Rehabil. 2008;89:2129–2139.[CrossRef][Web of Science][Medline]
  55. Hart DL, Wang YC, Stratford PW, Mioduski JE. Computerized adaptive test for patients with knee impairments produced valid and responsive measures of function. J Clin Epidemiol. 2008;61:1113–1124.[CrossRef][Web of Science][Medline]
  56. Asmundson GJ, Norton GR, Allerdings MD. Fear and avoidance in dysfunctional chronic back pain patients. Pain. 1997;69:231–236.[CrossRef][Web of Science][Medline]
  57. Hambleton RK. Emergence of item response modeling in instrument development and data analysis. Med Care. 2000;38(9 suppl):II60–II65.[Medline]
  58. Hambleton RK, Swaminathan H, Rogers HJ. Fundamentals of Item Response Theory. Newbury Park, CA: Sage; 1991.
  59. Hays RD, Morales LS, Reise SP. Item response theory and health outcomes measurement in the 21st century. Med Care. 2000;38(9 suppl):II28–II42.[Medline]
  60. Muthén LK, Muthén BO. Mplus User's Guide. 4th ed. Los Angeles, CA: Muthén & Muthén; 2006.
  61. Bjorner JB, Kosinski M, Ware JE Jr. The feasibility of applying item response theory to measures of migraine impact: a re-analysis of three clinical studies. Qual Life Res. 2003;12:887–902.[Web of Science][Medline]
  62. Fliege H, Becker J, Walter OB, et al. Development of a computer-adaptive test for depression (D-CAT). Qual Life Res. 2005;14:2277–2291.[CrossRef][Web of Science][Medline]
  63. McDonald RP. Test Theory: A Unified Treatment. Mahwah, NJ: Lawrence Erlbaum Associates; 1999.
  64. Hu LT, Bentler P. Cutoff criteria for fit indices in covariance structure analysis: conventional criteria versus new alternatives. Structural Equation Modeling. 1999;6:1–55.
  65. Tucker L, Lewis C. A reliability coefficient for maximum likelihood factor analysis. Psychometrika. 1973;38:1–10.[CrossRef]
  66. Browne MW, Cudeck R. Alternative ways of assessing model fit. In: Bollen KA, Long JA, eds. Testing Structural Equation Models. Newbury Park, CA: Sage Publications; 1993:136–172.
  67. Samejima F. Estimation of ability using a response pattern of graded responses. Psychometrika. 1969. Monograph 17.
  68. Samejima F. Graded Response Model. In: van der Linden WJ, Hambleton RK, eds. Handbook of Modern Item Response Theory. New York, NY: Springer-Verlag; 1997:85–100.
  69. PARSCALE for Windows, version 4.1. Lincolnwood, IL: Scientific Software International, Inc; 2003.
  70. Dodd BG, Koch WR, De Ayala RJ. Operational characteristics of adaptive testing procedures using the Graded Response Model. Appl Psychol Meas. 1989;13:129–143.[CrossRef]
  71. Rose M, Bjorner JB, Becker J, et al. Evaluation of a preliminary physical function item bank supported the expected advantages of the Patient-Reported Outcomes Measurement Information System (PROMIS). J Clin Epidemiol. 2008;61:17–33.[CrossRef][Web of Science][Medline]
  72. Thissen D. Reliability and measurement precision. In: Wainer H, ed. Computerized Adaptive Testing: A Primer. 2nd ed. Mahwah, NJ: Lawrence Erlbaum Associates; 2000:159–184.
  73. Delitto A, Erhard RE, Bowling RW. A treatment-based classification approach to low back syndrome: identifying and staging patients for conservative treatment. Phys Ther. 1995;75:470–489.[Abstract/Free Full Text]
  74. Crane PK, Cetin K, Cook KF, et al. Differential item functioning impact in a modified version of the Roland-Morris Disability Questionnaire. Qual Life Res. 2007;16:981–990.[CrossRef][Web of Science][Medline]
  75. Crane PK, Gibbons LE, Jolley L, van Belle G. Differential item functioning analysis with ordinal logistic regression techniques: DIFdetect and difwithpar. Med Care. 2006;44(11 suppl 3):S115–S123.[CrossRef][Web of Science][Medline]
  76. Crane PK, Gibbons LE, Narasimhalu K, et al. Rapid detection of differential item functioning in assessments of health-related quality of life: the Functional Assessment of Cancer Therapy. Qual Life Res. 2007;16:101–114.[CrossRef][Web of Science][Medline]
  77. Crane PK, Gibbons LE, Ocepek-Welikson K, et al. A comparison of three sets of criteria for determining the presence of differential item functioning using ordinal logistic regression. Qual Life Res. 2007;16(suppl 1):69–84.[CrossRef][Web of Science][Medline]
  78. Crane PK, Hart DL, Gibbons LE, Cook KF. A 37-item shoulder functional status item pool had negligible differential item functioning. J Clin Epidemiol. 2006;59:478–484.[CrossRef][Web of Science][Medline]
  79. Crane PK, van Belle G, Larson EB. Test bias in a cognitive test: differential item functioning in the CASI. Stat Med. 2004;23:241–256.[CrossRef][Web of Science][Medline]
  80. Stata Statistical Software, release 9.2. College Station, TX: StataCorp LP; 2007.
  81. Hanley JA, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology. 1982;143:29–36.[Abstract/Free Full Text]
  82. Choi BC. Slopes of a receiver operating characteristic curve and likelihood ratios for a diagnostic test. Am J Epidemiol. 1998;148:1127–1132.[Abstract/Free Full Text]
  83. Sackett DL, Straus SE, Richardson WS, et al. Evidence-Based Medicine: How to Practice and Teach EBM. 2nd ed. New York, NY: Churchill Livingstone Inc; 2000.
  84. Dujardin B, Van den Ende J, Van Gompel A, et al. Likelihood ratios: a real improvement for clinical decision making? Eur J Epidemiol. 1994;10:29–36.[CrossRef][Web of Science][Medline]
  85. Jaeschke R, Guyatt G, Sackett DL; Evidence-Based Medicine Working Group. Users’ guides to the medical literature, III: how to use an article about a diagnostic test, A: Are the results of the study valid? JAMA. 1994;271:389–391.[Abstract/Free Full Text]
  86. DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1988;44:837–845.[CrossRef][Web of Science][Medline]
  87. Woby SR, Watson PJ, Roach NK, Urmston M. Are changes in fear-avoidance beliefs, catastrophizing, and appraisals of control, predictive of changes in chronic low back pain and disability? Eur J Pain. 2004;8:201–210.[CrossRef][Web of Science][Medline]
  88. de Jong JR, Vlaeyen JW, Onghena P, et al. Fear of movement/(re)injury in chronic low back pain: education or exposure in vivo as mediator to fear reduction? Clin J Pain. 2005;21:9–17; discussion 69–72.[CrossRef][Web of Science][Medline]
  89. Godges JJ, Anger MA, Zimmerman G, Delitto A. Effects of education on return-to-work status for people with fear-avoidance beliefs and acute low back pain. Phys Ther. 2008;88:231–239.[Abstract/Free Full Text]
  90. Klaber Moffett JA, Carr J, Howarth E. High fear-avoiders of physical activity benefit from an exercise program for patients with back pain. Spine. 2004;29:1167–1172.[CrossRef][Web of Science][Medline]
  91. Werneke MW, Hart DL. Categorizing patients with occupational low back pain by use of the Quebec Task Force Classification system versus pain pattern classification procedures: discriminant and predictive validity. Phys Ther. 2004;84:243–254.[Abstract/Free Full Text]
  92. Embretson SE, Reise SP. Item Response Theory for Psychologists. Mahwah, NJ: Lawrence Erlbaum Associates; 2000.
  93. Bode RK, Lai JS, Cella D, Heinemann AW. Issues in the development of an item bank. Arch Phys Med Rehabil. 2003;84(4 suppl 2):S52–S60.[CrossRef][Web of Science][Medline]
  94. DeWalt DA, Rothrock N, Yount S, Stone AA. Evaluation of item candidates: the PROMIS qualitative item review. Med Care. 2007;45(5 suppl 1):S12–S21.[CrossRef][Web of Science][Medline]
  95. Tennant A, Penta M, Tesio L, et al. Assessing and adjusting for cross-cultural validity of impairment and activity limitation scales through differential item functioning within the framework of the Rasch model: the PRO-ESOR project. Med Care. 2004;42(1 suppl):I37–I48.[Medline]
  96. Wright BD, Linacre JM. Observations are always ordinal; measurements, however, must be interval. Arch Phys Med Rehabil. 1989;70:857–860.[Web of Science][Medline]

Add to CiteULike CiteULike   Add to Complore Complore   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati    What's this?


This article has been cited by other articles:


Home page
ptjournalHome page
D. L. Hart, Y.-C. Wang, K. F. Cook, and J. E. Mioduski
A Computerized Adaptive Test for Patients With Shoulder Impairments Produced Responsive Measures of Function
Physical Therapy, June 1, 2010; 90(6): 928 - 938.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow The Bottom Line
Right arrow All Versions of this Article:
ptj.20080227v1
89/8/770    most recent
Right arrow Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Rapid Responses are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Hart, D. L.
Right arrow Articles by Choi, S. W.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Hart, D. L.
Right arrow Articles by Choi, S. W.
Related Collections
Right arrow Neurology/Neuromuscular System: Other
Right arrow Fear-Avoidance
Right arrow Tests and Measurements
Social Bookmarking
 Add to CiteULike   Add to Complore   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?