The validity of the ULTT is unclear, due to heterogeneity of test procedures and variability in the definition of a positive test
ObjectiveTo evaluate test procedures and positive diagnostic criteria for the upper limb tension test (ULTT) in diagnostic test accuracy studies.
MethodsA systematic review of diagnostic accuracy studies was performed. We conducted a search of the DiTA (Diagnostic Test Accuracy) database and selected primary studies evaluating the diagnostic accuracy of the ULTT. We assessed risk of bias, performed data extraction on study characteristics, test procedures, and positive diagnostic criteria, and performed a descriptive analysis.
ResultsWe included nine studies (681 participants), four diagnosing people with cervical radiculopathy (CR), four diagnosing people with carpal tunnel syndrome (CTS), and one included both CR and CTS. The risk of bias varied between 2 and 6 out of 6 positive items. Eight studies reported on the ULTT1 (median nerve).
Overall, all studies clearly described their test procedures and positive diagnostic criteria although the order of movements and the diagnostic criteria between studies varied. We suggest a more standardised test procedure for the ULTT1 to consist of: 1) stabilising the shoulder in abduction, 2) extending the wrist/fingers, 3) supinating the forearm, 4) externally rotating the shoulder, 5) extending the elbow, and finally 6) performed structural differentiation by side bending (lateral flexion) of the neck. This proposed test procedure should reproduce the symptoms and enables the clinician to evaluate whether symptoms increase/decrease when stressing or relaxing the nerves.
ConclusionBased on our findings we proposed a more standardised test procedure for the ULTT1 with accompanying positive diagnostic criteria to facilitate homogeneity in future diagnostic accuracy studies of the ULTT.
Upper limb tension tests (ULTT), also called upper limb neurodynamic tests (ULNT), are a commonly used neurodynamic technique by clinicians to evaluate nerve gliding and neural tension in patients.1 Neurodynamic tests aim to investigate if a peripheral nerve is contributing to the patients’ pain, resulting in peripheral neuropathic pain, by moving and stretching the peripheral nerves.2,3 When utilizing a neurodynamic test, the assessor manipulates the limb into various positions with the aim of increasing tension on peripheral nerves.4,5 The positioning of each joint is added to provoke the pain or reproduce the symptoms.6
Variations of the ULTT aim to selectively differentiate between nerves of the upper limb by altering the type of movement and movement sequence. The ULTT1 and 2a aim to test the median nerve, the ULTT2b the radial nerve, and ULTT3 the ulnar nerve. The ability of the ULTT to selectively test specific nerve roots is however questionable.3,7,8
The diagnostic strategy to diagnose a compression of the nerve includes, apart from clinical signs and symptoms and tests like the ULTT, electromyography, nerve conduction studies, and MRI (magnetic resonance imaging).9 Clinicians often use the ULTT to help diagnose whether a patient has peripheral neuropathic pain, like cervical radiculopathy (CR), carpal tunnel syndrome, or cubital tunnel syndrome.2,3,5
Before any test can be endorsed for use in clinical practice, its validity needs to be determined. Systematic reviews state that the validity of the ULTT is unclear, due to heterogeneity of test procedures and variability in the definition of a positive test.10,11 For clinicians, several websites and videos exist (apart from textbooks) to help them perform these tests accurately and claim to be evidence-based.2,12,13 Unfortunately, test procedures of the ULTT vary in the literature as well as on these major websites.
Today no current uniform set of positive diagnostic criteria exists for the ULTT, creating uncertainty on what constitutes a positive test.10,11 Positive test criteria (positivity thresholds) used in literature and clinical practice include, amongst others, reproduction of patient symptoms with differences between sides, relief or exacerbation of symptoms on structural differentiation, and reproduction of neurological pain associated with the nerve distribution.14,15
Current systematic reviews focus on the accuracy data, irrespective of the test procedures and variation in positive diagnostic criteria.10,11 Given that there is a lack of test procedure standardization and there are no clear cut-offs for positive tests available, it brings into question the usefulness of these tests in clinical practice and the validity of the accompanying accuracy data. The establishment of a standardized procedure and clear positive diagnostic criteria needs to be done first before we evaluate the diagnostic accuracy of these tests. When the diagnostic accuracy is established and the ULTT can rule in or rule out patients with a possible CR or CTS, these tests might have greater clinical application and improve patient outcomes. Additionally, this may lead to a reduction in patient waiting time for nerve conduction studies or expensive diagnostic imaging. Therefore, this study aimed to evaluate test procedures and positive diagnostic criteria for the ULTT in diagnostic test accuracy studies and to construct a proposed set of recommended test procedures and positivity thresholds.
MethodsStudy designA systematic review using the DiTA (Diagnostic Test Accuracy) database was performed. DiTA database is a comprehensive index, which is updated monthly using automated searches, of diagnostic test accuracy studies developed specifically for the discipline of physical therapy.16,17 It is a sister database of the PEDro (Physiotherapy Evidence Database) which includes intervention studies within the physical therapy domain. We registered our protocol on the UTS (University of Technology Sydney) data repository, and it can be requested from the corresponding author.
Search strategyThe search was conducted on the 22 November 2021 using the search terms: Upper Limb Tension Test and associated synonyms, like ULTT, ULNT, Brachial Plexus Tension, Elvey Test, Upper limb nerve tension test, Peripheral neuropathic pain, Cervical radiculopathy, Carpal tunnel syndrome, Cubital tunnel syndrome.
Study selectionThe inclusion criteria consisted of a) adult participants (18 years or over) who present with signs and symptoms that indicate upper limb pathology, b) evaluated sensitivity and/or specificity of the ULTT in diagnosing upper limb pathology by comparing the ULTT to any reference test, c) primary diagnostic test accuracy studies. We excluded studies using non-human subjects or cadavers and where the ULTT was used in combination with other special tests for diagnosis. The eligibility of each study, identified by the search or included in (systematic) reviews, was determined by two out of five independent assessors (GB, CB, CA, HD, SF). Discrepancies were resolved by consensus or by a third independent investigator (AV).
Risk of bias assessmentThe QUADAS-2 is the recommended tool to assess risk of bias (RoB) in diagnostic test accuracy studies.18 As this tool is rather difficult to administer for clinicians/researchers, we simplified it to six criteria (Table 1).19 Each criterion can be scored yes/no/unclear. Two out of five assessors (GB, CB, CA, HD, SF) independently assessed each study and any discrepancies were discussed and resolved by consensus. If a consensus could not be reached, a third independent investigator (AV) made a final decision.
Criteria for risk of bias in a diagnostic test accuracy study.31
Two of five assessors (GB, CB, CA, HD, SF) extracted the following data from each of the studies: author(s) and year of publication, participant details (number, mean age and range, sex, clinical characteristics), examination details (clinical setting, examiner profession and expertise), reference standard, test procedures, and criteria for positive test result, e.g. pain, range of motion (either actively or passively performed). Discrepancies and a data check was done by two review authors (HB, AV).
AnalysisWe considered a specificity or sensitivity value high enough to be clinically useful in ruling in or out a condition, according to the SpPIn (specificity high and positive test rules in a condition) and SnNOut (sensitivity high and negative test rules out a condition) rules. As most diagnostic test accuracy studies are performed in highly specific populations, generalising their accuracy to the public requires sensitivity and specificity to be sufficiently high, with recommended cut-offs between 0.85 and 0.90–0.95 being used.20.21 We conducted a frequency analysis of the number of papers outlining the test protocol used for the ULTT to determine the common denominator of those procedures. Similarly, a descriptive analysis was completed to assess the positive diagnostic criteria.
ResultsSearch resultsFrom the initial search in the DiTA database, 22 original studies were retrieved, in addition to two systematic reviews10,11 and three narrative reviews.3,22,23 After reviewing the primary research papers included in these reviews and removing duplicates, we retrieved an additional 5 references, resulting in 27 papers being available for full-text review (Fig. 1). After the application of the selection criteria, we excluded three papers as these were narrative reviews,3,22,23 three papers did not evaluate the ULTT,24-26 and one was in a population with low back pain.27 Finally, we included 10 papers reporting on 9 studies.
Description of included studiesParticipants. Four studies aimed at diagnosing CR (386 participants),14,28-30 and four studies aimed at diagnosing CTS in 295 participants.31-34 One study (in two publications) included both patients with CR and CTS (Table 2).35,36 Individuals with CR were all referred to a specialized clinic or a neurosurgery department. Three out of the four studies on patients with CR specifically stated they included consecutive patients. The average age of the participants varied between 43.2 and 54.3 years and most of the participants were female (varying between 49 and 83%).
Characteristics of included studies.
Study | Participants | Index test | Reference test |
---|---|---|---|
Cervical radiculopathy | |||
Apelby-Albrecht 2013Sweden | Cervical Radiculopathy; n = 51 consecutive, referred to neurosurgeon; age: 51 (25–67); female: 27 (53%) | ULTT all combined and 1, 2A-B, 3 separate (manual therapists) | Combination of history, clinical examination, and MRI (neurosurgeons) |
Ghasemi 2013Iran | Cervical Radiculopathy; n = 97 referred to specialised diagnostic centre; age: 46.3; female: 72 (74.2%) | ULTT (trained examiners) | Nerve conduction studies (EDX) (neurologist) |
Grondin 2021France | Cervical Radiculopathy; n = 85 consecutive, referred to neurosurgery department; age: 44; female: not reported? | ULTT 1, 2A-B, 3 (one experienced physical therapist) | Combination of history, clinical examination, and MRI (one experienced neurosurgeon) |
Sleijser-Koehorst 2020Netherlands | Cervical Radiculopathy; n = 134, consecutive, referred to specialised clinic; age: 49.9 (10.7); female: 66 (49%) | ULTT1 (physical therapist) | Clinical presentation + MRI (neurosurgeon) |
Carpal Tunnel Syndrome | |||
Bueno-Garcia 2016Spain | Carpal Tunnel Syndrome; n = 58; age: 54.3 (14.5); female: 44 (75.8%) | ULTT1 (physical therapist) | NCS (neurophysiologist) |
Trillos 2018Colombia | Carpal Tunnel Syndrome; n = 118 (230 wrists); age: 50.5 (18–86); female: 98 (83.1%) | ULTT1 (physical therapist) | NCS (physiatrist) |
Vanti 2011 | Carpal Tunnel Syndrome n = 44; age: 46.3 (10.8); female: 33 (75%) | ULTT1 (physical therapist) | NCS (experienced tester) |
Vanti 2012 | Carpal Tunnel Syndrome; n = 47 (84 limbs); age: 45.9 (10.7); female: 35 (74.5%) | ULTT1 (physical therapist) | NCS (experienced tester) |
Mixed population | |||
Wainner 2003* / 2005 | N = 82: Cervical Radiculopathy; n = 19; age: 43.2 (11.7)Carpal Tunnel Syndrome; n = 28; age: 48.4 (± 11.5)Female: 41 (50%) | ULTT-A,ULTT-B | NCS (physiatrist or neurologist); needle EMG |
Data are mean (standard deviation), mean (range), or frequency (proportion).
CTS, carpal tunnel syndrome; EDX, electro diagnostic studies; EMG, electromyography; MRI, magnetic resonance imaging; NCS, nerve conduction study; ULTT, upper limb tension test (ULTT1 and 2A aim to test the median nerve, the ULTT2B the radial nerve and ULTT3 the ulnar nerve).
Index test. The most evaluated test was ULTT1; five studies only evaluated the ULTT1, two studies evaluated all four ULTT tests and provided accuracy data for all tests separately,14,29 one study did not specify which ULTT was evaluated,28 and one study evaluated the ULTT-A (which was like the ULTT1) and ULTT-B (an alternative to ULTT1).35
Reference test. Nerve conduction study was the most common reference test mainly in the studies investigating CTS,31-35 followed by clinical presentation and magnetic resonance imaging (MRI), in the studies investigating CR.14,29,30 The CR studies reported history taking and physical examination to be part of the reference test, and in two (out of four) studies the reference test was performed by one specialist, while none of the CTS studies did so.
Accuracy data. All studies provided data on the accuracy of the ULTT, and eight studies reported data for the ULTT1. Based on the sparsity of data related to other ULTT tests, we only report outcomes for the ULTT1. Sensitivity of the four test procedures for the ULTT1 for CR varied between 0.35 and 0.83, and the specificity varied between 0.40 and 0.76, meaning none met the lowest cut-off mentioned in literature of 0.85. For CTS studies, which reported on eight test procedures, reported sensitivity was between 0.06 and 0.93 and specificity between 0.10 and 0.93. In three procedures the accuracy data met at least the 0.85 cut-off, but not the 0.95.
Risk of bias. The RoB varied from 2 to 6 criteria scored positive (Table 3).
Risk of bias assessment.
We found a wide variation with a total of 13 different ULTT1 procedures (Table 4). One study did not report the procedure and just provided a reference.31,37 Three other studies provided the procedure and mentioned a reference of the origin of the procedure.29,31,34 All papers mentioned between two and seven different steps in the procedure, unfortunately, no two studies performed the exact same procedure except Vanti et al.33,34 For one study we presented the procedure for the ULTT-A, as this one is most like the ULTT1.35
Test procedure (order of movements) of ULTT1.
In six studies, shoulder depression (which includes shoulder girdle/scapula depression/fixation/stabilisation) is listed as the starting position. One study did not mention shoulder depression and started with the shoulder in abduction,28 one study started with a contralateral flexion of the neck,31 and one study combined most procedures that are separate steps in all other studies into one starting position and added just one extra step (elbow extension).32 Six studies ended the procedure with the last step being ‘contralateral/ipsilateral cervical lateral flexion (bending)’, while one study started with this procedure and released it at the last step.31 Two studies did not mention cervical lateral flexion at all.28,32
Positive diagnostic criteriaAll studies clearly stated their positive diagnostic criteria, although the criteria between studies differed (Table 5). In most studies, multiple symptoms were required for a test to be regarded positive, either in combination or as an ‘either/or’. Three studies evaluated more than one set of positive diagnostic criteria.31,33,34 ‘Reproducible patient symptoms’ and ‘symptoms decrease when relaxing the stress on the nerves’ were the most used criteria (n = 9 and n = 6, respectively). With ‘symptoms decrease when relaxing the stress on the nerves’ (or increased symptoms with increased stretching) authors mean relaxing the stretch on the arm; ‘structural differentiation’ means that symptoms were evaluated to either increase or decrease when the cervical spine was laterally flexed. Three studies used a combination of criteria (Wainner's criteria: “reproducible patient symptoms OR limited range of motion of the elbow OR decrease of symptoms when relaxing”) as defined by Wainer et al.32,34,35 The criterion ‘symptoms in the first 3 digits’ was only mentioned in three studies including patients with CTS.31,33,34
Number of studies that identified each criterion for a positive test for ULTT1.
Study | Crit 1 | Crit 2 | Crit 3 | Crit 4 | Sensitivity/Specificity (95%CI) |
---|---|---|---|---|---|
Cervical radiculopathy | |||||
Apelby-Albrecht 2013 | Reproducible patient symptoms | Side differences | Symptoms increase during stretching | Sens: 0.83 (0.66, 0.93)Spec: 0.75 (0.48, 0.93) | |
Ghasemi 2013 | Reproducible patient symptoms | Sens: 0.35 – 0.60$Spec: 0.40 | |||
Grondin 2021 (Nee et al. 2012) | Reproducible patient symptoms | Structural differentiation* | Sens: 0.59 (0.39, 0.78)Spec: 0.76 (0.63, 0.86) | ||
Sleijser-Koehorst 2020 | Reproducible patient symptoms | Structural differentiation* | Sens: 0.67 (0.54, 0.78)Spec: 0.67 (0.54, 0.78) | ||
Carpal tunnel syndrome | |||||
Bueno-Garcia 2016(Shacklock 2005) | Reproducible patient symptoms | Structural differentiation* | Sens: 0.58 (0.45, 0.71)Spec: 0.84 (0.72, 0.96) | ||
Symptoms in first three digits | Structural differentiation* | Sens: 0.74 (0.61, 0.83)Spec: 0.50 (0.35, 0.65) | |||
Trillos 2018 | Reproducible patient symptoms | Limited Range of Motion | Symptoms increase during stretching | Sens: 0.93 (0.88, 0.98)Spec: 0.10 (0.00, 0.34) | |
Vanti 2011 (Butler 2000) | Reproducible patient symptoms | Limited Range of Motion | Symptoms increase during stretching | Sens: 0.92 (0.74, 0.97)Spec: 0.15 (0.05, 0.36) | |
Symptoms in first three digits | Sens: 0.54 (0.35, 0.71)Spec: 0.70 (0.48, 0.85) | ||||
Vanti 2012 | Symptoms in first three digits | Sens: 0.40 (0.26, 0.89)Spec: 0.78 (0.66, 0.89) | |||
Symptoms in first three digits | Structural differentiation* | Sens: 0.29 (0.16, 0.45)Spec: 0.82 (0.69, 0.91) | |||
Symptoms in first three digits | Structural differentiation* | Sens: 0.06 (0.02, 0.19)Spec: 0.93 (0.82, 0.98) | |||
Mixed population | |||||
Wainner 2003 / 2005(ULTT-A) | Reproducible patient symptoms | Limited Range of Motion | Symptoms increase during stretching | Sens: 0.97 (0.90, 1.00)Spec: 0.22 (0.12, 0.33) |
CI, confidence interval; crit, criterion, Sens, sensitivity; Spec, specificity.
We found nine diagnostic test accuracy studies of which four included patients with CR, four studies patients with CTS, and one included both patients. The most evaluated test was the ULTT1, with in total 13 test procedures described in eight studies. We found a wide variety of test procedures and criteria for a positive test. Based on our findings, we propose a more standardised test procedure of the ULTT1 in patients with CR as well as CTS. The associated positive diagnostic criteria are reproduction of the symptoms and an increase or decrease of symptoms when stressing or relaxing the nerves.
Comparison with existing literaturePreviously, several reviews evaluating the ULTT concluded that there is a lack of a clear definition of terms (‘investigators definition of a positive test’) and procedures for the ULTT1.3,10,23 The authors all concluded that the validity of ULTT1 is probably hampered by the diversity in the procedure and interpretation of the index test. The major online support tools also use a variety of procedures in their videos, which reflects the variability in test procedures found in the literature and hampers clear implementation.2,12,13 By evaluating the test procedures of the ULTT1 we found that the order of movements varied between studies. Based on the included studies, we suggest a more standardised test procedure and positive diagnostic criteria. Hopefully in future research the diagnostic accuracy of the ULTT1 can be more accurately assessed through a standardised procedure and positive diagnostic criteria.
Proposed set of test procedures and positive diagnostic criteriaOverall, all studies stabilised the shoulder in abduction, extended the elbow, supinated the forearm, extended the wrist, and performed structural differentiation by side bending of the neck, although the order of movements varied. We propose a more standardised test procedure to consist of these movements. The suggested order is like the one performed in the studies with the lowest risk of bias (score of 6): 1) stabilising the shoulder in abduction, 2) extending the wrist/fingers, 3) supinating the forearm, 4) externally rotate the shoulder, 5) extending the elbow, and 6) structural differentiation by laterally flexing (i.e., side bending) the neck. This proposed test procedure aims to assess whether the patient's symptoms are reproduced and whether symptoms increase or decrease when increasing or decreasing the tension on the nerves. When symptoms are present in the first three digits during the ULTT, this acts as an extra diagnostic criteria for patients with CTS, and can be regarded as ‘reproduction of symptoms’, same for ‘limited range of motion’.
Strengths and limitationsTo our knowledge, this is the first systematic review evaluating the test procedures and positive diagnostic criteria of the ULTT1 in patients with neck or arm pain. A possible limitation of our review may exist in the search strategy, which was conducted solely in the DiTA database. DiTA is updated monthly, drawing on automated optimised searches of MEDLINE, EMBASE, CINAHL, and the Cochrane Database of Systematic Reviews.38 Therefore, we consider the DiTA database up to date and when we performed a search in PubMed (January 2022) we did not find any additional or missed studies. Another limitation may be present in the study selection, as patients selected in the included studies are specific patients, referred to specialised clinics. Although this finding probably does not influence test procedures and positive diagnostic criteria, it might influence the accuracy data as these are dependent on prevalence of the condition.39 Concerning the reference test, in the CR studies it included physical examination in three of the four studies, potentially leading to risk of confirmation bias that might result in higher accuracy data. In two CR studies it was also unclear whether the index and reference test were performed independently. Our findings, however, did not show higher accuracy data in the CR studies, therefore we believe that confirmation bias might not have played an important role. Another limitation could be our risk of bias tool. We used a simpler 6 item ‘checklist’ for risk of bias assessment, to allow clinicians and students to conduct risk of bias assessments more easily. We are confident that the outcomes of the risk of bias assessment will not differ much from the QUADAS-2. We registered our protocol on the UTS data repository only, as we were unable to register the protocol in Prospero which focusses on systematic reviews with a patient related/relevant outcome. Lastly, due to the limited data, we were only able to make recommendations about the ULTT1, and not the ULTT2 (a and b) and ULTT3.
ImplicationsClinical implications. We proposed a standardised test procedure and positive diagnostic criteria for use in clinical practice. A more standardised set of test procedures will help clinicians as it is a consistent message to patients. Reproduction of symptoms and a decrease of symptoms when reducing the tension are the recommended criteria for a positive test.
Research implications. We suggest that future research should validate our proposed set of test procedures as well as our proposed set of positive diagnostic criteria. This may lead to greater consistency and standardisation between studies, reducing heterogeneity regarding treatment effectiveness studies, systematic reviews, and meta-analysis of test accuracy studies.
ConclusionAlthough the ULTT continues to be used by clinicians, its diagnostic accuracy remains uncertain. Given the ULTT is likely to remain in use by clinicians, we have provided a recommendation on the test protocol, based on the nine studies we found, which includes: 1) stabilising the shoulder in abduction, 2) extending the wrist/fingers, 3) supinating the forearm, 4) externally rotating the shoulder, 5) extending the elbow, and 6) performing structural differentiation by lateral flexion (side bending) of the neck aiming to reproduce the symptoms and enabling the clinician to evaluate whether symptoms increase or decrease when stressing or relaxing the nerves.
Author contributionsAPV Conceptualization; APV Data curation; All Formal analysis; NA Funding acquisition; all Investigation; APV, DA Methodology; APV, DA, HB Project administration; APV Resources; APV, DA Software; APV, DA Supervision; APV, DA, MH Validation; all Writing; APV, DA, MH Review & editing
Georgia Bisset (GB), Christian Blanda (CB), Cassandra Armenio (CA), Helen Dickson (HD), Sarah Fensom (SF) for their assistance in the data extraction.
Declarations: This work has not been published previously, or is under consideration for publication elsewhere.