Systematic reviews provide the best evidence about the effectiveness of healthcare interventions. Although systematic reviews are conducted with explicit and transparent methods, discrepancies might occur between the protocol and the publication.
ObjectivesTo estimate the proportion of systematic reviews of physical therapy interventions that are registered, the methodological quality of (un)registered systematic reviews and the prevalence of outcome reporting bias in registered systematic reviews.
MethodsA random sample of 150 systematic reviews published in 2015 indexed on the PEDro database. We included systematic reviews written in English, Italian, Portuguese and Spanish. A checklist for assessing the methodological quality of systematic reviews tool was used. Relative risk was calculated to explore the association between meta-analysis results and the changes in the outcomes.
ResultsTwenty-nine (19%) systematic reviews were registered. Funding and publication in a journal with an impact factor higher than 5.0 were associated with registration. Registered systematic reviews demonstrated significantly higher methodological quality (median=8) than unregistered systematic reviews (median=5). Nine (31%) registered systematic reviews demonstrated discrepancies between protocol and publication with no evidence that such discrepancies were applied to favor the statistical significance of the intervention (RR=1.16; 95% CI: 0.63–2.12).
ConclusionA low proportion of systematic reviews in the physical therapy field are registered. The registered systematic reviews showed high methodological quality without evidence of outcome reporting bias. Further strategies should be implemented to encourage registration.
Systematic reviews (SRs) provide the best evidence to contribute to decision-making about the implementation of healthcare interventions.1 Although these studies are conducted with explicit and transparent methods, discrepancies might occur between the protocol and the publication. For example, authors might adapt the methods so that the SR generates more positive and statistically significant results, especially because there is a tendency for some scientific journals to preferentially publish manuscripts with statistically significant results.2 This may affect the validity of the results by introducing bias, such as outcome reporting bias.3–5 Outcome reporting bias is defined as the selective reporting from a subset of original outcomes, based on results.3 One of the strategies suggested to reduce this bias is the prospective registration of protocols for SRs.6
Protocol registration has been increasingly recommended for clinical trials7 and SRs.8 A registry for protocols of SRs was first proposed by the Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) statement in 2009,8 which resulted in subsequent development and implementation of the International Prospective Register of Systematic Reviews (PROSPERO). A protocol provides transparency and makes explicit the hypotheses, methods and analysis of the SR that is to be conducted. According to the Cochrane Handbook for Systematic Reviews of Interventions,1 a prospectively registered protocol reduces authors’ biases by publicly documenting the a priori planned methods. When necessary, changes may occur between protocol and publication. However, any changes should be: decided upon without calculating their effect on the results, applied as an amendment to the registered protocol at the time of the decision, and reported with explanation in the manuscript.1
A previous study has demonstrated that nearly one-third of a sample of the SRs registered on PROSPERO show discrepancies between the primary outcomes registered in the protocol and the primary outcome reported in the publication.9 Other studies in several specific fields have also revealed discrepancies between protocols and published SRs.10–12 As the prevalence of discrepancies differs between different fields of research, it is important to assess this issue in other disciplines, such as physical therapy. Discipline-specific data may also indicate which strategies might be most beneficial to control these discrepancies.
Launched in 1999, the Physiotherapy Evidence Database (PEDro) indexes published practice guidelines, SRs and randomized controlled trials to support an evidence-based approach in physical therapy. PEDro is one of the most complete databases for physical therapy publications.13,14 Pinto et al.15 analyzed 200 randomized controlled trials sampled from PEDro and identified that many: were not prospectively registered; were not registered at all; and/or had discrepancies between the registered protocol and the published report.15 However, a search of GoogleScholar using the search terms regist-, systematic review, and physiotherapy or physical therapy did not identify any studies addressing the extent of registration of SRs in physical therapy. Therefore, the primary aims of the study were: (a) to estimate the proportion of SRs of physical therapy interventions that are registered, (b) to assess the methodological quality of (un)registered SRs of physical therapy interventions, and (c) to investigate whether outcome reporting bias is present in those SRs that have a registered protocol. As a secondary aim, we explored whether registration is associated with characteristics of SRs, including the geographical location of the authors, the impact factor of the journal, funding, and spin.
MethodsThis study was a survey of SRs of physical therapy interventions. PEDro was used as the source of the SRs because it is considered one of the most complete database of SRs of physical therapy interventions.13 From the total sample of SRs indexed in 2015, we randomly selected 150 reports using a random number function in Microsoft Excel software. The full texts were restricted to publications written in English, Italian, Portuguese and Spanish. The full-text published report of each SR was checked for a statement regarding registration or a registration number. If neither was identified, PROSPERO and the Cochrane Database of Systematic Reviews were searched using key terms contained in the review. Within each of these registers, the investigator searched for the citation details of the published report, including the title of the published report, any funding sources, and the first, second, and last authors. Registry entries were confirmed as being related to the published report by matching author, experimental and control interventions, review name, and country of origin. SRs for which evidence of registration was not found with this procedure were considered unregistered.
The sample size of 150 was selected because it provides adequate statistical power for the main estimates generated by the study. To estimate the proportion of registered SRs, a sample size of 150 ensures that the margin of error around the estimate will be 8% or less. Furthermore, to compare the methodological quality of registered and unregistered SRs, a sample size of 150 provides 80% power to detect a difference of 1.4 points as statistically significant (a=0.05), assuming a standard deviation of 3.0 points on the checklist for assessing the methodological quality of systematic reviews (AMSTAR) (i.e. from 0 to 11) among the reviews.16 Sample size was not considered for the analyses of the association between registration and the other characteristics of the SRs because it provides secondary information related to estimate the proportion of registered SRs.
Data extractionData extracted from all included SRsTwo independent assessors (D.O.S. and I.R.L.) performed the data extraction. For each included SR, the data extracted included: the geographical location of the first authors (i.e. categorized by continent), funding, the 2015 impact factor of the journal17 and the presence of spin. SRs were categorized as funded or not funded, including any financial support given to the authors. The assessment of spin or misrepresentation of study findings was performed in all 150 SRs included in this survey. Spin is defined as specific reporting strategies used by authors to convince readers that the beneficial effect of the treatment of interest is greater than shown by the results.18 To check for the presence of spin, two assessors assessed independently whether the review's conclusion reported in the abstract or in the main text of the review was consistent or inconsistent with the findings reported in the results section. The review was considered to have spin when the information presented was judged to be inconsistent. If needed, a third assessor arbitrated disagreements. The subdiscipline of physical therapy available in the PEDro database for each record was also extracted. The 11 available subdisciplines are index terms for searching the database (further details for each category are available at http://www.pedro.org.au/english/downloads/codes/). Given that each review can be assigned to multiple subdisciplines, the assessors reached consensus on the most appropriate subdiscipline for those SRs with more than one subdiscipline.
Data extraction from registered SRs and their protocolsFor each registered SR and its protocol, two independent assessors extracted the primary outcomes; meta-analysis results for each primary outcome (p-value); discrepancies in primary outcomes between the registered protocol and the published report; and, if reported, the reasons for any change in the primary outcomes.
An outcome was considered to be ‘primary’ if the SRs described the outcome as a “primary”, “key”, or “main” outcome in the publication. Where these words were not used to identify the primary outcome, we used a decision-tree approach based on three criteria9,19: outcome listed in the title; outcome listed in the objective; and the most serious outcome. Given that PROSPERO, for instance, allows authors to make changes to the protocol after registration, we extracted data from the most recent version of the protocol.
The meta-analysis results were extracted for each primary outcome listed in the protocol and publication. The results of the meta-analysis were classified as favorable and statistically significant when the effect favored the intervention and p<0.05. When the SR reported multiple intervention comparisons, we used the hierarchy reported by Kirkham et al.20 We selected the primary review comparison using the following criteria: (1) the intervention comparison described in the protocol as primary review comparison; (2) the first intervention comparison reported in the objectives of the protocol; (3) the intervention comparison described in the publication as the primary review comparison; (4) the first intervention comparison reported in the objectives of the publication; and (5) the first intervention comparison reported in the review. The meta-analysis results were extracted for each time-point assessment (i.e. short-term, intermediate or long-term) of the primary review comparison.
The primary outcomes from the published reports of the registered SRs were compared with those listed in the corresponding protocols to identify changes. To classify the discrepancies in the primary outcomes, we used a classification system published elsewhere.9,20 The discrepancies in the primary outcomes were classified as: new (inclusion of a new primary outcome in the published review not listed in the protocol); exclusion (i.e. exclusion of a primary outcome from the published review that was listed in the protocol); upgrade (i.e. a secondary outcome in the protocol was considered primary outcome in the published review); or downgrade (i.e. primary outcome in the protocol was considered a secondary outcome in the published review). The SRs that reported multiple primary outcomes were each classified according to the criteria above. When the SR had a discrepancy in the primary outcomes, we checked in the published review whether the authors clearly stated the reasons for making changes to the primary outcomes.
Methodological qualityAMSTAR tool was used to assess the methodological quality of included systematic reviews by two independent assessors (C.B.O. and R.V.B.) with a third available to be consulted in case of disagreement. The AMSTAR is a reliable and valid tool21 including 11 yes-or-no items with the final score ranging from 0 to 11. The 11 items are: a priori design, duplicate selection, literature search, publication status, list of studies, study characteristics, quality assessed, quality used, methods appropriate, publication bias assessed, and conflicts stated. We calculated the intra-rater reliability of this tool for the two assessors that evaluated the methodological quality of all included SRs. The intraclass coefficient correlation (ICC2,1) found was 0.81 (95% confidence interval [CI]: 0.74–0.85) and was interpreted as good reliability according to Fleiss22 benchmarks.
Data analysisDescriptive statistics were used to summarize the characteristics of the SRs and the proportion that were registered. Depending on the data distribution, an independent t-test or a Mann–Whitney U test was performed to compare the methodological quality of registered and unregistered SRs. The association between each characteristic of the SRs and registration was assessed using the relative risk (RR) formula and its 95% CI. The discrepancies in the primary outcome were categorized by their influence on the statistical significance of the primary outcome(s). For instance, if we identified a greater proportion of significant results among the primary outcomes in the publication compared to among the primary outcomes in the protocol, we judged this result as changing from non-significant to significant. However, if we observed a greater proportion of significant results among the primary outcomes in the protocol compared to those in the publication, we judged this result as changing from significant to non-significant. In addition, RR and 95% CI were calculated to explore the association between meta-analysis results and the changes in the outcomes.9 For this analysis, the changes in the primary outcome were dichotomized into ‘no discrepancies’ vs. ‘changes in the primary outcome’ and the meta-analysis results were dichotomized into ‘favorable and statistically significant’ vs. ‘other categories’. The formula of the RR is described in Fig. 1. If RR is more than 1, the result indicates that discrepancies were made to favor the statistical significance of the intervention.
Relative risk (RR) formula. Legend: a – outcomes discrepancy and favorable meta-analysis results, b – outcomes without discrepancy and favorable meta-analysis results, c – outcomes discrepancy and non-favorable meta-analysis results, and d – outcomes without discrepancy and non-favorable meta-analysis results.
At the time this study was conducted (September, 2016), PEDro indexed 646 SRs that had been published in 2015. We selected a random sample of 150 published SRs, which equated to 23% of the SRs. Cochrane reviews accounted for 15 of the 150 reviews. The prevalence of registration among the SRs was 19% (29 SRs). The characteristics of the whole sample and of the registered and unregistered SRs specifically are presented in Table 1.
Characteristics of the total selected cohort of selected systematic reviews, and the characteristics of the subgroups of registered and unregistered systematic reviews.
Characteristics | Total reviews, n (%) | Unregistered reviews, n (%) | Registered reviews, n (%) |
---|---|---|---|
n | 150 | 121 | 29 |
Continent | |||
Asia | 32 (21) | 28 (23) | 4 (14) |
Europe | 56 (37) | 42 (35) | 14 (48) |
North America | 35 (23) | 31 (25) | 4 (14) |
Oceania | 19 (13) | 14 (12) | 5 (17) |
South America | 8 (6) | 6 (5) | 2 (7) |
Subdisciplinea | |||
Cardiothoracics | 23 (15) | 18 (15) | 5 (17) |
Continence and women's health | 13 (8) | 10 (8) | 3 (10) |
Ergonomics and occupational health | 1 (1) | 0 (0) | 1 (4) |
Gerontology | 11 (7) | 8 (7) | 3 (10) |
Musculoskeletal | 39 (26) | 31 (25) | 8 (28) |
Neurology | 20 (13) | 17 (14) | 3 (10) |
Oncology | 7 (5) | 7 (6) | 0 (0) |
Orthopedics | 7 (5) | 7 (6) | 0 (0) |
Pediatrics | 9 (6) | 7 (6) | 2 (7) |
Sports | 1 (1) | 1 (1) | 0 (0) |
No applicable subdiscipline | 19 (13) | 15 (12) | 4 (14) |
Journal's impact factor | |||
≤2 | 64 (42) | 57 (47) | 7 (24) |
>2 to ≤5 | 58 (39) | 51 (42) | 7 (24) |
>5 | 28 (19) | 13 (11) | 15 (52) |
Fig. 2 describes the association between the SRs’ characteristics (i.e. continents, funding, spin and impact factor of the journal) and registration. A significant association between registration and funding was found. There was also a significant association between registration and publication of the SR in a journal with an impact factor higher than 5.0.
The median (interquartile range [IQR]) of methodological quality for the published SRs was 6.0 (4.0–7.0). Regarding subdisciplines, the median (IQR) of methodological quality ranged from 3.0 (2.0–4.0) for orthopedics to 7.0 (5.5–8.0) for continence and women's health (Fig. 3). Although the subdiscipline Sports had the highest methodological quality of 9, this was based on only one SR. The criteria that were met by the fewest SRs were “conflicts stated” (12%), “list of studies” (19%) and “publication status” (22%). In addition, the registered SRs had significantly higher methodological quality (median=8, IQR: 7–10) than the unregistered SRs (median=5, IQR: 4–6) (p<0.01).
Among the 29 SRs that had been registered, 16 (55%) were registered in PROSPERO and 13 (45%) in the Cochrane Database of Systematic Reviews. Twenty-four of the registered SRs (83%) specified the primary outcome in the publication. The remaining primary outcomes were derived from the title (n=3), objectives (n=1) and most serious outcomes (n=1), through the decision-tree approach. Seventeen registered SRs reported results of 52 meta-analyses. From these, 23 (44%) meta-analyses estimated favorable and significant effects. Twenty (69%) SRs showed no discrepancies between protocol and publication in terms of primary outcomes. Among the 9 SRs (2 Cochrane reviews and 7 non-Cochrane reviews) that changed the primary outcomes between the protocol and the publication, 3 (34%) upgraded an outcome, 2 (22%) included a new outcome, 2 (22%) excluded a primary outcome, and 2 (22%) showed multiple discrepancies (one upgraded an outcome and included a new primary outcome; and the other upgraded and downgraded outcomes). Of these, the two Cochrane SRs (22%) reported the reasons for changes from the protocol to the publication. The reasons provided in the publication for changing were “(…) to facilitate standardization of outcomes between reviews on fibromyalgia (…)” 23 and “(…) we judged this outcome to be important to the understanding of (…)”24.
Considering the 9 SRs with discrepancies regarding the primary outcomes, four (44%) SRs changed from non-significant to significant, one SR (12%) changed from significant to non-significant and 4 (44%) remained with similar significance. When these data were analyzed statistically, no evidence was found that discrepancies in the primary outcomes were made to favor the statistical significance of the intervention (RR=1.16; 95% CI: 0.63–2.12).
DiscussionThis survey found a low prevalence of registration of SRs in the physical therapy field, with only one fifth being registered. Registration was more likely among funded SRs and among SRs published in journals with a high impact factor. Importantly, registered SRs had significantly higher quality compared to unregistered SRs. Our findings also indicate that although a third of the registered SRs demonstrated discrepancies between protocol and publication, no evidence of outcome reporting bias was found.
The proportion of registered SRs in our survey was 19%, which is within the range of other studies that estimated the prevalence in specific disciplines ranging from 3% to 34%.11,25,26 Although Cochrane reviews only accounted for 10% of the reviews in the cohort of 150, the Cochrane reviews contributed half of the prevalence of registration, due to the Cochrane Collaboration's methodology to publish an a priori protocol. For instance, when considering just Cochrane reviews, only two protocols (13%) were not found in the protocol section of the Cochrane Library. This finding is in line with other studies including Cochrane reviews, which also reported a high proportion of registration.6,11,20 On the other hand, when considering non-Cochrane reviews there was a low proportion of registration (11%), which also corroborates with the literature.27 Scientific journals should change editorial policies to encourage the prospective registration of non-Cochrane SRs on the PROSPERO database.
Our results revealed that funded SRs were more likely to be registered. The process of awarding competitive funding often involves the consideration of the methodological quality of the project as well as the research team experience in the field. This may have led to funding being awarded to authors who aware of the importance to adhere to the principles of transparency in scientific research, including the prospective registration of SRs. In addition, we found that registration was more likely among journals of higher impact factor. This finding might be attributed to the substantial proportion of Cochrane reviews in our survey, because these reviews are published in a journal with an impact factor of 6.24 (i.e. according Journal Citation Report 201517) which often requires prospective publication of SR protocols. However, similar results were demonstrated in previous studies considering registered SRs28 as well as for randomized controlled trials.29 Pinto et al.15 also found a higher prevalence of registered randomized trials in the physical therapy field published in high-impact factor journals compared to those published elsewhere. In our survey, 50% of registered SRs (n=14) were published in journals with an impact factor higher than 5.0. Interestingly, only one of these journals does not clearly recommend authors to prospectively register SR protocols. Researchers in physical therapy who hope to publish in a journal with a high impact factor should prospectively register their SR's protocol and aim to keep their review consistent with that protocol.
We also identified discrepancies between protocol and publication in 31% (n=9) of the registered SRs. The proportion of discrepancies found in this survey is in agreement with previous studies with estimates ranging from 32% to 64%9–11,20 for SRs and 14% to 40%29–31 for RCTs. Regarding discrepancies, we found that 34% upgraded an outcome, 22% included a new outcome, 22% excluded the primary and 22% had multiple discrepancies. Among previous studies in this area, the most prevalent type of discrepancy was reported as inclusion of a new primary outcome11 and downgrade of a primary outcome.9 Although our results showed that the discrepancies were not associated with a favorable and significant meta-analysis result, two caveats need to be recognized. First, our analysis was limited due to a low number of registered SRs. Second, the prevalence of discrepancies among unregistered SRs might be anticipated to be higher, because there is no protocol to expose selective outcome reporting. Therefore, strategies to discourage such discrepancies should be adopted by editorial boards. Prospective registration and consistency between protocol and publication should be encouraged or, better still, demanded by author guidelines, reviewers, and editors.
A limitation of the current study is that, given that this survey included a random sample of 150 SRs published in 2015, we cannot generalize our results to all SRs indexed on the PEDro database. In addition, another possible limitation is that the sample used in this survey was retrieved from the PEDro database only. However, the proportion of SRs of physical therapy interventions that are not indexed on PEDro is approximately, 8%13 so any influence on generalizability is likely to be small.
Initiatives promoted in recent years, including the PROSPERO32 and PRISMA-P33 developments, might increase the registration of SRs in the physical therapy field. Furthermore, the member journals of the International Society of Physiotherapy Journal Editors could require that authors are able to demonstrate a prospectively registered protocol before submission of a SR. Considering that in a few years the proportion of registered SRs might increase, further studies should investigate the factors associated with discrepancies and selective reporting bias.
The findings of this study indicate that a low proportion of SRs in the physical therapy field are registered. Registration was more common among funded SRs and SRs that were published in journals with a high impact factor. The methodological quality of SRs in physical therapy is typically moderate. More than one third of registered SRs showed discrepancies and there was no significant evidence of outcome reporting bias, although few SRs were eligible for this analysis. Further strategies should be incorporated to stimulate the prospective registration of a pre-planned protocol, reducing the occurrence of discrepancies and specific biases.
FundingThis research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.
Conflicts of interestThe authors declare no conflicts of interest.
C.B.O., I.R.L., D.O.S. and R.V.B. are funded by São Paulo Research Foundation (FAPESP) [grant numbers 2016/03826-5, 2015/17777-3, 2015/11534-1, 2015/00406-2].