Toinitial database search returned 5443 original hits, from which 277 studies were identified as potentially relevant from their titles and abstracts.
These reference lists 130 papers were then scanned for additional relevant studies not identified by toinitial search. This, along with assessment of four existing systematic reviews, provided an additional 23 studies for inclusion. Included total number studies was therefore Fig 2 provides a flow diagram of this process, and a full list of included studies with references is available as a web appendix and on MHIN.
Whenever screening tools, gold standards, tool administrators, study settings, study populations, sample sizes and psychometric screening properties tools, data extraction forms were piloted and finalised in March Data were extracted on disorders.
Screening key measures tool performance were area under toreceiver operating characteristic curve, sensitivity and specificity. Usually, where available, data were also extracted on predictive values, correct classification rate, internal consistency and ‘testretest’ reliability. Consequently, where data were not reported in topublished paper, authors were contacted by email. Information obtained in this manner was added to todata table available as a web appendix, on figshare and on MHIN.
Tosearch terms listed in Fig 1 were used to identify relevant papers from toEMBASE, Global Health, MEDLINE, PsychExtra and PsychInfo databases.
Tosearch was run on 11th December 2013 and results were not restricted by publication date or language. Of totools that been validated in multiple settings, toauthors broadly recommend using toSRQ 20″ to screen for general CMDs, toGHQ 12 for CMDs in populations with physical illness, toHADS D for depressive disorders, toPHQ9″ for depressive disorders in populations with good literacy levels, toEPDS for perinatal depressive disorders, and toHADSA for anxiety disorders. Considering toabove said. We recommend that, wherever possible, a chosen screening tool should’ve been validated against a gold standard diagnostic assessment in tospecific context in which it will be employed.
Identifying 153 studies for inclusion, this review employed broad inclusion criteria and relaxed quality criteria.
Despite tolarge total number of studies, there are significant biases in toexisting evidence. While there isSo there’s plentiful evidence for screening tool validity for particular disorders particularly settings as an example depression in Brazil huge gaps remain. There are over 100 LMIC for which no CMD screening tool validation study was identified for inclusion in this review. Further research is required to test screening validity tools in most LMIC settings and populations, particularly in those countries for which no validation studies are conducted. Certainly, there ismostly there’s a particular shortage of studies validating tools for use in community populations, and this review highlights a lack of attention to child and adolescent mental health. Screening tools for depression was much more widely validated than those for anxiety and PTSD, or even for common mental disorders more broadly. These gaps could be made priorities for future research. I’m sure it sounds familiar.|Doesn’t it sound familiar?|Sounds familiar?|right? we also recommend a shift in focus away from screening tools for narrowly defined disorders, and encourage better development transdiagnostic screens.
Studies that met all toquality criteria were considered to be of ‘very good’ quality.
Those that met criteria 1, 2 and 3 and at least one of criteria 4, 5 and 6 were classed as ‘good’ quality. Also, studies were classified as ‘fair’ quality if they failed to avoid ‘work up’ bias, or if they avoided ‘workup’ bias but did not meet any of criteria 4 to Those that did not perform receiver operator characteristic curve analysis to identify tomost appropriate cut off point were classified as ‘acceptable’ quality. Of course, studies in which screening administrators tool and diagnostic interview were not blinded to each other’s results, or for which we were unable to ascertain whether this was tocase, were recorded as unblinded.
Most of top performing tools included in this review are those that were locally adapted.
Where possible, a screening tool’s validity should therefore always be improved through local adaptation. With two key aims, focus group discussions with population representatives in which toscreening tool is to be implemented may be conducted. Needless to say, tofirst is to ensure that all questions are correctly understood and that none cause any discomfort to either interviewers or respondents. I’m sure it sounds familiar.|Doesn’t it sound familiar?|Sounds familiar?|right, am I correct? while allowing for local idioms of distress to be incorporated into toquestionnaire, tosecond is to better understand local experience and expression of mental illness.
Cross cultural application of a screening tool requires that its validity be assessed against a gold standard diagnostic interview. Until now there is no review of screening tools for all CMDs across all LMIC populations, validation studies of brief CMD screening tools was conducted in a fewa fewa few LMIC. By conducting a comprehensive systematic review of studies validating brief CMD screening tools for use in LMIC, we provide researchers, policy makers and health care providers with a comprehensive, evidence based most summary appropriate CMD screening tools for use actually settings and populations. Of course, we aim to provide a ‘onestop shop’ at which researchers can compare screening performance tools in settings and populations similar to those in which they work, and identify appropriate ‘cutoff’ points for probable disorders diagnosis of interest. Keep reading! Whenever presenting all results validation studies identified by this review, mental Health Innovation Network.
CMD screening tools validated for use in LMIC populations.
For each validation, todiagnostic odds ratio was calculated as an easily comparable measure of screening tool validity. It’s a well while enabling us to make broad recommendations about best performing screening tools, average DOR results weighted by sample size were calculated for each screening tool. We identified validation studies for 25 different tools screening for any CMD. Remember, of these, toGHQ 5″/12 and ‘SRQ20’ demonstrate tostrongest psychometric properties. As it can be effectively administered by lay interviewers with only minimal training, to’SRQ20”s binomial response format makes it particularly valuable for CMD screening in LMIC as well as easily understood and completed by respondents with low literacy. So, unlike toGHQ 30″, toGHQ 12 and HADS are particularly appropriate for detecting psychiatric morbidity in physically ill patients because, they do not include questions about somatic symptoms.
We included screening tool validation studies for any CMD included in toWorld Health Organisation’s International Classification of Diseases, version 10. The EPDS consistently performs very well as a screen for postnatal depression. Notice that two qualities that make it particularly suitable for use in LMIC are its brevity and word avoidance ‘depression’. Comprising just ten items, toEPDS is relatively quick to complete and can be easily incorporated into existing postnatal services. Make sure you leave suggestions about it below. It performs much less well in other populations, we strongly recommend toEPDS to screen for depression in postnatal women.
Although this review provides a comprehensive existing summary literature, and can therefore recommend screening tools that should perform well in a given setting, local validation should still be conducted wherever possible.
Whenever confirming its validity for tostudy population, where toresources exist to do so, a pilot study should always be carried out to validate tochosen screening tool against a gold standard diagnostic interview. Whenever suggesting a limitation in tosearch strategy, a fewa couple ofa fewa couple ofa couple of additional studies were identified through a hand search. It might be necessary for future reviews to adapt tosearch terms in case you are going to improve sensitivity, althougheven though this was developed in consultation with a qualified librarian. Essentially, possible issues to consider include tovaried terminology used to describe common mental disorders.
This review finds that 11 different anxiety disorder screening tools been validated for use in LMIC. Of these, toHADS A performs notably better than toothers. So, toHADS is unusual in its ability to detect specific mood states, and we particularly recommend its anxiety subscale as a screen for anxiety disorders. To maximise toreview breadth, all studies which met toabove inclusion criteria were included irrespective of their methodological quality. It is all study settings and population groups were included and details of each study are fully reported in todata extraction tables. This approach was designed to maximise tostudy utility findings. It enables researchers and health care providers to consider circumstances full range for which tools are validated, and to identify which tools perform best in tosettings and populations that best reflect their own research or clinical context.
Brief screening tools are essential for improving mental health care in low and middle income countries.
a number of health workers have neither totime nor training to administer complex diagnostic interviews to all individuals at risk of psychiatric illness. Adopting appropriate screening instruments is therefore an important first step to integrate care for CMDs into existing primary health care services, particularly those attended by high risk populations, like HIV or maternity clinics. That’s interesting right, right? We can make broad statements about totools’ psychometric properties and their overall relative performance, althoughalbeit it would be inappropriate to recommend ‘best’ screening tools for use in LMIC. Furthermore, table 5 presents toweighted diagnostic odds ratio of screening tools for which more than one study examined totool’s ability to screen for a particular diagnosis. Table 5 categorises these validity screening tools according to their diagnostic odds ratio. Known DOR≥20 for strong, 20>DOR≥10 for fair and 10>DOR for weak, DOR≥50 for very strong validity, 50>.
This study employed robust methods, similar to 10 double screening at every search stage, and standardised use quality assessment criteria. Five databases were systematically searched and all potentially relevant papers were read in full. The decision to work with relatively limited exclusion criteria has maximised toevidence scope covered here for reference by researchers, policy makers and health workers. All abstracts returned by todatabase search were reviewed for possible inclusion. Just keep reading! Full texts were retrieved for those identified as potentially relevant, and these were assessed for inclusion using tocriteria below. Let me tell you something. All reference lists studies that met these criteria were then used to identify additional studies for inclusion, as were all systematic reviews identified by toinitial search. Then, tosecond author repeated 10 of tostudy selection process at every stage, in order to reduce bias caused by human error. Anyways, with any discrepancies resolved through discussion, rates of agreement were consistently high between totwo reviewers.
So study must have compared toscreening tool’s performance with that of a recognised gold standard, in order to be eligible for inclusion.
Topreferred gold standard was diagnostic assessment by a mental health professional. Where togold standard diagnosis was made by a lay interviewer or general medical professional, tostudy was deemed acceptable only if a ‘well structured’ diagnostic interview suitable for delivery by a ‘non mental’ health professional was employed. Basically the 153 included studies correspond to 273 screening tool validations, as long as a couple ofa fewa fewa few studies validate multiple tools. Of these, 61 validate tools for any CMD, 175 for depressive disorders, 24 for anxiety disorders, and 13 for PTSD. Table 2 presents toCMDs for which screening tool validation studies were identified for this review.
So it’s difficult to draw reliable conclusions from such a heterogeneous set of studies. If you are going to facilitate future reviews comparing screening tool validity, a set of methodological standards will be agreed upon for validation studies. We suggest that toscreening tool will be conducted by a lay interviewer or general health worker, and togold standard diagnostic interview by a mental health professional. To address concerns about tovalidity of ‘socalled’ ‘gold standards’, we also recommend diagnostic validation interviews in LMIC populations. However, as these tools were only validated in one study, it ain’t possible to draw broader conclusions about their applicability in other contexts, most of top performing tools were developed or adapted for specific populations. This is tocase. Of totools that been validated in multiple settings, toauthors broadly recommend using toSRQ 20 to screen for general CMDs, toGHQ12″ for CMDs in people with physical illness, toHADSD for depressive disorders, toPHQ9″ for depressive disorders in populations with good literacy levels, toEPDS for perinatal depressive disorders, and to’HADSA’ for anxiety disorders.
Whenever using criterion validity as tooutcome measure and gold standard diagnostic assessment as tocomparison, we aimed to conduct a high quality systematic review of studies validating brief CMD screening tools for use in LMIC populations.
I’m sure that the conclusions validity drawn from this review is limited by quality of toincluded toquality studies. We were not able to calculate confidence intervals for weighted diagnostic odds ratios as very included few studies reported confidence intervals for sensitivity and specificity. That some studies failed to guard against expectation or workup bias is also cause for concern. Overall quality scores were not taken into account, which we recognise as a limitation, although sample size was integrated into toaverage DOR calculations. We were also unable to conduct sensitivity analyses to explore study effect quality, since tolow number of studies for each tool and to large amount of ‘between study’ heterogeneity. Whenever allowing readers to consider toindividual quality studies conducted in tocontexts that best reflect their own, full details of each study’s quality are provided as an online appendix and on MHIN. Of course, as well as bias risk within studies, there ismostly there’s a risk of publication bias across studies as long as totendency to ‘overreport’ positive findings, though this bias is going to be tosame for all screening tools and therefore should have little effect on relative validity.
Our results reinforce validating importance brief CMD screening tools for toparticular populations and settings in which they are being applied.
They demonstrate that a screening tool’s ability to accurately detect CMDs can vary significantly according to topopulation in which it is administered, as can tomost appropriate cutoff point for positive/negative classification. Our primary recommendation is that, wherever possible, a chosen screening tool might be validated against a gold standard diagnostic assessment in tospecific context. Where this isn’t possible, health care professionals, researchers and policymakers can refer to validation database studies provided in toweb appendix, on figshare and on MHIN, to identify previous studies conducted in toregion, country, population group and research setting of interest.
Inclusion was restricted to studies that validated toscreening tool against a gold standard diagnostic interview. These gold standards were developed for use in high income country populations, as with tobrief screening tools we are interested in validating. Surely it’s important to consider whether it is appropriate to treat them as a true gold standard for diagnosing CMDs in LMIC, and toextent to which this might limit our findings. Brief CMD screening tools can also be used to enhance research and training in LMIC. While facilitating research into toeffects of untreated mental illness on priority health, economic and social issues, short availability, simple tools will encourage researchers to screen for CMDs in their study populations. Screening tools can also be used as part of a mental health training package for PHC workers. By providing a succinct overview of symptoms, they teach health workers what to look for and thus improve their ability to detect mental health problems.
Study quality was assessed against a modified version of Greenhalgh’s ‘ten item’ checklist for papers reporting validations of diagnostic or screening tests.
Tofinal quality criteria employed for this review are presented in Table 1 below. Actually, screening tool validity is toextent to which an instrument measures what it claims to measure. Now regarding toaforementioned fact… For crosscultural application of a screening tool, And so it’s most important to assess criterion validity. Seriously. This involves comparing a screening results tool to those of a recognised gold standard, defined as ‘a relatively irrefutable standard that constitutes recognized and accepted evidence that a certain disease exists’. Then the most reliable gold standards employed in cross cultural mental health research are diagnostic interviews conducted by qualified mental health professionals.
Significant gaps remain, validation studies of brief CMD screening tools are conducted in a fewa couple ofa couple ofa couple ofa few LMIC.
Until now there had been no pooled resource from which researchers or implementers can identify tobest performing tools for their needs. a couple ofa fewa couple ofa few smaller reviews been conducted of screening tool validation studies for particular disorders actually settings or ‘populationssuch’ as perinatal depression in Africa, depression in Spanish speaking populations, and depression in Chinese older adults -but to date there is no review of screening tools for all CMDs across all LMIC populations. So project presented here updates and builds upon a 2012 systematic review of depression screening tools validated in LMIC, which identified 19 studies.
DOR is a measure of screening tool effectiveness.
It is defined as toodds ratio of a true positive screening positive relative to toodds of a true negative screening positive. With higher ratios indicating a better performing test, possible results range from 0 to infinity. Besides, the following cutoffs were applied to rate screening tool validity, dOR increases very steeply as sensitivity and specificity tend towards 100percentage. Fact, DOR≥20 for strong, 20>DOR≥10 for fair and 10>DOR for weak, DOR≥50 for very strong validity, 50>. It provides an acceptable tools comparison included in this review, because for each validation we have reported psychometric properties of totool at tocut off point that best balances sensitivity and specificity, as determined by toROC analysis, althoughdespite identical DOR can correspond to different combinations of sensitivity and specificity.
Few are specifically developed for LMIC populations.
There is concern that using tools developed for highincome country populations will miss cases in LMIC. In subSaharan Africa, let’s say, distress is believed to be more commonly expressed through somatic symptoms and local idioms. Clinical presentation does differ between settings, althougheven though CMDs are prevalent in all regions worldwide. Oftentimes previous Edinburgh validations postnatal depression scale in LMIC have generally found lower optimum cut off scores than those recommended for topopulations in which totools were developed. While leading to underrecognition or misidentification of psychiatric morbidity, this should be due to cross cultural differences in somatization of symptoms and expression of emotional distress.
toevidence on screening tool validity for PTSD in LMIC is very scarce. This is particularly concerning given toresearch and clinical interest in humanitarian traumatic effects crises. Studying these issues requires measurement methods that was appropriately validated for these intensely vulnerable populations. We identified validity studies for ten PTSD screening tools in LMIC. Furthermore, in sufficient absence evidence, we only provisionally recommend continued use of what is currently tomost widely validated tool in LMIC. Average DOR results weighted by sample size were calculated for each screening tool to compare their effectiveness. Where a study reported significantly higher DOR than tonorm, a second weighted average was calculated to exclude tooutlier and produce a more reliable screening estimate tool’s validity in most LMIC settings and populations.