Validation of the Malay Version of the Self-rated Strength and Difficulties (SDQ) Questionnaire

The Strengths and Difficulties Questionnaire (SDQ) is a screening questionnaire that measures children’s problems. The Malay self-rated version psychometric properties had not been investigated. Objective: The present study aims to translate the English self-report version of Strengths and Difficulties (SDQ) Questionnaire into Malay language and investigate its psychometric properties. Methods: Two forward and backward translations were done. Face validity was carried out and followed by reliability and validation process. A total of 300 secondary school adolescents participated in the validation study. Preliminary data analyses were performed using descriptive statistics while the theoretical structure of the SDQ was assessed using EFA and CFA. Model fits were assessed using Chi-square test and other fit indices at 5% significance level. Results: A total of 281 fully complete questionnaires were assessed. Cronbach’s Alpha was acceptable with value of 0.70. The mean age of participants was 14.8±1.45 years. Results of the EFA and CFA revealed a model using the original 5-factor structure with total of 17 items as the best model (χ 2/ df =1.34) with all fit indices yielding the best results.


Introduction
Mental health problems reported to be among the major burden of disease among children and adolescents (Global Burden of Disease Pediatrics Collab, 2016). Three-quarters of people with mental health problems live in low-and middle-income countries where an estimated 75 to 85% receive little or no treatment with majority due to lack of detection and referral (Mendenhall, et al., 2014;World Health Organization, 2013). However, Mendenhall, et al. (2004) found that only 10% of all child and adolescent mental health (CAMH) research has been done in low-and middle-income countries, such as Malaysia.
The identification, evaluation and usage of simple, short and freely-available screening tool for CAMH difficulties may offer a powerful strategy to close the treatment gap as it will enable the identification of children and adolescents in need of next-step evaluation and treatment. Among the screening tools used worldwide is the Strength and Difficulties Questionnaire (SDQ), originally developed by Goodman in 1997. The Strengths and Difficulties Questionnaire (SDQ) has since been translated in more than 60 languages and validated in different populations. It has been used as a screening instrument for child and adolescent mental health and behavioral problems in clinical and community settings (Giannakopoulos et al., 2009;Singh, 2018).
The SDQ has 25 items which is used to generate scores for five domains of psychological adjustment among children and adolescents namely hyperactivity, emotional problems, prosocial scale (PS), conduct problems, and peer problems (Goodman, 2009;Mellor, & Stokes, 2007). It is a multi-methods instrument consisting of a parent and a teacher form available for children aged 3 to 16 years old, and a self-reported form available for the age 11 to 16 years. The scoring is documented and readily available in www.sdqinfo.com. Items are scored on a 3-point Likert-type scale indicating how each attribute applies to the responded (0=not true, 1=somewhat true, 2=certainly true). A high score on the PS reflects strength, while high scores on the other four subscales of the SDQ reflect difficulties. Subscale total scores can summed (for all subscales except PS) together to generate the Total Difficulties score (TD) or may also be used by total of each sub-scores. Possible total scores ranges from 0 to 10 for subscales and from 0 to 40 for the TD scale with varying cut-off scores for distinguishing normal, borderline, and abnormal symptoms; or also used in a continuous score form.
The advantages of SDQ includes the brief format, ease of administration as a one-page assessment, and being freely accessed for its manual version. Other competitive advantages of SDQ compared to other instruments as stated by Bjornsdotter et al. (2013) and Dickey, Blumberg, (2004) are its ability to focus not only on difficulties but also on strengths and the acceptability of the instrument to parents, health professionals of various disciplines and epidemiologists. It also has been adopted to be used in institution such as the National Health Interview Survey in the United States, which is a good measure for its potential (Goodman, & Scott, 1999).
The SDQ has generally been considered to be an instrument with good psychometric properties with the construct validity supported in previous studies (Bjornsdotter, et al., 2013;Goodman, & Scott, 1999;Hadi, Abidin, Othman, & Nor, 2018). It has been shown that SDQ generated scores highly correlated with Child Behavior Check List (CBCL) and it is significantly better than the CBCL in detecting inattention and hyperactivity when comparison was done to a semi-structured interview (Goodman, & Scott, 1999). It was also found that the SDQ self-report questionnaire has significant relationships the Youth Self-Report (Dickey, Blumberg, 2004). Multiinformant SDQs have been found to be useful to screen for conduct, hyperactivity, depressive, and anxiety disorders in the community in certain population setting (Goodmanet al, 2000;Mullick, & Goodman, 2001). Exploratory and confirmatory Factor Analysis has been previously used to assess the structure of the SDQ (Giannakopoulos, et al., 2009;Bjornsdotter, et al., 2013;Dickey, & Blumberg, 2004).
As it has been translated in many languages, in some languages the theoretical structure has not been fully and appropriately investigated in some relevant population settings where the SDQ has been applied. The Malay SDQ for parents and teachers is available and has been investigated for its theoretical structure and other psychometric properties with somewhat conflicting results. One study has investigated the parent-teacher version and found a five-factor solution and similar psychometric properties to other versions of SDQ in other languages (Idris, et al., 2019). The study found that the Cronbach's Alpha was acceptable for both parent and teacher with values of 0.74 and 0.77 respectively. This study also did apply a self-report version to 150 students but did not study the psychometric properties of the self-report version. However, another study investigating the same parents report version found most support for a two-factor model oblique model, with factors for a positive construal factor and a psychopathology factor (Gomez, & Stavropoulos, 2018). The self-report Malay version of SDQ has not been formally translated and validated.

Objectives
This present study aimed to translate the English self-report version of SDQ to Malay language and study the theoretical structure of the self-reported version of the SDQ in a population of adolescents attending secondary schools in Terengganu, Malaysia. In order to ensure a more comprehensive report, both EFA and CFA including other relevant psychometric properties of the SDQ were also investigated. The intention was to provide scientific bases as well as relevant cautions in the application of the Malay SDQ.

Methods
This was a cross-sectional study done from December 2018 to May 2019. Inclusion criteria were secondary school students who was able to read and write and exclusion criteria was students who was illiterate or with any mental illnesses.
Consent was obtained from the original author of the questionnaire via the organization 'Youth-in-mind' to translate and validate the self-report version of the English SDQ to Malay language. Approval from the Education Ministry of Malaysia and UniSZA Human Research Ethics Committee were taken. The English self-rated version was obtained online as it is readily accessible at www.sdqinfo.com. Two forward translations and backward translations were done by a team of different language experts consisting of Teaching English as a Second Language (TESL) graduates. Those conducting the backward translations were blinded to the actual version of the English questionnaire. The translated version of the Malay questionnaire was then compared to the actual version of the English questionnaire by language and content experts consisting of a psychiatrist and a family physician. The team corrected the final wordings of the questionnaire, and ensured that the content could be easily understood by respondents.
A pre-test was then carried out to identify problems before the actual research commenced. The questionnaires were administered to 40 adolescents. This number of participants was adequate based on study by Perneger et al. (2015) which stated that 32 participants were required to achieve power of 0.8, and prevalence of 0.05. A column was provided for the adolescents to comment about individual items that they found difficult to answer. The adolescents were also randomly interviewed for feedback regarding the sentences which they may find ambiguous or difficult to comprehend. From the results of the pre-test, minor modifications were done to some wordings of the questionnaire via discussion by a team consisting of language experts and content experts who were a psychiatrist and a family physician.
A total of 300 secondary school adolescents in Terengganu state were estimated as sample size to participate in the final validation study. The sample size was selected according to the estimate of 10 respondents per item and took account the non-responding rate of 20%. A total of three public schools and one private school were randomly selected from a list of secondary schools from Terengganu state, Malaysia. A pre-visit to each school was done to brief the Principles of the schools regarding the study and distribute the research information to selected classes of Form One, Two and Four students. A total of 12 classes were included in the study. The classes were selected randomly based on a list of classes provided by school administration offices. All the students in the selected classes were given research information sheets together with the study consent forms to be brought back and signed by their parents. During the data collection visit, self-administered questionnaires were distributed to students whom their parents consented.
Data from the questionnaires were entered to SPSS version 22. Data was then checked for completeness. Those with any missing data were excluded, leaving 281 complete for analysis.

Statistical Analysis
Preliminary data analyses were performed using descriptive statistics while the theoretical structure of the SDQ was assessed using EFA and CFA. Reliability involved conducting internal consistency using Cronbach's Alpha and further testing of item analysis (the item-total and interitem correlation) was also conducted. Cronbach's Alpha is acceptable if value is between 0.70 and 0.80, and good if the level is above 0.80 (Tavakol, & Dennick, 2011). Levels of inter-item correlations below 0.30 are considered not adequately correlated (Ferketich, 1991), and for itemtotal correlations, the recommended value is above 0.30 (Nunally, & Bernstein, 1994). Aspects of factor analysis were assessed using Principle Component Analysis. These analyses were conducted using SPSS version 22. Confirmatory Factor Analysis (CFA) was performed using R software by MLR estimation method based on a five-factor structure which was demonstrated in the original English version of the questionnaire. Model fits were assessed using Chi-square test and other fit indices at 5% significance level as shown in Table 1 (Brown, 2015).  (Hair, et al., 2010).

Reliability Analysis Internal Consistency
Data was analysed among 281 completed questionnaires. The participants consist of 13 to 17 years old students with mean age of 14.8±1.45 years. Majority (65%) were females and from public schools (81%). Cronbach's Alpha was used to test the internal consistency of the SDQ. The results showed that the Malay self-rated version of the SDQ has acceptable internal consistency, with a Cronbach's alpha coeffcient of 0.70 and 0.71 for the 20 items of the Total Difficulties scores and 0.70 for prosocial behaviour. However, the Cronbach Alpha scores for the peer problem scale was only 0.20, while conduct, hyperactivity and emotional problems were 0.54, 0.45, 0.62 rescpectively. Table 2 shows the item-total correlations using the Pearson correlation coefficient. This table shows that all items correlated well with their corresponding items in other subscales. All items included in the Total Difficulties scale showed poor correlation with items in the pro-social skills subscale. Item correlation was highest for the emotional problems subscale (r = 0.55-0.70) and lowest for the peer relations subscale (r = 0.30-0.60). The highest correlations between item and total scores were found for 'Many worries, often seemed worried' (r = 0.70) in the emotional problems subscale. Most of the item total correlations showed levels of more than 0.4 signifying moderate correlation. In terms of inter-item correlations, all correlation between items for each subscale was above the recommended value of r = 0.3. The results showed that items in prosocial subscale were not having high correlations with other subscales where r values were less than 0.3.

Exploratory Factor Analysis
The data from the parent's questionnaire in the SDQ were analysed by means of a Principle Component Analysis using Varimax Rotation. The KMO (Kaiser-Meyer-Olkin) index was 0.73 with a Bartlett's Test of Sphericity that was significant (P < 0.001). Table 3 shows the result of the Principle Component Analysis, which revealed that the loading on the predicted factors were high for 19 of the 25 items in the parent's questionnaire which was in the range between 0.32 and 0.70. A value of 0.32 is taken as a cut off point for the minimum loading of an item (Tabachnick, & Fidell, 2001). The only subscale which loaded specifically to their own subscale is the prosocial behaviour (PB). Hyperactivity, emotional symptoms, conduct problems and peer problems subscales did not load on a single factor but on two or more factors. For example, two items in the hyperactivity problem subscale loaded highly on other subscales (i.e. 'I am restless and cannot stay still for a long time' loaded highly on the peer problem subscale (factor loading of 0.48) and 'I am constantly fidgeting or squirming' loaded highly on the emotional symptoms subscale (factor loading of 0.51). One of the items in the conduct problem subscale loaded highly on the hyperactivity problems subscale (factor loading of 0.57), while the peer problem subscale loaded on three other subscales; the hyperactivity subscale (item 14 with factor loading of 0.34), conduct problems subscale (factor loading of 0.62) and emotional problems subscales (item 6 factor loading of 0.57).
After a step-by-step procedure to achieve the best fit, Figure 1 showed a model which provided a good fit to the data only after deletion of 2 items having poor factor-loading for each of the Total Difficulties subscales (refer Table 2 for short form for each item). The final model indices showed that the structural model relationship within the domains in the Malay self-rated version of the SDQ shows a good fit after the deletion (x 2 = 148.84, df = 109, P =0.007, chisq/df = 1.37; CFI = 0.92; RMSEA = 0.04; TLI =0.90; SRMR=0.06). The items which are deleted were items 21 and 25 in hyperactivity scale, items 3 and 24 in for emotional symptoms scale, item 7 and 22 from conduct problem scale and item 11 and 14 from the peer problems scale. All models show acceptable construct reliability with Raykov's Rho of 0.7 and above (Table 4).   GFI=Goodness-of-Fit index; NFI=Normed-fit index; IFI=Incremental fit indices; TLI=Tucker-Lewis index; CFI=Comparative fit index BIC=Bayesian information criterion; AIC= Akaike information criterion; SRMR= Standardized root mean square; RMSEA=Root mean square error of approximation.

Discussion
The results of this study suggest that internal reliability of the Malay self-report version of the SDQ was satisfactory. The reliability was assessed using internal consistency and inter-item correlation. All items included in the Total Difficulties scale showed poor correlation with items in the pro-social skills subscale as levels of inter-item correlations below 0.30 are considered not adequately correlated (Hair, Black, Babin, Anderson, 2010). This result is in keeping with the interitem results which showed that items in prosocial subscale were not having high correlations with other subscales where r values were less than 0.3. This feature is as opposed to the parent version of the Malay SDQ, where one item in the prosocial skills subscale had high correlation with conduct (Idris, Barlow, Dolan, Surat, 2019;Kannapiran, Kob, Rus, & Sulaiman, 2018). This is considered as a good feature for this translated SDQ self-rated questionnaire as the prosocial items show positive qualities, which should not be highly correlated with the qualities included in Total Difficulties scale. It may be due to the prosocial qualities is less being noticed by the carers compared to the difficulties of the child or adolescents, which may be due to the local culture. This study also found that each item correlates moderately with its corresponding item in the same sub-scale. Other than that, the result for internal consistency was acceptable for overall questionnaire and the Total Difficulties scores with a Cronbach of 0.70 and 0.71. Therefore, the overall questionnaire reliability is considered acceptable.
However, compared to the parents' and teacher's report version of Malay SDQ, the teacherreport had the highest internal consistency (Idris, Barlow, Dolan, Surat, 2019). It is similar to those obtained by Goodman, R. (2001) although the value in the current study is lower. The value was also much lower for the conduct and peer problems subscales. Previous other studies (Widenfelt et al., 2003;Murris et al., 2003;Koskelainen et al., 2000) have also shown lower values for both conduct disorders and peer relations subscales in the parents' and children's reports. Low values for these subscales could be due to the presence of both positive and negative items (Hairet al., 2010). Widenfelt et al. (2003) suggested that the low internal consistencies of these subscales, were due to the presence of items that do not fit within the domain.
This study found that each item correlates moderately with its corresponding item in the same sub-scale. The emotional problems subscale had the highest correlation among its corresponding items. This suggest that emotional problem had less presence in other behaviour that were measured. However, three items in the conduct subscale had moderate correlation with emotional problems subscale. There were also items in the emotional which had moderate correlations with hyperactivity. This result suggests that in this research, there were some moderate correlations between internalising and externalising domains. This finding is similar to the parent report version of the questionnaire where there was a moderate correlation between items in the emotional and hyperactivity problem subscales (0.356-0.391) (Mullick, & Goodman, 2001). This was in agreement with the study by Goodman R. (2001) in which there was a higher correlation between externalising/externalising domains compared with internalising/ externalising domains. The correlation between externalising and internalising problems may coexist, although they have been explained as separate entities (Chase, & Eyeberg, 2008).
For exploratory factor analysis, most previous studies confirms the factors structure (Matsuishi, et al., 2008;Rodriguez-Hernandez, et al., 2012). Nevertheless, in the factor analysis, certain items loaded highly on other subscales. For example, a hyperactivity item loaded on both emotional and peer relations subscale while an item in the conduct disorders subscale loaded on the hyperactivity problems subscale. Items in the peer relations subscale also loaded on multiple other items in other subscales. One study investigating the Malay SDQ for parents found a fivefactor solution and similar psychometric properties to other versions of SDQ in other languages (Idris, et al., 2019). However, another study investigating the same parents report version found most support for a two-factor model oblique model, with factors for a positive construal factor and a psychopathology factor. The latter study used exploratory factor analysis (EFA) to determine the best model for parent ratings of the Strengths and Difficulties Questionnaire (SDQ), and then multiple-group confirmatory factor analysis (MCFA) to confirm this model. The study showed most support for a two-factor model oblique model, with factors for a positive construal factor and a psychopathology factor (Gomez, & Stavropoulos, 2018).
This study revealed that the original the five-factor model of the SDQ fitted moderately well and fulfilled two out of there criteria of an ideal fit model; while model with three latent variables (externalising behaviour, internalising behaviour, and prosocial behaviour) did not show better fit indices. Other study among parents and teachers in a community sample of young children in Flanders, Netherland, have shown similar result (Leeuwen, et al., 2006). However, there were studies which found most support for a 3-factor oblique model as the best model for the sample of adolescents studied (χ 2 /df =2.20) with all fit indices yielding better results (Akpa, et al., 2016).
Unfortunately, when all requirements for a good fit during confirmatory factor analysis were used, the 5-factor model does not provide support for an ideal fit in the present study. Competing other models which were done according to suggestion of a three-factor model (using PBS, internalizing and externalizing factors), and two-factor model (using PBS as first factor and TD score as second factor) do not provided a better fit (Table 4). A few previous studies had found that a three-factor model (Dickey, & Blumberg, 2004) and a two-factor model (Gomez, & Stavropoulos, 2018) provided a better fit compared to the original five-factor model. Following suggestion by Akpa, et al. (2016) to get a better fit by deletion of a poor-factor loading item one by one revealed a best fit model after minimum deletion of two items each from each TD subscales. The items which are deleted were two in hyperactivity scale which are item 21 and 25, emotional symptoms item 3 'I get a lot of headaches, stomach aches or sickness' and item 24 'I have many fears, I am easily scared', two from conduct problems; item 7 'I usually do as I am told' and 22; 'I take things that are not mine from home, school or elsewhere' and peer problem subscale item 11 and 14. The items deleted may not be due to the difference in the Malay culture where children are likely to be more submissive, appear more obedient and more likely to hide their ailments or feelings.
Although the result of the CFA showed the original 5-factor model did not have an ideal fit, it may be hasty to conclude that these findings invalidate the use of the SDQ in the present setting in Malaysia. As the instrument is originally designed for use as a screening tool, rather than a diagnostic test, the interpretation need to be viewed with caution and further assessment is needed. Secondly, its validity has been documented in several study settings with different study populations. Thirdly, the construct validity did show acceptability even in the original 5-factor model and an ideal fit was achieved without changing the 5-factor structure. Consequently, rather than suggesting modifications, the researchers suggested that the SDQ should be used in the original 5-factor structure form and interpreted cautiously, within the confines of its intent.

Limitations
There were a few limitations in this study. The scope of research which was limited adolescents in secondary schools might affect generalizability. Adolescents who did not use the Malay medium as their first language might not fully comprehend the questionnaire. However the inclusion of both public and private schools and the adequate number of samples in this psychometric study contributed to the strength of this study. More studies are necessary to confirm other psychometric properties of the translated version, such as test-retest reliability.

Conclusion
The reliability of the Malay self-report version of the Strengths and Difficulties Questionnaire is considered acceptable, while the original five-factor structure is best to be used compare to the alternative three-or two-factor structure. Although an ideal fit could not be achieved using all items of the questionnaire, the best fit after the removal of weak items is possible. Therefore, this version of SDQ should be used cautiously, within the confines of its intent of being a screening instrument