The Reliability Study of Personal Wellness Questionnaire (PWQ) to Measure Self-Changes among Malaysian Low- Performing Public Service Officers

This pilot study aimed to identify the reliability of the Personal Wellness Questionnaire (PWQ) which is used as instrument to measure self-changes among Malaysian low-performing public service officers. This instrument consisted of 75 items divided into six sections; section A was demographics data, whereas sections B, C, D, E and F consisted of five sub-constructs of self-changes namely emotional, psycho-spiritual, social, cognitive, and behavioural adjustment. A total of 30 lowperforming public service officers at a particular ministry in Putrajaya were involved in this pilot study. The Rasch Model version 3.72.3 was used to analyse the PWQ items, in which value of 0.89 was obtained for item reliability, and value of 0.95 was obtained for respondent reliability. These findings indicated that PWQ items were very good, in effective condition with a high level of consistency, and can be used in actual research. Several items were dropped because they did not match the correct constructs and did not comply with the criteria set by the researchers. The final instrument comprised of 51 appropriate items for measuring the five self-changes sub-constructs of the research target population.


Introduction
Civil servants in Malaysia faced various issues in terms of human development which leads to a low level of commitment among some of them. Nowadays, current changes in society such as higher income and living rates, highly educated societies, and diverse customer demands, subsequently urging the public sector to provide better quality services in terms of broader options and flexibility (Marsidi & Abdul, 2007). Therefore, it was important for counseling services to be established in the workplace. Bakar (2014) stated that among the core goals of counseling services is to encourage changes in client behavior, help client make decisions, form clients' coping skills, rationalize client's minds and help clients improve relationships with others. Circular Letter No. 4/1998 had been issued by the Malaysian Public Service Department stating that psychological and counseling intervention services were highly emphasized and given much attention in order to improve the service quality of the public service officer. Therefore, the need to implement this intervention in the workplace requires support and involvement of management at all levels.
Emotional stability, psycho-spiritual, social skills, cognitive and behavioral adjustment, if unbalanced, could affect the quality of service of an employee. Therefore, they needed to be improved to enhance their work performance (Bokti & Talib, 2010;Tenney, Poole & Diener, 2016;Milliman et al., 2000;Querstret et al., 2015). In Malaysian Public Service Department (PSD), selfchanges of low-performing civil servants in these five elements were measured using the Personal Wellbeing Questionnaire (PWQ). This adapted instrument, however, had never been validated for its' reliability to be used in local context. For that reason, the main objective of study was to test the reliability of this questionnaire in order to see the suitability and to detect any weaknesses in items used. Through this validation study, the researcher performs the functionality check on the items as a whole and each individual item from the aspect of reliability.

Methodology
This pilot study aimed to obtain the reliability of the instruments. There were 75 items in this instrument that were divided into six sections, namely section A for demographic data which contained nine items and sections B, C, D, E and F which were further divided into 5 sub-constructs of self-change, which were emotional stability, psycho-spiritual, social skills, cognitive and behavioral adjustments. The instrument used was a questionnaire adapted by researchers from Psychology Management Division, Public Service Department. Thirty people involved were participants of the Personal Wellbeing Program organized by a ministry in Putrajaya in which the respondents had the same characteristics as the actual respondents chosen by the researcher that were those with Annual Performance Score Report of 60% and below.
The Rasch Model approach is used to determine the reliability of an instrument. In this pilot study, the researchers used the Rasch Measurement Model to test the reliability of items and respondents and for the removal of inappropriate items in the study. However, for this paper, Rasch's model measurement approach was also used to examine the reliability of questionnaire instrument developed through quantitative data collection in the pilot study. Normally, the reliability of an item was only seen through Alpha Cronbach (α) value for the entire instrument.

Results and Discussion
A total of 30 respondents answered this questionnaire, those who were involved in the Personal Wellbeing Program conducted for three days and two nights, similar to the actual program which would be conducted for 20 hours. After the data were collected, the data were analyzed descriptively and the minimum value used in this analysis was the Rasch Measurement Model approach, researchers perform item functionality checks in term of reliability and item-respondents differentiation and removal of items. The explanation for each item functionality check was described in Table 1 as follows: Table 1: Interpretation of Alpha-Cronbach (α) Scores (Bond & Fox 2007) Alpha-Cronbach(α) Score Reliability 0.9 -1.0 Very good and effective with high degree of consistency 0.7 -0.8 Good and acceptable 0.6 -0.7 Acceptable < 0.6 Item need to be repaired < 0.5 Item needs to be removed In order to determine item reliability for instruments, Rasch measurement model approach was used by referring to the reliability and differentiation of items. The findings of the analysis showed that the reliability value obtained based on Alpha Cronbach (α) value was 0.95 as shown in Table 2. This clearly demonstrated that the instruments were very good and effective with a high level of consistency and thus could be used in the actual research. An analysis of the instrument was also performed on the whole by looking at the reliability and differentiation of items and respondents. Table 3 showed the reliability and differentiation of items in which the item's reliability value was 0.89, while the item separation value was 2.78 when rounded-up became 3.0. Based on item reliability, the value of 0.87 indicated that it was in good condition and acceptable (Bond & Fox 2007). Whereas the separation value of the item was 2.62 and if rounded up, it was equal to 3.0. According to Linacre (2005), the value of good separation index was greater than 2.0. Meanwhile, based on Table 4, the reliability value of the respondent was 0.95 and the respondent's separation value was 4.15. This showed the reliability of the respondents was very high and it was good because Bond and Fox (2007)   The Point Measure Correlation (PTMEA CORR) value is meant to detect the polarity of the item was intended to test the extent to which construction of the constructs achieved its goals. If the value found in the PTMEA CORR part was a positive (+) value, it indicated that the item measured the constructs as intended (Bond & Fox, 2007). Conversely, if the value was negative (-) the developed item did not measure the constructs as intended. Therefore, the item needed to be removed or revised as the item did not point to the question or was difficult to answer by the respondent. Based on Table 5, there were three items that had negative values of B1, E58 and F65. For the rest, the PTMEA CORR value was positive and it showed that the item measured the constructs you want to measure. Thus, there were three items needed to be removed from the entire 75 items in the questionnaire (PWQ). While the value of PTMEA CORR was positive, there were five lowest positive values for B2 (0.05), B10 (0.05), D33 (0.06), F62 (0.04) and F69 (0.05). This value should also be noted because it was likely that the item was difficult to answer by the respondent (Azman, 2011). Therefore, the items needed to be revised. The findings showed that positive items in the questionnaire were moving in one direction with constructs and able to measure constructs and did not conflict with the constructs to be measured. If the value of PTMEA CORR was high, then the item was able to differentiate the ability between respondents who answered this questionnaire. In addition, the suitability (fit) of items in measuring constructs could also be seen through the values of MNSQ infit and MNSQ outfit. MNSQ's outfit and infit value should be within a range of 0.6 to 1.4 in order to ensure the built items were suitable for measuring the constructs. The MNSQ value should be between 0.6 and 1.4, if the logit value exceeds 1.4 it means the item was misleading and needed to be viewed again. If MNSQ value was less than 0.6, this means that the item was too easily expected by the respondent (Linacre & Ph, 2014). In addition, the value of ZSTD outfit and infit should be between -2 and +2 (Bond & Fox, 2007), however, if the value of the MNSQ outfit and infit was acceptable, then the ZSTD index might be ignored (Linacre & Ph, 2014;Abazeed, 2018). Table 6 showed the misfit order which displayed items that had MNSQ highest and MNSQ lowest values from the statistical item analysis of misfit order.
Based on Table 6, there were 27 items that were within the prescribed range and they needed to be revised or removed. Items exceeding the value of 1.40 in the MNSQ outfit were A10 (3. Therefore, with reference to Table 6, a total of 38 items needed to be revised or removed. There were eight items that were not within the PTMEA CORR range. There were 16 items removed because they did not accurately measure the constructs. In addition, 14 items had been revised by looking at the needs of the researchers and expert views. After the analysis, 51 items fulfilled the purpose of constructs to be investigated by researchers. Once the data was analysed, all items and instruments underwent revisions in order to achieve the validity and reliability standards of the instruments based on the Rasch Measurement Model. Although all the items were analysed by SPSS version 23, however, the instrument was supported and strengthened by using the Rasch Measurement Model in terms of checking the item reliability, respondents' reliability, respondents' differentiation and item differentiation as well as item removal. Based on data analysis conducted, 24 items did not meet the requirements of the analysis that had been determined and needed to be rejected.
When using the Rasch analysis application, the rating scale worked to form a category. This category could be used for multiple choice questions or Likert scales. In this questionnaire, 5-point Likert scales were used: 1. Strongly disagree 2. Disagree 3.
Strongly agree Table 7 showed the 5-point Likert scale of the categories according to the sequence of 1 to 5 that were 1, 8, 63 and 28. Therefore, through the table above, the difference in the structure calibration between the scale and the range was to be 1.4<y<5. For example, 2 to 3 = none, 3 to 4 = 1.89, and 4 to 5 = 2.82. This means that the scale in this questionnaire was understood and can be maintained using 5-point Likert scales.

Conclusion
Rasch technique had greatly impacted the manner in which social science research made use of tests and surveys. The Rasch Model framework offered procedures for constructing and revising social science measurement instruments and documenting measurement properties of instruments (e.g., reliability, construct validity). Rasch technique also enabled researchers to make critical corrections when using raw test score data or survey data. Specifically, Rasch technique allowed nonlinear raw data to be converted to a linear scale, which then could be evaluated through the use of parametric statistical tests. In addition to the examples provided earlier, there were Rasch steps that could be used to investigate additional important instrumentation issues such as step ordering/step disordering, item reliability, person reliability, differential item functioning, and differential test functioning (Boone, 2016;Sadoughi & Hesampour, 2017).
In a nutshell, this study helped to validate the Personal Wellbeing Questionnaire (PWQ) which is used by the Malaysian Public Service Department (PSD) as one of its' counselling psychology measurement tools. It could be concluded that the validity and reliability were an important aspect that should be emphasized in evaluating an instrument whether it was new or adapted before it was used in the field of real research. Based on the analysis of this validation study, this instrument was good in quality and appropriate to be used by psychological officers in ministries, departments or in the private sector to measure the self-change through five sub-constructs namely the emotional stability, psycho-spiritual, social skills, cognitive and behavioural adjustments for low-performing civil service officers.