Developing E-Muhadatha t Kit Instrument (I-KEM) for Non-Arabic Speakers: A Rasch Model Analysis

More attention should be given to determining the precise instrument validity and reliability measurement tools in order to reduce error and produce a quality instrument. Hence, this study aims to identify the validity and reliability of the E-Muhadathat kit or (I-KEM) instrument, which has 21 items, via the Rasch Measurement Model. This model is believed to provide accurate estimates of construct validity and instrument reliability. This instrument aims to determine the development requirements for the E-Muhadathat kit targeted at non-Arabic speakers in Malaysian public higher learning institutions. Three experts in Malay, Arabic, and general knowledge were consulted to determine the instrument's face and content validity, which contains 21 items. I-KEM pilot questionnaires were distributed online using Google Forms to 50 respondents from three public Malaysian universities: MARA University of Technology (UiTM), University of Malaya (UM), and University Putra Malaysia (UPM). Subsequently, the validity and reliability of I-KEM were measured based on the Rasch Model using Ministeps software version 5.1.4.0. The findings show that I-KEM needs to be improved in terms of face validity and content validity on the recommendation of expert evaluators. In contrast, the measurement of the Rasch Model shows that I-KEM has good construct validity with a noise level of 10% and is controlled by a maximum value of 15%. In addition, the item reliability index of 0.96 and the person reliability index of 0.90 indicate a high degree of instrument reliability. This study presents significant findings in determining the developmental needs for the E-Muhadathat


Introduction
Interactive multimedia plays an important role in revolutionizing education with technology and innovation, cultivating active learning practices to boost the quality of teaching and learning.In Arabic education, adopting Information and Communication Technology (ICT) has helped empower Arabic learning and produced professionals skilled in Arabic communication.As Marpuah (2015) described, Muhadathat entails communication or speech in Arabic and reflects an individual's capability to speak and practice Arabic fluently and spontaneously.In addition, these skills also require a psychological readiness in a person to speak a second or foreign language spontaneously and fluently (Hashimoto, 2002).
The concept of Muhadathat requires an emphasis on Arabic language communication activities among non-Arabic speakers (Tu'aimat, 2006).This is supported by the opinion of Ashour and al-Hawamidat (2014), who stated that speaking skills reflect the individual mind because information, ideas, and emotions can be conveyed through those skills.According to Ramli et al (2017), among the alternatives that Arabic language teachers need to do in creating interactive and communicative classrooms is through a teaching approach with media kits.Furthermore, as asserted in previous studies (Hat et al., 2013;Mahmuda, 2018), multimedia technologies can provide an interactive and student-centered learning environment.As a result, students will have a higher interest and motivation in Arabic learning.This, ultimately, improves the effectiveness of Arabic teaching and learning activities.
Moreover, the traditional learning method is predominantly one-way and teachercentered, which can negatively impact students' conceptual understanding and learning motivation (Hassan et al., 2017).In addition, conventional learning also prevents students from adopting a student-centered, interactive learning environment, which multimedia technologies can greatly influence.This can ensure the effectiveness of teaching and learning through higher students' motivation and engagement in learning.Self-directed learning or heutagogy has become synonymous with today's modern era (Samin et al., 2020).Hence, in line with today's technology, this study plans to develop an interactive multimedia kit or the E-Muhadathat kit, which can simulate dialogues with native Arabic speakers in real-life settings.This kit is an Android smartphone app based on Natural Language Processing (NLP).It uses Artificial Intelligence (AI) technology to recognize the voices of native Arabic speakers and can be used to help students practice Arabic correctly in a real context.

Problem Statement
Conventional teaching methods that are one-sided and passive lead to Arabic language teaching being seen as difficult, boring, and uninteresting (Zaini et al., 2017).According to Mazlan (2018), in universities, one-way teaching is still predominantly used in Arabic language classrooms, and lecturers tend to use traditional teaching approaches like lectures and discussions supplemented with less interactive visual and auditory teaching aids.The internet allows students to access online resources regardless of time and place and adopt selfdirected learning or heutagogy (Blaschke, 2012).However, the reliance on conventional learning has prevented students from adopting a culture of heutagogy.
Studies have found that heutagogy is only moderately practiced in Arabic lessons.This highlights the need to extend this practice, specifically in higher education institutions in Malaysia (Yahya et al., 2013;Hamzah et al., 2019;Rahman & Ahmad, 2020;Hai et al., 2020;Yahaya et al., 2021;Ghani et al., 2022;Fuad & Al-Yahya, 2022).The development needs of this media kit are also due to the issues of speech, communication, and willingness to communicate (WTC) faced by Arabic language students in Malaysian universities (Haron, 2011;Ibrahim, 2013;Mohamad, 2009;Arshad & Bakar, 2012;Noor et al., 2014;Daud & Pisal, 2014;Tohlong, 2015;Borham, 2018;Rahman & Ahmad, 2020).Thus, before conducting the E-Muhadathat kit development needs analysis, the I-KEM instrument should be tested to ensure its validity and reliability with Rasch Model measurements are at a high level.

Research Problem
This study's main aim is to use the Rasch-based Model to determine the validity and reliability of the I-KEM instrument.The measurement encompasses the reliability item index, person index, item polarity, unidimensionality, standardized residual correlation, and item fit.

Methodology
This pilot study involved 50 Arabic language students in three Malaysian public universities (UiTM, UM, and UPM).A pilot study is administered to a small portion of the original sample to identify deficiencies and weaknesses of the instrument (Fraenkel and Wallen, 1996).This study used the E-Muhadathat kit development questionnaire (I-KEM) for non-Arabic speakers in higher education in Malaysia.This instrument was adapted from (Sahrir et al., 2017;Rahman et al., 2015;MacIntyre et al., 1998;Tohlong, 2015).Once the I-KEM instrument was developed, it was distributed to three experts in Malay, Arabic, and quantitative studies to test the face and content validity.Then, respondents were asked for their consent to participate in data collection.
Several enumerators helped to distribute informed consent forms and questionnaires online through the Google Form platform.Measurements based on the Rasch Model were used to identify the construct validity and instrument reliability.This measurement model formulates dichotomy data (in the form of right and wrong forms such as tests) and polytomy data (in the form of Likert scales such as questionnaires), which correlate respondents' ability with item difficulty (Rasch, 1980).According to Wright and Linacre (1992), the Rasch Model meets the criteria of scientific measurement principles such as determining linear measurements, overcoming the issue of missing data, providing accuracy estimates, detecting misfit or outlier items, and the measurement instrument is not dependent on the observed object parameters.This model is believed to provide accurate estimates of construct validity and instrument reliability (Aziz et al., 2015).Therefore, the data of this pilot study were analyzed using Ministeps software version 5.1.4.0.

Validity of Instrument
Punch (1998) described validity as how much the measurement accurately represents a concept.Therefore, face validity and content validity should be conducted by field experts to check aspects of language, structure, and sentence order used in the questionnaire items and to see the suitability of the items with the components in the measurement (Darusalam & Hussin, 2016).I-KEM instrument went through a process of face validity and content validity by three experts in the field of Arabic linguistics, Malay education, and quantitative research from the Arabic department and the Malay department, Faculty of Languages and Communication, Universiti Pendidikan Sultan Idris (UPSI).
Then, the researcher conducted a face validity and content validity measurement using the Content Validity Index (CVI) value to determine the average validity value of the instrument administered by experts.According to Davis (1992), the CVI value that meets the requirements is ≥ 0.80 for new or modified instruments.The CVI value can be measured for each item (I-CVI) or the CVI value for the entire instrument (S-CVI).For the I-CVI value, the number of experts assigned a scale of 3 or 4 is divided by the total number of experts who completed the assessment.In addition, the S-CVI value is determined by dividing the sum of the I-CVI for each item by the total number of items.A summary of the formula for measuring CVI is as follows (Polit et al., 2007):

Number of Items
Based on the formula, the researcher carries out the face validity of the I-KEM instrument, and the CVI value obtained is 1.00, which indicates high face validity and is accepted by the expert panel (Davis, 1992).Table 1 below shows the face validity by measuring the CVI value obtained for the I-KEM instrument.

Table 1 Face Validity of I-KEM Instrument
Table 2 shows the content validity with the CVI value obtained for the I-KEM instrument.Based on the CVI formula, the CVI value is 1.00, which indicates that the content validity of the I-KEM instrument is high and accepted by the expert panel.

No
.
Item The experts agreed that the I-KEM instrument can measure the aspects of content that need to be measured.In addition, some improvements need to be made, such as refining word spelling errors, refinement of verse expressions that carry the double meaning so as not to confuse the respondent, adding questionnaire items, and detailing the reasons why respondents learned Arabic to be used as a reference for the development of the kit.Therefore, the researcher has refined the I-KEM instrument that has been developed based on expert recommendations.

Pilot Study
This study conducted a pilot study to evaluate the findings' consistency, stability, and repeatability.This process entails administering the questionnaire to a small group of respondents before questionnaires are distributed to the respondents (Asbulah et al., 2018).Reliability means the extent to which an instrument is free from measurement errors (Fraenkel & Wallen, 2003).According to Linacre (1994), the minimum stable and adequate rate for the pilot study sample is 30 people with a confidence level (95%), so it can produce a stable and meaningful statistical analysis.The pilot study involved 50 respondents who were Bachelor of Arabic students from UiTM, UM, and UPM, similar to the target respondents for the actual questionnaire.For this purpose, the researcher has appointed several enumerators to distribute informed consent forms and questionnaires online through Google Forms to the respondents.In this study, research data collection from respondents is conducted based on the guidelines of the UPSI Research Ethics Committee (JKEPU, UPSI) (Ethical Reference Number: 2021-0249-01).

Results and Discussion
The results are aligned with the study's objectives-(1) identifying the reliability of the instrument, (2) identifying the polarity of the item measuring each construct, (3) examining the unidimensionality of items measuring a single construct, (4) measuring the standardized residual correlation values to avoid items overlapping or confusing each other, and (5) analyzing the suitability of the items measuring each construct.items with 5 Likert scales, starting with 1 strongly disagree, disagree, neutral, agree, and 5 strongly agree.1) Reliability Index In the Rasch Model measurement, the ideal values for Cronbach's Alpha are between 0.71-0.99(Bond & Fox, 2015).The interpretation of Cronbach's Alpha (α) values is shown in Table 3.

Table 3 Content Validity of I-KEM Instrument
Cronbach's Alpha score Interpretation 0.9-1.0 Very good, effective with a highly consistent 0.7-0.8 Good and acceptable 0.6-0.7 Acceptable <0.6 The item should be refined <0.5 The item should be removed This study determined the instrument's reliability using statistical analyses grounded on the Rasch Model, as shown in Figure 1.Based on Figure 1, the Cronbach's Alpha (α) value for the I-KEM instrument is 0.94.Furthermore, the study reported a person reliability index of 0.90; the person separation index was reported at 3.07.The results further indicate that if 21 items were administered to a group of respondents of different and similar abilities, there is a high expectation of repetition for the respondents' feedback in the questionnaire (Bond & Fox, 2015).In addition, the reliability and separation values for items are listed in Figure 2 (Fox and Jones, 2005).Therefore, the I-KEM instrument has a generally good consistency with a value close to 1.0 (Bond & Fox, 2015).It also shows that the instrument could be used for the actual studies.

2) Polarity Item
The term "polarity item," also known as the "Point-Measure Correlation Coefficient," refers to the point of measurement of the correlation coefficient between a person's aptitude and item difficulty (PTME Corr).This analysis seeks to evaluate whether a predetermined construct can accomplish its objectives.A positive PTME Corr.value (+) indicates that each item can measure the construct that is being assessed.On the other hand, a negative PTME Corr.value represents a negative index.Such a value shows the need for research to determine whether the items need improvement or removal (Bond & Fox, 2007).
Figure 3 shows the PTME Corr.values for the I-KEM instrument are all positive, with no negative values and no values below 0.20.This describes measuring items from a positive direction, parallel in one direction, to measure constructs (Linacre, 2002).A high PTME Corr.value indicates items can differentiate individual abilities (Bond and Fox, 2007).The PTME Corr.should exceed the value of 0 (> 0) and be positive (the item measures one direction in the same direction as the construct being measured).An item polarity of 0 and negative contradicts the measured variable or construct (Linacre, 2007).(2002), is (> 60%).Furthermore, Runnels (2012) asserted that ideal PCA values range from 20% to 40%.Aziz et al. (2015) mentioned that the maximum unexplained variance of first contrast should be 15%.
Figure 4 shows the unidimensionality value of the I-KEM instrument with the PCA (raw variance explained by measures) value of 60.5%, compared to the Rasch Model expectation of 61.5%.This indicates a very good value of variance by exceeding the minimum levels of 20% and (> 60%) (Linacre, 2002).The level of interference of the measured item (noise) or unexplained variance in the first contrast is 10%, a controlled value from a maximum value of 15% (Aziz et al., 2015).Overlapping and redundant items can be identified through the standardized residual correlation value.This is intended to avoid any confusion or misplaced objectives in instrument development.The standardized residual correlation value shall record a value not exceeding 0.70.If two items exceed that value level, only one item is used, and the other needs to be dropped or refined so there is no further correlation between the items.This elaborates that the two items have a high correlation value because they have similar characteristics to each other, and other dimensions are shared (McNamara, 1996;Asbulah et al., 2018).Figure 5 shows the standardized residual correlation values for the I-KEM instrument.

Figure 5. The Standardized Residual Correlation Values
As shown in Figure 5, the I-KEM instrument contains two items with high standardized residual correlation values (exceeding 0.70), namely items B2 and B3.Both items were filtered by looking at MNQS values approaching a value of 1.00 and ZSTD values approaching a value of 0.00 (Huei et al., 2020).After going through the filtering process, items B2 and B3 were retained according to the researcher and supervisor's discussion as well as expert recommendations.

5) Item Fit
The MNSQ infit value refers to the match corresponding to the response pattern and the measured item.Based on the Rasch Model analysis, item fit values are used to measure a latent variant, and a study by Boone et al. (2014) described the infit and Mean Square outfit (MNSQ) should range between (0.5-1.5).In this regard, ranges higher than the MNSQ value typically have high ZSTD values and fall outside the acceptable range of -2.0 ZSTD +2.0.Meanwhile, the MNSQ outfit values exceeding 1.5 indicates that the responses given by the sample are too random for low-ability samples, simple questions not answered by the sample or negligence of answers for the high-ability sample.The MNSQ infit and outfit values should be between 0.70-1.33 to indicate the suitability of items measuring latent variables or constructs (Bond & Fox, 2015).In this light, the respondents' responses are highly predictable as the MNSQ outfit value is less than 0.5.However, the MNSQ outfit value must be considered before the MNSQ infit value when measuring a construct (Sumintono, 2017).Figure 6 shows the item fit for the I-KEM instrument.Figure 6 illustrates that all items are within the acceptable range of infit and outfit MNSQ values of 0.5-1.5.This explains that these items are in a sufficient range for measurement (Linacre, 2002;2006).The range of MNSQ is explained in Table 4 below Distorting or weakening the measurement system.Probably due to only one or two observations.1.5 -2.0 Less effective for measurement construction but not weakening.0.5 -1.5 Effective enough for measurement.< 0.5 Less effective for measurement but not debilitating.Likely to result in confusing reliability and separation coefficients.

Conclusion
The pilot study results indicate that validity and reliability tests should be performed as the initial step of instrument development to ensure that the instrument is reliable and will yield precise data.Each item in the instrument should be tested according to standard indices and conditions of the Rasch Measurement Model.Items exceeding the range of fit items should either be refined or removed according to experts' views and consensus.In this regard, the refined instruments showed better reliability.Overall, the validity and reliability tests based on the Rasch Model showed that the I-KEM instrument has good validity and high reliability.This indicates that all 21 items can measure the constructs.Therefore, these findings explain that the I-KEM instrument is suitable for use by university-level students.The findings from the analysis can guide researchers to create highly valid and reliable instruments to ensure that the measurements can meet the study's goals.Accordingly, this study proposes the development of pedagogical applications such as media kits related to Arabic communication skills that are relevant and appropriate to help students master the Arabic language.In addition, this study also aims to guide Arabic language teachers to develop innovative and student-centered teaching tools to further enhance the communication competence of non-Arabic speakers in Malaysian tertiary institutions.

Figure 1 .
Figure 1.Reliability and Person Separation Values towards the I-KEM Instrument below.

Figure 2 .
Figure 2. Reliability and Item Separation Values towards the I-KEM Instrument

Figure 3 .
Figure 3. Polarity of the I-KEM Instrument Items

Figure 4 .
Figure 4. Unidimensionality of the I-KEM Instrument

Figure 6 .
Figure 6.Item Fit for the I-KEM Instrument The instrument contained 21

Table 4
Description of Infit and Outfit MNSQ MNSQ Values Measurement Implications >2.0