Assessment of Higher-Order Thinking Skills: Is it Simply Determined by Verbs?

The assessment of higher order thinking skills (HOTS) is the important practices in the teaching and learning process. However, various issues raised in the process of developing assessment task, namely i) the vague and diverse understanding of the HOTS concept. and ii) rely on the verb as a determinant to categorize the thinking ability level in the assessment task. The educators should master the five fundamental principles in the development of good quality and high degree of validity of the assessment tasks. Besides, the educators should hold a clear concept of HOTS which can be viewed from three main categories, namely transfer, critical thinking and problem solving. Various assessment strategies are stated for assessing problem solving, creativity and innovative, and decision-making ability. In the scoring process, apart from the accuracy of facts and answers, the assessor should also emphasize the response structure in terms of the ability to apply, analyze or synthesize the concept or content learned, as well as the ability to understand and use the information provided in the scenario. The detail discussions of the main principles for assessing HOTS, concept of HOTS, various strategies to assess HOTS and scoring the response structure for the HOTS tasks would provide a thorough and comprehensive overview of the HOTS task development process. This information is very vital for the educators and teachers who should play the role as professional assessors.


Introduction
The assessment of higher order thinking skills (HOTS) is the important practices in the teaching and learning process. Nowadays, it is vital in producing the young generations who are able to compete in the 21st century world (Saido et al., 2018). Educators from the school level to higher institutions have shouldered a heavy responsibility in this initiative. They do not only play the main role as instructors but also as assessors to identify their students' HOTS development continuously. The question is, do the educators already equipped with the appropriate professional assessment skills and knowledge in assessing students' HOTS? What are the existing practices in developing HOTS assessment tasks? An assessment will lose its meaning and have a negative impact on students and teachers if there is no emphasis on the process of the development, administration, interpretation and reporting of assessment data. In other words, a weak degree of assessment validity will bring various negative implications which are very detrimental to the education system. Unfortunately, this issue is often ignored by the parties involved, whether consciously or not. Thus, this paper focuses on discussing the development of HOTS assessment task as it is the most important stage in the HOTS assessment process. It raises many serious issues that have not been resolved. The detail discussions of the main principles for assessing HOTS, concept of HOTS, various strategies to assess HOTS and scoring the response structure for the HOTS tasks would provide a thorough and comprehensive overview of the HOTS task development process. This information is vital for the educators and who should play the role as professional assessors.
Issues in the Assessment of Higher-order Thinking Skills 1. Vague and diverse understanding of the HOTS concept among educators. The concept of HOTS held by an educator will directly affect the effectiveness in practicing the assessment process (Yeung, 2015;Yusoff & Seman, 2018). Based on some current research studies, it was found that Malaysian teachers still have problems with understanding the concept of HOTS although the development of HOTS in teaching and learning has been implemented for more than 10 years in the education system (Norfariza & Nur Fadhillah, 2018). According to a study done by Yusoff and Seman (2018), majority of primary school teachers in Kuala Terengganu were unable to give a clear explanation about the concept of HOTS and only 50 percent of the teachers involved had ever asked HOTS questions based on the Bloom model in their teaching and learning. On the other hand, Mahmad et.al (2021) revealed that although the emphasis on the teaching and learning of HOTS has been implemented since 2013, but the teacher-centered system has hindered the efforts to develop students' HOTS. This becomes the main reasons that the conception held by school teachers has influenced the initiatives and enthusiasm in making effective preparations in the teaching and learning of HOTS.
This similar problem is also faced by other countries as well. For example, although the Indonesian educators are aware of the importance of the 21st century thinking development among students, but they face various challenges and problems in teaching and assessing HOTS such as knowledge about the concept of HOTS, the development and application of HOTS tasks. In another study, Wilson and Narasuman (2020) revealed that the school teachers confronted various obstacles in integrating the assessment of HOTS in the school-based assessment system, namely limited skill in developing HOTS task, professional assessment knowledge, and insufficient professional training provided by the authorities. Besides, Fensham and Bellochi (2013) claimed that majority of the assessment items in Australian high school chemistry course were categorized as lower order thinking ability level items. Similary, FitzPatrick et al (2015) stated that only around 30 % of the assessment items in Canada pharmacy courses require HOTS level. One of the main factors that cause the problems is about the HOTS assessment skills possessed by the educators.
2. Rely on the verb as a determinant to categorize the thinking ability level in the assessment task. Bloom's model has been applied in the teaching and learning process for a long time, especially in the assessment process. This model has become the main reference for the school level and higher education institutions in categorizing the thinking skill levels of the assessment tasks in a hierarchical manner. It has also become an important reference for the higher education quality monitoring process, such as Malaysian Qualifications Agency -MQA in categorizing the learning outcomes and developing the assessment tasks. However, the verbs proposed for each level in the model are always used as the main determinants to categorize the thinking level assessed by the task. The top three levels of this model are usually categorized as higher-order thinking levels, namely the level of analyzing, evaluating, and creating. The suggested key words for all three levels are as follows (Table 1): The key words (all in verb) provided only as a guide to strengthen the understanding of the concept for each level of the model. It should not be used as the main determinant of the cognitive level to be assessed by a task. This practice has often been misused over the years. In fact, the categorization of the level to be assessed by HOTS task should be based on the context of the content and the degree of originality of the task. For instance, if the educator has discussed with students during the teaching and learning process about the comparisons between theory A and theory B, the tasks or questions that involving the discussion about the differences or similarity between theory A and theory B should not be categorized at the analyzing level anymore. It may only assess the student's understanding or remembering level. In addition, a verb shown does not only represent a certain cognitive level. For example, an assessment task that assesses the students' ability to 'select' at the understanding and evaluating level should have different expectations based on the context and content of the task. The concept of 'select' at the understanding level means the students may be able to select the correct representation in explaining a certain concept or meaning. While the concept of 'select' at the evaluating level focuses on the student's ability to select the correct information in justifying and criticizing a point of view. This situation clearly shows that if the assessor ignores the principle of content validity at the initial stage of the assessment process, the negative impact will be faced by the education system.

The main Principle for Assessing HOTS
Before discussing the main principles for assessing HOTS, educators should master the three fundamental principles in the development of good quality and high degree of validity of the assessment tasks (Brookhart & Nitko, 2019), namely: 1. The content and scope of assessment tasks should be able to assess the main and intended learning standards or learning outcomes for a topic or chapter to be assessed. The test specification table is usually the main reference to ensure that the intended learning standards are representative of the content domain and include different cognitive levels. This principle aims to avoid the development of assessment tasks that focus on the trivial learning standards which are too easy or difficult. 2. The assessment tasks developed should be in line with the cognitive level of the learning standards to be assessed. In other words, the assessment of skills and knowledge in an assessment task are aligned with the learning standards. It should not go beyond the particular cognitive level as it will cause the students to guess the answer or fail to respond. Meanwhile, the assessment tasks which are too easy are not able to assess the students' true ability and achievement. 3. The use of clear and simple language and sentence structure in the development of assessment tasks are very important to ensure their true abilities can be assessed. Besides, the instructions in the assessment task should also be clear and easy to be understood. The students are able to follow the assessment procedures and requirements accordingly. Also, this principle stated that hints or clues should not be provided in the assessment task as it will help the students to reach the answer easily. However, candidates should not be trapped by language factor that leads to the difficulty in understanding the tasks or assessment instructions. These problems will definitely affect the quality of the assessment.
In the development of quality HOTS tasks, educators should master three additional principles, namely 4. Using a relevant source to encourage and develop students' HOTS ability. Visual materials, text, and other types of resource materials that can be used as scenario sources for the development of assessment tasks either in the selected-response or constructed-response test item format. In addition, students can also use the information of resources to support their responses when answering HOTS assessment tasks, such as carrying out a research or project. 5. Using novel material to challenge the students' HOTS ability. Novel material in this context means original source material, which has never been explored in the teaching and learning process. If the material is not novel, it may only assess the students' memorization skills. In this context, only the educators are clear whether the material is novel or not. The use of novel material does not mean the assessment involving the content which is not taught in the classroom, but rather identifying the student's ability to apply their knowledge and skills in a new context. 6. Educators should be able to distinguish the concept between difficulty and HOTS for an assessment task. An assessment task is said to be difficult if it assesses the skills and knowledge that go beyond the cognitive level of an intended learning standard. HOTS assessment emphasizes on the development of critical and creative thinking abilities as well as the problem-solving abilities in a new scenario. Thinking ability may involve simple and difficult processes. Some problems may be solved through the alternative ways of thinking. Similarly, memorization is not necessarily an easy process as expected. People always believe that the low ability students should focus on the memorization aspect because they are not able to achieve HOTS.
After understanding the main principles of assessing HOTS, educators should hold a clear concept of HOTS.
According to Brookhart (2010), the definition of HOTS is focused on the three main categories, based on the analysis of the various definitions expressed by researchers and experts. The three categories are transfer, critical thinking and problem solving. The transfer category emphasizes on the meaningful learning, i.e. learning does not only involve memorization, it should focus on the student's ability to 'transfer' the knowledge and skills acquired in designing and inventing the new product or problem solving. In other words, students should be able to demonstrate their abilities in applying all the knowledge and skills learned. The second category is critical thinking. It involves the element of reflective thinking, namely analyzing conclusions on everything which is believed and done. This thinking also involves artful thinking which include the thinking about producing an idea in the form of an image, feeling or art. This type of thinking is very important to support thoughtful learning such as reasoning, investigating, observing, comparing, looking for complexity, exploring different points of view. The third category is problem solving. This category describes the initiative of student to achieve a certain result or solution when encountering the problems. Of course, the solution will not be achieved directly or automatically. The journey of achieving the solution needs to be focused in the development of HOTS because this process will involve several complex thinking processes. When facing with a situation, students need to apply the relevant information, learn to understand, evaluate ideas critically, form the creative alternatives, and communicate effectively. All these definitions are similar to the definition of HOTS discussed by the Malaysian Curriculum Development Division (2014). HOTS is the main focus in the 21st century skills development. In general, the development of students' thinking skills consists of critical and creative thinking skills. Both types of skills involve reasoning processes and structured thinking to solve problems, create a product, become innovative and able to make decisions. HOTS refers to the student's ability in applying the skills, knowledge and values to solve problems, create a new product, become innovative and able to make decisions. All these aspects are interrelated and can be assessed simultaneously.

Strategy to assess HOTS
If the assessor relies only on verbs to develop the HOTS assessment task, he or she may fail to explain and justify clearly the types of thinking being assessed. Therefore, assessors should equip themselves with the skills in selecting the appropriate strategy when developing the assessment task. The discussion of various strategies that can be applied in assessing the development of HOTS in terms of problem solving, creating, innovating and making decisions are as follows.
1. The strategies of the HOTS task development for assessing problem solving abilities a. Identifying the problem: Presenting with a scenario, students can be asked to identify the main problem that needs to be solved. b. Demonstrating the linguistic understanding: Presenting some problems, students are required to outline and solve the key phrases or vocabulary related to the context of the problems. Students also need to use their own sentences to explain the meaning of linguistic features in the problems.
c. Identifying the irrelevant: Presenting the relevant and irrelevant material along with the problem. Students are required to identify the irrelevant information in finding the solution for the problem. d. Sorting the problem cards: Presenting a collection of problems or several types of problems. Students are required to sort the problems into categories or groups, then explain their reasoning about their sorting. This strategy is particularly suitable for assessing students' abilities in organizing all the provided problems that can be solved by using the same principle, theory or formula. e. Identifying the obstacles: Presenting a complex problem by deliberately ignoring some important information. Students are required to explain why this problem is difficult to be solved, the types of obstacles encountered, and additional information needed. f. Making a justification about the strategy selected: Presenting the problem along with two or more solutions. Students are required to explain why the strategies are correct. They need to make sure that the problem can be solved with various strategies. g. Integrating the data: Presenting a problem statement and some source materials such as stories, cartoons, graphs, tables and so on. Students are required to use the source material in solving the problem, and explaining the procedures to reach the solution. h. Using analogy: Presenting a problem statement and a correct solution, students are required to describe other problems that can be solved with the same strategy. Besides, they should provide a strong justification for the problems that they formed. i. Solving the problems backwards: Presenting a complex problem situation or task involving various solution steps. Students are required to solve the problems backwards. That is, based on the solution, the students need to make a plan or strategy to solve the problem. For example, asking students to plan the steps or plan the time framework in completing a project task.
2. The strategies of the HOTS task development for assessing the ability creativity and innovative ability a. Identifying the assumptions: Presenting a problem statement and students are required to propose a creative and innovative solution. Students also need to state assumptions about the problem in order to reach the solution. b. Elaborating multiple strategies: Presenting a problem situation and students are required to solve the problem in two or three creative ways. The solution should be shown in the form of a diagram, picture or graph. c. Modeling the problem: Presenting a problem and students are required to draw a diagram or picture to represent and explain the problem situation clearly and creatively. d. Generating the alternative strategies: Presenting a problem statement and students are required to state two or more solutions (unique and creative). Students can be required to decide the most effective solution. e. Focusing on the question: Presenting a problem statement, government policy or an experiment. Students are required to state the main problem and the criteria for evaluating the arguments in the problem. f. Clarifying the questions: Presenting a description or an argument for a situation.
Students are required to construct the questions that will be addressed to the author or speaker. They also need to state the reasons for the constructed questions.
3. The strategy of the HOTS task development for assessing decision-making ability a. Justifying a solution: Presenting a problem statement along with two or more possible solutions. Students are required to identify the most appropriate solution and justify their selection. b. Evaluate and determine the quality of solutions: Presenting a problem and students are required to evaluate the different solutions. Students can also be asked to present the different solutions and evaluate their quality and effectiveness. c. Evaluating the solutions systematically: Based on strategy (b), students will be assessed about their ability to follow the systematic procedures in evaluating the proposed solution. d. Assessing the credibility of sources: Presenting the texts, arguments and advertisements. The students are required to state which part of the source is credible or not credible to be applied. e. Deductive reasoning for the decision making: Presenting a problem statement with one logical conclusion and two or more illogical conclusions. Students are required to determine appropriate and logical conclusions to be applied. f. Evaluating inductively to make the decisions: Presenting a problem statement or data information. Students are required to determine logical and appropriate conclusions. They need to justify their conclusion. g. Making a value judgment: Presenting the description of a situation, problem statement and an expected solution. Students are required to determine the appropriateness of the solution in terms of the value to be tested. The strong reason must also be stated.

Response Structure for the HOTS Tasks
Response structure refers to the structure or pattern of the student's answer in responding an assessment task. It reflects the students' cognitive abilities for the assessed content. In the scoring process, apart from the accuracy of facts and answers which will be the main criteria, the assessor should also emphasize the response structure in terms of: 1. the ability to apply, analyze or synthesize the concept or content that has been learned. 2. understanding and using the information provided in the scenario. The assessment of these two aspects is interrelated. A weak response structure indicates the ability to directly quote a piece of the information from the scenario to give a response without relating (such as analyzing) to the concepts or facts that have been learned. Although the use of the information is correct but the response structure is loose and incomplete. While a good response structure shows the ability to relate all the relevant information provided in the scenario with the concept or fact that has been mastered to make a robust and comprehensive inference, generalization or analysis. According to Biggs and Collis (1982), in general, the quality of student's response structure can be classified into four main levels, from concrete to abstract and complex. Although the characteristics of the response structure are general, but they are suitable and easy to be applied, especially in providing hierarchical scoring description rubrics. The first level (unistructural), one aspect of the basic information provided in the scenario can be understood and applied to give the response. However, other aspects of the information have been neglected. The structure of the response shows a failure to relate concepts or facts learned.
The second level (multistructural), some or all of the relevant information provided in the scenario can be understood and applied to give the response but fails to be related based on concepts or facts that have been learned. The information is still used directly without making analysis.
The third level (relational) shows the ability to relate all the relevant information provided in the scenario to make an analysis, generalization or synthesis based on the concepts or facts learned. The highest level (extended abstract) not only shows the ability to relate all the relevant information provided in the scenario, but is also able to make predictions, hypotheses or find the alternative solutions based on concepts or facts that have been mastered and their existing knowledge as well.
Based on the characteristics of these four levels, the two lower levels (unistructural and multistructural) are called the surface level as they only involve the use of information provided directly without relating to the concepts or facts learned. While the two higher levels (relational and extended abstract) which are also known as deep levels. It demonstrates the higher thinking ability because the information provided are able to be related to make generalizations, form hypotheses or predictions. The characteristics of this hierarchical response structure are suitable to be applied in the preparation of a more detailed and systematic description in the scoring rubric.

Conclusion
HOTS assessment is not a simple process. Moreover, the assessment needs to prioritize the achievement of a good degree of validity. Each stage of the task development should be carried out professionally, as well as the development of the scoring rubrics. Besides, the newly developed task and scoring rubric should also be evaluated by an experienced party to determine its content-based evidence of validity. A pilot study is also highly encouraged to ensure the aspects of language used, question instructions, descriptions in scoring rubrics can be easily understood by all parties involved.