Perception Scale of Online Learning in the Indonesian Context During the Covid-19 Pandemic: Psychometric Properties Based on the Rasch Model

This study aims to evaluate the psychometric properties of students' perception scales about online learning during the Covid-19 pandemic in Indonesian culture. This study involved 176 students (Male = 54% and Female = 46%) at the junior and senior high school levels from public schools in Yogyakarta, Indonesia. The age of the respondents ranged from 11 to 17 years, with a mean of 13.5 years and a standard deviation of 1.4 years. The online learning perception scale adopts 16 items developed by Bhagat and colleagues. The psychometric properties of the scale were evaluated based on the reliability of the person and item, the suitability of the Rasch model, the functionality of using a 5-point rating scale, and its unidimensionality. The analysis results show that the scale has good consistency and performance in the Indonesian context. Sixteen items are a good fit for the model and are unidimensional. The 4-point Likert rating scale is more effective than the original 5-point rating scale. So, 16 items in POSTOL have adequate psychometric properties to be used on students in Indonesia. This is an open-access article under the CC–BY-SA license.


I. Introduction
The Covid-19 pandemic has accelerated sudden and unanticipated shifts in students' and teachers' preferred learning modes. The greatest temporary approach to reducing the rate of Covid-19 transmission in various parts of the world, including Indonesia, is to use online learning techniques. Supporting online learning methods is crucial for minimizing the effects of pandemics on education [1]- [3]. Online learning media have increased students' knowledge capacity and skills [4], [5]. However, due to these rapid and unpredictable changes, students may not be fully prepared for online learning [6], [7]. Analyzing student perceptions of online learning will thus assist teachers and stakeholders in developing the following policy.
Students' perception of online learning is an essential issue in online education [8]. Students perceive the benefits and drawbacks of internet-based learning [9], [10]. A good attitude toward online learning will help with integration and process success [11]. On the other hand, online learning strategies burden students and parents [12]- [14]. The support of resources and encouragement of learning requirements influence students' learning throughout the pandemic [15]. These facilities include the availability of hardware and software as well as internet connections [16], [17], and the internet in Indonesia is still uneven [8], [18]. Numerous research has examined students' perceptions of online learning over the past two years [19]- [23]. One of the scales developed to measure perception is the Perception of Students Towards Online Learning (POSTOL). This scale was developed by Bhagat et al. [24] and has been implemented by undergraduate, master, and doctoral students in Taiwan. One of the scales that has been evaluated to determine how students feel about online learning is POSTOL. Through the use of the classical test theory, POSTOL's quality has been evaluated. POSTOL is formed by four factors: Social Presence (SP), Instructor Characteristics (IC), Instructional Design (ID), and Trust (TR). However, because of the disparities in educational levels and cultural circumstances, this scale cannot be directly applied to students in Indonesia. Therefore, a process of adaptation is required to assess POSTOL's psychometric attributes [25]- [27].
Evaluation of students' perceptions of online learning needs to be done immediately to see the supports and obstacles to its implementation over the last two years. It aims to increase the effectiveness and efficiency of online learning. So, this does not cause a loss of learning in students. The evaluation results will provide relevant and accurate information when using a scale that has good psychometric properties.
The evaluation outcomes using the classical test theory approaches, EFA, and CFA, which have previously been reported, have not given comprehensive information on psychometric properties. So, a contemporary test theory must be used to support it (Rasch model). The Rasch model provides additional in-depth details on psychometric properties. Examples of psychometric properties that cannot be described by classical test theory include the Likert rating scale's functioning, unidimensionality, the scale's bias towards respondent demographics, and item fit (item difficulty level and respondent ability) [28]. Therefore, this study aims to evaluate, using the Rasch model, the psychometric properties of POSTOL in high school students and the Indonesian cultural environment. This paper adds to and supports POSTOL psychometrics' ability to operate across cultural boundaries.

II. Theory Cross-cultural adaptation process
One of the common mistakes while adapting measuring instruments is relying solely on translation from the original language to the destination language. Furthermore, the adaptation process is more than merely translating measuring instruments. However, it is necessary to contextualize the socio-cultural situation of the destination user. It is widely understood that the items must be linguistically translated and culturally contextualized if a measurement tool is to be used across cultures. The linguistically and culturally translated items seek to uphold the conceptual validity of the instrument's content across various cultural contexts.
The self-report scale by Beaton et al. [29] was translated and culturally adapted following their crosscultural adaptation standards. The generally accepted standards approved the final version for choosing measuring tools [26]. The adaptation process aims to ensure that the source and target questionnaires are semantically, idiomatically, experientially, and conceptually equivalent. The suggested procedure for cross-cultural adaptation is shown in Figure 1.

POSTOL Scale
The POSTOL is one of the scales Bhagat et al. [24] developed to evaluate students' perceptions of implementing online learning. Due to the Covid-19 pandemic, almost all learning currently uses the online mode, shifting from the face-to-face mode that has been done previously. One strategy to stop the spread of the Covid-19 virus is implementing online education. More than two years of learning activities have taken place online. Students experience various experiences during online learning. This raises different perceptions between individual students.
The POSTOL scale developed by Bhagat et al. [24] consists of four dimensions: Social Presence (SP), Instructor Characteristics (IC), Instructional Design (ID), and Trust (TR). These four dimensions have been established through 2-stage factor analysis. The first-factor analysis was carried out through Exploratory Factor Analysis (EFA). All items are naturally grouped based on the existing data at this stage. The next stage is the structure formed at the EFA stage, re-confirmed through Confirmatory Factor Analysis (CFA), and 16 items have been obtained that have met the fit for the model.

Rasch Model
The Rasch model was first developed by a Danish mathematician, Georg Rasch [30]. Rasch modeling is part of the Item Response Theory (IRT) which only focuses on one logistic parameter, namely the item difficulty level, which is viewed from two sides (item difficulty level and person ability) [31]. The Rasch model was developed to measure latent human traits, such as cognitive and noncognitive aspects (opinions or perceptions). Because the measurement is a latent variable, the Rasch model places its position as a model that can change the instrument into a measuring scale as a measuring instrument in physics. Therefore, the fundamental idea behind the Rasch model is to create a logit ruler with the same interval scale for both the difficulty of the item and the person's ability [32]. This model can create a hierarchy between persons (test takers or students) and test items [33]. The Rasch model employs a probabilistic model. Students can provide an accurate response depending on comparing a person's ability and item difficulty. The raw scores are processed using a logarithmic equation to compare the person's abilities and the item's level of difficulty directly.

Psychometric Properties
When choosing and employing an instrument to measure unobservable constructs, it is critical to examine its psychometric properties [34]. The validity and reliability of measuring devices are referred to as psychometric properties [35]. Before it can be declared that the questionnaire has good psychometric features, which means that it is trustworthy and valid, the scale must be thoroughly analyzed [36]. The main activities in psychometry include the construction or compilation of various psychological theories into psychological measuring tools/psychological test tools, as well as the development and analysis of data from these measurements [37].
The investigation of measuring qualities like measurement invariance, internal consistency, and structural validity in education has been conducted extensively using the Rasch analysis as a contemporary psychometric approach [38]. Aspects investigated to evaluate psychometric properties include (a) Person and item reliability, person and item separation index and internal consistency, (b) Item fit with the model and its level of difficulty, (c) principal component analysis (PCA) of residual for structural validity, and (d) item differential function (DIF) to measure invariant [28], [39]- [41].

III. Method Participants
The sample size must be determined to ensure the stability of the estimation results. A minimum sample size of 50 people is needed to reach an accuracy of 1 logit with a confidence level of 99 percent [42]. Ling Lee et al. [43] suggest using between 50 to 250 respondents to evaluate the model's goodness. Therefore, it is believed that the 176 respondents satisfied the minimum sample size. The analysis did not include 3 of the 176 respondents since they were in an outlier situation. Table 1 lists the respondents' demographic information.

Instrument
The POSTOL, translated into the Indonesian version, came from the scale of students' perceptions of online learning developed by Bhagat et al. [24]. POSTOL consists of 4 factors/dimensions, namely: Social Presence (SP, 5 items), Instructor Characteristics (IC, 5 items), Instructional Design (ID, 3 items), and Trust (TR, 3 items). The translation process is carried out by lecturers from the English language field using forward-backward translation techniques [44]. A WhatsApp survey is created from the translation and sent to possible respondents. Local school teachers participated in the two-week data collection process. The researcher guarantees the confidentiality of the information provided by the respondents, and student participation is optional. We emphasize this when introducing the instrument to give respondents flexibility in their responses.

Data Analysis
Four key metrics-reliability, model fit, use of a 5point Likert rating scale, and unidimensionality-were used to assess the psychometric properties of POSTOL. The Rasch model was used to examine the instrument's psychometric properties. To analyze the data, Winsteps 4.6.1 and Ms. Excel were both used. We use a cut-off value of ≥ 0.70 to show reliability because it is recommended [45]. The item's fit against the model was evaluated using the Infit MnSq and Outfit MnSq criteria in the range of 0.5 -1.5 [46]. The functionality of the Likert rating scale was evaluated according to the criteria used by Llamas-Ramos et al. [47]. Meanwhile, unidimensionality is evaluated based on raw variance explained by measures and unexplained variance in the 1st contrast.

IV. Results and Discussion Summary Statistics of POSTOL
An overview of the statistical findings from the POSTOL instrument adaption is presented in Table 2. The study reveals that the item and person separation index are 8.00 and 1.67. The reliability for the person is 0.74, and the reliability for the item is 0.98. At the same time, the test reliability score, indicated by the Cronbach alpha value, is 0.75. Table 3 summarizes the match index of the 16 items in POSTOL by entry. Based on Table 3, the Infit MnSq is 0.74 to 1.45, while the outfit MnSq is in the 0.71 to 1.52 range. The item analysis yields a difficulty level ranging from -1.65 to 1.19 logit, and the Standard Error (S.E) model ranges from 0.08-0.17 logit. ID3 is the easiest item with a Model S.E value of 0.17. In contrast, the most difficult item is owned by TR2 with a Model S.E value of 0.08. The average value of the items is 0.00, and the standard deviation is 0.93.

Likert Rating Scale
Rating scale analysis was performed to prove the functionality of the 5-point Likert rating scale used in POSTOL. Table 4 shows the nature of the structure of the Likert rating scale used. In the second column, most of the response categories are in categories 5 (Strongly Agree), 4 (Agree), and 3 (Doubtful). The third column shows the average of all people who chose each category. This average increases monotonically. In the fourth and fifth columns, Infit MnSq is in the range of 0.92 to 1.19, and the Outfit MnSq value is 0.88 to 1.29, indicating that each category is within the acceptable limits. The sixth column shows the estimated POSTOL thresholds in the order of zero, -1.03, -0.86, 0.17, and 1.72. Graphically, the responses of each category are represented through the probability curve in Figure 2. Based on the probability curve, the category 2 scale does not show a separate peak, so it does not represent a unit of construction being measured. This follows the threshold value in Table 4.

Unidimensionality
The unidimensionality of the POSTOL scale was determined through PCA of the residues. Empirically, the raw variance explained by measures is 45.7%, the unexplained variance in the 1st contrast is 10.0%, and the Eigenvalues are 2.78. This metric is needed to determine whether POSTOL can accurately measure students' perceptions during online learning. If the raw variance explained by measures is more than 40% and the unexplained variance in the 1st contrast is less than 15%, scale unidimensionality is achieved [48], [49]. This indicates that POSTOL has good unidimensionality.

Discussions
This study aims to evaluate the psychometric qualities of the adapted instrument used to gauge students' perceptions of online learning during Covid-19. Winsteps software version 4.6.1 was used to analyze the data to verify the construct validity of the POSTOL [50]. The results of the initial statistical test showed that the person could distinguish 16 items in 8 groups [51]. Linacre [52] states that a good separation index is > 2.0. Person reliability is included in the Good category, and item reliability is included in the Special category [45], [53]. This shows consistency in the respondents' answers, and the quality of the items in POSTOL is special. On the other hand, the quality of the interaction between the person and the item as a whole is viewed from the Cronbach alpha value [33]. The analysis results show that the person and item have a good interaction. This finding supports the results of the consistency analysis of the POSTOL instrument, as evaluated by Bhagat et al. [24]. The next step is carefully studying the match index through Infit MnSq and Outfit MnSq. The analysis results show that all items fit well with the Rasch model except for the SP5 item, "Reading my classmates' work will help improve the quality of my work." Item SP5 has an Outfit MnSq value of 1.52, outside the range of 0.5-1.5. However, the Pt. Mea. Corr. have values from 0.30-0.70 [54], [55]. Pt. Mea. Corr. a high level indicates that an item can distinguish the respondent's ability [54]. That is, the response pattern has an orientation in the same direction as the general response pattern. So SP5 items need to be preserved. The discrepancy of SP5 items can be in the form of using negative words or giving a negative impression [56]. The Indonesian version of POSTOL using the Rasch model supports the validity of the original version of POSTOL, which was analyzed using a factor analysis approach [24].
The functionality of the 5-point Likert scale is evaluated in order of threshold. Although there is an increase in the threshold value with the category value, the threshold increases irregularly. This shows that the categories are not clearly defined for the respondents. Respondents cannot clearly distinguish the 5 Likert scale options provided, so it is necessary to simplify the rating scale to 4 Likert rating scales [57]. Figure 2 visualizes us combining scales 2 and 3 because scale 2 does not have a peak of its own. So the use of the scale becomes more effective because the category interval becomes wider [56]. These findings complement the psychometric properties of POSTOL that have not been previously reported by Bhagat et al. [24].
Unidimensionality is one of the fundamental measures to assess an instrument's ability to measure what will be measured [33], [58]. Based on the value of raw variance explained by measures and unexplained variance in the 1st contrast, it shows that 16  measurement. In more depth, the absence of items from other dimensions was explored through Eigenvalue, less than 3. Thus, the POSTOL instrument adapted had good unidimensionality, and no indication of noise and items from other dimensions was found. The non-fulfillment of the unidimensionality measure can jeopardize the reliability and construct validity estimates [58].

V. Conclusion
The psychometric properties of the POSTOL instrument in the Indonesian version were evaluated based on the Rasch model. The analysis results show that the POSTOL instrument has good psychometric properties to measure student perceptions of online learning in Indonesia for junior and senior high schools. Statistically, the Indonesian version of the POSTOL instrument meets the elements of good validity and reliability. Sixteen items tested met the element of good fit to the Rasch model. Using a 4-point Likert rating scale is more effective for junior and senior high schools in Indonesia than the 5-point rating scale in the original version. In addition, the results of the unidimensionality test show that all items in the POSTOL instrument in Indonesian meet the unidimensional element. This finding recommends that teachers or instructors evaluate students' perceptions of the learning they have done during the Covid-19 pandemic.
These findings must be limited to the junior and senior high school levels because they have yet to reach various student demographics. Future research must evaluate the instrument's psychometric properties in a more heterogeneous context. The diversity of types of schools and student disciplines (social, science, health, or vocational) need to be considered to obtain information on their use in a broader area. We recommend evaluating the psychometric properties of POSTOL in elementary school-level students.

VI. Acknowledgment
Thanks to the Institute for Research and Community Service, Universitas Ahmad Dahlan, for facilitating and funding this research.