INVESTIGATING UNIDIMENSIONALITY OF MATHEMATICS MULTIPLE- CHOICE TEST ITEMS USED FOR JOINT MOCK EXAMINATION IN KWARA STATE PUBLIC SCHOOLS

ARTICLE INFO ABSTRACT Mathematics is an essential subject for human existence on day-to-day activities. This is an indication that it if properly handed it could enhance national development. Therefore, in test items construction to examine students’ performance in the subject careful considerations must be given. The purpose of the study is to investigate of student of mathematics multiple-choice test items used for a joint mock examination in Kwara State public schools. The study adopted a descriptive research survey type. The population comprised students in Senior Secondary Schools while the target population consisted of Senior Secondary School II students. The multistage sampling technique that involved stratified, proportionate and simple random sampling was used at different stages of selections. The measuring device was the 2018/2019 Kwara State Joint Mock Mathematics Test. The test is a standardized test and thus the validity and reliability coefficients were not disclosed to the public. The statistical technique employed was factor analysis using. This study, therefore, concluded that only 41 (82%) of 50 multiple-choice items revealed evidence of unidimensionality. One of the recommendations was that those items that violated unidimensionality assumption should be revisited so that instrument could be 100% compliance to unidimensionality. Received: April, 2021 Revised : May, 2021 Accepted: August, 2021


INTRODUCTION
In 2009, the Federal Government of Nigeria directed the Nigerian Educational Research and Development Council (NERDC) to review and realign the previous Senior Secondary School (SSS) curriculum to align with the Nigerian educational reforms in which Mathematics was included. Its inclusion was a result of being the backbone of sciences and other disciplines. In Nigeria, Mathematics is one of the core subjects in the school curriculum from primary to the secondary school level and even reflected in the tertiary educational curriculum as recognized by the national policy on education (Federal Republic of Nigeria, 2004, Awofala, 2011and Federal Republic of Nigeria, 2013. Its inclusion in the school curriculum is a result of the role it plays as an indispensable tool in day-to-day activities (Peter, 2011). The knowledge of mathematics involves calculations, computations and solving of daily problems. Mathematics is a body of knowledge that deals with quantitative measurement and manipulation of things and facts. This body of knowledge deals with numbers and their operations and hence, it plays a significant role in day-to-day interaction with other fellow human being especially outof-school interaction that used to solve critical problems and even placing valued judgement on issues (Jayanthi, 2019). Ebiendele (2011) noted that Mathematics is an essential subject for human existence in day-to-day activities. This, therefore, warrants its inclusion in the school curriculum (Bennison. 2015).
To ascertain the above-mentioned qualities are achieved, the learners need to be examined periodically and thus, mathematics test items must be systematically prepared. Based on this assertion, two measurement procedures (Classical Test Theory and Item Response Theory) are available to ascertain the quality of mathematics test items.
Classical Test Theory (CTT) is a traditional quantitative approach in testing the reliability and validity of a scale-based items (Cappelleri, Lundy and Hays, (2014). This theory assumes that each observed score (X) of a person is the combination of an underlying true score (T) and unsystematic error (E). The Item Response Theory (IRT) assumes probabilistic distribution of examinees' success at the item level. Item Response Theory focuses primarily on item-level information while CTT focuses on test-level information (Fan, 1998). This study, therefore, emphasis the use of IRT. Most IRT models assume that the latent variable of the examinee is represented by a unidimensional continuum. De Ayala, (2009) noted that that IRT investigates two things (i.e., the characteristic of each item in a test and how each the examinee reacts to the items based on his/her ability. At this juncture, it must be noted that there are basic underlying assumptions in IRT, one of such is dimensionality of test item which can be in two forms (i.e., unidimensionality and multidimensionality). In this study therefore emphasis is on unidimensionality.
Uni-dimensionality theory in IRT assumes that multiple-choice items in a test are expected to measure a single trait (ability) of test-takers (Meijer & Tendeiro (2018). This implies that all items in a test is expected to have a common strand, measuring a single variable while multidimensionality assumes that two or more traits such ability, intelligent, motivation, attitude etc.) could be measured in a test. Ziegler & Hagemann (2015) states that an item will be considered unidimensional, if the difference in answering an item is due to only one variance source called latent trait (ability), not based on any other attributes. The model informs that differences in answering a multiple-choice item must be due to differences in ability (latent trait) of the test-taker only (Meijer & Tendeiro (2018). This can be hypothetically illustrated as shown in figure 1.  Figure 1 shows that there are six test multiple-choice items (i.e., X1, X2, X3, X4, X5 and X6) which are usually dichotomously scored as right or wrong. The circle that contains Ꝋ represents test-takers ability (latent variable). The pointed arrows indicate the number of questions/items prepared from three topics. Two questions were drawn from each topic and all these topics are expected to measure only one attribute (ability) of the test-takers.
The assumption is best shown on a curve graph shaped an "S". This shape can be explained by a mathematical function called Item Characteristic Curve (ICC) and the graph of this function is termed Item Characteristic Curve (ICC). The most popular mathematical form of ICC is usually displayed on a graph of the X and Y-axis. The Xaxis is the ability (latent trait) denoted by symbol Ꝋ, and the Y-axis is the chance of getting the item correct which is denoted by P (Ꝋ). This implies that as the ability of the test-taker increases the chances of getting an item correct must also increase as indicated in Figure 2. In the X-axis, the ability (Ꝋ) ranges from -3 to +3 while the Y-axis (PꝊ) ranges from 0.00 to 1.00. This graph shows that the higher the latent trait the better the performance of the test-takers in the test items Given this, there have been a series of research activities on psychometric properties of test items by earlier scholars. For instance  carried out a study, that assessed the essential unidimensionality of real data. Test of Essential Unidimensionality of CAT items. Verhelst (2001) carried out a study on Testing the unidimensionality assumption of the Rasch model. He used the Martin-Löf test (ML-test) and the splitter item-technique. Slocum (2005) assessed the unidimensionality of psychological scales: using individual and integrative criteria. Factor analysis and chisquare GL were used to analyze data and test hypotheses postulated. Ikona, (2006) investigated the unidimensionality of mock mathematics for senior secondary schools in Cross River State, Nigeria. He employed chi-square statistics to investigate unidimensionality. O'Neill and Reynolds, (2006) investigated the unidimensionality of the national council licensure examination for registered nurses (NCLEX-RN). The researcher used factor analysis to identify and validate subscales within the test. In another development, Ajeigbe and Afolabi (2014) examined unidimensionality and occurrence of differential item functioning in Mathematics and English language test items in the Osun State qualifying examination. Hagell (2014) also examined testing rating scale unidimensionality using the principal component analysis.
Indeed, the present study investigated the unidimensionality of mathematics multiple-choice test items used for a joint mock examination in public senior secondary schools II in Kwara State. This joint mock examination is being conducted yearly on SS II students by the ministry of education. The basic aim is to prepare and scrutinize Senior Secondary (SS) II students that are ready to sit for WAEC and NECO external examinations. This is to reduce the percentage of failure in the external examinations.
It is no more a rumour that students in Nigeria and Kwara State in particular perform below expectations in mathematics examinations both in the internal and external environments. The attestation of poor performance could be seen from the releases of yearly West African Senior School Certificate Examination (WASSCE) reports as indicated in table 1.  Table 1 shows the trends of performance in mathematics from 2011 to 2017. All the years under consideration reveals that from 2011 to 2016, less than 50.00% except in 2016 that slightly above 50.00% which implies that generally, students' performance is below average Their inability to perform could be adduced to series of factors, among them are unconducive school environments, inadequate learning materials, the appropriate use of teachers' teaching strategies, the lukewarm attitude of parents towards their children's learning and students' learning habit (Jimoh, Balogun and Yusuf 2016).
However, much attention has not been given to individual items in a test as a factor that could have an adverse effect on students' performance in both internal and external examinations. In the preparation of multiple-choice questions, it is an assumption in IRT that item preparation must follow unidimensionality. The essence of the joint mock examination is to prepare SS II students for external examinations such as the West African Examinations and the National Examinations conducted in West Africa and Nigeria in particular. Hence, it serves as a testing dose and to face the reality of external examinations. The mathematics multiple-choice test items prepared by the Kwara State Ministry of Education is being used to ascertain the level of preparedness of the students and therefore, allow them to face the reality. The set of items are assumed to be equivalent because it is also a standardized set of item constructed by the Kwara State Ministry of Education. It must be noted that if any of the items violate this assumption, it reduces the validity and reliability of the measuring instrument (Joint mock mathematics test items).
The research objective, therefore, is to investigate the unidimensionality of mathematics multiple-choice test items used for a joint mock exam in Kwara State which was developed by the examination section of the ministry in 2019 and administered in the 2018/2019 academic session. In the course of carrying out this study, one pertinent research question was raised "To what extent do the 2018/2019 joint mock multiplechoice mathematics test items constructed by the examination section of the ministry comply with the unidimensionality assumption.

METHOD
The descriptive research design of the survey type was adopted in the study. The population of the study consisted of all in SS I, SSII and SSIII in public schools in Kwara State while the target population consisted of SS II students that sat for the 2018/2019 academic session. A multistage sampling procedure involving stratified, proportionate and simple random sampling technique was employed at different stages of selection. At the first stage, a stratified random sampling technique was employed to categorize schools based on the three senatorial districts. As of the time of this report, Kwara central has 101 senior secondary schools, Kwara North 82 and Kwara South 163. schools (Kwara State Ministry of Education, Human and Capital Development, 2019 and Ilori, Sawa, and Gobir, 2019).
At the second stage, a simple random sampling technique was adopted to select 4 schools from Kwara Central, 2 schools from North and 6 schools from South. In the third stage, all the scripts in sampled schools were considered. This is to avoid being biased in the selection of scripts. Hence, the sample size was 2,152.
The instruments used for data collection were the mathematics question paper and multiple-choice answer scripts of the candidates that sat for the joint examination. The question paper consisted of sections A and B. Section A contained 50 closed-response (multiple-choice) test items and section B contained 3 free-response (essay) test items. This study, therefore, adopted only 50 closed-response multiple-choice items.
The closed-ended section of the question paper, it has four options (A to D) from which the candidates were asked to pick one as the answer. The items constructed were generated from a wide range of SSI and SS-II national mathematics curriculum. Specifically, the items were drawn from Number and Numeration, Algebraic Processes, Geometry and Menstruation and Statistics and Probability as indicated in the Mathematics National curriculum for senior secondary school. For the researchers to have access to the candidates' script, permission was sought from the Kwara State Ministry of Education. Students' responses to the test items were extracted from the already marked examination answer scripts based on the schools sampled. In any of the schools sampled, all the scripts were considered and this was to avoid being biased in the selection of scripts. Therefore, 2,152 scripts were considered as the sampled size for this study.
The question paper consisted of sections A and B. Section A contained 50 closedresponse (multiple-choice) test items and section B contained 3 free-response (essay) test items. This study, therefore, adopted only 50 closed-response multiple-choice items which has four options (A to D). The items constructed were generated from a wide range of SSI and SS-II Nigeria national mathematics curriculum. Specifically, the items were prepared from Number and Numeration, Algebraic Processes, Geometry and Menstruation and Statistics and Probability as indicated.
For the researchers to have access to the candidates' script, permission was sought from the Ministry of Education. Students' response to the test items were extracted from the already marked examination answer scripts based on the schools sampled. In any of the schools sampled, all the scripts were considered and this was to avoid being biased in the selection of scripts. Therefore, 2,152 scripts were considered as the sampled size for this study and cross-section of students writing joint mock examinations are shown in figure 1.  Figure 1 shows the cross-section of students writing mathematics joint mock examination in one of the public senior secondary schools in Ilorin, Kwara State.For this singular study, emphasis is on mathematics multiple-choice items that are dichotomously scored. In dichotomous outcome, two levels of responses are required such as true/false, correct/incorrect, agree/disagree, right/wrong, satisfactory/unsatisfactory, pass/fail, high/low. The items are scored 0 for incorrect and 1 for a correct response.

RESULTS AND DISCUSSION
In the course of data analysis, the test-taker ability (Theta) was computed using histogram graph using WinGen version 3.0.10.433 software which ranges from -3 to +3. Factor analysis via principal component was computed using SPSS 25.0 software to answer the research question. This factor analysis is, therefore, used to discover the unexplained factors (i.e., items that do not favour unidimensionality) among multiple observations (50 multiple-choice items). These factors represent underlying concepts that cannot be adequately measured by a single variable (unidimensionality). In addition, to identify the number of items that favours unidimensionality, the Item Characteristic Curve (ICC) was computed using WinGen version 3.0.10.433 software. The latent variable (Theta) of the test-takers was examined as shown in figure 2.

Figure 2:
Test-takers Ability (Theta) Figure 2 revealed the cumulative ability (Theta) of the test-takers. The ability of the test-takers ranged from -3 to +3, and much of it concentrated between -2 to +2 as shown in the graph. They are highly fitted into the model because their abilities range are normally distributed which is otherwise called bell-shape distribution.

Research Question: To what extent do the 2018/2019 joint mock multiple-choice mathematics test items constructed by the Kwara State Ministry of Education and Human Capital Development comply with the unidimensionality assumption?
Based on eigenvalues of 50 mathematics multiple-choice items it reveals the total variance explained generated from the factor analysis carried out via principal component analysis. The results show that in the initial eigenvalues the first four principal components have eigenvalues (37.733, 5.657, 2.485 and 1.295) greater than 1. These four components explain the cumulative percentage of 94.341% of the variation in the data, and therefore, they are retained (See Initial Eigenvalues in table 1). The first factor has an initial eigenvalue of 37.733 which is greater than the second factor of 5.657. The four factors are, therefore, extracted as sums of squared loadings as illustrated in table 2 All the remaining factors are less important because they are less than 1 (See Table  1). Since the percentage of total variance explained by the first principal component is 75.467% often regarded as a very high index which assumes the items in the test are unidimensional due to only one variance (the ability of the test-taker). Having used eigenvalues to describe the unidimensionality of joint mock mathematics multiple-choice test items, a scree plot is used to affirm the number of factors retained. In the scree plot, the point of interest is where the curve starts flattened as shown in figure 4.  Figures 5a to 5h were randomly selected from the Item Characteristic Curve of items 1 to 50. Examination of the 50 figures reveals that 41 (82%) of the items assumed an "S" shape. Among them are 1,2,3,4,5,6,7,8,9,10,11,13,14,15,16,17,19,20,21,22,23,24,25,26,27,28,30,33,34,35,36,37,39,40,41,42,44,45,46,47 and 48. The rest nine items (12,18,29,31,32,37,43,49 and 50) violated the assumption of "S" shape. In these 41 items, as the ability of the test-takers increase, the probability of getting the answer correct increases. Those nine (9) items that violated local item independence are shown in figure 6a to 6i  Figure 6. Item Characteristic Curve The finding of this study revealed that 41 (82%) out of 50 mathematics multiplechoice items constructed by the Kwara State Ministry of Education and Human Capital Development in the 2018/2019 academic session are unidimensional. This finding is justifiable by the assumption that the mathematics multiple-choice items used is a mock state-wide unified examination which is the product of test standardization, though composed of subsets of items that measure different but related aspects of the mathematical concepts. It is also attributable to the non-violation of local item dependency and monotonicity.
The result of this study agrees with Orim (2015) who in his test of unidimensionality of WAEC and NECO Biology items, established unidimensionality in 47 out 50 items for WAEC and 39 out 50 for NECO. The finding of this study corroborates the finding of Ikona (2016) that showed that all the 20 items selected for the study were significantly unidimensional, which supposed that there was compliance with the unidimensionality assumption of IRT. The finding also confirms the finding of Ajeigbe and Afolabi (2014) when they observed that each of the multiple-choice in mathematics and English language items constructed by the Osun State Ministry Education measured a single construct which showed evidence of unidimensionality. The finding of this study is in line with Hafiz, Bichi, and Abdullahi, (2016) that the IRT assumption of unidimensionality prevailed in the test and that one dominant factor existed that explained the majority of the variance which was the condition assumed unidimensionality.
The finding is contrary to  who assessed the essential unidimensionality of real data. He came up that the test items did not measure only one trait. The items in the test were influenced by secondary traits in addition to the major trait intended to be measured.

CONCLUSION
The observations and the findings made by the researchers showed that mathematics multiple-choice items constructed and administered by the Kwara State Ministry of Education and Human Capital Development in 2018/2019 academic session regarded as a standardized state-wide test did not reveal total compliance with unidimensionality parameters. Based on the conclusion reached in this study, the following recommendations are before items constructed are administered to the intended respondents, the examination Kwara State Ministry of Education and Human Capital Development should always revisit those items to make sure they are aligned with the unidimensionality assumption. The examination section of Kwara State Ministry of Education and Human Capital Development should employ expert test developers or psychometricians so that they examine those clues that could affect the validity and reliability of their instrument and test. This allows the developers to ascertain all the test item functions and address whatever parameter being measured. Training and re-training of test items developers should be given priority by the examination section of Kwara State Ministry of Education and Human Capital Development. This could be through a series of workshops and conferences to update them.