Modules in Assessment in Learning 1 For PRI
September 17, 2022 | Author: Anonymous | Category: N/A
Short Description
Download Modules in Assessment in Learning 1 For PRI...
Description
SULTAN KUDARAT STATE UNIVERSITY
ssessment of Student Learning 1
Ernie C. Cerado, PhD Ma. Dulce P. Dela Cerna, MIE Editor/Compiler
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
i
SULTAN KUDARAT STATE UNIVERSITY
Preface COVID-19 has affected the world at large, but this has also given us a glimpse of the good that exists.
- Amit Gupta
With wide-r With wide-rang anging ing cha challen llenges ges bro broug ught ht abo about ut by the pande pandemic mic in almost all communities to include academic, it otherwise brings an opportunity for the faculty to develop teaching strategy and tools to answer the learning needs of the students. The response however is not universal but rather location-specific. There can be no “one-sizefits-all” measure due to the varying resources, capacity, restrictions and peculiarities of the campus, faculty, and students. As SKSU is a state university where funds are normally limited, it is understood to have ha ve more more co cons nstr trai aint ntss than than the the ne need eded ed reso resour urce ces. s. The The facu facultltyy readiness, socio-economic administrative support and internetstudent connectivity are amonghistories, the primary considerations in selecting the most workable instructional model. Since these factors are are obvi obviou ousl slyy ch chal alle leng ngin ing, g, the the us usee of prin printe tedd le lear arni ning ng mo modu dule less emerged as the most practical modality to adopt. This instructional mater ateria iall non oneeth thel eles esss can be ex expl plooititeed bett tter er whe henn us useed in combination with other learning options such as online, SMS, voice call, face-to-face or the blended way - thus, the suggested flexible learning system. With the commitment of the university to facilitate the free reproduction of the modules for every student, it is very likely that optimal learning can still be achieved in the apparently crudest yet safest method amidst serious health challenges. Most impo Most importa rtant ntly ly,, the the stud studen ents ts are are requ reques este tedd to ma maxi ximi mize ze the the utilization of these learning modules inasmuch as this material is affo afford rded ed free freely ly.. At this this vola volatil tilee titime me,, le lett the the prin princi cipl plee of “act “activ ivee learning” comes into play; students are expected to be independent and imaginative in learning. As matured learners, be responsible in your own learning - be competent in “learning to learn.” This is the mainn reason mai reason why a lot of assess assessmen mentt exe exerci rcises ses and enric enrichme hment nt activities are provided at the conclusion of each lesson.
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
ii
SULTAN KUDARAT STATE UNIVERSITY
Table of Contents
Foreword Chapter 1
ii Outcomes-Based Education
1
Lesson 1 Understanding Outcomes-Based Education Chapter 2 Introduction to Assessment in Learning
1 16
Lesson 1 Basic Concepts and Principles in Assessing
16
Learning
Lesson 2 Assessment Purposes, Purposes, Educational Objec Objectives, tives,
32
Learning Targets and Appropriate Methods
Lesson 3 Classifications of Assessment Chapter 3
Development and Enhancement of Tests
54 71
Lesson 1 Planning a Written Test Lesson 2 Construction of Written Tests
71 90
Assessment t Lesson 3 Improving a Classroom-Based Assessmen
122
Lesson 4 Establishing Test Validity and Reliability
139
Chapter 4
Organization, Utilization, and Communication of
161
Test Results Lesson 1 Organization of Test Data Using Tables and
162
Graphs Use of Test Data Lesson 2 Analysis, Interpretation, and Use
191
Lesson 3 Grading and Reporting of Test Results
240
Appendix 1 Course Syllabus
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
278
iii
SULTAN KUDARAT STATE UNIVERSITY
CHAPTER 1 OUTCOMES-BASED EDUCATION Overview
In response to the need for standardization of education systems and proces pro cesse ses, s, man manyy higher higher educa educatio tionn ins institu titutio tions ns in the Phi Philipp lippine iness shift shifted ed attention and efforts toward implementing OBE system on school level. The shift to OBE has been propelled predominantly because it is used as a framework by international and local academic accreditation bodies in schooland program-level accreditation, on which many schools invest their efforts into. The Commission on Higher Education (CHED) even emphasized the need for the implementation of OBE by issuing a memorandum order on the “Policy Standard to enhance quality assurance in Philippine Higher Education throug thr oughh an Out Outcom comeses-Bas Based ed an andd Typol Typology ogy Bas Based ed QA” QA”.. Conse Conseque quentl ntly, y, a Handbo Han dbook ok of Typolo Typology, gy, Outcom Outcomeses-Bas Based ed Edu Educat cation ion,, and Sus Sustai tainab nabili ility ty Assessment was released released in 2014. Given the current status of OBE in the country, this lesson aims to shed sh ed liligh ghtt on some some criti critica call aspe aspect ctss of the the frame framewo work rk wi with th the the ho hope pe of elucidating important concepts that will ensure proper implementation of OBE. Also, it zeroes in inferring implications of OBE implementation for assessment and evaluation of students‟ performance. Objective
Upon completion of this chapter, the students can achieve a good grasp of outcomes-based education. Lesson 1: Understandin Understanding g Outcomes-Based Education
Pre-discussion
Primarily, this chapter will deal with the shift of educational focus from content to learning outcomes particularly on the OBE: matching intentions with the outcomes of education. The students can state and discuss the change of educational focus from content to learning outcomes. They can
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
1
SULTAN KUDARAT STATE UNIVERSITY
present a sample educational objectives and learning outcomes in K to 12 subjects of their own choice. What to Expect?
At the end of the lesson, the students students can: 1. disc discus usss ou outc tcom omes es-b -bas ased ed educ educat atio ion, n, its mean meanin ing, g, brie brieff hi hist stor oryy and and characteristics; 2. ide identi ntify fy the procedu procedures res in the implemen implementat tation ion of OBE in subje subjects cts or courses; and 3. define define outcomes outcomes aand nd discuss discuss each ttype ype ooff outc outcomes omes..
Meaning of Education
According to some learned people the word ed educ ucat atio ion n has been been derived from the Latin term “ educatum” which which means the act of teaching or training. Other groups of educationalists say that it has come from another Latin word “educare” which means to bring up or to raise. For a few others, the word education has originated from another Latin word “educere” which means to lead forth or to come out. All these meanings indicate that education seeks to nourish the good qualities in man and draw out the best in every indi in divi vidu dual al;; it seek seekss to deve develo lopp the the inne inner, r, in inna nate te capa capaci citities es of man. man. By educating an individual, we attempt to give him/her the knowledge, skills, unders und erstan tandin ding, g, intere interests sts,, attitu attitudes des,, and critic critical al think thinking ing.. Th That at is, he/she he/she acquires knowledge of history, geography, arithmetic, language, and science. Today, outcome-based education is the main thrust of the Higher Education Institutions in the Philippines. The OBE comes in the form of comp co mpet eten ency cy-b -bas ased ed
lear learni ning ng
stan st anda dard rdss
and and
ou outc tcom omes es-b -bas ased ed
qu qual alitityy
assu as sura ranc ncee mo moni nito tori ring ng an andd eval evalua uatiting ng sp spel elle ledd out out un unde derr the the CHED CHED Memo Me mora rand ndum um Or Orde derr No. No. 46. 46. Ac Acco cord rdin ingl gly, y, CH CHED ED OBE OBE is di diff ffer eren entt from from Transformational OBE on the following aspects:
The CMO acknowledges that there are 2 different OBE frameworks, namely: the strong and the weak.
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
2
SULTAN KUDARAT STATE UNIVERSITY
CHED subscribes to a weak or lower case due to the realities of the Philippine higher education.
CHED recognizes that there are better OBE frameworks than what they implemented, which does not limit HEIs to the implementation of the weak vs. the strong OBE.
Spady’s OBE or what is otherwise called transformational tr ansformational OBE is under the strong category of OBE.
What is OBE?
Outcom Out comeses-Bas Based ed Educat Education ion (OB (OBE) E) is a proces processs that that inv involv olves es the restructuring of curriculum, assessment and reporting practices in education to reflect the achievement of high order learning and mastery rather than the accumulation of course credits. It is a recurring education reform model, a student-centered learning philosophy that focuses on empirically measuring student’s performance, which are called outcomes and on the resources that are available to students, which are called inputs. Furthermore, Outcome-Based Education means clearly focusing and organizing everything in an educational system around what is essential for all students to be able to do successfully at the end of their learning experiences. This means starting with a clear picture of what is important for students to be able to do, then organizing the curriculum, instruction, and assessment to make sure that this learning ultimately happens. For education stalwart Dr. William Spady, Outcome-Based Education (OBE) is a paradigm shift in the education system that’s changing the way students learn, teachers think and schools measure excellence and success. He came to the Philippines to introduce OBE in order to share the benefits of OBE. Spady said in conceptualizing OBE in 1968, he observed the US education system was more bent on how to make them achieve good scores. “So there are graduates who pass exams, but lack skills. Then there are those who can do the job well yet are not classic textbook learners.” Furthermore, he said that OBE is also more concerned not with one standard for assessing the success rate of an individual. “In OBE, real outcomes take us far beyond the paper-and-pencil test.” An OBE-oriented learner thinks of the process of ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
3
SULTAN KUDARAT STATE UNIVERSITY
learning as a journey by itself. He acknowledged that all students can learn and succeed, but not on the same day in the same way. As a global authority in educational management and the founder of OBE learning philosophy, Spady sees that unlike previous learning strategies where a learner undergoes assessment to see how much one has absorbed lessons, OBE is more concerned with how successful one is in achieving what need ne edss to be acco accomp mplis lishe hedd in term termss of sk skilills ls and and st stra rate tegi gies es.. “I “It’t’ss abou aboutt developing a clear set of learning outcomes around which an educational system can focus,” he said. Outcomes are clear learning results that students can demonstrate at the end of significant learning experiences. They are what learners can actually do with what they know and have learned.” OutcomesBased Education expects active learners, continuous assessment, knowledge integration, critical thinking, learner-centered, and learning programs. Also, it is designed to match education with actual employment. Philippine higher education institutes are encouraged to implement OBE not only to be locally and globally competitive but also to work for transformative education. Elevating the Educational Landscape for Higher Education
This Th is shif shiftt of lear learni ning ng para paradi digm gm is impo import rtan antt an andd ne nece cess ssar aryy as glob global aliz izat atio ionn is on the the pipe pipelin line. e. St Stud uden ents ts are are no nott prep prepar ared ed on only ly for for the the acquisition of professional knowledge but they must be able to perform handson work and knowledge application/replication in different work settings and societies. Alongside with it, students should possess such generic (all-round) attributes like lifelong learning aptitude, team work attitudes, communication skills, etc. in order to face the ever-changing world/society. Learning outcomes statements to be useful should be crafted to inform effect eff ective ive ed educa ucatio tional nal pol polici icies es an andd pra practi ctices ces.. When When they they are cle clear ar about about proficiencies students are to achieve, such statements provide reference poin po ints ts for for stud studen entt pe perf rfor orma manc nce, e, not not ju just st for for in indi divi vidu dual al cour course sess but but the the cumulative effects of a program of study. The CHED required the implementation of Outcomes-Based Education (OBE (O BE)) in Ph Phililip ippi pine ne univ univer ersi sitities es an andd col olle lege gess in 20 2012 12 th thro roug ughh CH CHED ED Memorandum Order No. 46. As a leading learning solutions provider in the ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
4
SULTAN KUDARAT STATE UNIVERSITY
Philippines, learning materials are aligned with OBE through the following features: Learning Objectives - Statements that describe what learners/students are expected to develop by the time they finish a particular chapter. This may include the cognitive, psychomotor, and affective aspects of learning. Teaching Suggestions - This section covers ideas, activities, and strategies that are related to the topic and will help the instructor in achieving the Learning Objectives. Chapter Outline - This section shows the different topics/subtopics found in each chapter of the textbook. Discussion Questions - This section contains end-of-chapter questions that will require students to use their critical thinking skills to analyze the factual knowledge of the content and its application to actual human experiences. Experiential Learning Activities - This includes activities that are flexible in nature. This may include classroom/field/research activities, simulation exercises, and actual experiences in real-life situations. Objective type of tests to test knowledge of students may include any of the following: - Identification - True or False - Fill in the blank - Matching type - Multiple Choice Answer Keys to the test questions questions must be provided provided** Assessment for Learning - This may include rubrics that that will describe and evaluate the level of performance/expected performance/expected outcomes of the learners. The Outcomes of Education
Lear Le arni ning ng ou outc tcom omes es are are stat statem emen ents ts that that desc descri ribe be si sign gnifific ican antt and and essential learning that learners have achieved, and can reliably demonstrate at the end of a course or program. In other words, learning outcomes identify what the learner will know and be able to do by the end of a course or
program. Examples that are specific and relatively easy to measure are: ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
5
SULTAN KUDARAT STATE UNIVERSITY
…CAN read and demonstrate good comprehension of text in areas of the student’s interest or professional field.
…CAN demonstrate the ability to apply basic research methods in psychology, including research design, data analysis, and interpretation.
…CAN identify environmental problems, evaluate problem-solving strategies, and develop science-based solutions.
…CAN demonstrate the ability to evaluate, integrate, and apply appropriate information from various sources to create cohesive, persuasive arguments, and to propose design concepts. It is grou ground nded ed on the the prin princi cipl ples es of: cla clarit rity y of foc focus us of sig signif nifica icance nce,,
expanded opportunity for students to succeed, high expectations for quality performance, and design down from where you want to end up. Clarity of focus. Educators should be made aware and conscious about the
outcomes of education each student must manifest or demonstrate at the cour co urse se leve levell an andd that that thes thesee ou outc tcom omes es at the the cl clas assr sroo oom m le leve vell are are connected to the attainment of higher le levvel outcomes (i. e., program/institutional outcomes and culminating outcomes). Thus, at the initial stage of academic or course planning, the higher outcomes serve as guide for educators in defining and clearly stating the focus of the course/subject. This principle implies that the criteria of attainment of learning outcomes (students‟ learning performance) that can be elicited through assessments should exhibit a particular standard that applies to all learners. In effect, this standardizes the assessment practices and procedures used by educators in specific subject/course. High expe expectati ctations. ons. As stat stated ed in the the cl clar arity ity of focu focuss prin princi cipl ple, e, le lear arni ning ng
outcomes at the course level are necessarily connected to higher level outcomes. These connections warrant educators from eliciting high level of performance from students. This level of performance ensures that students successfully meet desired learning outcomes set for a course, and consequently enable them to demonstrate outcomes at higher levels (program or institutional level). Thus, the kind of assessments in OBE ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
6
SULTAN KUDARAT STATE UNIVERSITY
learning context should challenge students enough to activate and enable higher hig her ord order er thi thinki nking ng skills skills (e. g., critic critical al thi think nking ing,, decisi decision on making making,, problem solving, etc.), and should be more authentic (e. g., performance tests, demonstration exercise, simulation or role play, portfolio, etc.). Expanded Expan ded oppo opportun rtunity. ity. Th Thee
fifirs rstt an andd seco second nd prin princi cipl ples es impo import rtan antltlyy
necessitate that educators deliver students‟ learning experiences at an adva ad vanc nced ed leve level.l. In the the proc proces ess, s, ma many ny st stud uden ents ts ma mayy fifind nd it di diff ffic icul ultt comp co mply lyin ingg wit withh the the stan standa dard rdss se sett for for a cour course se.. As a phil philos osop ophi hica call underpinning of OBE, Spady (1994) emphasized that “all students can learn and succeed, but not on the same day, in the same way.” This disco dis coura urages ges edu educa cator torss fro from m ge gener nerali alizin zingg man manife ifesta statio tions ns of learne learnedd beha be havi vior or from from stud studen ents ts,, co cons nsid ider erin ingg that that ever everyy st stud uden entt is a uniq unique ue learner. Thus, an expanded opportunity should be granted to students in the the proc proces esss of lear learni ning ng an andd more more im impo port rtan antltlyy in asse assess ssin ingg thei their r perfor perf orma manc nce. e. Th Thee expa expans nsio ionn of op oppo port rtun unitityy can can be cons consid ider ered ed multi mu ltidi dime mens nsio iona nall (i. e., e., time, time, me meth thod odss and and moda modalit litie ies, s, oper operat atio iona nall principles, performance standards, curriculum access and structuring). structuring). In the assessment practice and procedures, the time dimension implies that educators should give more opportunities for students to demonstrate learning outcomes at the desired level. Thus, provisions of remedial, make ma ke-u -up, p, remo remova val,l, prac practitice ce test tests, s, and and othe otherr ex expa pand nded ed le lear arni ning ng opportunities are common in OBE classrooms. Design Desig n down down.. This This is the the mo most st cruc crucia iall oper operat atin ingg prin princi cipl plee of OB OBE. E. As
mentioned in the previous section, OBE implements a top-down approach in designing and stating the outcomes of education (i. e., culminating enabling - discrete outcomes). The same principle can be applied in desi de sign gnin ingg an andd impl implem emen entiting ng ou outc tcom omes es‟‟ asse assess ssme ment ntss in cl clas asse ses. s. Traditionally, the design of assessments for classes is done following a bottom-up approach. Educators would initially develop measures for micro learning tasks (e. g., quizzes, exercises, assignments, etc.), then proceed to develop the end-of-term tasks (e. g., major examination, final project, etc.). In OBE context, since the more important outcomes that should be primarily identified and defined are the culminating ones, it follows that the same principle should logically apply. ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
7
SULTAN KUDARAT STATE UNIVERSITY
However, in a traditional education system and economy, students are given grades and rankings compared to each other. Content and performance expectations are based primarily on what was taught in the past to students of a given age. The basic goal of traditional education was to present the knowledge and skills of the old generation to the new generation of students, and to provide students with an environment in which to learn, with little attention (beyond the classroom teacher) to whether or not any student ever learns any of the material. It was enough that the school presented an opportunity to learn. Actual achievement was neither measured nor required by the school system. In fact, under the traditional model , student performance is expected to show a wide range of abilities. The failure of some students is accepted as a natural and unavoidable circumstance. The highest-performing students are given the highest grades and test scores, and the lowest performing students are given low grades. Local laws and traditions determine whether the lowest performing students were socially promoted or made to repeat the year. Schools used norm-referenced tests, such as inexpensive, multiple-choice comput com puterer-sc score oredd que questi stions ons wit withh sin single gle correc correctt answer answers, s, to quick quickly ly rank rank students on ability. These tests do not give criterion-based judgments as to whet wh ethe herr stud studen ents ts have have me mett a sing single le st stan anda dard rd of wh what at ev ever eryy st stud uden entt is expected to know and do: they merely rank the students in comparison with eachh oth eac other. er. In thi thiss sys system tem,, gra grade de-lev -level el expec expectat tation ionss are def define inedd as the performance of the median student, a level at which half the students score better and half the students score worse. By this definition, in a normal population, half of students are expected to perform above grade level and half the students below grade level, no matter how much or how little the students have learned. In outcomes-based education, classroom instruction is focused on the skills ski lls and and compe competen tencie ciess tha thatt studen students ts mus mustt demons demonstra trate te when when they they exit. exit. There are two types of outcomes: immediate and deferred deferred outcomes. Immediate Immed iate outco outcomes mes are com compet petenc encies ies and ski skills lls acquire acquiredd upo uponn
completion of a subject; a grade level, a segment of a program, or of a program itself. Examples of these are: Ability to communicate in writing and and speaking ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
8
SULTAN KUDARAT STATE UNIVERSITY
Mathematical problem-solving skills
Skill in identifying objects by using the different senses
Ability to produce artistic or literary works works
Ability to do research and write the results
Ability to present an investigative investigative science project
Skill in story-telling
Promotion to a higher grade level
Graduation from a program
Passing a required licensure examination
Initial job placement
On the other hand, deferred outcomes refer to the ability to apply cognitive, psychomotor, and affective skills/competencies in various situations many years after completion of a subject; grade level or degree program. Examples of these are:
Success in professional practice or occupation
Promotion in a job
Success in career planning, health, and wellness
Awards and recognition recognition
Summary
The chang changee in educa educatio tional nal per perspe specti ctive ve is calle calledd Outcom Outcomeses-Bas Based ed Education (OBE) which is characterized with the following:
It is student-centered; that is, it places the students at the center of the process by focusing on Student Learning Outcome (SLO).
It is facult facultyy driven driven;; that that is, it enc encour ourag ages es fac facult ultyy respon responsib sibili ility ty for teaching, assessing program outcomes, and motivating participation from the students.
It is meaningful; that is, it provides data to guide the teacher in making valid and continuing improvement in instruction and other assessment activities. To implement OBE on the subject or the course, the teacher should
identify the educational objectives of the subject course so that he/she can help students develop and enhance their knowledge, skills, and attitudes; ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
9
SULTAN KUDARAT STATE UNIVERSITY
he/she must list down all learning outcomes specified for each subject or the course objectives. A good source of learning outcomes statements is the taxonomy of educational objectives by Benjamin Bloom which is grouped into three domains: the Cognitive, also called knowledge, refers to mental skills suchh as rem suc rememb emberin ering, g, und unders erstan tandin ding, g, app applyin lying, g, ana analyz lyzing ing,, evalua evaluatin ting, g, synthesizing, creating; the Psychomotor, also referred to as skills, includes manual or physical skills, which proceed from mental activities and range from the simplest to the complex such as observing, imitating, practicing, adapting, and innovating; the Affective, also known as the attitude, refers to growth in feelings or emotions, from the simplest behavior to the most complex such as receiving, responding, valuing, organizing, and internalizing. The emphasis in an OBE education system is on measured outcomes rather than "inputs," such as how many hours students spend in class, or what textbooks are provided. Outcomes may include a range of skills and knowledge. Generally, outcomes are expected to be concretely measurable, that is, "Student can run 50 meters in less than one minute" instead of "Student enjoys physical education class." A complete system of outcomes for a subjec subjectt area area nor normal mally ly inc includ ludes es everyt everythin hingg from from me mere re recita recitatio tionn of fact fact ("Students will name three tragedies written by Shakespeare") to complex analysis and interpretation ("Student will analyze the social context of a Shakespearean tragedy in an essay"). Writing appropriate and measurable outcomes can be very difficult, and the choice of specific outcomes is often a source of local controversies. Learning outcomes describe the measurable skills, abilities, knowledge or values that students should be able to demonstrate as a result of a completing a course. They are student-centered rather than teacher-centered, in that they describe what the students will do, not what the instructor will teach. They are not standalone statements. They must all relate to each other and to the title of the unit and avoid repetition. Articulating learning outcomes for students is part of good teaching. If you tell students what you expect them to do, and give them practice in doing it, then there is a good chance that they will be able to do it on a test or major assignment. That is to say, they will have learned what you wanted them to know. If you do not tell them what they ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
10
SULTAN KUDARAT STATE UNIVERSITY
will be expected to do, then they are left guessing what you want. If they guess wrong, they will resent you for being tricky, obscure or punishing. Finally, outcomes assessment procedures must also be drafted to enab en able le the the teac teache herr to de dete term rmin inee the the de degr gree ee to wh whic ichh the the st stud uden ents ts are are attaining the desired learning outcomes. It identifies for every outcome the data that will be gathered which will guide the selection of the assessment tools to be used and at what point assessment will be done. Enrichment
Secure a copy of CHED Memorandum Order No. 46, s. 2012 re “Policy Standard to enhance Quality Assurance in Philippine Higher Education through an Outcomes-Based and Typology-Based QA.” You may download the document from this link https://ched.gov.ph/2012-ched-memorandum-orders/ . Find out the detailed OBE standards in higher education. You may refer to your Professor any queries or clarifications you want from what you have read during his/her consultation period.
Assessment Activity 1. Fill up the matrix based from your findings of the Educational
Objectives (EO) and create your own Learning Outcomes (LO).
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
11
SULTAN KUDARAT STATE UNIVERSITY
Activity 2. Research the nature of education and be able to submit/present
your outputs in power point/slides. Activity 3. The following statements are incorrect. On the blank before each
number, write the letter of the section which makes the sentence wrong, and on the blank after each number, re-write the wrong section to make the sentence correct. ____1. Because of knowledge knowledge explanation/ bbrought rought about by the use oof/f/ (a) (b) computers in education/ the teacher ceased to be the sole source (c) (d) of knowledge. _______________________ ____________ _______________________ ________________________ ________________________ ______________ __ ___________________________________ _______________________ ________________________ ________________________ _______________ ___ ____2. At present, / the teacher teacher is the giver of knowledge/ by assisting/in assisting/in the (a) (a) (b) (b) (c (c)) organization of facts and information. (d) _______________________ ____________ _______________________ ________________________ ________________________ _______________ ___ ___________________________________ _______________________ ________________________ ________________________ _______________ ___ _____ 33. . The change of focus/ in instruction/ from outcomes to content/ is (a) (a) (b) (b) (c (c)) known as Outcomes-Based Education. (d) _______________________ ____________ _______________________ ________________________ ________________________ _______________ ___ _______________________ ____________ _______________________ ________________________ ________________________ _______________ ___ ____4. A good source/ source/ of subject matter statement/ is Benjamin Bloom’s/ Bloom’s/ (a) (a) (b) (b) (c (c)) Taxonomy of Educational Objectives. (d) _______________________ ____________ _______________________ ________________________ ________________________ _______________ ___ ___________________________________ _______________________ ________________________ ________________________ _______________ ___ ____5. Education comes/ comes/ from the Latin root/ “educare” “educare” or “educere”/ which (a) (a) (b) (b) (c (c)) means to “pour in”. (d) ___________________________________ _______________________ ________________________ ________________________ _______________ ___ ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
12
SULTAN KUDARAT STATE UNIVERSITY
___________________________________ _______________________ ________________________ ________________________ _______________ ___ ____6. In the past,/ the focus/ focus/ of instruction/ was learning ou outcomes. tcomes. (a) (b) (c) (d) ___________________________________ _______________________ ________________________ ________________________ _______________ ___ _______________________ ____________ _______________________ ________________________ ________________________ _______________ ___ ____7. Ability to communicate/ communicate/ in writing and speaking/ is aann example/ of (a) (a) (b) (b) (c (c)) deferred outcome. (d) _______________________ ____________ _______________________ ________________________ ________________________ _______________ ___ _______________________ ____________ _______________________ ________________________ ________________________ _______________ ___ ___8. The content and and the outcome/ are the two/ main elements/ elements/ of the (a) (b) (c) (d) educative process. _______________________ ____________ _______________________ ________________________ ________________________ _______________ ___ ___________________________________ _______________________ ________________________ ________________________ _______________ ___ ___9. Affective, refers to mental skills/ skills/ such as remembering,/ un understanding, derstanding, (a) (a) (b) (b) (c (c)) applying, analyzing, analyzing, evalua evaluating,/ ting,/ synthesiz synthesizing, ing, and creating. (d) _______________________ ____________ _______________________ ________________________ ________________________ _______________ ___ _______________________ ____________ _______________________ ________________________ ________________________ _______________ ___ ___10. Immediate outcome outcome is the ability/ to apply cognitive cognitive,, psychomotor, and (a) (a) (b) (b) affective skills/ in various situations many years /after completion of a (c) (d) course or degree program. _______________________ ____________ _______________________ ________________________ ________________________ _______________ ___ _______________________ ____________ _______________________ ________________________ ________________________ _______________ ___ Activity 4. Give the mea meaning ning of the followin followingg word or group group of words. Write
your answers on the spaces provided for after each number. 1. Out Outcom comeses-Bas Based ed Educat Education ion _______________________ ___________ ________________________ ________________________ _____________________ _________ _______________________ ___________ ________________________ ________________________ _____________________ _________ ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
13
SULTAN KUDARAT STATE UNIVERSITY
_______________________ ___________ ________________________ ________________________ _____________________ _________ _______________________ ___________ ________________________ ________________________ _____________________ _________ 2. Imme Immedi diat atee O Out utco come me _______________________ ___________ ________________________ ________________________ _____________________ _________ _______________________ ___________ ________________________ ________________________ _____________________ _________ _______________________ ___________ ________________________ ________________________ _____________________ _________ 3. Defe Deferre rredd Ou Outc tcom omee _______________________ ___________ ________________________ ________________________ _____________________ _________ _______________________ ___________ ________________________ ________________________ _____________________ _________ _______________________ ___________ ________________________ ________________________ _____________________ _________ 4. Edu Educat cation ional al Obj Object ective ive _______________________ ___________ ________________________ ________________________ _____________________ _________ ___________________________________ _______________________ ________________________ _____________________ _________ _______________________ ___________ ________________________ ________________________ _____________________ _________ 5. Lear Learni ning ng Outc Outcom omee _______________________ ___________ ________________________ ________________________ _____________________ _________ _______________________ ___________ ________________________ ________________________ _____________________ _________ _______________________ ___________ ________________________ ________________________ _____________________ _________ 6. Stu Studen dent-C t-Cent entere eredd Ins Instru tructi ction on ___________________________________ _______________________ ________________________ _____________________ _________ _______________________ ___________ ________________________ ________________________ _____________________ _________ _______________________ ___________ ________________________ ________________________ _____________________ _________ 7. Con Conten tent-C t-Cent entere eredd Inst Instruc ructio tionn _______________________ ___________ ________________________ ________________________ _____________________ _________ _______________________ ___________ ________________________ ________________________ _____________________ _________ _______________________ ___________ ________________________ ________________________ _____________________ _________ 8. Psyc Psycho homo moto torr S Ski killll _______________________ ___________ ________________________ ________________________ _____________________ _________ ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
14
SULTAN KUDARAT STATE UNIVERSITY
_______________________ ___________ ________________________ ________________________ _____________________ _________ _______________________ ___________ ________________________ ________________________ _____________________ _________ 9. Cogn Cognititiv ivee Sk Skililll _______________________ ___________ ________________________ ________________________ _____________________ _________ _______________________ ___________ ________________________ ________________________ _____________________ _________ _______________________ ___________ ________________________ ________________________ _____________________ _________ 10. Clarity of focus _______________________ ___________ ________________________ ________________________ _____________________ _________ _______________________ ___________ ________________________ ________________________ _____________________ _________ ___________________________________ _______________________ ________________________ _____________________ _________ References
De Guzman, E. and Adamos, J. (2015). Assessment of Learning 1. Quezon City: Adriana Publishing Co., Inc. Macayan, Jonathan (2017).Implementing Outcome-Based Education (OBE) Framework: Implications for Assessment of Students’ Performance. Educational Measurement and Evaluation Review (2017), Vol. 8 (1). Navarro, R., Santos, R. and Corpuz, B. (2017). Assessment of Learning I (3 rd. ed.). Metro Manila: Lorimar Publishing, Inc. CHAPTER 2 INTRODUCTION TO ASSESSMENT IN LEARNING
Overview
Clear understanding of the course on Assessment of Learning has to begi be ginn wi with th one’ one’ss co comp mple lete te awar awaren enes esss of the the fund fundam amen enta tall te term rmss and and principles. Most importantly, a good grasp of the concepts like assessment, learning, evaluation, measurement, testing and test is a requisite knowledge for eve every ry pre-se pre-servi rvice ce teach teacher. er. Suffic Sufficien ientt inf inform ormati ation on of these these pedago pedagogic gic elements would certainly heighten his or her confidence in teaching. The principles behind assessment are similarly necessary to be studied as all activities related to it must be properly grounded; otherwise, it is not sound and meanin meaningle gless. ss. Object Objective ive,, co conte ntent, nt, method method,, too tool,l, criter criterion ion,, record recording ing,, ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
15
SULTAN KUDARAT STATE UNIVERSITY
procedure, feedback, and judgment are some significant factors that must be considered to undertake quality assessment. Objective
Upon completion of the unit, the students can discuss the fundamental concepts, principles, purposes, roles and classifications of assessment, as well as align the assessment methods to learning targets. Lesson 1: Basic Concepts and Principles in Assessment Pre-discussion
Study the picture in Figure 1. Has this something to do with assess ass essme ment? nt? Wh What at are you your r comments?
What to Expect?
At the end of the lesson, the students students can: 1. make a pe person rsonal al ddefinit efinition ion ooff assessm assessment; ent; 2. compa compare re assessme assessment nt with m measu easuremen rementt and ev evaluat aluation; ion; 3. dis discu cuss ss testi testing ng aand nd ggrad rading ing;; 4. expla explain in the differe different nt prin principle cipless in ass assessi essing ng learning; learning; 5. relate aann exper experience ience aass a studen studentt or pupil rela related ted to eac eachh principle; principle; 6. comme comment nt on th thee tests aadminis dministered tered by by the pa past st teac teachers; hers; and and 7. per perfor form m simp simple le eva evalua luatio tion. n.
What is assessment?
Let us have some definitions of assessment from varied sources: ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
16
SULTAN KUDARAT STATE UNIVERSITY
1. Assessment involves the use of empirical data on student learning to refine ref ine progra programs ms and imp improv rovee stu studen dentt learni learning. ng. (As (Asses sessin singg Academ Academic ic Programs in Higher Education by Allen 2004) 2. Assessment is the process of gathering and discussing information from multiple and diverse sources in order to develop a deep understanding of what students know, understand, and can do with their knowledge as a result res ult of the their ir edu educa catio tional nal ex exper perien iences ces;; the proces processs culmin culminate atess when when assessment results are used to improve subsequent learning. (LearnerCenter Cen tered ed Ass Assess essmen mentt on Col Colleg legee Cam Campu puses ses:: shifti shifting ng the focus focus fro from m teaching to learning by Huba and Freed 2000) 3. Assessment is the systematic basis for making inferences about the lear le arni ning ng and and de deve velo lopm pmen entt of stud studen ents ts.. It is the the proc proces esss of de defifini ning ng,, sele se lect ctin ing, g, de desi sign gnin ing, g, co collllec ectiting ng,, anal analyz yzin ing, g, in inte terp rpre retin ting, g, an andd us usin ingg information to increase students' learning and development. (Assessing Student Learning and Development: A Guide to the Principles, Goals, and Methods of Determining College Outcomes by Erwin 1991) 4. Assessment is the systematic collection, review, and use of information about abo ut edu educa catio tional nal pro progra grams ms und undert ertake akenn for the pur purpos posee of imp improv roving ing student learning and development (Palomba & Banta, 1999). 5. Assessment refers to the wide variety of methods or tools that educators use to evaluate, measure, and document the academic readiness, learning progress, skill acquisition, or educational needs of students (Great School Partnership, 2020). 6. Davi Davidd et al. al. (202 (2020: 0:3) 3) de defifine nedd assessment as the “process of gathering quantitative and/or qualitative data for the purpose of making decisions.” 7. Assessment is defined as a process that is used to keep track of learners’ progress in relation to learning standards and in the development of 21 st century skills; to promote self-reflection and personal accountability among students about their own learning; and to provide bases for the profiling of student performance on the learning competencies and standards of the curriculum (DepEd Order No. 8, s. 2015). Assessment is one of the most critical dimensions of the education proc proces ess; s; it focu focuse sess not not only only on iden identitify fyin ingg how how many many of the the pred predef efin ined ed education aims and goals have been achieved but also works as a feedback ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
17
SULTAN KUDARAT STATE UNIVERSITY
mechanism that educators should use to enhance their teaching practices. Assessment is located located among the main factors that ccontribute ontribute to a high quality teaching and learning environment. The value of assessment can be seen in the links that it forms with other oth er educat education ion pro proce cesse sses. s. Th Thus, us, Lampri Lampriano anouu and Ath Athan anaso asouu (20 (2009: 09:22) 22) pointe poin tedd ou outt that that asse assess ssme ment nt is co conn nnec ecte tedd wi with th the the ed educ ucat atio ionn goal goalss of “diagnosis, prediction, placement, evaluation, selection, grading, guidance or administration”. Moreover, Biggs (1999) regarded assessment to be a critical process that provides information about the effectiveness of teaching and the progress of students and also makes clearer what teachers expect from students. Meaning of Learning
We all know that the human brain is immensely complex and still somewhat of a mystery. It follows then, that learning as a primary function of the brain is appreciated in many different senses. To provide you sufficient insights of the term, here are several manners that learning can be described: 1. A change change in human ddispos isposition ition or cap capabili ability ty that persists persists over over a period of time and is not simply ascribable to processes of growth.” (From The Conditions of Learning by Robert Gagne) 2. Learni Learning ng is the rela relativ tively ely perm permane anent nt change change in a per person son’s ’s knowled knowledge ge or behavior due to experience. This definition has three components: 1) the duration of the change is long-term rather than short-term; 2) the locus of the change is the content and structure of knowledge in memory or the behavior of the learner; 3) the cause of the change is the learner’s experi exp erienc encee in the envir environm onment ent rather rather tha thann fat fatigu igue, e, mot motiva ivatio tion, n, dru drugs, gs, phys ph ysic ical al co cond ndititio ionn or ph phys ysio iolo logi gicc in inte terv rven entition on.. (Fro (From m Le Lear arni ning ng in Encyclopedia of Educational Research, Richard E. Mayer) 3. It has has been been su sugg gges este tedd that that the the term term le lear arni ning ng defi defies es prec precis isee defi defini nitition on because it is put to multiple uses. Learning is used to refer to (1) the acquisition and mastery of what is already known about something, (2) the extension and clarification of meaning of one’s experience, or (3) an organized, intentional process of testing ideas relevant to problems. In ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
18
SULTAN KUDARAT STATE UNIVERSITY
other words, it is used to describe a product, a process, or a function. (From Learning How to Learn: Applied Theory for Adults by R.M. Smith) 4. A process process that lea leads ds to chang change, e, whic whichh occurs occurs as a resul resultt of exp experi erienc encee and increases the potential of improved performance and future learning. (F (Fro rom m Make Make It St Stic ick: k: The Science of Successful Learning by Peter C. Brown, Henry L. Roediger III, Mark A. McDaniel) 5. The process process of gaining gaining know knowled ledge ge and exper expertis tise. e. (From How Learning Learning Works: Seven Research-Based Principles for Smart Teaching by Susan Ambrose, et al.) 6. A persisting persisting cha change nge in huma humann performance performance or per performa formance nce pot potentia entiall which must come about as a result of the learner’s experience and interaction with the world. (From Psychology of Learning for Instruction by M. Driscoll) 7. Learni Learning ng is “a pro proces cesss that lead leadss to chang change, e, which which occurs occurs as a result result of experience and increases the potential for improved performance and future learning” (Ambrose et al, 2010:3). The change in the learner may happen at the level of knowledge, attitude or behavior. As a result of le lear arni ning ng,, lear learne ners rs come come to see see conc concep epts ts,, id idea eas, s, an and/ d/or or the the worl worldd differently. It is not something done to students, but rather something students themselves do. It is the direct result of how students interpret and respond to their experiences. From the foregoing definitions, learning can be briefly stated as a change in learner’s behaviour towards an improved level resulting from one’s experiences and interactions with his environment. St Stud udyy the the foll follow owin ingg fig figur ures es to ap appr prec ecia iate te be bett tter er the the mean meanin ingg of “learning.”
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
19
SULTAN KUDARAT STATE UNIVERSITY
Figure 2
Figure 3
Figure 4
Youu ma Yo mayy be thin thinki king ng that that lear learni ning ng to bake bake cook cookie iess and and le lear arni ning ng something like Chemistry are not the same at all. In a way, you are right however, the information you get from fr om assessing what you have learned is the same. Brian used what he learned from each batch of cookies to improve the next ne xt ba batc tch. h. You You also also lear learnn from from ev ever eryy home homewo work rk assi assign gnme ment nt that that you you complete, and in every quiz you take what you still need to study to know the material. Measurement and Evaluation
Calderon Calde ron and Gonz Gonzales ales (1993) defined measurement as the process of dete determ rmin inin ingg the the qu quan antitity ty of ac achi hiev evem emen entt of le lear arne ners rs by me mean anss of approp app ropria riate te mea measur suring ing ins instru trumen ments. ts. In mea measu surin ring, g, we often often utiliz utilizee some some standard standa rd instru instrumen ments ts to ass assign ign num numera erals ls to tra traits its such such as achiev achieveme ement, nt, interest, attitudes, aptitudes, intelligence and performance. Paper and pencil ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
20
SULTAN KUDARAT STATE UNIVERSITY
test is the primary instrument in the common practice of educators. They measure specific elements of learning such as readiness to learn, recall of fact facts, s, de demo mons nstr trat atio ionn of skill skills, s, or ab abilility ity to anal analyz yzee and and so solv lvee prac practic tical al problems. Generally, values of certain attribute are translated into numbers by measurement. Nonetheless, a quantitative measure like a score of 65 out of 80 in writt written en ex exam amin inat atio ions ns do does es not not ha have ve mean meanin ingg unle unless ss in inte terp rpre rete ted. d. Essentially, measurement ends when a numerical value is assigned while evaluation comes in next. On the other hand, evaluation is possibly the most complex and least understood among the basic terms in assessment of learning. Inherent in the idea of evaluation is "value." When we evaluate, what we are doing is engaging in some process that is designed to provide information that will help us make a judgment about a given situation. Generally, any evaluation process requires information about the situation in question. In education, evaluation is the process of using the measurements gathered in the assessments. Teachers use this information to judge the relationship between what was intended by the instruction and what was learned. They evaluate the information gathered to determine what students know and understand; how far they have progressed and how fast; and how their scores and progress compare to those of other students. In short, evaluation is the process of making judgments based on stan standa dard rdss an andd ev evid iden ence cess de deriv rived ed from from me meas asur urem emen ents ts.. It is now now gi givi ving ng meaning to the measured attributes. With this, it is implicit that a sound evaluation is dependent on the way measurement was carried out. Ordinarily, teachers’ decision to pass or fail a learner is determined by his obtained grade relative to the school standard. Thus, if one’s final grade is 74 or lower then it means failing; otherwise, it is a passing when the final grade is 75 or better since the standard passing or cut-off grade is 75. The same scenario takes place in the granting of academic excellence awards such as Valedictorian, Salutatorian, First Honors, Second Honors, Cum laude, Magna cum laude, Summa cum laude, etc. Here, evaluation means comparing one’s grade or achi ac hiev evem emen entt agai agains nstt an esta establ blis ishe hedd st stan anda dard rdss or cr crititer eria ia to arriv arrivee at a ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
21
SULTAN KUDARAT STATE UNIVERSITY
decision. Therefore, grading of students in schools must be credible to ensure that giving of awards would be undisputable. Testing and Grading
A test is used to examine someone's knowledge of something to determine what he or she knows or has learned. Testing measures the level of skill or knowledge that has been reached. David et al. (2020:4) wrote that testing is the most common form of assessment. It refers to the use of test or battery of tests to collect information on student learning over a specific period of time. A test is a form of assessment, but not all assessments use tests or testing. De Guzman and Adamos (2015:2) described testing to be a “formal, systematic procedure for gathering information” while a test is a “tool consists of a set set of qu ques estio tions ns ad admi mini nist ster ered ed du duri ring ng a fix fixed ed pe peri riod od of tim timee unde under r comparable conditions for all students.” Most of the educational tests are intended to measure a construct. They may also be used to measure the learner’s progress in both formative and summative purposes. In practice, a typical teacher often gives short quiz after teaching a lesson to determine attainment of the learning outcomes. He also undertakes long assessments upon completion of a chapter, unit, chapter or course to test the learners’ degr de gree ee of ac achi hiev evem emen ent. t. In simi simila larr wa way, y, th thee Pr Prof ofes essi sion onal al Re Regu gula latition on Commission (PRC) and Civil Service Commission (CSC) are administering licensure and eligibility examinations to test the readiness or competence of would-be professionals. On the other hand, grading implies combining several assessments, translating the result into some type of scale that has evaluative meaning, and reporting the result in a formal way. Hence, grading is a process and not merely quantitative values. It is the one of the major functions, results, and outcomes of assessing and evaluating students’ learning in the educational setting (Magno, 2010). Practically, grading is the process of assigning value to the performance or achievement of a learner based on specified criteria like performance task, written test, major examinations, and homework. It is also a form of evaluation which provides information as whether a learner passed or failed in a certain task or subject. Thus, a student is given a grade of 85 after ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
22
SULTAN KUDARAT STATE UNIVERSITY
scoring 36 in a 50-item midterm examination. He also received a passing grade of 90 in Mathematics after his detailed grades in written test and performance task were computed. Models in Assessment
The two most common psychometric theories that serve as frameworks for assess assessme ment nt and measur measureme ement nt esp especi eciall allyy in the det determ ermina inatio tionn of the psyc ps ycho home metri tricc ch char arac acte teris ristitics cs of a meas measur uree (e.g (e.g., ., test tests, s, sc scal ale) e) are are the the classical test theory (CTP) and the item response theory (IRT).
The CTT, also known as the true score theory, explains that variations in the performance of examinees’ on a given measure is due to variations in their abilities. It assumes that an examinees’ observed score in a given measure is the sum of the examinees’ true scores and some degree of error in the measurement caused by some internal and external conditions. Hence, the the CTT CTT also also assu assume mess that that all all me meas asur ures es are are impe imperf rfec ectt an andd the the sc scor ores es obtained from a measure could differ from the true score (i.e., true ability of an examinee). The CTT provides an estimation of the item difficulty based on the frequency of number of examinees who correctly answer a particular item; items with a fewer number of examinees with correct answers are considered more difficult. It also provides an estimation of item discrimination based on the number of examinees with higher or lower ability to answer a particular item. If an item is able to distinguish between examinees with higher ability (i.e., higher total test score) and lower ability (i.e., lower total test score), then an item is considered to have good discrimination. Test reliability can also be estitima es mate tedd using ing approa proacches fr from om CTT (e.g., Kude Kuder-Rich r-Richardso ardsonn 20, Cronbach’s alpha). Item analysis based on this theory has been the dominant approach because of the simplicity of calculating the statistics (e.g., item difficulty index, item discrimination index, item-total correlation). The IRT, on the other hand, analyzes test items by estimating the probability that an examinee answers an item correctly or incorrectly. One of the central differences of IRT from CTT is that in IRT, it is assumed that the characteristic of an item can be estimated independently of the characteristic or ability of an examinee, and vice-versa. Aside from item difficulty and item ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
23
SULTAN KUDARAT STATE UNIVERSITY
discrimination indices, IRT analysis can provide significantly more information on item and test, such as fit statistics, item characteristic curve (ICC), and tests characteristic curve (TCC). There are also different IRT models (e.g., one-parameter model, 3-parameter model) which can provide different item and test information that cannot be estimated using the CTT. In previous years, there has been an increase in the use of IRT analysis as measurement frame fra mewo work rk desp despititee the the co comp mple lexi xity ty of the the anal analys ysis is in invo volv lved ed du duee to the the availability of IRT software. Types of Assessment
The most common types of assessment are diagnostic, formative and summ su mmat ativ ive, e, crit criter erio ionn-re refe fere renc nced ed an andd norm norm-r -ref efer eren ence ced, d, trad tradititio iona nall an andd authentic. Other experts added ipsative and confirmative assessments. Pre-assessment or diagnostic assessment
Before creating the instruction, it is necessary to know for what kind of students you are creating the instruction. Your goal is to get to know your studen stu dent’s t’s streng strengths ths,, weakne weaknesse ssess and the skills skills and kno knowle wledge dge the theyy posses pos sesss before before tak taking ing the instru instructi ction. on. Based Based on the data data yo youu hav havee collected, you can create your instruction. Usually, a teacher conducts a pre-test to diagnose the learners. Formative assessment
Formative assessment is a continuous and several assessments done during the instructional process for the purpose of improving teaching or learning (Black & William, 2003). Summative assessment
Summat Sum mative ive assess assessmen ments ts are qu quizz izzes, es, tests, tests, exa exams, ms, or other other formal formal evaluations of how much a student has learned throughout a subject. The goal of this assessment is to get a grade that corresponds to a student’s understanding of the class material as a whole, such as with a midterm or cumulative final exam. Confirmative assessment
Whenn you Whe yourr ins instru tructi ction on has been imp been implem lement ented ed in your your classroom, it is still ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
24
SULTAN KUDARAT STATE UNIVERSITY
necessary to take assessment. Your goal with confirmative assessments is to find out if the instruction is still a success after a year, for example, and if the way you are teaching is still on point. You could say that a connfifirm co rmaatitivve asses ssessm smen entt is an exte extens nsiv ivee fo form rm of a summa ummatitive ve assessment (LMS, 2020). Norm-referenced assessment
This assessment primarily compares one’s learning performance against an average norm. It indicates the student’s performance in contrast with other students (see Figure 5 ). ). Also, the age and question paper are same for both of them. It assesses whether the students have performed better or worse than the others. It is the theoretical average determined by comparing scores.
Criterion-referenced Criterion-reference d assessment
It
measures
student’s
performances against a fixed set of predetermined criteria or learning standards ( see Figure 6 ). ). It checks what students are
expected to know and be able to do at a specific stage of thei theirr ed educ ucat atio ion. n. Cr Crititer erio ionnreferenced tests are used to eval ev alua uate te a spec specifific ic bo body dy of knowledge or skill set; it is a test to evaluate the curriculum taught in a cour co urse se.. In prac practic tice, e, thes thesee as asse sess ssme ment ntss are are desi design gned ed to de dete term rmin inee whether students have mastered the material presented in a specific unit. Each student’s performance is measured based on the subject matter presented (what the student knows and what the student does not know ). Again, all students can get get 100% if they have fully mastered mastered the material. Ipsative assessment ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
25
SULTAN KUDARAT STATE UNIVERSITY
It measures the performance of a student against previous performances from that student. With this method you are trying to improve yourself by comparing previous results. You are not comparing yourself against other students, which may be not so good for your self-confidence (LMS, 2020). Traditional Assessment
Traditional assessments refer to conventional methods of testing, usually matching type test items. In general, they measure students’ knowledge of the content. Common examples are: True or False, multiple choice tests, tes ts, standa standardi rdized zed tes tests, ts, achie achievem vement ent tes tests, ts, intell intellige igence nce tes tests, ts, and aptitude tests. Authentic Assessment
Authentic assessments refer to evaluative activities wherein students are aske as kedd to pe perf rfor orm m real real-w -wor orld ld task taskss that that demo demons nstr trat atee me mean anin ingf gful ul application of what they have learned. They measure students’ ability to apply knowledge of the content in real life situations and ability to use what they have learned in meaningful ways. Common examples are: demonstrations, hands-on experiments, computer simulations, portfolios, projects, multi-media presentations, role plays, recitals, stage plays and exhibits. Principles of Assessment
There are many principles in the assessment in learning. Different literature provides their unique list yet closely related set of principles of asse as sess ssme ment nt.. Acco Accord rdin ingg to Da Davi vidd et al al.. (202 (2020) 0),, the the foll follow owin ingg may may be considered as core principles in assessing learning: 1. Assessment should hav havee a clea clearr purpose. The methods methods used in collecting information should be based on this purpose. The interpretation of the data collected should be aligned with the purpose that has been set. This principle is i s congruent with the outcome-based education (OBE) principles of clarity of focus and design down. 2. Assessment is not an end in itself. It serves as a means to enhance enhance student learning. It is not a simple recording or documentation of what learners know and do not know. Collecting information about student ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
26
SULTAN KUDARAT STATE UNIVERSITY
learning, whether formative or summative, should lead to decision that will allow improvement of the learners. 3. Assessment is an on-g on-going, oing, continuous, continuous, and a formative process process . It consists of a series of tasks and activities conducted over time. It is not a one-shot activity and should be cumulative. Continuous feedback is an important element of assessment. This principle is congruent with the OBE principle of expanded opportunity. 4. Assessment is learner-centered. It is not about what the te teacher acher does but what the learner can do. Assessment of learners provides teachers with an understanding on how they can improve their teaching, which corresponds to the goal of improving student learning. 5. Assessment is both pro processcess- and product-oriented. It gives equal importance to learner performance or product in the process. They engaged in to perform or produce a product. 6. Assessment must be comprehensive and holistic. It should be performed using a variety of strategies and tools designed to assess student learning in a holistic way. It should be conducted in multiple periods to assess learning overtime. This principle is also congruent with the OBE principle of expanded opportunity. 7. Assessment requires the use of appropriate measures. For assessment to be valid, the assessment tools or measures used must have sound psychometric properties, including, but not limited to, validity and reliability. Appropriate measures also mean that learners must be provided with challenging but age- and context-appropriate assessment tasks. This principle is consistent with the OBE principle of high expectation. 8. Assessment should be authentic as possible. Assessment Assessment tasks or activities should closely, if not fully, approximate real-life situations or experiences. Authenticity of assessment can be taught of as a continuum from least authentic to most authentic, with more authentic tasks expected to be more meaningful for learners. Summary
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
27
SULTAN KUDARAT STATE UNIVERSITY
Assessment is a systematic systematic process of defining, defining, selecting, designing, collecting, analyzing, interpreting, and using information to increase students' learning and development.
Assessment may be be described in terms of its purpose purpose such as ass assessment essment FOR, assessment OF and assessment AS.
Learning is a change in the learner’s behaviour towards an improved level as a product of one’s experience and interaction with his environment.
Measurement is a process of determining or describing the attributes attri butes or characteristics of learners generally in terms of quantity.
Evaluation is the process of making judgments based on standards and evidences derived from measurements.
A test is a tool consists of a set set of questions administered during a fixed period of time under comparable conditions for all students. Testing measures the level of skill or knowledge that has been reached.
Grading is a form of evaluation which provides information as to whether a learner passed or failed in a certain task or subject.
The most common psychometric theories that serve as frameworks for assessment and measurement in the determination of the psychometric characteristics of a measure are the classical test theory (CTT) and the item response theory (IRT).
The most common types of assessment are diagnostic, formative and summative, criterion-referenced and norm-referenced, traditional and authentic. Other experts added ipsative and confirmative assessments.
Principles of assessment are guides for teachers in their design, and development of outcomes-based assessment tools.
Assessment
1. What is asses assessment sment in le learnin arning? g? Wha Whatt is asses assessmen smentt in learni learning ng for you? you? 2. Differ Different entiat iatee the the follo followin wing: g: 2.1. 2.1.
Meas Measur urem emen entt and and ev eval alua uatition on
2.2.
Testing an and ggrrading
2.3. 2.3.
Fo Form rmat ativ ivee aand nd summ summat ativ ivee as asse sess ssme ment nt
2.4. 2.4.
Clas Classi sica call test test th theo eory ry an andd Ite Item m res respo pons nsee th theo eory ry
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
28
SULTAN KUDARAT STATE UNIVERSITY
3.
Based on the principles that you have learned, make a simple plan on how you will undertake your assessment with your future students. Consider 2 principles only.
Principless Principle
Plan for applying the principle principle in your classroom assessment
1. 2.
4. Choose Choose 3 core pri principl nciples es in asse assessing ssing le learnin arning, g, and ex explain plain them them in relation to your experiences with past teachers. A model is provided for your reference. Principles Example: 1. As Asse sess ssme ment nt re requ quir ires es the use of appropriate measures.
Practices One of my high school teachers was very unfair when it comes to giving of assessment. I can still recall how he prepared his test questions that were not actually part of our lessons. Before the test, all of us studied well on the various lessons we discussed in the entire grading period. Unfortunately, a lotwere of items in the the topics. actual What examinations examinatio ns that out of made it worse is that he would get angry when asked about the mismatch. I think the teacher did not consider the validity of his test, and it was not appropriate.
2. 3. 4.
5. Evaluate Evaluate the ex extent tent of yo your ur know knowledge ledge aand nd understa understanding nding about about assessment of learning and its principles. Indicators
Great extent
Moderate extent
Not allat
1. I ca cann ex explai plainn the mean meaning ing of aasses ssessmen smentt of learning 2. I ca cann di discu scuss ss what what is lea learni rning ng.. 3. I ca cann co compa mpare re asse assessm ssment ent with with measurement and evaluation. 4. I can com compa pare re te testi sting ng aand nd grad gradin ing. g. 5. I ca cann ddiscu iscuss ss th thee cclass lassical ical test theo theory. ry. 6. I can eenu numer merate ate tthe he diffe differen rentt type typess of assessment. 7. I ca cann di differe fferentia ntiate te be between tween forma formative tive and summative assessment. 8. I can expl explain ain w what hat eeach ach of of the principl principles es of assessment means. 9. I ca cann gi give ve eexamp xamples les of aasses ssessmen smentt tas tasks ks oorr ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
29
SULTAN KUDARAT STATE UNIVERSITY
activities that do not conform to one or more of the core principles in assessmen assessment.t. 10. I understand wh what at it means to have a good assessment practice in the classroom.
Enrichment
Secure a copy of DepEd Order No. 8, s. 2015 on the Policy Guidelines on Classroom Assessment for the K to 12 Basic Education Program. Study the policies and be ready to clarify any provisions during G-class. You can access the Order from this link: https://www.deped.gov.ph/2015/04/01/do8-s-2015-policy-guidelines-on-classroom-ass 8-s-2015-policy-guide lines-on-classroom-assessment-for-the-k-to-12essment-for-the-k-to-12basic-education-p basic-edu cation-program/ rogram/ Read DepEd Order No. 5, s. 2013 (Policy Guidelines on the Implementation of the School Readiness Year-end Assessment (SReYA) for Kindergarten. (Please access through https://www.deped.gov.ph/2013/01/25/do-5-s-2013-policy-guidelines https://www.deped.gov.ph/2013/01/25/do-5-s-2 013-policy-guidelines-on-onthe-implementation-of-the-school-readiness-year-en the-implementation-of-the-sch ool-readiness-year-end-assessment-sreya d-assessment-sreya-for-kindergarten/). for-kindergarten/ ). Questions
1. What assess assessment ment is cit cited ed in the Order? Order? Wha Whatt is the purpose purpose of givi giving ng such assessment? 2. How would would you cla classify ssify th thee asses assessmen smentt in terms of its na nature? ture? Ju Justify stify.. 3. What is the the releva relevance nce of thi thiss asses assessmen smentt to studen students, ts, pare parents nts and teachers and the school?
References ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
30
SULTAN KUDARAT STATE UNIVERSITY
Alberta Education (2008, October October 1). Types of Classroo Classroom m Assessment. Retrieved from http://www.learnalberta.ca/content/mewa/html/assessment/types.html David et al. (2020). Assessment in Learning 1. Manila: Rex Book Store. De Guzman, E. and Adamos, J. (2015). Assessment of Learning 1. Quezon City: Adriana Publishing Co., Inc. Fisher,Retrieved M. Jr. R. (2020). Student Assessment in Teaching and Learning. from https://cft.vanderbilt.edu/student-assessment-inteaching-and-learning/ Navarro, L., Santos, R. and Corpuz, B. (2017). Assessment of Learning 1 (3 rd ed.). Quezon City: Lorimar Publishing, Inc. Magno, C. (2010). The Functions of Grading Students. The Assessment Handbook, 3, 50-58.
Lesson 2: Purposes of Classroom Assessment, Educationa Educationall Objectives, Learning Targets and Appropriate Methods Pre-discussion
To be able to achieve the intended learning outcomes of this lesson, one is required to understand the basic concepts, theories and principles in assessing the learning of students. Should these things are not yet cleared and understood, it is advised that a thorough review be made of the previous chapter. What to Expect?
At the end of the lesson, the students students can: 1. articu articulate late tthe he purpose purpose of classro classroom om as assess sessment; ment; 2. tell the dif differen ference ce between between the Bl Bloom’s oom’s T Taxon axonomy omy and the the Revis Revised; ed; Bloom’s Taxonomy in stating learning objectives; 3. apply th thee Revise Revisedd Bloom’s Bloom’s Taxonomy Taxonomy in wri writing ting learnin learningg objectives objectives;; 4. discu discuss ss the im importan portance ce of lea learning rning targe targets ts in ins instruct truction; ion; 5. for formul mulate ate llear earnin ningg tar target gets; s; and and 6. match the assess assessment ment m method ethodss with specific specific learnin learningg objectives/targets. Purpose of Classroom Assessment ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
31
SULTAN KUDARAT STATE UNIVERSITY
Assessment works best when its purpose is clearly defined. Without a clear purpose, it is difficult to plan and design assessment effectively and effi effici cien entltly. y. In class lassro room oms, s, teac teache hers rs are are ex expe pect cted ed to unde unders rsta tand nd the the instructional goals and learning outcomes which will inform how they will dessig de ignn an andd im impl plem emen entt thei theirr as asse sess ssme ment nt.. Ge Gene nera ralllly, y, th thee purp purpos osee of assessment may be classified in terms of the following: 1. Asse Assessme ssment nt for Learn Learning ing ((Forma Formative tive A Asses ssessmen sment) t)
The philosophy behind assessment for learning is that assessment and teac teachi hing ng shou should ld be inte integr grat ated ed into into a wh whol ole. e. Th Thee powe powerr of su such ch an asse as sess ssme ment nt does doesn' n'tt come come from from intr intric icat atee tech techno nolo logy gy or from from us usin ingg a specific assessment instrument. It comes from recognizing how much learning is taking place in the common tasks of the school day – and how much insight into student learning teachers can mine from this material (McNamee and Chen, 2005: 76). Assessment for learning is is on-going assessment that allows teachers to monitor students on a day-to-day basis and modify their teaching based on whatt the studen wha students ts need need to be succes successfu sful.l. This This assess assessmen mentt provid provides es stud studen ents ts wit withh the the time timely ly,, sp spec ecifi ificc feed feedba back ck that that they they need need to make make adjustments to their learning. After teaching a lesson, we need to determine whether the lesson was accessible to all students while still challenging to the more capable; what the students learned and still need to know; how we can improve the lesson to make it more effective; and, if necessary, what other lesson we might offer as a better alternative. This continual evaluation of instructional choices is at the heart of improving our teaching practice (Burns, 2005). 2. Asse Assessme ssment nt of Lear Learning ning (Summ (Summative ative Asse Assessme ssment) nt) Assessment of learning is the snapshot in time that lets the teacher,
students and their parents know how well each student has completed the lear le arni ning ng task taskss an andd ac actitivi vitities es.. It prov provid ides es in info form rmat atio ionn abou aboutt st stud uden entt achievement. While it provides useful reporting information, it often has little effect on learning. ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
32
SULTAN KUDARAT STATE UNIVERSITY
Comparing Assessment for Learning Learning and Assessment of Learning Learning Assessment for Learning Learning (Formative Assessment)
Assessment of Learning Learning (Summative Assessment)
Checks learning to determine what to Checks what has been learned to date. do next and then provides suggestions of what to do - teaching and learning are indistinguishable from assessment.
Is designed to assist educators and students in improving learning.
Is used continually by providing descriptive feedback. Usually uses detailed, specific and
Is designed for the information of those not directly involved in daily learning and teaching (school administration, parents, school board, Alberta Education, post-secondary post-secondary institutions) in addition to educators and students. Is presented in a periodic report. Usually compiles data into a single
descriptive feedback - in a formal or informal report.
number, score or mark as part of a formal report. Is not reported as part of an Is reported as part of an achievement achievement grade. grade. Usually compares the student's Usually focuses on improvement, compared with the student's “previous learning either with other students' learning (norm-referenced, making best” (self-referenced, making learning highly competitive) or the learning more personal). standard for a grade level (criterionreferenced, making learning more collaborative and individually focused). Involves the student. Does not always involve the student. Adapted from Ruth Su Sutton, tton, unpublished unpublished doc document, ument, 200 2001, 1, in Alberta Assessment Consortium, Refocus: Looking at Assessment for Learning (Edmonton, AB: Alberta Assessment Consortium Consortium,, 2003), p. 4. 4.
3. Asse Assessme ssment nt a as s Le Learni arning ng (S (Self-a elf-asses ssessment sment)) Assessment
as
learning
devel evelop opss
and and
supp ppor orts ts
st stuudents ents''
metacognitive skills. This form of assessment is crucial in helping students become lifelong learners. As students engage in peer and self-assessment, they learn to make sense of information, relate it to prior knowledge and use it for new learning. Students develop a sense of ownership and efficacy when they use teacher, peer and self-assessment feedback to make adjustments, improvements and changes to what they understand. ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
33
SULTAN KUDARAT STATE UNIVERSITY
As discussed in the previous chapter, assessment serves as the mechanism by which teachers are able to determine whether instruction worked in facilitating the learning of students. Hence, it is very important that assessment is aligned with instruction and the identified learning outcomes for learners. Knowing what will be taught (curriculum content, competency, and perfor perf orma manc ncee stan standa dard rds) s),, an andd ho how w it wi willll be taug taught ht (inst (instru ruct ctio ion) n) are are as important as knowing what we want from the very start (curriculum outcome) in dete determ rmin inin ingg the the sp spec ecifi ificc pu purp rpos osee an andd st stra rate tegy gy for for as asse sess ssme ment nt.. Th Thee alignment is easier if teachers have clear purpose on why they are performing the the as asse sess ssme ment nt.. Typi Typica calllly, y, teac teache hers rs us usee cl clas assr sroo oom m as asse sess ssme ment nt for for assessment OF lear learnning ing mor oree th thaan as asssess ssme mennt FOR le lear arni ning ng and and assessmen asses smentt AS learning. Ideally, however, all three purposes of classroom assessment must be used. While it is difficult to perform an assessment with all three purposes in mind, teachers must be able to understand the three purposes of assessment, including knowing when and how to use them. The Roles of Classroom Assessment in the Teaching-Learning Process
Assessment is an integral part of the instructional process where teachers design and conduct instruction (teaching), so learners achieve the spec sp ecifific ic targ target et lear learni ning ng ou outc tcom omes es defi define nedd by the the cu curr rric icul ulum um.. While While the the purp pu rpos osee of asse assess ssme ment nt may may be clas classi sifified ed as asse assess ssme ment nt of learning, assessment for learning, and assessment as learning, the specific purpose of an asse assess ssme ment nt depe depend ndss on the the teac teache her’ r’ss obje object ctiv ivee in co colle llect ctin ingg and and evaluatin evalua tingg ass assess essmen mentt dat dataa from from learne learners. rs. Mor Moree speci specific fic object objective ivess for assessing student learning congruent to the following roles of classroom assses as essm smeent in th thee te teac achi hing ng-l -leear arnning ing pr proc oces esss: for format mative ive,, dia diagno gnosti stic, c, evaluative, and motivational , each of which is discussed below. Form Fo rmat ativ ive. e. Teachers conduct assessment because they want to acquire
information on the current status and level of learner’s knowledge and skil skills ls or comp compet eten enci cies es.. Te Teac ache hers rs ma mayy need need in info form rmat atio ionn (e.g (e.g.. prio prior r knowledge, strengths) about the learners prior to instruction, so they can design their instructional plan to better suit the needs of the learners. Teachers may also need information on learners during instruction to allow them to modify instruction or learning activities to help learners ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
34
SULTAN KUDARAT STATE UNIVERSITY
achieve the learning outcomes. How teachers should facilitate students’ learning may be informed by the information that may be acquired in the assessment results. Diagnostic. Te Teach achers ers can use assess assessmen mentt to ide identi ntify fy sp speci ecific fic learne learners’ rs’
weak we akne ness sses es or diff diffic icul ultie tiess that that may may affe affect ct thei theirr achi achiev evem emen entt of the the in inte tend nded ed lear learni ning ng outc outcom omes es.. Iden Identitify fyin ingg thes thesee weak weakne ness sses es al allo lows ws teachers to focus on specific learning needs and provide opportunities for instructional intervention or remediation inside or outside the classroom. Thee diag Th diagno nost stic ic role role of asse assess ssme ment nt ma mayy al also so le lead ad to di diff ffer eren entitiat ated ed instruction or even individualized i ndividualized learning plans when deemed necessary. Evaluative. Teachers conduct assessment to measure learners’ performance
or ac achi hiev evem emen entt for for the the purp purpos osee of maki making ng ju judg dgme ment nt or grad gradin ingg in particular. Teachers need information on whether the learners have met the intended learning outcomes after the instruction is fully implemented. The learners’ placement or promotion to the next educational level is informed by the assessment results. Facilitative. Classroom assessment may affect student learning. On the part
of teachers, assessment for learning provides information on students’ learning and achievement that teachers can use to improve instruction andd the an the lear learni ning ng ex expe perie rienc nces es of le lear arne ners rs.. On the the pa part rt of le lear arne ners rs,, assessment as learning allows them to monitor, evaluate, and improve their own learning strategies. In both cases, student learning is facilitated. Motivational. Classroom assessment can serve as a mechanism for learners
to be mo motiv tivat ated ed an andd en enga gage gedd in le lear arni ning ng an andd ac achi hiev evem emen entt in the the classroom. Grades, for instance, can motivate and demotivate learners. Focu Fo cusi sing ng on prog progre ress ss,,
prov provid idin ingg effe effect ctiv ivee feed feedba back ck,, in inno nova vatiting ng
assessme asse ssment nt tasks, tasks, and usin usingg scaf scaffoldin foldingg during during assessmen assessmentt activities activities provide opportunities for assessment activities provide opportunities for assessment to be motivating rather than demotivating. Comparing Educational Goals, Standards, and Objectives
Before discussing what learning targets are, it is important to first define educational goals, standards, and objectives. ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
35
SULTAN KUDARAT STATE UNIVERSITY
Goal Go als. s. Goals are general statements about desired learner outcomes in a
given year or during the duration of a program (e.g. senior high school). Stand Sta ndard ards. s. Standards are specific statements about what learners should
know and are capable of doing at a particular grade level, subject, or course. McMillan (2014) described four different types of educational st staand ndar ardds: (1) (1) content (des (desir ired ed outc outcom omes es in a co cont nten entt area area), ), (2) (2) performance (w (wha hatt st stud uden ents ts to do de demo mons nstr trat atee co comp mpet eten ence ce), ), (3) (3) develo dev elopm pment ental al (sequence of growth and change over time), and (4) grade-level (outcomes for a specific grade). Educat Edu cation ional al Obj Object ective ives. s. Edu Educat cation ional al or learni learning ng objec objectiv tives es are spe specif cific ic
statements of learner performance at the end of an instruction unit. These are sometimes referred to as behavioural objectives and are typically stated with the use of verbs. The most popular taxonomy of educational objectives is Bloom’s Taxonomy of Educational Objectives. The Bloom’s Taxonomy of Educational Objectives
Bloom’s Taxonomy consists of three domains: cognitive, affective and psychomotor . These three domains correspond to the three types of goals
that teachers want to assess: knowledge-based goals (cognitive), skills-based goals (psychomotor), and effective goals (affective). Hence, there are there taxonomies that can be used by teachers depending on the goals. Each taxonomy consists of different levels of expertise with varying degrees of complexity. The most popular among the three taxonomies is the Bloom’s Taxono Tax onomy my of Edu Educat cation ional al Object Objective ivess for Kno Knowle wledg dge-B e-Base asedd Goa Goals. ls. The taxo taxono nomy my desc descri ribe bess six six leve levels ls of expe expert rtis ise: e: knowledge, comprehension, applic app licati ation, on, ana analys lysis, is, syn synthe thesis sis,, an and d eva evalua luatio tion n. Tabl blee 1 pr pres eseents nts th thee description, illustrative verbs, and a sample objective for each of the six levels. Table 1. Bloom’s Taxonomy of Educational Objectives in the Cognitive Domain Cognitive Level Description
Illustrative Verbs
Sample Objective
Knowledge
Recall or recognition of
defines, recalls,
Enumerate the six levels of expertise
learned materials like concepts,
names, enumerates,
in the Bloom’s taxonomy of
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
36
SULTAN KUDARAT STATE UNIVERSITY
events, facts, ideas, and procedures Un Unde ders rsta tand ndin ingg the the meaning of a learned material, including interpretation, and
objectives in the cognitive domain Explain each of the explains, describes, six levels of summarizes, expertise in the and translates Bloom’s taxonomy of objectives in the
Application
literaloftranslation Use abstract ideas, principles, or methods to specific concrete situations
Analysis
Separation of a concept or idea into constituent parts or elements and an understanding of the nature and association among the elements Construction of elements or parts from different sources to form a more complex or novel structure Making judgment of ideas or methods based on sound and established criteria
applies, demonstrates, produces, illustrates, and uses compares, contrasts, categorizes, classifies, and calculates
cognitive domain. Demonstrate how to use Bloom’s taxonomy in formulating learning objectives. Compare and contrast the six levels of expertise in Bloom’s taxonomy of objectives in the cognitive domain.
composes constructs, creates, designs, and integrates
Compose learning targets using Bloom’s taxonomy.
appraises, evaluates, judges, concludes, and criticizes
Evaluate the congruence between learning targets and assessment methods.
Comp Co mpre rehe hens nsio ionn
Synthesis
Evaluation
and labels
Bloom’s taxonomies of educational objectives provide teachers with a structured guide in formulating more specific learning targets as they provide an exhaustive list of learning objectives. The taxonomies do not only serve as guide for teachers’ instruction but also as a guide for teachers’ instruction but also as a guide for their assessment of student learning in the classroom. Thus, it is imperative that teachers identify the levels of expertise that they expect the learners to achieve and demonstrate. This will then inform the asse as sess ssme ment nt meth method od requ requir ired ed to prop proper erly ly asse assess ss st stud uden entt le lear arni ning ng.. It is assumed that a higher level of expertise in a given domain requires more sophisticated assessment methods or strategies. ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
37
SULTAN KUDARAT STATE UNIVERSITY
The Revised Bloom’s Taxonomy of Educational Objectives
Anderson and Krathwohl (2001) proposed a revision of the Bloom’s Taxonomy in the cognitive domain by introducing a two-dimensional model for writitin wr ingg le lear arni ning ng obje object ctiv ives es.. Th Thee firs firstt di dime mens nsio ion, n, know knowledge ledge dimen dimension sion, includes four types: factual, conceptual, procedural, and metacognitive . The cognitive itive proc process ess dimen dimension sion, cons seco se cond nd dime dimens nsio ion, n, cogn consis ists ts of si sixx ty type pes: s: re reme memb mber er,, un unde ders rsta tand nd,, ap appl ply, y, ev eval alua uate te,, an and d crea create te.. An educational or
learning objective formulated from this two-dimensional model contains a noun (type of knowledge) and a verb (type of cognitive process). The Revised Bloom’ Blo om’ss Taxono Taxonomy my pro provid vides es tea teach chers ers with with a more more struct structure uredd and mor moree precise approach in designing and assessing learning objectives. Below is an example of an educational or learning objective: Stud Studen ents ts wi will ll be ab able le to diff differ eren enti tiat ate e qu qual alit itat ativ ive e re rese sear arch ch an and d quantitative research.
In the the ex exam ampl ple, e, differentiate is the verb that represents the type of cogn co gniti itive ve proc proces esss (in (in this this ca case se,, analyze), while qual qualitativ itative e rese research arch and quan qu antit titat ativ ive e rese resear arch ch is th thee noun noun phra phrasse that that repr repres esen ents ts the the ty type pe of
knowledge (in this case, conceptual ). ). Tables 2 and 3 present the definition, illustrative verbs, and sample objectives of the cognitive process dimensions and knowledge dimensions of the Revised Bloom’s Taxonomy. Table 2. Cognitive Process Dimensions in the Revised Bloom’s Taxonomy of Educational Objectives Cognitive Process
Definition
Create
compose, produce, Combining parts to make a develop, formulate, devise, prepare, whole design, construct, propose, and reorganize assess, measure, Judging th the value of estimate, evaluate, information or critique, and judge data
Evaluate
Illustrative Verbs
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
Sample Objective
Propose a program of action to help solve Metro Manila’s traffic congestion. Critique the latest film that you have watched. Use the critique guidelines and format discussed in the 38
SULTAN KUDARAT STATE UNIVERSITY
Breaking down analyze, calculate, information into examine, test, compare, differentiate, parts organize, and classify Applying the apply, employ, facts, rules, practice, relate, use,
Analyze
Apply
concepts ideas in and another context Unde Un ders rsta tand nd Unde Unders rsta tand ndin ingg what the information means Rem emem embber Rec ecoognizi nizinng and recalling facts
class. Classify the following chemical elements based on some categories/areas. Solve the following problems using the
implement, and solve carry-out,
different measures of central tendency.
describe, determine, interpret, translate, paraphrase, and explain identifying, list, name, underline, recall, retrieve, locate
Explain the causes of malnutrition in the country. Name the 7th president of the Philippines
Table 3. Knowledge Dimensions in the Revised Bloom’s Taxonomy of Educational Objectives Knowledge
Description
Sample Question
Factual
This type of knowledge is basic in every discipline. It tells the facts or bits of information one needs to know in a discipline. This type of knowledge usually answers questions that begin with “who”, “where”, “what”, and “when”. Th This is typ type ooff kn knowle wledge dge iiss aals lsoo fundamental in every discipline. It tells the concepts, generalizations, principles, theories, and models that one needs to know in a discipline. This type of
Who is the national hero of the Philippines?
Con once cept ptuual
knowledge usually answers questions that begin with “what”. Proc rocedural This type ooff kn knowledge iiss aallso fundamental in every discipline. It tells the processes, steps, techniques, methodologies, or specific skills needed in performing a specific task that one needs to know and be able to do in a discipline. This type of knowledge usually answers questions that begin with “how”. Metaco Met acogni gnitiv tivee Thi Thiss type type of kn knowl owledg edgee makes makes the the discipline relevant to one’s life. It makes one understand the value of learning on one’s life. It requires reflective knowledge and strategies on how to solve problems or perform a cognitive task through ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
What makes the Philippines the “Pearl of the orient seas”?
How to open a new file in Microsoft Word?
Why is Education the most suitable course for you?
39
SULTAN KUDARAT STATE UNIVERSITY
understanding of oneself and context. This type of knowledge usually answers questions that begin with “why”. Questions that begin “how” and what could be used if they are embedded in a situation that one experiences in real life. LEARNING TARGETS
“Students who can identify what they are learning significantly outscore those who cannot.” – Robert Marzano The metaphor that Connie Moss and Susan Brookhart use to describe learning targets in their Educational Leadership article, “What Students Need to Learn,” is that of a global positioning system (GPS). Much like a GPS communicates timely information about where you are, how far and how long until your destination, and what to do when you make a wrong turn. A learning target provides a precise description of the learning destination. They tell students what they will learn, how deeply they will learn it, and how they will demonstrate their learning. Learning targets describe in student-friendly language the learning to occur in the day’s lesson. Learning targets are written from the students’ point of view and represe represent nt what both the teacher and the stude students nts are aiming for duri du ring ng the the less lesson on.. Le Lear arni ning ng ta targ rget etss al alsso in incl clud udee a pe perf rfor orma manc ncee of understanding, or learning experience, that provides evidence to answer the question “What do students understand and what are they able to do?” As Moss and Brookhart write, while a learning target is for a daily lesson, “Most complex understandings require teachers to scaffold student understanding across a series of interrelated lessons.” In other words, each learning target is a part of a longer, sequential plan that includes short and long-term goals. McMillan (2014) defined learning targets as a statement of student performance for a relatively restricted type of learning outcome that will be achieved in a single lesson or a few days, and contains what students should know,, unde know understan rstandd and be able to do at the end of the instruc instruction tion and criteria for judging the level of demonstrated performance. It is more specific and clear than the educational goals, standards, and learning objectives. To avoid ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
40
SULTAN KUDARAT STATE UNIVERSITY
confusion of terms, De Guzman and Adamos (2015) wrote that definition of learning targets is similar to that of learning outcomes .
Now, how does a learning target differ from an instructional objective? An instructional objective describes an intended outcome and the nature of evidence that will determine mastery of that outcome from a teacher’s point of view. It contains content outcomes, conditions, and criteria. A learning target, on the other hand, describes the intended lesson-sized learning outcome and the nature of evidence that will determine mastery of that outcome from a student’s point of view. It contains the immediate learning aims for today’s lesson (ASCD, 2021).
Why Use Learning Targets?
According to experts, one of the most powerful formative strategies for improving student learning is clear learning targets for students. In Visible Learning, John Hattie emphasizes the importance of “clearly communicating the intentions of the lessons and the criteria for success. Teachers need to know the goals and success criteria of their lessons, know how well all students in their class are progressing, and know where to go next.” Learning targets ensure that students:
know what they are supposed to learn during the lesson; without a clear learning target, students are left guessing what they are expected to learn and what their teacher will accept as evidence of success.
build skilfulness in their ability to assess themselves and be reflective.
are continually monitoring their progress toward the learning goal and making changes as necessary to achieve their goal.
are in control of their own learning, and not only know where they are going, they know exactly where they are relative to where they are going; they are able to choose strategies to help them do their best, and they know exactly what it takes to be successful.
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
41
SULTAN KUDARAT STATE UNIVERSITY
know the essential information to be learned and how they will demonstrate that learning to achieve mastery. Learning targets are a part of a cycle that includes student goal
setting and teacher feedback. Formative assessment, assessment for learning, starts when the teacher communicates the learning target at the beginning of the lesson. Providing examples of what is expected along with the target written in student-friendly language gives students the opportunity to set goals, self-assess, and make improvements. Types of Learning Targets
Manyy ex Man exper perts ts consid consider er four four (4) typ types es of lea learni rning ng tar target gets, s, namely namely:: knowledge, skills, reasoning and product. Table 4 provides the details of each category. Table 4. Types of Learning Targets, Descripon and Sample
Types Knowledge Know, list, identify, understand, explain
Skills Demonstrate, pronounce, perform
Description Sample Knowledge targets I can explain the role of represent the factual conceptual framework information, procedural in a research. knowledge, and I can identify conceptual metaphors and similes understandings that I can read and write underpin each discipline quadratic equations. or content area. These I can describe the targets form the function of a cell foundation for each of the membrane. other types of learning I can explain the effects targets. of an acid on a base. I can facilitate a focus Skill targets are those where a demonstration or group discussion (FGD) a physical skill-based with research performance is at the participants. heart of the learning. I can measure mass in Most skill targets are metric and SI units. found in subjects such as I can use simple physical education, visual equipment and tools to and performing arts, and gather data. foreign languages. Other I can read aloud with content areas may have fluency and expression. a few skill targets. I can participate in civic discussions with the aim of solving current problems.
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
42
SULTAN KUDARAT STATE UNIVERSITY
Reasoning Predict, infer, summarize, compare, analyze, classify
Reasoning targets specify thought processes students must learn to do well across a range of subjects. Reasoning Involves thinking and applying-using knowledge to solve a problem, make a decision, etc. These targets move students beyond mastering content knowledge to the application of knowledge.
Product
Product targets describe
learning in terms of artifacts where creation of a product is the focus of the learning target. With product targets, the specifications for quality of the product itself are the focus of teaching and assessment.
Create, design, write, draw, make
I can dribble to keep the ball away from an opponent. I can justify my research problems with a theory. I can use statistical methods to describe, analyze, evaluate, and make decisions. I can make a prediction based on evidence. I can examine data/results and propose a meaningful interpretation. I can distinguish between historical fact and opinion. I can write a thesis proposal. I can construct a bar graph. I can develop a personal health-related fitness plan. I can construct a physical model of an object.
Other experts consider a fifth type of learning target – affect. This refers to affective characteristics that students can develop and demonstrate because of instruction. This includes the attitudes, beliefs, interests, and values. Some experts use disposition as alternative term for affect. Types of Assessment Methods
Assessment methods methods can be categorize categorizedd according to the nature aand nd characteristics of each method. McMillan (2007) identified 4 major categories such as selective-response selective-response,, constructed-respon constructed-response, se, teacher observations and student self-assessment .
Selected-Response Selected-Respons e vs. Constructed-Resp Constructed-Response onse
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
43
SULTAN KUDARAT STATE UNIVERSITY
An assessment, test, or exam is classified as selected-response or constructed-responsee based on the item types used. constructed-respons An exam using multiple-choice, true/false, matching, or any combination of these item types is called a selected-response assessment becaus bec ausee the studen studentt “se “selec lects” ts” the correc correctt answer answer from from ava availa ilable ble ans answer wer choices. A selected-response exam is considered to be an objective exam because there is no rating of the student’s answer choice – it is either correct or incorrect.
Multiple-Choice Test Items have a stem that poses the problem or
question and three or four answer choices (options). One of the choices is the undeniably correct answer, and the other options are, unquestionably, incorrect answers.
Matching items are somewhat like MC items in that there are item stems
(phrases or statements) and answer choices that are required to be matched to the item stems. There should always be one more answer choice than the number of item stems. Generally, matching items are well suited for testing understanding of concepts and principles.
True-false items have the advantage of being easy to write, more can be
given in the same amount of time compared to MC items, reading time is minimized, and they are easy to score. Construct Cons tructed-re ed-respon sponse se items req requi uire re th thee stu tuddent ent to answe swer a
question, commonly referred to as a “prompt.” A constructed-response exam is considered to be a subjective exam because the correctness of the answer is based on a rater’s opinion, typically with the use of a rubric scale to guide the the scor scorin ing. g. Es Essa sayy and and shor shortt an answ swer er ex exam amss are are co cons nstr truc ucte tedd-re resp spon onse se assessments because the student has to “construct” the answer. Comparison between Selected-Response and Constructed-Response
Types
Advantages
Selected-response (e.g .,., multiple choice, true
or false, matching type) Easier to score Can be answered quickly Covers a broader range of curriculum in a shorter
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
Constructed-response (e.g .,., short answer, essay)
Allows students to demonstrate complex, indepth understanding Less likelihood of guessing 44
SULTAN KUDARAT STATE UNIVERSITY
time
Disadvantages
Constrains students to
correct answer Motivates students to learn in a way that stresses the organization of information, principles, and application More time-consuming to
single appropriate answer More time-consuming to an Encourages students to score learn by recognition Subject to guessing correct answer
Teachers Observation Teacher Teac her obse observati rvation on has been accepted readily in the past as a
le legi gititima mate te so sour urce ce of info inform rmat atio ionn fo forr reco record rdin ingg an andd repo report rtin ingg st stud uden entt demonstrations of learning outcomes. As the student progresses to later year ye arss of scho school olin ing, g, less less an andd less less atte attent ntio ionn ty typi pica cally lly is gi give venn to teac teache her r obse ob serv rvat atio ionn an andd more more and and mo more re atte attent ntio ionn ty typi pica cally lly is gi give venn to form formal al assessment procedures involving required tests and tasks taken under explicit constraints of context and time. However, teacher observation is capable of provid pro viding ing subst substant antial ial inf inform ormati ation on on stu studen dentt demon demonstr strati ation on of learni learning ng outcomes at all levels of education. For teacher observation to contribute to valid judgments concerning student learning outcomes, evidence needs to be gathered and recorded system sys temati atical cally. ly. Sys System temati aticc gat gather hering ing and rec record ording ing of evide evidence nce req requir uires es preparation and foresight. Teacher observation can be characterised as two types: incidental and planned.
Incidental observation occurs during the ongoing (deliberate) activities of teaching and learning and the interactions between teacher and students. In other words, an unplanned opportunity emerges, in the context of clas classr sroo oom m acti activi vitie ties, s, wh wher eree the the teac teache herr obse observ rves es some some as aspe pect ct of individual student learning. Whether incidental observation can be used as a basis for formal assessment and reporting may depend on the records that are kept.
Planned observation observati on involves deliberate planning planni ng of an opportunity for the teacher to observe specific learning outcomes. This planned opportunity
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
45
SULTAN KUDARAT STATE UNIVERSITY
may occur in the context of regular classroom activities or may occur thro throug ughh the the se sett ttin ingg of an asse assess ssme ment nt task task (s (suc uchh as a prac practic tical al or performance activity). Student Self-Assessment
One form of formative assessment is self-assessment or or self-reflection by students. Self-reflection is the evaluation or judgment of the worth of one’s performance and the identification of one’s strengths and weaknesses with a view to improving one’s learning outcomes, or more succinctly, reflecting on and monitoring one’s own work processes and/or products (Klenowski, 1995). Student self-assessment has long been encouraged as an educational and lear le arni ning ng stra strate tegy gy in the the clas classr sroo oom, m, an andd is both both popu popula larr an andd po posi sititive vely ly regarded by the general education community (Andrade, 2010). Besides, McMillan and Hearn (2008) described self-assessment as a process by which students 1) monitor and evaluate the quality of their thinking and behavior when learning and 2) identify strategies that improve their understan unde rstanding ding and skills. That is, self-a self-asses ssessmen smentt occurs occurs when students judge their own work to improve performance as they identify discrepancies between current and desired performance. This aspect of self-assessment aligns closely with standards-based education, which provides clear targets and criteria that can facilitate student self-assessment. The pervasiveness of standards-based instruction provides an ideal context in which these clear-cut benchmarks for performance and criteria for evaluating student products, when int when intern ernali alized zed by stu studen dents, ts, pro provid videe the kno knowle wledge dge nee neede dedd for sel selffassessment. Finally, self-assessment identifies further learning targets and in inst stru ruct ctio iona nall st stra rate tegi gies es (cor (corre rect ctiv ives es)) st stud uden ents ts can can appl applyy to im impr prov ovee achievement. Appropriate Methods of Assessment
Oncee the learni Onc learning ng tar target getss are identi identifie fied, d, ap appro propri priate ate assess assessmen mentt methods can be selected to measure student learning. The match between a learning target and the assessment method used to measure if students have met the target is very critical. Tables 5 and 6 present a matrix of the different ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
46
SULTAN KUDARAT STATE UNIVERSITY
types of learning targets and sample assessment methods. Details of these varied assessment methods shall be discussed thoroughly in Chapter 5. Table 5. Matching Learning Targets and Assessment Methods
Selected Response
Learning Targets Knowledge Reasoning Skill Product
Multiple Choice 3 2 1 1
Constructed Response
True or Matching Short Problem False Type Answer -solving 3 3 3 3 1 1 1 3 1 1 1 2 1 1 1 1
Essay 3 3 2 1
Note: Higher numbers indicate better matches (e.g., 5 = Excellent, 1 = Poor).
Table 6. Matching Learning Targets with other Types of Assessment
Learning Targets
Project-based
Portfolio
21 2 3
23 3 3
Kneoaw R solendingge Skill Product
Recitation Observation 3 1 1
2 2 1
Note: Higher numbers indicate better matches (e.g., 5 = Excellent, 1 = Poor).
There are still other types of assessment, and it is up to the teachers to select the method of assessment and design appropriate assessment tasks and activities to measure the identified learning targets. Summary
In educational setting, the purpose of assessment may be classified in terms of assessment of learning, assessment for learning, and assessment as learning.
Assessment OF learning learning is held at the end of a su subject bject or a course to determine performance. It is equivalent to summative assessment.
Assessment FOR learning learning is done repeatedly repeatedly during instruction to check the learners’ progress and teacher’s strategies so that intervention or changes can be made.
Assessment AS learning learning is done to develop the learners’ independe independence nce and self-regulation.
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
47
SULTAN KUDARAT STATE UNIVERSITY
Classroom assessment in the teaching-learning process has the following roles: formative, diagnostic, evaluative, and motivational.
Educational objectives objectives are best explained th through rough Bloom’s Taxonomy. Taxonomy. It consists of three (3) domains, namely: cognitive, affective and psychomotor which are the main goals of teachers.
An instructional objectives objectives guide instruction, and we write them from the teacher’s point of view. Learning targets guide learning and are expressed in language that students understand, the lesson-sized portion of information, skills, and reasoning processes that students will come to know deeply.
Assessment methods methods may be categorize categorizedd as selected-response, selected-response, constructed-response, constructed-respon se, teacher observation and student self-assessment.
Learning targets may be knowledge, skills, reasoning or product.
Teachers match learning targets with appropriate assessment methods.
Assessment
1. Describe Describe the 3 purpos purposes es of clas classroom sroom aasses ssessmen smentt by comp completing leting the the matrix below. Assessment OF Assessment Assessment AS learning FOR learning learning WHAT? WHY? WHEN? 2. Compare Compare and contras contrastt the diffe different rent role roless of class classroom room assess assessment ment.. 3. Distinguis Distinguishh educational educational goa goals, ls, stand standards, ards, obj objectiv ectives es and learning learning targets targets using the following table. Goals
Standards
Objectives
Learning targets
Description Sample statements ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
48
SULTAN KUDARAT STATE UNIVERSITY
4. Learning Learning ta targets rgets are are simi similar lar to lea learnin rningg outcomes. outcomes. Justif Justify. y. 5. Determine Determine whe whether ther the given given learning learning target is kno knowledg wledge, e, skill, reasonin reasoningg or product. Learning Targets 1. I can us usee data from a random ssample ample to draw draw inferences about a population with an unknown characteristic of interest. 2. I can ide identify ntify the major major reasons for the rapid expansion of Islam during the 7th and 8th centuries. 3. I can describe th thee relationship between illustrations an andd the story in which they appear. 4. I can de describe scribe how organisms iinteract nteract with ea each ch other to transfer energy and matter in an ecosystem. 5. I can reca recallll the influences influences that ppromote romote alcoho alcohol,l, tobacco, and other drug use. 6. I can us usee characteristic properties of liquids to distinguish from another. 7. one I cansubstance eva evaluate luate th the e quality qualit y of my own wor workk to refine it. 8. I can can id identif entifyy the main idea of a pass passage. age. 9. I can drib dribble ble the bas basketb ketball all with one one hand hand.. 10.I can list down the first 5 Philippine Presidents. 11.I can construct a bar graph. 12.I can develop a personal health-related fitness plan. 13.I can measure the length of an object. 14.I can introduce myself in Chinese. 15.I can compare forms of government.
Type R K R R K R R K S K P P P SS RS
6. Check Check the D DepEd’ epEd’ss K to 1122 Curriculum Curriculum Guid Guidee at this this link: link: https://www.deped.gov.ph/k-to-12/about/k-to-12-basic-educationcurriculum/grade-1-to-10-subjects/,, and select a single llesson curriculum/grade-1-to-10-subjects/ esson that interest you. Complete a learning target activity below based on the given model: Title Titl e of Less Lesson on: Wring the Literature Review of a Thesis Proposal
Instructional Lesson Content Objective/learning outcome At the end of the Writing the Literature Review lesson, the students should be able to Research demonstrate ability to writetheir a
Type of Learning Targets
Sample Learning Targets I can…
Knowledge explain the principles
Literature Research and Gap
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
in writing literature review of the a thesis 49
SULTAN KUDARAT STATE UNIVERSITY
literature review section of a thesis proposal.
proposal argue the significance of my thesis through literature review
Performing the Literature Search and Reviewing the Literature
Reasoning
Principles and Guidelines in Writing the Literature Review
Skills
search and organize related literature from various sources
APA Guidelines in Citations and References
Product
write an effective review section of a thesis proposal
Title of Lesson : ____________ ________________________ ______________________ __________
Lesson Content Instructional Objective/learning objectives
Type of Learning Targets
Sample Learning Targets
7. purposes Evaluate Evaluate the ex extent tent of yo your urlearning know knowledge ledge aand ndand understa undappropriate erstanding nding about aassessment bout the of assessment, targets, methods. Indicators
Great extent
Moderate Not at all extent
1. I can can en enumer umerate ate the the ddiffere ifferent nt pu purpose rposess of assessment. 2. I can expl explain ain the the rol rolee of ass assess essmen mentt in the teaching and learning process. 3. I can can ex explain plain the purpose purpose of conduc conducting ting classroom assessment. 4. I can diff differe erenti ntiate ate betw between een ggoal oals, s, standards, objectives, and learning targets. ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
50
SULTAN KUDARAT STATE UNIVERSITY
5. I can exp explai lainn the ddiffe ifferen rentt lev levels els ooff expertise in Bloom’s Taxonomy of Educational Objectives in the Cognitive domain. 6. I can can ex explain plain the difference difference between between the the Bloom’s Taxonomy and the Revised Taxonomy. 7. Bloom’s I can can compare compa re and cont contrast rast instr instructio uctional nal objectives and learning targets. 8. I can can formulate formulate specific specific learn learning ing targ target et given in a specific lesson. 9. I can can ma match tch ass assess essmen mentt met method hod appropriate to specific learning targets. 10.II can select or design an assessment task 10. or activity to measure a specific learning l earning target. Enrichment
Open the DepEd’s K to 12 Curriculum Guide from this link: https://www.deped.gov.ph/k-to-12/about/k-to-12-basic-educationcurriculum/grade-1-to-10-subjects/.. and make yourself familiar with the curriculum/grade-1-to-10-subjects/ content standards, performance standards and competency. Choose a specific lesson for a subject area, and grade level that you want to teach in the future. Prepare an assessment plan using the matrix. Subject Grade level Grade level standard Performance standards Specific lesson Learning targets Assessment task/activity Why use of this assessment task activity? How does this task/activity help you
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
51
SULTAN KUDARAT STATE UNIVERSITY
improve your instruction? How does this assessment task/activity help your learners achieved the intended learning outcomes? References
Andrade, H. (2010). Students as as the definitive source of formative assessment: Academic self-assessment self-assessment and the self-regulation of learning. In H. Andrade & G. Cizek (Eds.), Handbook of formative assessment (pp. 90–105). New York, NY: Routledge. Clayton, Heather. “Power Standards: Focusing on the Essential.” Making the Standards Come Alive! Alexandria, VA: Just ASK Publications, 2016. Access at www.justaskpublications.com/just-ask-resource-center/enewsletters/msca/power-standards/ newsletters/msca/power-standards/ 1. Manila: Rex Book Store. David et al. (2020). Assessment in Learning De Guzman, E. and Adamos, J. (2015). Assessment of Learning 1. Quezon City: Adriana Publishing Co., Inc. EL Education (2020). Students Unpack a Learning Target and Discuss om/440522199 Academic Vocabulary. Vocabulary. [Video]. https://vimeo.c https:/ /vimeo.com/4405221 Hattie, John. Visible Learning for Teachers: Maximizing Impact on Learning. New York: Routledge, 2012. Klenowski, V. (1995). Student self-evaluation processes in student-centred teaching and learning contexts of Australia and England. Assessment in Education: Principles, Policy & Practice, 2(2). Maxwell, Graham S. (2001). Teacher Observation in Student Assessment. (Discussion Paper). The University of Queensland. Moss, Connie and Susan Brookhart. Learning Targets: Helping Students Aim for Understanding in Today’s Lesson. Alexandria: ASCD, 2012. Navarro, L., Santos, R. and Corpuz, B. (2017). Assessment of Learning 1 (3 rd ed.). Quezon City: Lorimar Publishing, Inc.
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
52
SULTAN KUDARAT STATE UNIVERSITY
Lesson 3: Different Classifications of Assessment
Pre-discussion
Ask the students about their experiences when they took the National Achievement Test (NAT) during their elementary and high school days. Who administered it? How did you answer them? What do you think was the purpose of the NAT? What about their experiences in taking quarterly tests or quizzes? What other assessments or tests did they take before? What are your notable experiences relative to taking tests? What to Expect?
At the end of the lesson, the students students can: 1. compa compare re the followin followingg forms ooff asse assessme ssment: nt: educatio educational nal vs. psychological, teacher-made vs. standardized, selected-response vs. constructed-response, constructed-respons e, achievement vs. aptitude, and power vs. speed; 2. give exam examples ples of ea each ch cclassi lassificati fication on of test; 3. illust illustrate rate situatio situations ns on th thee use of ddiffere ifferent nt class classificat ifications ions of assessment; and 4. decid decidee on the kkind ind of of ass assessm essment ent to to be used used..
Classifications of Assessment The diere dierent nt for forms ms of assess assessmen mentt are classi classied ed accord according ing to purpos purpose, e, form, form, interpretaon of learning, funcon ability, and kind of learning.
Classification
Type
Purpose
Educationall and Psychological Educationa
Form Function
Paper and pencil, and Performance-based Teacher-made and Standardized
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
53
SULTAN KUDARAT STATE UNIVERSITY
Kind of learning Ability Interpretation of learning
Achievement and Aptitude Achievement Speed and Power Norm-referencedd and Criterion-referenced Norm-reference Criterion-referenced
Educational and Psychological Assessment Educational assessment is the process of measuring and documenting
what students have learned in their educational environments. In a traditional classroom setting, it focuses on identifying the knowledge, skills, and attitudes students have acquired via a lesson, a course, a grade level, and so on. It is an ongoing process, ranging from the activities that teachers do with students in clas classr sroo ooms ms ever everyy da dayy to stan standa dard rdiz ized ed test testin ing, g, colle college ge thes theses es and and instruments that measure the success of corporate training programs. Let’s understand educational assessments by looking at its many aspects:
The forms of educational assessment can take
The need for educational assessment
The essentials of a good assessment
Types of educational assessment Education assessments can take any form:
It may involve formal tests or performance-based activities.
It may be administered online or using paper and pencil or other materials.
It may be objective (requiring a single correct answer) or subjective (there may be many possible correct answers, such as in an essay). It may be formative (carried out over the course of a project) or summative (administered at the end of a project or a course). What these types of educational assessments have in common is that,
all of them measure the learners’ performance relative to previously defined goals, which are usually stated as learning objectives or outcomes. And, because assessment is so widespread, it is vital that educators, as well as parents and students, understand what it is and why it is used. Psycholog Psyc hological ical asse assessme ssment nt is the use of standardized measures to evaluate the abilities, behaviors, and personal qualities of people. Typically, ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
54
SULTAN KUDARAT STATE UNIVERSITY
psycho psy cholog logica icall tests tests attemp attemptt to sh shed ed lig light ht on an individ individual ual’s ’s intell intellige igence nce,, personality perso nality,, motivation motivation,, intere interest, st, psych psychopat opatholog hology, y, or ability. ability. Traditiona Traditionally, lly, these tests were formed on clinical or psychiatric populations and were used primarily for diagnosis and treatment. However, with the increasing presence of forensic psychologists in the courtroom, these tests are being used to help determine legal questions or legal constructs. As a result, there is a growing debate over the utility of these tests in the courtroom. Paper-pencil and Performance Performance-based -based Assessments Paper-and-pencil instruments instruments refer to a general group of assessment
tools in which students read questions and respond in writing. This includes test tests, s, such such as kn know owle ledg dgee an andd ab abililitityy test tests, s, and and in inve vent ntor orie ies, s, such such as personality and interest inventories. It can be used to assess job-related know kn owle ledg dgee an andd ab abililitityy or sk skililll qu qual alifific icat atio ions ns.. The The poss possib ible le rang rangee of qualifications which can be assessed using paper-and-pencil tests is quite broad. For example, such tests can assess anything from knowledge of office procedures to knowledge of federal legislation, and from the ability to follow directions to the ability to solve numerical problems. Because many takers can be assessed at the same time with a paper-and-pencil test, such tests are an efficient method of assessment. All assessment methods must provide information that is relevant to the qualification(s) being assessed. There are four (4) steps in developing paperpap er-and and-pe -penc ncilil tes tests, ts, na namel mely: y: listin listingg top topic ic areas/ areas/tas tasks; ks; specif specifyin yingg the response format, number of questions, the time limit and difficulty level; writing the questions and developing the scoring guide; and reviewing the questions and scoring guide. Step 1. Listing topic areas/tasks
For each knowledge/ability qualification that will be assessed by the test, list the topic areas/tasks to be covered. Check off any critical topic areas/tasks that are particularly important to the job. For example, the topic areas that will be covered for the qualification (knowledge of office procedures) might be knowledge of correspondence, knowledge of filing and knowledge of making travel arrangements. Or, for example, the tasks to be assessed for the ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
55
SULTAN KUDARAT STATE UNIVERSITY
qualification (ability to solve numerical problems) might be the ability to add, subtract, multiply and divide. Step 2. Specifying the response format, number of questions, the time t ime limit and difficulty level
Prior to writing the questions for your test, you should decide on such things as the response format, the number of questions, the time limit and the difficulty level. What type of response format should I choose? The three most common response formats are: (a) multiple-choice; (b) short answer; and (c) essay.
With a multiple-choice response format, a large number of different topic areas/tasks can be covered within the same test and the questions are easy to score. However, because all potential answers must be chosen by some candidates, it is time-consuming to write good questions.
With Wi th a shor short-a t-ans nswe werr resp respon onse se form format at,, as in multi multipl plee choi choice ce,, a la larg rgee number of different topic areas/tasks can be covered within the same test and these questions are easy to score. In addition, less time is required to write these questions compared to multiple-choice ones.
With an essay response format, only a few topic areas/tasks can be covered due to the amount of time it takes to answer questions; however, the content can be covered in greater detail. Essay questions require little time to write but they are very time-consuming to score.
Although at first glance a multiple-choice format may seem a relatively easy and logical choice if breadth of coverage is emphasized, don't be fooled. It is hard to write good multiple-choice questions and you should only choose this type of response format if you are willing to devote a lot of time to editing, reviewing, and revising the questions. If depth of coverage is emphasized, use an essay response format.
Performance-based Performance-base d Assessment
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
56
SULTAN KUDARAT STATE UNIVERSITY
Performance assessment is is one alternative to traditional methods of
testing student achievement. While traditional testing requires students to answer questions correctly, performance assessment requires students to demonstrate knowledge and skills, including the process by which they solve problems. Performance assessments measure skills such as the ability to integrate knowledge across disciplines, contribute to the work of a group, and develop a plan of action when confronted with a new situation. Performance assessments are also appropriate for determining if students are achieving the higher standards set by states for all students. This brochure explains features of this assessment alternative, suggests ways to evaluate it, and offers exploratory questions you might ask your child's teacher about this subject. What Are Performance Assessments? The goal of performance-based learning should be to enhance what the students have learned, not just have them recall facts. Thee foll Th follow owin ingg six six (6) (6) type typess of ac activ tivititie iess prov provid idee good good st star artin tingg poin points ts for for assessments in performance-based learning. 1. Pr Pres esen enta tati tion ons s
One easy way to have students complete a performance-based activity is to have them do a presentation or report of some kind. This activity could be done by students, which takes time, or in collaborative groups.
The basis for the presentation may be one of the following: Providing information
Teaching a skill
Reporting progress
Persuading others Stu Stude dennts ma mayy cho hooose to add in vi visu suaal ai aidds or a Po Pow werP rPoi oint nt
presentation or Google Slides to help illustrate elements in their speech. Presentations work well across the curriculum as long as there is a clear set of expectations for students to work with from the beginning. 2. Por ortf tfo oli lio os ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
57
SULTAN KUDARAT STATE UNIVERSITY
Student portfolios can include items that students have created and collected over a period. Art portfolios are for students who want to apply to art programs in college. Another example is when students create a portfolio of their written work that shows how they have progressed from the beginning to the end of class. The writing in a portfolio can be from any discipline or a combination of disciplines. Some teachers have students select those items they feel represents their best work to be included in a portfolio. The benefit of an activity like this is that it is something that grows over time and is therefore not just completed and forgotten. A portfolio can provide students with a lasting selection of artefacts that they can use later in their academic career. Reflections may be included in student portfolios in which students may make a note of their growth based on the materials in the portfolio. 3. Pe Perf rfor orma manc nces es
Dramatic performances are one kind of collaborative activities that can
be used as a performance-based assessment. Students can create, perform, and/or provide a critical response. Examples include dance, recital, dramatic enactment. There may be prose or poetry interpretation. This form of performance-based assessment can take time, so there must be a clear pacing guide. Students must be provided time to address the demands of the activity; resources must be readily available and meet all safety standards. Students should have opportunities to draft stage work and practice. Developing the criteria and the rubric and sharing these with students before evaluating a dramatic performance is critical. 4. Projects
Projec Pro jects ts are com common monly ly used used by teache teachers rs as perfor performan mancece-ba based sed activi act ivitie ties. s. The Theyy can inc includ ludee ev every erythi thing ng from from resear research ch paper paperss to artist artistic ic representations of information learned. Projects may require students to apply their knowledge and skills while completing the assigned task. They can be aligned with the higher levels of creativity, analysis, and synthesis. Students might be asked to complete reports, diagrams, and maps. Teachers can also choose to have students work individually or in groups. ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
58
SULTAN KUDARAT STATE UNIVERSITY
Journals may be part of a performance-based assessment. They can be used to record student reflections. Teachers may require students to complete journal entries. Some teachers may use journals as a way to record participation. 5. Ex Exhi hibi bits ts an and d Fa Fair irs s
Teachers can expand the idea of performance-based activities by
creating exhibits or fairs for students to display their work. Examples include things like history fairs to art exhibitions. Students work on a product or item that will be exhibited publicly. Exhibitions show in-depth learning and may include feedback from viewers. In some cases, students might be required to explain or defend their work to those attending the exhibition. Some fairs like science fairs could include the possibility of prizes and awards. 6. Debates
A debate in the classroom classroom is one form of performance-bas performance-based ed learning that teaches students about varied viewpoints and opinions. Skills associated with debate include research, media and argument literacy, reading comprehension, evidence evaluation, public speaking, and civic skills. Teacher-made and Standardized Tests
Carefully constructed teacher-made tests and standardised tests are similar in many ways. Both are constructed on the basis of carefully planned table of specifications, both have the same type of test items, and both provide clear directions to the students. Still the two differ. They differ in the quality of test items, the reliability of test measures, the procedures for admi ad mini nist ster erin ingg an andd scor scorin ingg an andd the the inte interp rpre reta tatition on of sc scor ores es.. No doub doubt, t, standardised tests are good and better in quality, more reliable and valid. But a classroom teacher cannot always depend on standardised tests. These may not suit to his local needs, may not be readily available, may be costly, and may have different objectives. In order to fulfill the immediate requirements, the teacher has to prepare his own tests which are usually objective type in nature. ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
59
SULTAN KUDARAT STATE UNIVERSITY
What is a Teacher-made Test? Teacher-made tests are normally prepared and administered for testing
classr cla ssroom oom achiev achieveme ement nt of studen students, ts, evalua evaluatin tingg the met method hod of teachi teaching ng adopte ado ptedd by the teach teacher er and other other curric curricula ularr progra programme mmess of the sch schoo ool.l. Teacher-made test is one of the most valuable instruments in the hands of the teac teache herr to solv solvee his his pu purp rpos ose. e. It is des desig igne nedd to solv solvee the the prob proble lem m or requirements of the class for which it is prepared. It is prepared to measure the outcomes and content of local curriculum. It is very much flexible so that, it can be adopted to any procedure and material. It does not require any sophisticated technique for preparation. Taylor has highly recommended for the use of these teacher-made objective type tests, which do not require all the four steps of standardised tests nor need the rigorous processes of standardisation. Only the first two steps planning and preparation are sufficient for their construction. Features of Teacher-Made Tests
1. The items items of the tests tests are are arra arranged nged iinn orde orderr of dif difficult ficulty. y. 2. These These are prep prepared ared by th thee teach teachers ers which which can be used used for prog prognosis nosis and diagnosis purposes. 3. The test test covers tthe he whol wholee conte content nt area and and inclu includes des a large large number number of items. 4. The prepar preparation ation of the iitems tems confor conforms ms to the the blue blueprint. print. 5. Test constru construction ction iiss not a si single ngle ma man’s n’s bus business iness,, rather rather it is a cooperative endeavour. 6. A teacher-ma teacher-made de test do does es not cover cover all the ste steps ps of a standardis standardised ed test. 7. Teacher-m Teacher-made ade tes tests ts may also also be employe employedd as a tool tool for formative formative evaluation. 8. Preparatio Preparationn and adm administ inistration ration of of these tes tests ts are economic economical. al. 9. The test test is dev develope elopedd by the tteach eacher er to ascertain ascertain the student’ student’ss achievement and proficiency in a given subject. 10.Teacher-made 10. Teacher-made tests are least used for research purposes. purposes. 11.They 11. They do not have norms whereas providing norms is quite ess essential ential for standardised tests. ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
60
SULTAN KUDARAT STATE UNIVERSITY
Uses of Teacher-Made Tests
1. To help a teache teacherr to know whether whether the cclass lass in no normal, rmal, average, average, above above average or below average. 2. To help him him in formulating formulating nnew ew strat strategies egies for for teaching teaching and le learnin arning. g. 3. A teacher-ma teacher-made de test ma mayy be used as a fu full-fle ll-fledged dged achiev achievement ement test test which which covers the entire course of a subject. 4. To measure measure students students’’ academic academic achie achieveme vement nt in a given ccourse ourse.. 5. To assess assess how far spec specified ified ins instruct tructional ional ob objectiv jectives es have been been achieved. achieved. 6. To know know tthe he efficac efficacyy of llearni earning ng ex experie periences nces.. 7. To diagnose diagnose stu students dents learnin learningg difficul difficulties ties and to sug suggest gest necessa necessary ry remedial measures. 8. To certify, certify, clas classify sify or gra grade de the student studentss on the bas basis is of resulting resulting scores. scores. 9. Skilfully Skilfully prep prepared ared tea teachercher-made made tests tests can sserve erve the the purpos purposee of standardised test. 10.Teacher-made 10. Teacher-made tests can help a teacher teacher to render guidance and counselling. 11.Good 11. Good teacher-made tests can be exchanged among neighbouring schoo schools. ls. 12.These 12. These tests can be used as a tool for formative, diagnostic and summative evaluation. 13.To assess pupils’ growth in different areas. Standardized Test standardi dardized zed test is a test that is given to students in a very A stan
consistent manner. It means that the questions on the test are all the same, the time given to each student is also the same, and the way in which the test is scored is the same for all students. Standardized tests are constructed by experts along with explicit instructions for administration, standard scoring procedures, and a table of norms for interpretation. Thus, a standardized test is administered and scored in a consistent or "s "sta tand ndar ard" d" ma mann nner er.. Th Thes esee test testss are are de desi sign gned ed in su such ch a way way that that the the queest qu stio ions ns,,
con ondi ditition onss
forr fo
adm dmin inis iste teri ring ng,,
sc scor orin ingg
proce roceddures ures,,
and
interpretations are consistent. Any test in which the same test is given in the same manner to all test takers, and graded in the same manner for everyone, is a standardized test. ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
61
SULTAN KUDARAT STATE UNIVERSITY
Standardized tests do not need to be high-stakes tests, time-limited tests, or multiple-choice tests. The questions can be simple or complex. The subject matt ma tter er am amon ongg scho school ol-a -age ge stud studen ents ts is frequ frequen ently tly acad academ emic ic sk skilills ls,, but but a standardized test can be given on nearly any topic, including driving tests, creativity, personality, professional ethics, or other attributes. The purpose of standardized tests is to compare the performance of one individual with another, an individual against a group, or one group with another group. Below are lists of common standardized tests. You can explore the details of these test titles from http://www.study.com. http://www.study.com. •
Sta Standa ndardiz rdized ed K-12 -12 E Exxams ams
•
ISEE: Independent School Entrance Examination
•
SSAT: Secondary School Admission Test
•
HSPT: High School Placement Test
• •
SHSAT: Specialized High School Admissions Test COOP: Cooperative Admissions Examination Program
•
PSAT: Preliminary Scholastic Aptitude Test
•
GED: GED: Ge Gene nera rall Edu Educa catition onal al De Deve velo lopm pmen entt Tes Testt
• HiSET: High School Equivalency Test
•
ACT: American College Test Test
•
SAT: Scholastic Aptitude Test Locally, the Department of Education has the National Achievement
Test (NAT) for Grades 3, 6, 10 and 12 ( see Table 1). Moreover, the Center for Eduucati Ed cation onaal Meas asuureme remennt (C (CEM EM), ), a pr priv ivaate fifirm rm,, als lsoo has has a lilist st of standardized tests for incoming Grade 7 and Grade 11 students, and several others for students entering college such as the Readiness Test for Colleges and Univer Universit sities ies,, Nursi Nursing ng Ap Aptit titude ude Test, Test, and Philip Philippin pinee Aptitu Aptitude de Tes Testt for Teachers.
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
62
SULTAN KUDARAT STATE UNIVERSITY
Table 1. NAT Examinaon Informaon
Grade/Year Grade 3 (Elementary )
Examinee
All students in both public and private schools.
Grade 6 (Elementary ) Grade 10 (Junior High School ) Grade 12 (Senior High School Completers , called Basic Education Exit Assessment (BEEA)) (BEEA))
Graduating students in both public and private schools.
Description
Serves as an entrance assessment for the elementary level. One of the entrance examinationss to proceed in examination Junior High School. One of the entrance examinationss to proceed in examination Senior High School. Taken for purposes of systems evaluation; not a prerequisite prerequis ite ffor or graduation or college enrolment.
Note: The test is a system-based assessment designed to gauge learning outcomes across target levels in identified periods of basic education. Empirical information on the achievement level of pupils/students serve as a guide for policy makers, administrators, curriculum planners, principles, and teachers, along with analysis on the performance of regions, divisions, schools, and other variables overseen by DepEd.
Achievement and Aptitude Test
How do we determine what a person knows about a certain subject? Or how do we determine an individual's level of skill in a certain area? One of the most common ways to do this is to use an achievement test. What is an Achievement Test?
An achievement test is is designed to measure a person's level of skill, accomplishment, or knowledge in a specific area. The achievement tests that most people are familiar with are the standard exams taken by every student in school. Students are regularly expected to demonstrate their learning and proficiency in a variety of subjects. In most cases, certain scores on these achievement tests are needed in order to pass a class or continue on to the next grade level (Cherry, 2020). Some examples of achievement tests include: • •
A math math eexa xam m cover covering ing the latest latest cha chapte pterr in your your book book A tes testt in yo your ur Psyc Psycho holo logy gy clas classs
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
63
SULTAN KUDARAT STATE UNIVERSITY
•
A compre comprehen hensiv sivee fin final al in your your Purpo Purposiv sivee Commun Communica icatio tionn cla class ss
•
The AC ACT aannd SAT eexxams
•
A skil skills ls ddem emon onst stra ratition on iinn yyou ourr PE cl clas asss Each of these tests is designed to assess how much you know at a
specific point in time about a certain topic. Achievement tests are not used to determine what you are capable of; they are designed to evaluate what you know and your level of skill at the given moment. Achievement tests are often used in educational and training settings. In schools, achievements tests are frequently used to determine the level of education for which students might be prepared. Students might take such a test to determine if they are ready to enter into a particular grade level or if they are ready to pass of a particular subject or grade level and move on to the next. Stan Standa dard rdiz ized ed ac achi hiev evem emen entt test testss are are al alsso used used ex exte tens nsiv ivel elyy in educational settings to determine if students have met specific learning goals. Each grade level has certain educational expectations, and testing is used to determine if schools, teachers, and students are meeting those standards. Aptitude Test
Unlike achievement tests, which are concerned with looking a person's level of skill or knowledge at any given time, aptitude tests are instead focused on determining how capable of a person might be of performing a certain task. An aptitude test is is designed to assess what a person is capable of doing or to predict what a person is able to learn or do given the right education and instruction. It represents a person's level of competency to perform a certain type of task. Such aptitude tests are often used to assess academic potential or career suitability and may be used to assess either mental or physical talent in a variety of domains. Some examples of aptitude tests include: •
A test test assess assessing ing aann indivi individua dual's l's apti aptitud tudee to becom becomee a fighter fighter pilot pilot
•
A career career test evaluating evaluating a pe person rson's 's ccapab apability ility to work work as an air air tra traffic ffic controller
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
64
SULTAN KUDARAT STATE UNIVERSITY
•
An aptitud aptitudee tes testt is give givenn to high school school studen students ts to determine determine which type of careers they might be good at
•
A comput computer er progra programmi mming ng test test to de deter termin minee how a job job can candid didate ate m migh ightt solve different hypothetical problems
•
A test test design designed ed to te test st a per perso son's n's phys physica icall abiliti abilities es neede neededd for a particular job such as a police officer or firefighter Students often encounter a variety of aptitude tests throughout school
as they think about what they might like to study in college or do for as a career someday. High school students often take a variety of aptitude tests designed to help them determine what they should study in college or pursue as a career. These tests can sometimes give a general idea of what might interest students as a future career. For example, a student might take an aptitude test suggesting that they are good with numbers and data. The results might imply that a career as an accountant, banker, or stockbroker would be a good choice for that particular student. Another student might find that they have strong language and verbal skills, which might suggest that a career as an English teacher, writer, or journalist might be a good choice. choice. Thus, an aptitude test measures one’s ability to reason and learn new skills. Aptitude tests are used worldwide to screen applicants for jobs or educational programs. Depending on your industry and role, you may have to take one or more of the following kinds of test, each focused on specific skills: •
Nume Numeri rica call Reas Reason onin ingg Te Test st
• •
Verb Verbaal Rea Reasson onin ingg Te Test st Abst Abstra racct Rea Reaso sonning ing T Tes estt
•
Mech Mechan anic ical al Apti Aptitu tude de Test Test
•
Indu Induct ctiv ivee Re Reas ason onin ingg Te Test st
Speed Test versus Power Test Speed tests consist of easy items that need to be completed within a
titime me lilimi mit. t. Mos ostt grou groupp te tessts of menta entall abi bilility ty and and ach achie ievveme ement are admi ad mini nist ster ered ed with with tim timee limi limits ts.. In so some me case cases, s, the the titime me limit limitss are are of no importance, as nearly every subject completes all they can do correctly. In ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
65
SULTAN KUDARAT STATE UNIVERSITY
other tests, the limits are short enough to make rate of work an important factor in the score and these are called speed tests. In the context of educational measurement, a power test usually usually refers to a measurement tool composed of several items and applied without a relevant time limit. The respondents have a very long time, or even unlimited time, to solve each of the items, so they can usually attempt all of them. The total score is often computed as the number of items correctly answered, and individual differences in the scores are attributed to differences in the ability under assessment, not to differences in basic cognitive abilities such as processing speed or reaction time. An example of a speed test is a typing test in which examinees are required to type correctly as many words as possible given a limited amount of time. An example of a power test was the one developed by the National Coun Co unci cill of Te Teac ache hers rs in Math Mathem emat atic icss that that de dete term rmin inee the the abil abilitityy of the the examinees to utilize data to reason and become creative, formulate, solve, and reflect critically on the problems provided. Summary
In this lesson, we did identify and distinguish from each other the different classifications of assessment. We learned when to use educational and psychological assessment, or paper-and-pencil and performance-based asse as sess ssme ment nt.. Also Also,, we were were able able to di diff ffer eren entitiat atee teac teache her-m r-mad adee and and standardized test, achievement and aptitude test, as well as, speed and power tests.
Assessment
1. Which Which class classificat ification ion of asses assessmen smentt is common commonly ly used in the classroo classroom m setting? Why? 2. To demonstra demonstrate te understa understanding, nding, try try giving giving more examples examples for each ty type pe of assessme assessment. nt.
Type
Examples
Educational Psychological Paper and pencil Performance-based Teacher-made ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
66
SULTAN KUDARAT STATE UNIVERSITY
Standardized Achievement Aptitude Speed Power Norm-referenced Criterion-referenced 3. Match the the learning learning targe targett with the appropri appropriate ate ass assessm essment ent methods. methods. Check if the type of assessment is appropriate. Be ready to justify. Learning targets Example: Exhibit
Selectedresponse
Essay
Performance Task √
Teacher observation √
Selfassessment √
proper dribbling of a basket ball
1. Iden Identitify fy pa part rtss of a microscope and its functions 2. C om ompa pare reofth thee methods assessment 3. Arra Arrang ngee tthe he eating utensils on table 4. Pe Perf rfoorm th thee dance steps in “Pandanggo sa Ilaw” 5. Define assessment 6. Compare and contrast testing and grading 7. List List do down wn all all the Presidents of the Philippines 8. Find the speed of a car 9. Rec ecititee th thee mission of SKSU 10.Prepa 10. Prepare re a lesson plan in Mathematics ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
67
SULTAN KUDARAT STATE UNIVERSITY
4. Give the the features features and use of th thee follo following wing assess assessments ments.. Classifications of Assessment 1. Spee Speedd vs vs.. Po Powe werr test testss 2. Ach Achiev ievem ement ent vs. Ap Aptitu titude de Test 3. Ed Educ ucat atio iona nall vs. Psychological tests 4. Sel Select ected ed and and ccons onstru tructe cteddresponse test 5. Pape Paper-p r-pen enci cill vs. vs. performance-based test
Description
Use or purpose
5. Evaluate Evaluate the ex extent tent of yo your ur know knowledge ledge aand nd understa understanding nding about about assessment of learning and its principles. Indicators
Great extent
Moderate Not extent at all
1. I can dis discu cuss ss the perf perform ormanc ance-b e-base asedd assessment. 2. I can ex expla plain in the m mean eaning ing of sele selecte cteddresponse test. 3. I can ccomp ompare are ppowe owerr and sspee peedd tests. tests. 4. I ca cann co compare mpare achievem achievement ent and aptitu aptitude de tests. 5. I can disc discuss uss the the co constru nstructedcted-respo response nse ttest. est. 6. I can list ddown own tthe he differe different nt cla classific ssification ationss of assessment. 7. I ca cann di differen fferentiate tiate between between teac teacher-ma her-made de and standardized test. 8. I can eexp xplain lain por portfo tfolio lio as as on onee of the the performance-based assessments. 9. I ca cann gi give ve exam examples ples of aaptitud ptitudee te tests. sts. 10. I can decide what response format ( multiple choice, short answer, essay ) is more applicable.
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
68
SULTAN KUDARAT STATE UNIVERSITY
Enrichment
Check the varied products of Center for Educational Measuremen Measurementt (CEM) as regards standardized tests. Access it through this link: https://www.cem-inc.org.ph/products https://www.cem-inc.org.ph/products
Try taking a free Personality Test available online. You can also try an IQ test. Share the results with the class.
References
Aptitude Tests. Retrieved from https://www.aptitude-test.com/aptitudetests.html tests.html Cherry, Kendra (2020, February 06). How Achievement Tests Measure What People Have Learned. Retrieved from https://www.verywellmind.com/what-is-an-achievement-test-2794805 https://www.verywellmind.com/what-is-an-achievement-test-2794805 Classroom Assessment. Retrieved from https://fcit.usf.edu/assessment/selected/responseb.html David et al. (2020). Assessment in Learning 1. Manila: Rex Book Store. De Guzman, E. and Adamos, J. (2015). Assessment of Learning 1. Quezon City: Adriana Publishing Co., Inc. Improving your Test Questions. https://citl.illinois.edu/citl-101/measurementevaluation/exam-scoring/improving-your-test-ques evaluation/exam-sco ring/improving-your-test-questions?src=ctetions?src=ctemigration-map&url=%2Ftesting%2Fexam%2Ftest_ques.html Navarro, L., Santos, R. and Corpuz, B. (2017). Assessment of Learning 1 (3 rd ed.). Quezon City: Lorimar Publishing, Inc. University of Lethbridge (2020). Creating Assessments. Retrieved from https://www.uleth.ca/teachingcentre/exams-and-assignments https://www.uleth.ca/teachingcentre/exams-and-assignments
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
69
SULTAN KUDARAT STATE UNIVERSITY
CHAPTER 3 DEVELOPMENT AND ENHANCEMENT OF TEST
Overview
This chapter deals on the process and mechanics in developing a written test that is i s understandably a teacher-made type. As future professional teachers, one has to be competent in the selection of the learning objectives or outcomes, preparation of a table of specifications (TOS), the guidelines in writiting wri ng vari varied ed writ writte tenn test test form format ats, s, an andd wr writitin ingg the the test test ititse selflf.. Adeq Adequa uate te knowledge of the TOS construction is indispensable in formulating a valid test in terms of content and construct. Also, the complete understanding of the rules and guidelines in writing a specific test format would probably ensure an acceptable and unambiguous test which is fair to the learners. In addition, reliability and validity are 2 important characteristics of test that shall likewise be included to guarantee quality. For test item enhancement, topics such as difficulty index, index of discrimination and even distracter analysis are to be introduced. Objective
Upon Up on comp comple letition on of the the un unit, it, the the st stud uden ents ts can can demo demons nstr trat atee thei their r knowledge, understanding and skills in planning, developing and enhancing a written test. Lesson 1: Planning a Written Test
Pre-discussion
The setting of learning objectives for an assessment of a course or subject are and the construction of a table of specifications for a classroom test require specific skills and experience. To successfully perform these foregoing tasks, a pre-service teacher should be able to distinguish the different levels of cognitive behavior and identify the appropriate assessment ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
70
SULTAN KUDARAT STATE UNIVERSITY
method for them. It is assumed that in this lesson, the competencies for instruction that are cognitive in nature are the ones identified as the targets in developing a written test, which should be reflected in the test’s table of specifications to be created. What to Expect?
At the end of the lesson, the students students can: 1. defin definee the necessary necessary in instruc structiona tionall outcomes outcomes to be inc included luded in a written written test; 2. descr describe ibe wha whatt is a tabl tablee of spec specificat ifications ions (T (TOS) OS) and ititss formats; formats; 3. pre prepar paree a T TOS OS for for a wr writt itten en test; test; aand nd 4. demo demonstra nstrate te the ssystem ystematic atic ssteps teps in making making a TOS. TOS.
Planning a Written Test
To be prepared to learn, write or enhance skills in planning for a good classroom test, pre-service teachers need to review their prior knowledge on lesson plan development, constructive alignment, and different test formats. Hence, aside from this chapter, it is strongly suggested that you read books and other references in print or online that could help you design a good written test. Defining the Test Objectives or Learning Outcomes for Assessment
In designing a well-planned written test, first and foremost, you should be able to identify the intended learning outcomes in a course, where a written test test is an appr approp opri riat atee me meth thod od to use. use. Thes Thesee le lear arni ning ng outc outcom omes es are are knowledge, skills, attitudes, and values that every student should develop throughout the course or subject. Clear articulation of learning outcomes is a primary consideration in lesson planning because it serves as the basis for evaluating the effectiveness of the teaching and learning process determined thro throug ughh test testin ingg or as asse sess ssme ment nt.. Le Lear arni ning ng ob obje ject ctiv ives es or ou outc tcom omes es are are measurable statements that articulate, at the beginning of the course, what students should know and be able to do or value as a result of taking the course. These learning goals provided the rationale for the curriculum and ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
71
SULTAN KUDARAT STATE UNIVERSITY
instruction. They provide teachers the focus and direction on how the course is to be handled, particularly in terms of course content, instruction, and assessment. On the other hand, they provide the students with the reasons and motivation to study and endure. They provide students the opportunities to be aware of what they need to do to be successful in the course, take control and ownership of their progress, and focus on what they should be learning. Setting objectives for assessment is the process of establishing direction to guide both the teacher in teaching and the student in learning. What are the objectives for testing?
In de deve velo lopi ping ng a wr writt itten en test test,, the the co cogn gnititiv ivee beha behavi vior orss of le lear arni ning ng outcomes are usually targeted. For the cognitive domain, it is important to identify the levels of behavior expected from the students. Typically, Bloom’s Taxo Ta xono nomy my was was us used ed to clas classi sify fy lear learni ning ng obje object ctiv ives es ba base sedd on le leve vels ls of complexity and specificity of the cognitive behaviors. With knowledge at the base (i.e., lower-order thinking skill ), ), the categories move to comprehension, applic app licati ation, on, analys analysis, is, synthe synthesis sis,, and ev evalu aluati ation. on. Howev However, er, And Anders erson on and Krathwohl (2001), Bloom’s student and research partner, respectively, came up with a revised taxonomy, in which the nouns used to represent the levels of co cogn gniti itive ve beha behavi vior or we were re repl replac aced ed by verb verbs, s, and and the the sy synt nthe hesi siss and and evaluation were switched. Figure 1 presents the two taxonomies.
Figure 1. Taxonomies of Instructional Objectives ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
72
SULTAN KUDARAT STATE UNIVERSITY
In dev develo elopin pingg the cog cognit nitive ive domain domain of ins instru tructi ctiona onall object objective ives, s, key verbs can be used. Benjamin Bloom created a taxonomy of measurable verbs to he help lp us des descri cribe be and class classify ify observ observabl ablee knowle knowledge dge,, ski skills lls,, attitu attitudes des,, behaviors and abilities. The theory is based upon the idea that there are levels of observable actions that indicate something is happening in the brain (cognitive activity.) By creating learning objectives using measurable verbs, you indicate explicitly what the student must do in order to demonstrate learning. Please refer to Figure 2 and Table 1.
Figure 2. Bloom’s Taxonomy of Measurable Verbs
For better understanding, Bloom has the following description for each cognitive domain level:
Knowledge - Remember previously learned information
Comprehension - Demonstrate an understanding of the facts
Application - Apply knowledge to actual situations
Analysis - Break down objects or ideas into simpler parts and find evidence to support generalizations
Synthesis - Compile component ideas into a new whole or propose alternative solutions
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
73
SULTAN KUDARAT STATE UNIVERSITY
Evaluation - Make and defend judgments based on internal evidence or external criteria Table 1. Bloom’s verb charts
Revised Bloom’s Level
Key Verbs (keywords)
Create
design, formulate, build, invent, create, compose, generate, derive, modify, develop. choose, support, relate, determine, defend, judge, grade, compare, contrast, argue, justify, support, convince, select, evaluate. classify, break down down,, categorize, analyze, dia diagram, gram, illustrate, criticize, simplify, associate. calculate, predict, app apply, ly, solve, illustrate, use, demonstrate, determine, model, perform, present. des escr crib ibe, e, eexxpla plain, in, pa para rapphras hrase, e, rres esta tate te,, gi givve ori origgin inaal examples of, summarize, contrast, interpret, discuss. list list,, rrec ecitite, e, outltlin ine, e, defi define ne,, nnam amee, m mat atcch, qu quot otee, rrec ecaallll,, identify, label, recognize.
Evaluate Analyze Apply Unders dersta tand nd Rem emem embe berr
Bloom’s Definitions
Remembering - Exhibit memory of previously learned material by recalling facts, terms, basic concepts, and answers.
Understanding - Demonstrate understanding of facts and ideas by organizing, comparing, translating, interpreting, giving descriptions, and stating main ideas.
Applying - Solve problems problems to new situations by ap applying plying acquired
knowledge, facts, techniques and rules in a different way. Analyzing - Examine and and break information into parts by identifying motives or causes. Make inferences and find evidence to support generalizations.
Evaluating - Present and defend opinions by making judgments about information, validity of ideas, or quality of work based on a set of criteria.
Creating - Compile information together in a different way by combining elements in a new pattern or proposing alternative solutions
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
74
SULTAN KUDARAT STATE UNIVERSITY
Table of Specifications
A table of specifications (TOS), sometimes called a test blueprint, is a tool used by teachers to design a written test. It is a table that maps out the test objectives, contents, or topics covered by the test; the levels of cognitive behavior to be measured; the distribution of items, number, placement, and weights of test items; and the test format. It helps ensure that the course’s intended learning outcomes, assessments, and instruction are aligned. Generally, the TOS is prepared before a test is created. However, it is deal to prepare one even before the start of instruction. Teachers need to create a TOS for every test that they intend to develop. The test TOS is important because it does the following:
Ensures that the instructional objectives and what the test captures match Ensures that the test developer will not overlook details that are considered essential to aeasier good test Makes developing a test and more efficient Ensures that the test will sample all important content areas and processes Is useful in planning and organizing Offers an opportunity for teachers and students to clarify achievement expectations.
General Steps in Developing a Table of Specifications
Learne Lea rnerr assess assessmen mentt wit within hin the fra framew mework ork of cla classr ssroom oom ins instru tructi ction on requires good planning. These are the steps in developing a table of specifications: 1. Determine the objectives of the test. The first step is to identify the test obje ob ject ctiv ives es.. This This sh shou ould ld be ba base sedd on the the in inst stru ruct ctio iona nall obje object ctiv ives es.. In general, the instructional objectives or the intended learning outcomes are identified at the start, when the teacher creates the course syllabus. Normally, there are three types of objectives: (1) cognitive, (2) affective, and (3) psychomotor. Cognitive objectives are designed to increase an individual’s knowledge, understanding, and awareness. On the other hand, affective objectives aim to change an individual’s attitude into something desirable, while psychomotor objectives are designed to build physical or motor skills. When planning for assessment, choose only the objectives
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
75
SULTAN KUDARAT STATE UNIVERSITY
that can be best captured by a written test. There are objectives that are not meant for a written test. For example, if you test the psychomotor
domain, it is better to do a performance-based assessment. There are also cogn co gnititiv ivee ob obje ject ctiv ives es that that are are so some metitime mess bett better er as asse sess ssed ed thro throug ughh performance-based assessment. Those that require the demonstration or crea cr eatition on of so some meth thin ingg tang tangib ible le like like proj projec ects ts woul wouldd al also so be mo more re appropriately measured by performance-based assessment. For a written test, you can consider cognitive, ranging from remembering to creating of ideas that could be measured using common formats for testing, such as multiple choice, alternative response test, matching type, and even essays or open-ended tests.
2. Determine the coverage of the test. The next step in creating the TOS is to determine the contents of the test. Only topics or contents that have been discussed in class and are relevant should be included in the test 3. Ca Calc lcul ulat ate e the the we weig ight ht for for ea each ch topi topic. c. On Once ce the the test test cove covera rage ge is determined, the weight of each topic covered in the test is determined. The weight assigned per topic in the test is based on the relevance and the time spent to cover each topic during instruction. The percentage of theme for a topic in a test is determined by dividing the time spent for that topic covered in the test. For example, for a test on the Theories of Personality for General Psychology 101 class, the teacher spent ¼ to 1 ½ hours class sessions. As such, the weight for each topic is as follows: Topics
No. of Sessions
Time Spent
Percent of Time (Weight)
Theories and Concepts Psychoanalytic Theories Trait Theories Humanistic Theories Cognitive Theories Behavior Theories Social Learning Theories
0.5 class sessions
30 min
10.0
1.5 class sessions
90 min
30.0
1 class sessions 0.5 class session 0.5 class session 0.5 class session 0.5 class session
60 min 30 min 30 min 30 min 30 min
20.0 10.0 10.0 10.0 10.0
Total
5 class sessions
300 min (5 hours)
100
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
76
SULTAN KUDARAT STATE UNIVERSITY
Determine rmine the numb number er of items for the whole test. To determine the 4. Dete
number of items to be included in the test, the amount of time needed to answer the items are considered. As a general rule, students are given 3060 seconds for each item in test formats with choices. For one-hour class, this means that the test should not exceed 60 items. However, because you need also to give time for test paper/booklet distribution and giving instructions, the number of items should be less, maybe just 50 items. 5. Determine the number of items per topic. To determine the number of items to be included in the test, the weights per topic are considered. Thus, using the examples above, for a 60-item final test, Theories & Concepts, Humanistic Theories, Cognitive Theories, Behavioral Theories, and social Learning Theories will have 5 items, Trait Theories – 10 items, and Psychoanalytic Theories – 15 items. Topic
Theory & Concepts Psychoanalytic Theories Trait Theories Humanistic Theories Cognitive Theories Behavioral Theories Social Learning Theories
Percent of Time (Weight)
No. of Items
10.0 30.0
5 15
20.0 10.0 10.0 10.0 10.0
10 5 5 5 5
100
50 items
Total
Different Formats of a Table of Specifications
TOS of a test may be drafted in one-way, two-way, or three-way.
1. One-W One-Way ay TOS. TOS. A one-way TOS maps out the content or topic, test
objectives, number of hours spent, and format, number, and placement of items. This type of TOS is easy to develop and use because it just works around the objectives without considering the different levels of cognitive behaviors. However, a one-way TOS cannot ensure that all levels of cognitive behaviors that should have been developed by the course are covered in the test. Topics
Test Objectives
No. of
Format and
No. and
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
77
SULTAN KUDARAT STATE UNIVERSITY
Theories and Concepts
Psychoanalytic Theories
Others
Recognize important concepts in personality theories Identify the different theories of personality under the Psychoanalytic Model xxx
Total
Hours Spent
Placement of Items
0.5
Multiple Choice Item #s 1-5
5 (10.0%)
1.5
Multiple Choice Item #s 6-20
1 (30.0%)
xxx
xxx
xxx
5
Percent of Items
50 (100%)
2. Two-Way TOS. A two-way TOS reflects not only the content, time spent,
and number of items but also the levels of cognitive behavior targeted per test content based on the theory behind cognitive testing. For example, the comm co mmon on fra frame mewo work rk for for test testin ingg at pres presen entt in the the De DepE pEdd Cl Clas assr sroo oom m Assessment Policy is the Revised Bloom’s Taxonomy Taxonomy (DepEd, 2015). One advantage of this format is that it allows one to see the levels of cognitive skills and dimensions of knowledge that are emphasized by the test. It also shows the framework of assessment used in the development of the test. Nonetheless, this format is more complex than the one-way format. Content
Theories and Concepts
Time Spent
0.5 Hours
No. & Percent of Items
KD*
5 (10.0%)
F
Leve Levell of of Cogn Cogniti itive ve Behav Behavior ior,, It Item em Forma Format, t, No. No. and Placement of Items
R I.3 #1-3
U
C
Psychoanalytic Theories
F
AP
C
1.2 #14-15 1.3 #16-18
II.1 #41
II.1 #42
I.2 #6-7 I.2 #8-9
P
I.2 #10-11 I.2 #12-13
M
Overall Total
E
I.2 #4-5
C
Others Scoring
AN
1 point per 2 points per item item 5
50 (100.0%)
20
Another presentation is shown below:
20
3 points per item 10
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
78
SULTAN KUDARAT STATE UNIVERSITY Content
Time
No. of
Level of cognitive Behavior & Knowledge Dimension*,
Spent
Items
Item Format, No. & Placement of Items U AP AN E
0.5
5
I.3
I.2
hours
(10.0%)
#1-3
#4-5
1.5
15
(F) I.2
(C) I.2
I.2
1.2
II.1
II.1
hours
(30.0%)
#6-7
#8-9
#10-11
#14-15
#41
#42
(F)
(C)
(C)
(P)
(M)
(M)
I.2
1.3
#12-13
#16-18
(P)
(M)
R
Theories and Concepts Psycho Analytic Theories
Others Scoring Overall
C
1 point per item
3 points per item
5 points per item
20
20
10
50
Total (100.0%) *Legend: KD = Knowledge Dimension (Factual, Conceptual, Procedural, Metacognitive) I-Multiple Choice; II – Open-Ended
3. Three-Way TOS. This type of TOS reflects the features of one-way and
two-way TOS. One advantage of this format is that it challenges the test writer to classify objectives based on the theory behind the assessment. It also shows the variability of thinking skills targeted by the test. However, it takes a much longer to develop this type of TOS. Content
Learning
Time
No. of
Level of Cognitive Behavior and Knowledge
Objective
Spent
Items
Dimension*, Item Format, No. and Placement of Items
Theories
Recognize
0.5
5
R I.3
U I.2
and
important
hours
(10.0%)
#1-3
#4-5
Concepts
concepts in
(F)
(C)
AP
AN
E
C
personality Psycho-
theories Identify the
Analytic
different
Theories
theories of personality under psychoanalyti c model
Others
1.5
15
I.2
I.2
I.2
1.2
II.1
II.1
hours
(30.0%)
#6-7
#8-9
#10-11
#14-15
#41
#42
(F)
(C)
(C)
(P)
(M)
(M)
I.2
1.3
#12-13
#16-18
(P)
(M)
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
79
SULTAN KUDARAT STATE UNIVERSITY
Scoring
1 point per
3 poi point ntss per per ititem em
5 ppoi oint ntss per per
item Overall
50
20
item 20
10
Total (100%) *Legend: KD = Knowledge Dimension (Factual, Conceptual, Procedural, Metacognitive) I - Multiple Choice; II – Open-Ended
Summary
Bloom's taxonomy is a set of three hierarchical models used to classify learning objectives into levels of complexity and specificity. The three lists cover the learning objectives in cognitive, affective and psychomotor domains.
The cognitive domain list has been the primary focus of most traditional education and is frequently used to structure curriculum learning objectives, assessments and activities.
In the original version of the taxonomy, the cognitive domain is broken into the following six levels of objectives, namely: knowledge, comprehension, comprehension, application, analysis, synthesis and evaluation.
In the 2001 revised edition of Bloom's taxonomy, the levels are slightly different: Remember, Understand, Apply, Analyze, Evaluate, Create Cr eate (replacing Synthesize).
Knowledge involves recognizing or remembering facts, terms, basic concepts, or answers without necessarily understanding what they mean.
Comprehension involves demonstrating an understanding of facts and ideas by organizing, comparing, translating, interpreting, giving descriptions, and stating the main ideas.
Application involves using using acquired kno knowledge—solving wledge—solving problems problems in new situations by applying acquired knowledge, facts, techniques and rules. Learners should be able to use prior knowledge to solve problems, identify connections and relationships and how they apply in new situations.
Analysis involves examining and breaking breaking information into compone component nt parts, determining how the parts relate to one another, identifying motives or causes, making inferences, and finding evidence to support generalizations.
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
80
SULTAN KUDARAT STATE UNIVERSITY
Synthesis involves building a structure or pattern from diverse elements; iti t also refers to the act of putting parts together to form a whole. Evaluation involves presenting and defending opinions by making judgments about information, information, the validity of ideas, or qual quality ity of work based on a set of criteria.
A Table of Specifications or a test blueprint is a table that helps teachers align objectives, instruction, and assessment. This strategy can be used for a variety of assessment methods but is most commonly associated with constructing traditional summative tests.
Written test has varied formats and have a set of guidelines to follow.
Enrichment
1. Read the researc researchh article titl titled, ed, “Clas “Classroo sroom m Test Cons Constructi truction: on: The Power Power of a Table of Specifications” from https://www.researchgate.net/publication/257822 https://www.researchgate .net/publication/257822687_Classroom_T 687_Classroom_Test_Co est_Co nstruction_The_Power_of_a_Table_o nstruction_The_Po wer_of_a_Table_of_Specifications f_Specifications.. “How to use an automated Table of Specifications: 2. Watc Watchh the the vide videoo tititltled ed,, “How TOS Made Easy 2019.” Accessible from https://www.youtube.com/watch? v=75W_N4UKP3A v=75W_N4UKP3A 3. Explore Explore the po post st of Jess Jessica ica Sha Shabatur baturaa (September (September 27, 27, 2013) oonn “Using Bloom’s Taxonomy to Write Effective Learning Objectives.” Use this link https://tips.uark.edu/using-blooms-taxonomy/.. https://tips.uark.edu/using-blooms-taxonomy/ “How to write learning objectives using Bloom’s 4. Watc Watchh the the vide videoo tititltled ed,, “How Taxonomy.” Accessible from https://www.youtube.com/watch? v=nq0Ou1li_p0 v=nq0Ou1li_p0 Assessment
1. Answer the following questions: questions:
1. When planning planning ffor or a tes test,t, wha whatt should should yo youu do first? first? 2. Are all instruct instructional ional ob objectiv jectives es measured measured by a paper-p paper-penci encill test? 3. When constr constructin uctingg a TOS wher wheree objec objectives tives are are set without without classifyin classifyingg them according to their cognitive behavior, what format do you use?
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
81
SULTAN KUDARAT STATE UNIVERSITY
4. If you design designed ed a two two-way -way TOS TOS for yo your ur test, test, what does does this fformat ormat have? 5. Why a teache teacherr would consi consider der a thr three-wa ee-wayy TOS than than the other other formats? 2. To be able check whether you have learned the important information
about planning the test, please provide your answer to the questions given in the graphical representation.
3. Below are sets of competencies targeted for instruction taken from a
particular subject area in the DepEd K to 12 curriculum. Check (√) the assessment method appropriate for the given competence. 1. Sa Samp mple le 1 iin n Ma Math them emat atic ics s
Check the competencies appropriate for the given test format or method. Be ready to justify. Competencies
Appropriate for Objectives Test Format
Appropriate for Constructed Type of Test Format
Appropriate for Methods other than a Written Test
1. Orde Orderr fra fraction ctionss le less ss than than 1 2. Cons Construct truct plane plane fifigure guress using ruler and compass 3. Iden Identify tify cardinal cardinal numb numbers ers from 9001 through 900,000 4. Sol Solve ve 22-3 -3 st steps eps word word problems on decimals involving the four operations 5. Tra Transf nsform orm a di divis vision ion sentence into multiplica multiplication tion sentence and vice-versa 2. Sa Samp mple le 2 iin nS Sci cien ence ce
Check (√) the competencies appropriate for the given test format or method Competencies
Appropriate
Appropriate for
Appropriate for
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
82
SULTAN KUDARAT STATE UNIVERSITY
for Objectives Test Format
Constructed Type of Test Format
Methods other than a Written Test
1. Infer that the weather changes during the day and from day-today 2. Pra Practi ctice ce for ccare are and concern animals 3. Par Partic ticip ipate ate in campaigns and activities for improving/managing one’s environment 4. Comp Compare are the ability ability of land and water to absorb and release heat 5. Descr Describe ibe the four types of climate in the Philippines 3. Sa Samp mple le 3 in La Lang ngua uage ge
Check (√) the competencies appropriate for the given test format or method. Competencies
Appropriate for Objectives Test Format
Appropriate for Appropriate for Constructed Methods other Type of Test than a Written Format Test
1. Use words that describe 2. 3. 4. 5.
persons, places, animals, and events Draw conc conclusi lusions ons base basedd on picture-stimu picture-stimuli/ li/ passages Write a ddiffer ifferent ent story story ending Write a si simple mple frien friendly dly letter observing the correct format Comp Compose ose ridd riddles, les, slog slogans ans and announcements from the given stimuli
4. For the table table of specificat specifications ions,, you can apply apply wh what at you have have learned by by creating creating a two-way TOS of the final exams of your class. Take into considerations the content or topic, time spent for each topic; knowledge dimension; and item format, number, and placement for each level of cognitive behavior. An example
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
83
SULTAN KUDARAT STATE UNIVERSITY
of a TOS for a long exam for Abnormal Psychology class is shown below. Some parts are missing. Complete the TOS based on the given information. information. Content
Time Spent
# of Items
KD*
Disorder Usually First Diagnosed in Infancy, Childhood or Adolescence Cognitive Disorder
3 hours
?
F
3
?
C
Substance Related Disorder Schizophrenia and other Psychotic Disorder
1
P
3
10% (10) ?
10 10
? 100 100%
Total Overall Total
M
Level Level of of Co Cogn gniti itive ve B Beha ehavio vior, r, It Item em For Forma mat, t, N No. o. a and nd Placement of Items R U AP AN E C
I.10 #1-10
I.10 #?
I.10 ?
I.10 ? I.5 #? I.10 #?
I.10 #?
I.10 #?
I.5 #?
?
?
?
45 45%
I.10 #?
I.10 #?
?
?
25 25%
? 30 30%
5. Test Test Yo Your urse self lf Choose the letter of the correct answer to every item given. 1. The inst instruc ructio tional nal obje objecti ctive ve foc focus uses es on the develop developmen mentt of lea learne rners’ rs’ knowledge. Can this objective be assessed using the multiple-choice format? A. No, this objective requires an essay format. B. No, this objectiv objectivee is better asse assessed ssed using using matching matching type type test. C. Yes, as as multiple multiple-choice -choice iiss appro appropriate priate is assessing knowledge. D. Yes Yes,, as multiple multiple-ch -choic oicee is the mos mostt valid valid format format when when assess assessing ing learning. 2. You You prep prepar ared ed an obje object ctiv ivee test test form format at for for your your qu quar arte terl rlyy test test in Mathematics. Which of the following could NOT have been your test objective? A. Interpret a line graph B. Const Construc ructt a line line ggrap raphh C. Compare Compare the information information presen presented ted in a line line graph graph D. Draw conclus conclusions ions fro from m the data present presented ed in a line gr graph aph 3. Teacher Teacher Lan Lanie ie prep prepared ared a TO TOS S as her gui guide de in developi developing ng a test. Why Why is this necessary? A. To guide the planning planning of instruction B. To satisfy satisfy th thee requireme requirements nts in de develo veloping ping a te test st
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
84
SULTAN KUDARAT STATE UNIVERSITY
C. To have have a test blueprint as acc accreditation reditation usually usually requires this plan D. To ensur ensuree that that the test test is de desi sign gned ed to cove coverr wh what at it in inte tend ndss to measure 4. Mr. Mr. Arce Arceoo prep prepar ared ed a TO TOS S that that shows shows both the obj objec ectiv tives es and the different levels of cognitive behavior. What format could he have used? A. One-way format B. TwoTwo-wa wayy form format at C. Thre Threee-wa wayy forma formatt D. Four Four-w -way ay ffor orma matt 5. The Sch School ool Pri Princi ncipal pal wants wants the teac teacher herss to develo developp a TOS that that uses the two-way format than a one-way format. Why do you think this is the principal’s preferred format? A. So that the different levels of cognitive cognitive behavior to be tested tested are known B. So that the the formats formats of the test ar aree known by jjust ust looking looking at the T TOS OS C. So that the the test writer writer would know know the dis distribu tribution tion of tes testt items D. So that objectiv objectives es for ins instruct truction ion are als alsoo reflected reflected in the TOS TOS 6. Review Review the tab table le if spec specificat ifications ions tha thatt you have ddevelo eveloped ped for your your quarterly examination. 6.1. Is the purpose of assessment clear and relevant to measure measure desired learning outcome? 6.2. Are the topics or course contents discussed in class well well covered by the test? Is the number of test items per topic and for the whole test enough? Does the test cover only relevant topics? 6.3. Are all levels of thinking skills appropriately represented across topics? 6.4. Are the test formats chosen for the specific desires learning outcomes the most appropriate method to use? Can you employ other types of test? 6.5. Would you consider your table of specifica specifications tions good and effective to guide you in developing your test? Are there components in the TOS that need major revisions? How can improve the TOS? 7. Evaluate Evaluate your sskills kills iinn plann planning ing yo your ur test iinn terms of of setting setting objectives objectives an andd designing a table of specifications based on the following scale. Circle the
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
85
SULTAN KUDARAT STATE UNIVERSITY
performance level you are at for (1) setting test objectives and (2) creating a table of specifications. Level
Pro rofifici cieent Master Deve De velo lopi ping ng Novice
Performance Benchmark
Setting Test Objectives
Creating Table of Specifications
I kkno now w th them ve very ry well well.. I ca cann teach others where and when to use them appropriately. I can do it by myself, though, I sometimes make mistake. I am ge getti tting ng the there re,, thou though gh I still need help to be able to perfect it. I cannot do it myself. I need help to plan for my tests.
4
4
3
3
2
2
1
1
Based on your self-assessment above, choose the following tasks to help you enhance your skills and competencies in setting course objectives and in designing a table of specifications. Level
Possible Tasks
Profic Pro ficie ient nt
Hel Helpp oorr ment mentor or peer peer or cla classm ssmate atess who who are hav havin ingg ddiff ifficu iculty lty in se setti tting ng test objectives and designing table of specifications.
Maste asterr
Ex Exaami minne th the aare reaas tthhat yo youu nneeed to imp mpro rove ve on and add addre ress ss th theem immediately. immediatel y. Benchmark with the test objectives and TOS developed by your peers/classmates who are known to be proficient in this area.
Developing or Novice
Read more books/references about setting test objectives and designingg table of specificatio designin specifications. ns. Ask your teacher teacher to evaluate the test test objectives and table table of specificationss that you have developed and to give suggestion on how specification you can improve them.
“When I plan my test, I first design its TOS, so I know what I should cover. I usually prepare a Two-way TOS. Actually, because I have been teaching the same course for many years now, I have come to a point that all my tests have their two-way TOS ready to be shown to anybody,, most specially my students. Hence, even at the start of term, anybody Educator’s Feedback I know what I should teach and how they would be assessed. I know In an interview a high school teacher,through this is what he shared on his those topics thatwith are appropriatel appropriately y assessed a written test. the test is given, I usually give the TOS to my students, practiceWeeks when before preparing a test. so they have a guide in preparing for the test. I allot time in my class for my students to examine the TOS of the test for them to check if there were topics not actually taught in the class. My students usually are surprised when I do this as they don’t normally see TOS of their teacher’s test. But I do this as I want them to be successful. I find it fair
for them to know how much weight is given to every topic covered in the test. Most often, the outcome of the test is good as almost all, if not all, of my students would pass my test.”
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
86
SULTAN KUDARAT STATE UNIVERSITY
This interview merely indicates that preparing a TOS and making it accessible to students as their guide in preparing for their test is actually very helpful for them to successfully pass the test. Thus, preparing a TOS should become a standard practice of all teachers when assessing students’ learning through a test. References
Armstrong, P. (2020). Bloom’s Taxonomy. Taxonomy. TN: Vanderbilt Univ University ersity Center for Teaching. Retrieved from https://cft.vanderbilt.edu/guides-subpages/blooms-taxonomy/ pages/blooms-taxonomy/ David et al. (2020). Assessment in Learning 1. Manila: Rex Book Store. De Guzman, E. and Adamos, J. (2015). Assessment of Learning 1. Quezon City: Adriana Publishing Co., Inc. Fives, H. & DiDonato-Barnes, N. (February 2013). Classroom Test Construction: The Power of a Table of Specifications. Practical Assessment, Research Research & Evaluation, Volume 18 (3). Isaacs, Geoff (1996). Bloom’s Taxonomy of Educational Objectives. The University of Queensland: TEDI. Retrieved from https://kaneb.nd.edu/assets/137952/bloom.pdf Macayan, J. (2017). Implementing Outcome-Based Education (OBE) Framework: Implications for Assessment of Students’ Performance. Educational Measurement and Evaluation Review , Vol. 8 (1). Magno,forC.Assessing (2011). AStudent Closer Look at other Taxonomies xonomies ofHandbook Learning: Learning:, AVol. Guide Learning. TheTa Assessment 5.
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
87
SULTAN KUDARAT STATE UNIVERSITY
Lesson 2: Construction of Written Tests
Pre-discussion
The construction of good tests requires specific skills and experience. To be ab able le to succ succes essf sful ully ly de demo mons nstr trat atee your your know knowle ledg dgee and and sk skilills ls in constructing traditional types of tests that are most applicable to a particular learning outcome, you should be able to distinguish the different test types
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
88
SULTAN KUDARAT STATE UNIVERSITY
and formats, and understand the process and requirements in setting learning objectives and outcomes and in preparing the table of specifications. For proper guidance in this lesson, the performance tasks and success indicators are presented below. Performance Tasks
Success Indicators
Classifying tests
Identify the test format that is most appropriate for a particular learning outcomes
Designing a test
Create a test table of specifications (TOS) or assessment plan aligned with the desired learning outcomes, and the teaching-learning activities
Constructing Develop test items following the general guidelines test items for test construction of different test formats
What to Expect?
At the end of the lesson, the students students can: 1. descr describe ibe the char character acteristics istics of sel selecte ected-resp d-response onse an andd const constructe ructeddresponse tests; 2. class classify ify whether whether a test is select selected-re ed-respon sponse se or constructe constructed-res d-respons ponse; e; 3. ident identify ify the test test format th that at is most appropri appropriate ate to a parti particular cular le learnin arningg outcome/target; 4. apply tthe he gen general eral gu guideli idelines nes in co constru nstructing cting ttest est items; items; 5. prepa prepare re a wr written itten ttest est ba based sed oonn the pprepar repared ed TO TOS; S; and 6. evalu evaluate ate a given given teacherteacher-made made test test based based on guidelin guidelines. es. Constructing various Types of Traditional Test Formats
Classroom assessments are an integral part of learners’ learning. They do more than just measure learning. They also inform the learners what needs to be learned and to what extent and how to learn them. They also provide the parents some feedback about their child’s achievement of the desired learning outcomes. The schools also get to benefit from classroom assessments because the learners’ test results can provide them evidencebased data that are useful for instructional planning and decision-making. As
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
89
SULTAN KUDARAT STATE UNIVERSITY
such, it is important that assessment tasks or tests are meaningful and further promote deep learning; as well as fulfill the criteria and principles of test construction. Ther Th eree are are ma many ny ways ways by whic whichh le lear arne ners rs ca cann de demo mons nstr trat atee thei their r knowledge and skills and show evidence of their proficiencies at the end of a lesson, unit, or subject. While authentic or performance-based assessments have ha ve be been en ad advo voca cate tedd as the the be bett tter er an andd mo more re ap appr prop opri riat atee meth method odss in assessing learning outcomes, particularly as they assess higher-level thinking skills (HOTS), the traditional written assessment methods, such as multiplechoice tests, are also considered as appropriate and efficient classroom assessment tools for some types of learning targets. This is mainly true for larg la rgee clas classe sess an andd wh when en test test resu resultltss are are ne need eded ed imme immedi diat atel elyy for for so some me educational decisions. Traditional tests are also deemed reliable and exhibit excellent content and construct validity. To learn or enhance your skills in developing good and effective test items for a particular test format, you need to possess adequate knowledge on different test formats; how and when to choose a particular test format that is the most appropriate measure of the identified learning objectives and desired learning outcomes of your subject; and how to construct good and effective items for each format. General Guidelines in the Selection of Appropriate Test Format
Not every test is universally valid for every type of learning outcome. For example, if an intended outcome for a Research Method 1 course is “ to design and produce a research study relevant to one’s field of study ,” you
cannot measure this outcome through a multiple-choice test or a matchingtype test. Hence, to guide you on choosing the appropriate test format and design des igning ing fai fairr an andd appro appropria priate te yet yet challe challengi nging ng tes tests, ts, you sh shoul ouldd ask the following important questions: 1. What a are re the ob objectiv jectives es or des desired ired le learnin arning g outco outcomes mes of th the e subject/unit/lesson being assessed?
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
90
SULTAN KUDARAT STATE UNIVERSITY
Deciding on what test format to use generally depends on your learning
objective ives
or
the
desired
learni rning
outcomes
of
the
subject/unit/lesson. Desired learning outcomes (DLOs) are statements of what learners are expected to do or demonstrate as a result of engaging in the learning process. It is suggested that you return to Lesson 4 to review on how to set or write write instru instructi ctiona onall obj object ective ivess or intend intended ed learni learning ng outcom outcomes es for a subject. 2. What
level
of
thinking
is
to
be
assessed
(i.e.,
remembering,
understanding, applying, analysing, evaluating and creating)? Does the cognitive level of the test question match your instructional objectives or DLOs?
The level of thinking to be assessed and also an important factor to consider when designing your test, as this will guide you in choose the appropriate test format. For example, if you intend to assess, how much your learners are able to identify important concepts discussed in class (i.e., remembering or understanding level), a selected-response format such as multiple-choice test would be appropriate. However, if you intend to assess how your students will be able to explain and apply in another setting a concept or framework learned in class (i.e., applying and/or analysing level), you may consider giving constructed-response test format such as essays. It is important that when constructing classroom assessment tool, all levels of cognitive behaviour are represented – from remembering (R), understanding (U), applying (AP), analysing (AN), evaluating (E), and creating (C) – and taking into consideration the knowledge dimension, i.e., factual (F), conceptual (C), procedural (P), and metacognition (M). You may return to Lesson 2 and Lesson 4 to review the different levels of Cognitive Behaviour and Knowledge Dimensions. 3. Is the the test test ma matc tch h or alig aligne ned d wi with th th the e co cour urse se’s ’s DL DLOs Os an and d th the e co cour urse se contents or learning activities?
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
91
SULTAN KUDARAT STATE UNIVERSITY
Thee ass Th assess essmen mentt tas tasks ks should should be ali aligne gnedd wit withh the ins instru tructi ctiona onall activities and the DLOs. Thus, it is important that you are clear about what DLOs are to be addressed by your test and what course activities or tasks are to be implemented to achieve the DLOs. For example, if you want learners to articulate and justify their stand on ethical decision-making and social responsibility practices and business (i.e., DLO); then an essay test and class debate are appropriate measures and tasks for this learning outcome. A multiple-choice test may be used but only if you intend i ntend to assess learners’ ability to recognize what is ethical versus unethical decision-making practice. In the same manner, matchingtype items may be appropriate if you want to know whether your students can differentiate and match the different approaches or terms to their definitions. 4. Are the ttest est items realis realistic tic tto o th the e st studen udents? ts?
The test should be meaningful and realistic to the learners. They should be relevant or related to their everyday experience. The use of concepts, terms, or situations that have not been discussed in the class or that that they they ha have ve ne neve verr en enco coun unte tere red, d, read read,, or he hear ardd ab abou outt sh shou ould ld be mini mi nimi mize zedd or avoi avoide ded. d. This This is to prev preven entt le lear arne ners rs from from maki making ng wi wild ld guesses, which will undermine your measurement of what they have really learned from the class. What are the major categories and formats of traditional tests?
For the purposes of classroom assessment, traditional tests fall into two general general categ categories ories:: 1) selected-response type, in which learners select the correct response from the given options, and 2) constructed-response type, in which the learners are asked to formulate their own answers. The
cognitive capabilities required to answer selected-response items are different from those required by constructed-response items, regardless of contents. Selected-response tests require learners to choose the correct answer
or best alternative from several choices. While they can cover a wide range of learning materials very efficiently and measure a variety of learning outcomes,
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
92
SULTAN KUDARAT STATE UNIVERSITY
theyy are lim the limite itedd whe whenn as asses sessin singg learni learning ng outcom outcomes es that that inv involv olved ed more more complex and higher-level thinking skills. Selected-respons Selected-responsee tests include:
Multiple Multi ple Choic Choice e Test. It is the most commonly used format in formal
testing and typically consist of a stem (problem), one correct or best alternative (correct answer), and 3 or more incorrect or inferior alternatives (distractors).
True-Fals True -False e or Alter Alternativ native e Resp Response onse Test. It generally consists of a
statement and deciding if the statement is true (accurate/correct) or false (inaccurate/incorrect).
Matching Type Test. It consists of 2 sets of items to be matched with
each other based on a specified attribute. Constructed-response tests require learners to supply answers to a
given question or problem. These include:
Short Answer Test. It consists of open-ended questions or incomplete
sentence that requires learners to create an answer for each item, which is typically a single word or short phrase. This includes the following types:
Completion. It consists of incomplete statement that requires the
learners to fill in the blanks with the correct word or phrase.
Identification. It consists of statements that require the learners to
identify or recall the terms/concepts, people, places or events that are being described.
Essay Essa y Test. It consists of problems/questions that require learners to
compose or construct written responses, usually long ones with several paragraphs.
Problem-so Prob lem-solving lving Test. It consis consists ts of pro proble blems/ ms/qu quest estion ionss that that req requir uiree
learne lea rners rs to so solve lve proble problems ms in quanti quantitat tative ive or non-q non-quan uantita titativ tivee settin settings gs knowing knowledge and skills in mathematical concepts and procedures, and/or other higher-order cognitive skills (e.g., reasoning, analysis, critical thinking and skills). General Guidelines in Writing Multiple-Choice Test Items
Writing multiple-choice items requires content mastery, writing skills, and time. Only good and effective items should be included in the test. Poorly-
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
93
SULTAN KUDARAT STATE UNIVERSITY
written test items could be confusing and frustrating to learners and yield test scores that are not appropriate to evaluation their learning and achievement. The following are the general guidelines in writing good multiple-choice items. They are classified in terms of content, stem, and options. A. Co Con nte ten nt
1. Write items that reflect reflect only one specific content aand nd cognitive processing processing skill. Faulty : Which of the following is a type of statistical procedure used to test
a hypothesis regarding significant relationship between variables, particularly in terms of the extent and direction of association? A. ANCOVA
C. Correlation
B. ANOVA
D. t-test
Good: Which of the following is an inferential statistical procedure used to
test a hypothesis regarding significant difference between two qualitative variables? A. ANCOVA
C. Chi-Square
B. AN ANOVA
D. Mann-Whitney Test
2. Do no nott lift and use statements from the textbook or other learning materials as test questions. 3. Keep the vo vocabu cabulary lary simple and understandab understandable le based on level of learners/examinees. 4. Edit and proofread proofread the items for gr grammatical ammatical and spelling spelling before administering to the learners.
B. Stem
1. Write the the direct directions ions in the the stem in a clea clearr and understa understandab ndable le mann manner. er. Faulty: Read each question and indicate your answer by shading the circle
corresponding to your answer. Good: This test consists of two parts. Part A is a reading comprehension
test, and Part B is grammar/language test. Each question is a
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
94
SULTAN KUDARAT STATE UNIVERSITY
multiple-choice test item with five (5) options. You need to answer each question but will not be penalized the wrong answer or for guessing. You can go back and review your answer during the time allotted. 2. Write stems stems that that are cons consiste istent nt in form and structu structure, re, that is is,, present present all items either in question form or in description or declarative form. Faulty: (1) Who was the Philippine president during Martial Law?
(2) The first president of the Commonwealth of the Philippines was _______. Good: (1) Who was the Philippine president during Martial Law?
(2) Who was the first president ooff the Commonwe Commonwealth alth of the Philippines? 3. Express Express the ste stem m posit positively ively an andd avoid do double uble ne negativ gatives, es, such such as NOT and EXCEPT in a stem. If a negative word is necessary, underline or capitalize the words for emphasis. Faulty: Which of the following is not the measure of variability? Good: Which of the following is NOT a measure of variability?
4. Refrain Refrain from ma making king the the stem to tooo wordy wordy or con containin tainingg too much much information unless the problem or question requires the facts presented to solve the problem. Faulty: What does DNA stand for, and what is the organic chemical of
complex molecular structure found in all cells and viruses and codes genetic information for the transmission of inherited traits? Good: As a chemical compound, compound, what does DNA sstand tand for?
C. Opt ptio ion ns
1. Provide three (3) to five (5) options per per item, with only one one being the correct or best answer/alternative. 2. Write options that that are parallel or similar similar in form and length length to avoid giving giving clues about the correct answer. Faulty: What is an ecosystem?
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
95
SULTAN KUDARAT STATE UNIVERSITY
A. It is a community of living organisms organisms in conjunction with the no non-living n-living components of their environmental that interact as a system. These biotic and abiotic components are linked together through nutrient cycles and energy flows. B. It is a place place on Earth’s Earth’s surface surface where where lif lifee dwells. dwells. C. It is an area that one or more ind individu ividual al organisms organisms defend defend against against competition from other organisms. D. It is the biotic and abiotic surround surroundings ings of an organism or population. E. It is the largest largest divisio divisionn of the Earth’s su surface rface fil filled led with living living organisms. organisms. Good: What is an ecosystem?
A. It is a place on the Earth’s surface surface where life dwells. B. It is the biotic and abiotic abiotic surround surroundings ings of an orga organism nism or population population.. C. It is the larges largestt divisio divisionn of the Earth’s su surface rface filled filled with living living organisms. D. It is a large community of living and and non-living organisms in a particular area. E. It is an area that that one or more in individ dividual ual org organism anismss defend defend against against competition from other organisms. 3. Place Place oopti ptions ons in a logi logical cal oorde rderr (e.g., alphabetical, from shortest to longest). Faulty: Which experimental gas law describes how the pressure of a gas
tends to increase as the volume of the container decreases? (i.e., “The absolute pressure exerted by a given mass of an ideal i deal gas is inversely proportional to the volume it occupies.”) A. Boyle’s Law
D. Avogadro’s Law
B. Charles’ Law
E. Faraday’s Law
C. Be Beer er LLam ambe bert rt LLaw aw Good: Which experimental gas law that describes how the pressure of gas
tends to increase as the volume of the container decreases? (i.e., “The absolute pressure exerted b y a given mass of an ideal gas is inversely proportional to the volume it occupies.”) A. Avogadro’s Law
D. Charles Law
B. Beer Lambert Law
E. Faraday’s Law
C. Boyl Boyle’ e’ss LLaw aw
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
96
SULTAN KUDARAT STATE UNIVERSITY
4. Place Place correc correctt respons responsee random randomly ly to avoid a disc discerna ernable ble patte pattern rn of correct answers. 5. Use None-of-the-above carefully and only when there is one absolutely correct answer, such as in spelling or math items. Faulty: Which of the following is a nonparametric statistic?
A. ANCOVA
D. t-test
B. AN ANOVA
E. None of the Above
C. Corr Correl elat atio ionn Good: Which of the following is a nonparametric statistic?
A. ANCOVA
D. Mann-Whitney U
B. ANOVA
E. t-test
C. Corr Correl elat atio ionn 6. Avoid All of the Above as an option, especially if it is intended to be correct answer. Faulty: Who among the following has become the President of Philippine
Senate? A. Ferdinand Marcos
D. Quintin Paredes
B. Manuel Quezon
E. All of the Above
C. Manu Manuel el Roxa Roxass Good: Who was the first ever President of the Philippines Senate?
A. Eulogio Rodriguez
D. Manuel Roxas
B. Ferdinand Marcos
E. Quintin Paredes
C. Manu Manuel el Quez Quezon on 7. Make all ooption ptionss re realist alistic ic and and re reason asonable. able.
General Guidelines in Writing Matching-type items
The matching test item requires learners to match a word, sentence, or phrase in one column (i.e., premise) to a corresponding word, sentence, or phrase in a second column (i.e., response). It is most appropriate when you need to measure the learners’ ability to identify the relationship or association between similar items. They work best when the course content has many parallel concepts. While matching-type test format is generally used for simple
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
97
SULTAN KUDARAT STATE UNIVERSITY
recall of information, you can find ways to make it applicable or useful in assessing higher level of thinking such as applying and analyzing . The following are the general guidelines in writing good and effective matching-type tests: 1. Clearl Clearlyy state in the direc directio tions ns the basi basiss for match matching ing the stim stimuli uli with the responses. Faulty: Directions: Match the following. Good: Directions: Column I is a list of countries while Column II
presents the continents where these countries are located. Write the letter of the continent corresponding to the country on the line provided in Column I. Item #1’s instruction is less preferred as it does not detail the basis for matching the stem and the response options.
2. Ensure Ensure that the the stimuli stimuli are lo longer nger and and the re respon sponses ses are are shorter. shorter. Faulty: Match the descripon of the ag to its country. A
B
____Bangladesh ____Bangladesh ____Indonesia ____Japan
A. Green background background with red circle in the center B. One red strip on top and white white strip at the bottom C. Red background with white five-petal flower in the
____Singapore ____Thailand
center D. Red background with with large yellow circle in the center center E. Red background with large yellow pointed star in the center F. White White bback ackgro ground und w with ith large large red red ccirc ircle le in tthe he cente center r
Good: Match the descripon of the ag to its country. A
___Green background background with a red circle in the cente centerr ___One red strip on top and and white strip at the bottom ___Red background background with five-petal flower in the cente centerr ___Red background background with large yellow pointed pointed star in the center ___White background background with red circle in the center
B
A. B. C. D. E. F.
Bangladesh Hong Kong Indonesia Japan Singapore Vietnam
Item #2 is a better version because the descriptions are presented in the first column while the response options are in the second column. The stems are also longer than the options.
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
98
SULTAN KUDARAT STATE UNIVERSITY
3. For each each item, include include onl onlyy topic topicss that are related related with one anothe anotherr and and share the same foundation of information. Faulty: Match the following: A
B
_____1. Indonesia
A. Asia
_____2. _____3. _____4. _____5.
B. C. D. E. F.
Malaysia Philippines Thailand Year ASEAN was established established
Bangkok Jakarta Kuala Lumpur Manila 1967
Good: On the line to the le of each country in Column I, write the leer of the country’s capital presented in column II.
Column I
_____1. _____2. _____3. _____4.
Column II
Indonesia Malaysia Philippines Thailand
A. B. C. D. E.
Bandar Seri Begawan Bangkok Jakarta Kuala Lumpur Manila
Item Ite m #1 is con consid sidere ered d an una unacce ccepta ptable ble ite item m be becau cause se its res respon ponse se options are not parallel and include different kinds of information that can provide clues to the correct/wrong answers. On the other hand, item #2 details the basis for matching and the response options only include related concepts.
4. Make the the respo response nse opt options ions short, short, homogen homogeneous eous,, and arranged arranged in log logical ical order. Faulty: Match the chemical elements with their characteriscs. A
_____ Gold _____ Hydrogen _____ Iron _____ Potassium _____ Sodium
B
A. B. C. D. E. F.
Au Magnetic metal used in steel steel Hg K With lowest density Na
Good: Match the chemical elements with their symbols. A
B
_____ Gold _____ Hydrogen _____ Iron _____ Potassium
A. B. C. D.
Au Fe H Hg
_____ Sodium
E. F. K Na
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
99
SULTAN KUDARAT STATE UNIVERSITY
In item #1, response options are not parallel in content and length. They are not also arranged alphabetically.
5. Included Included res respons ponsee options options that are reas reasonab onable le and reali realistic stic and si similar milar in length and grammatical form. Faulty: Match the subjects with their course descripon. A
B
___ History ___ Political Science ___ Psychology ___ Sociology
A.
Studies the productio productionn and distribution of goods/services B. Study of politics and po power wer C. Study of society D. Understand role of men mental tal functions in social behaviour E. Uses Uses nar narrat rative ivess to to eexam xamine ine and an analy alyze ze pas pastt events
Good: Match the subjects with their course description A
B
___ 1. ___ 2. ___ 3. ___ 4.
Study of living things Study of mind and beha behaviour viour Study of policies and po power wer Study of recorded ev events ents in the past ___ 5. Study of society
A. B. C. D.
Biology History Political Science Psychology
E. Sociology F. Zoology
Item #1 is less preferred because the response options are not consistent in terms of their length and grammatical form.
6. Provide Provide more respo response nse options options tthan han the numb number er of stimuli. stimuli. Faulty: Match the following fracons with their corresponding decimal equivalents.
A
___ 1/4 ___ 5/4 ___ 7/25 ___ 9/10
B
A. B. C. D.
0.25 0.28 0.90 1.25
Good: Match the following fractions with their corresponding decimal equivalents.
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
100
SULTAN KUDARAT STATE UNIVERSITY
A
___ 1/4 ___ 5/4 ___ 7/25 ___ 9/10
B
A. B. C. D. E.
0.09 0.25 0.28 0.90 1.25
Item #1 is considered inferior to item #2 because it includes the same number of response options as that of the stimuli, thus making it more prone to guessing.
General Guidelines in Writing True or False items
True or False items are used to measure learners’ ability to identify whether a statement or proposition is correct/true or incorrect/false. They are best used when learners’ ability to judge or evaluate is one of the desired learning outcomes of the course. There are different variants of the true or false items. These include the following: 1. T-F Co Corr rrect ection ion or Mo Modif dified ied Tr True ue or Fal False se Que Questi stion. on. In this format,
thee sta th tate teme mennt is prese resennte tedd with ith a key word ord or phras hrasee tha that is underlined, and the learner has to supply the correct word or phrase. e.g., Multiple-choice test is authentic. 2. Ye Yess-No No Va Vari riat atio ion. n. In this format, the learner has to choose yes or no,
rather than true or false. e.g., e.g ., The foll follow owing ing are are kin kinds ds of te test. st. Circl Circlee Yes if iitt is authe authenti nticc tes testt and No if not. Multiple Choice Test
Yes
No
Debates
Yes
No
End-of-the Term Project
Yes
No
True or False Test
Yes
No
3. AA-B B Va Varriati iation on. In this format, the learners has to choose A or B, rather
than true or false. e.g., Indicate Indicate whic whichh of the following following are tr traditi aditional onal or au authent thentic ic tests by circling A if it is a traditional test and B if it is authentic. Traditional
Authentic
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
101
SULTAN KUDARAT STATE UNIVERSITY
Multiple Choice Test
A
B
Debates
A
B
End-of-the Term Project
A
B
True or False Test
A
B
Because true or false test items are prone to guessing, as learners are are aske askedd to choo choose se betw betwee eenn two two op optition ons, s, utmo utmost st care care sh shou ould ld be exercised in writing true or false items. The following are the general guidelines in writing true or false items: 1. Include Include stat statemen ements ts that are are com complete pletely ly true or or compl completely etely false false Faulty: The presidential system of government, where the president is
only the head of state or government, is adopted by the United States, Chile, Panama, and South Korea. Good: The presidential system, where the president is only the head of
the state or government, is adopted by Chile. Item#1 is of poor quality because, while the description is right, the coun co untr trie ies s give given n are are no nott all all co corr rrec ect. t. Whi hile le South outh Korea orea has has a
presidential system of government, it also has a prime minister who governs alongside with the president. 2. Use simpl simplee an andd ea easy-to sy-to-und -understan erstandd stateme statements nts Faul Fa ulty ty:: Education is a continues process of higher adjustment for
human beings who have evolved physically and mentally, which is free and conscious of God, as manifested in nature around the intellectual, emotional, and humanity of man. Good: Education is the process of facilitating learning or the acquisition
of knowledge, skills, values, beliefs, and habits. Items Ite ms # 1 is som somewh ewhat at con confus fusing ing,, es espec pecial ially ly for yo young unger er lea learne rners rs because there are many ideas in one statement.
3. Refrain Refrain from us using ing negativ negatives es - esp especial ecially ly double double negativ negatives. es. Faulty: There is nothing illegal about buying goods through the internet.
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
102
SULTAN KUDARAT STATE UNIVERSITY
Good: It is legal to buy things or goods through internet.
Double negatives are sometimes confusing and could result in wrong answers, not because the learner does not know the answer but because of how the test items are presented. 4. Avoid using using abs absolute olute such as “always “always”” and “never. “never.”” Faulty: The news and information posted on the CNN website is always
accurate. Good: The news and information posted on the CNN website is usually
accurate. Absolute words such as “always” and “never” restrict possibilities and make a statement as true 100 percent or all the time. They are also hint for a “false” answer.
5. Exp Expres resss a single single idea idea in each each te test st item. item. Faulty: If an object is accelerating, a net force must be acting on it, and
the acceleration of an object is directly proportional to the net force applied to the object. Good: If an object is accelerating, a net force must be acting on it.
Item # 1 consists of two conflicting ideas, wherein one is not correct.
6. Avoid Avoid the use of un unfamili familiar ar wo words rds oorr voc vocabul abulary. ary. Faul Fa ulty ty:: Esprit de corps among soldiers is important in the face of
hardships and opposition in fighting the terrorists. Studen Stu dents ts may hav have e a dif diffic ficult ult tim time e und unders erstan tandin ding g the sta statem tement ent,, especially if the word “esprit de corps” has not been discussed in the class. Using unfamiliar words would likely lead to guessing.
7. Avoid Avoid lifting st stateme atements nts from the the textbook textbook and oth other er learn learning ing materials. materials. General Guidelines in Writing Short-answer Items
A short-answer test item requires the learner to answer a question question or to finish an incomplete statement by filling in the blank with the correct word or phrase. While it is most appropriate when you only intend to assess learners’ lower-level thinking, such as their ability to recall facts learned in class, you
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
103
SULTAN KUDARAT STATE UNIVERSITY
can create items that minimize guessing and relevant clues to the correct answer. The following are the general guidelines in writing good fill-in-the-blank or completion test items:
1. Omit only only ssignif ignificant icant words from tthe he stateme statement. nt. Faulty: Every atom has a central _____ called a nucleus. Good: Every atom has a central core called a (n) ______. In item # 1, the word “core” is not the significant word. The item is also prone to many and varied interpretations, resulting to many possible answers.
2. Do not omit too many many wor words ds from the sta statem tement ent such such that the intende intendedd meaning is lost. Faulty: _______ is to Spain as the _______ is to United States and as
_______ is to Germany. Germany. Good: Madrid is to Spain as the ______ is to France. Item # 1 is prone to many and varied answers. For example, a student may answer the question based on the capital of these countries or based on what continent they are located. Item # 2 is preferred because it is more specific and requires only one correct answer.
3. Avoid obvious obvious clues to the corre correct ct response. Faulty: Ferdinand Marcos declared martial law in 1972. Who was the
president during that period? Good: The president during the martial law year was ___. Item #1 already gives a clue that Ferdinand Marcos was the president during dur ing this tim time e bec becaus ause e onl only y the presi presiden dentt of a cou countr ntry y can decla declare re martial law.
4. Be sure sure th that at the there re is only one ccorrec orrectt response. response. Faulty Fau lty:: the government should start using renewable energy sources
for generating electricity, such as ____. Good: the government should start using renewable sources of energy
by using turbines called ___. Item Item #1 ha has s ma many ny po poss ssib ible le an answ swer ers s be beca caus use e th the e stat statem emen entt is ve very ry general (e.g., wind, solar, biomass, geothermal, and hydroelectric). Item # 2 is more specific and only requires one correct answer (i.e., wind).
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
104
SULTAN KUDARAT STATE UNIVERSITY
5. Avoid Avoid gra grammatic mmatical al clues clues to to the correct correct respons response. e. Faulty: A subatomic particle with a negative negative electric charge is called an
_____. Good: A subatomic particle with a negative electric charge is called
a(n) ____. The word “an” in item #1 provides a clue that the correct answer starts with a vowel.
6. If possi possibl ble, e, put the blank blank at the end of a st stat atem emen entt rath rather er tha thann at the beginning. Faulty: ___ is the basic building building block matter. Good: The basic building block of matter is ___. In Item #1, learners may need to read the sentence until the end before th they ey ca can n reco recogn gniz ize e th the e prob proble lem, m, an and d th then en re re-r -rea ead d it ag agai ain n and and th then en answer the question. On the other hand, in item #2, learners can already identify the context of the problem by reading through the sentence only once and without having to go back and re-read the sentence.
General Guidelines in Writing Essay Tests
An essay test is an item which requires a response composed by the examinee, usually in the form of one or more sentences, of a nature that no single response or pattern of responses can be listed as correct, and the accuracy and quality of which can be judged subjectively only by one skilled or informed in the subject. Teachers generally chose and employ essay tests over other forms of assessment because essay tests require learners to create a response rather than to simply select a response from among the alternatives. They are the prefer pre ferred red form form of ass assess essme ment nt whe whenn teach teachers ers want want to mea measur suree lea learne rners’ rs’ higher hig her-or -order der thi thinki nking ng skills skills,, par partic ticula ularly rly the their ir ability ability to rea reason son,, analyz analyze, e, synthesize and evaluate. They also assess learners’ writing abilities. They are most appropriate for assessing learners’ (1) understanding of subject-matter content, (2) ability to reason with their knowledge of the subject, and (3) problem-solving and decision skills because items or situations presented in the test are authentic or close to real life experiences.
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
105
SULTAN KUDARAT STATE UNIVERSITY
There are two types of essay test: (1) extended-response essay and (2) restricted-response essay.
These are the general guidelines in constructing good essay questions: 1. Clearl Clearlyy defin definee the intended intended lea learning rning ooutcom utcomes es to be assesse assessedd by the essay test.
To design effective essay questions or prompts, the specific intended learning outcomes are identified. If the intended learning outcomes to be assessed lack clarity and specificity, the questions or prompts may assess something other than what they intend to assess. Appropriate direct verbs that most closely match the ability of the learners should
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
106
SULTAN KUDARAT STATE UNIVERSITY
demonstrate must be used in the prompts. These include verbs such as compose, analyze, interpret, explain, and justify, among others. 2. Refra Refrain in from usi using ng essay essay test for int intende endedd learn learning ing outcomes outcomes tthat hat are better assessed by other kinds of assessment.
Somee intend Som intended ed learni learning ng outcom outcomes es can be effici efficient ently ly and rel reliab iably ly assessed by selected-type test rather than by essay test. In the same mann ma nner er,, ther theree are are inte intend nded ed lear learni ning ng ou outc tcom omes es that that are are bett better er assessed using other authentic assessments, such as performance test, rather than by essay test. Thus, it is important to take into cons co nsid ider erat atio ionn the the limi limita tatition onss of es essa sayy test testss when when pl plan anni ning ng and and deciding what assessment method to employ for an intended learning outcome.
3. Clearl Clearlyy defin definee and situ situate ate the ta task sk within within a problem problem situat situation ion as well well as the type of thinking required to answer the test.
Essay questions or prompts should provide clear and well- defined tasks to the learners. It is important to carefully choose the directive verb, to write clearly the object or focus of the directive verb, and to delimit the scope of the task. Having clear and well-defined tasks will guided learners on what to focus on when answering the prompts, thuss avoid thu avoiding ing res respon ponses ses tha thatt con contai tainn ideas ideas tha thatt are unr unrela elated ted or irr irrel elev evan ant, t, too too long long,, or focu focusi sing ng only only on some some part part of the the task task.. Emphasizing the types of thinking required to answer the question will also guide students on the extent to which they should be creative, deep, complex, and analytical in addressing and responding to the questions.
4. Prese Present nt task taskss that are fai fair, r, reasonable reasonable,, and real realistic istic to th thee students. students.
Essay questions should contain tasks or questions that students will be able to do or address. These include those that are within the level of instruction or training, expertise, and experience of the students.
5. Be sp spec ecifific ic in the the prom prompt ptss ab abou outt the the tim timee al allo lotm tmen entt an andd cr crititer eria ia for for grading the response.
Essay prompts and directions should indicate the approximate time given to the students to answer the essay questions to guide them on
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
107
SULTAN KUDARAT STATE UNIVERSITY
how much time they should allocate for each item, especially if several essay questions are presented. How the responses are to be graded or rated should also be clarified to guide the students on what to include in their responses.
General Guidelines in Problem-solving Test items
Problem-solving test items are used to measure learners’ ability to solve problems that require quantitative knowledge and competencies and/or critical thinking skills. These items present a problem situation or task that will require learners to demonstrate work procedures or come up with a correct solution. Full or partial credit can be assigned to the answer, depending on the answers or solutions required. There are different variations of the quantitative problem-solving items. These included the following: 1. One answer choice - This type of question contains four or five options, and students are required to choose the best answer.
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
108
SULTAN KUDARAT STATE UNIVERSITY
Example: What is the mean of the following score distribution: 32, 44. 56.
69, 75, 77, 95, 96? A. 68
D. 74
B. 69
E. 76
C. 72 2. All possibl possible e ans answer wer choi choices ces - This type of question has four or five options, and students are required to choose all of the options that are correct. Example: Consider the following score distribution: 12, 14, 14, 14, 17, 24,
27, 28, and 30. Which of the following is/are the correct measure/s of central tendency? Indicate all possible answers. A. Mean = 20
D. Median = 17
B. Me Mean = 22
E. Mode = 14
C. Medi Median an = 16 16 Options A, D, and E are all correct answers. 3. Ty Type pe-i -in n an answ swer er – This type of question does not provide options to choo ch oose se from from.. Inst Instea ead, d, the the lear learne ners rs are are as aske kedd to supp supply ly the the corr correc ectt answer. The teacher should inform the learners at the start how their answer will be rated. For example, the teacher may require just the corr co rrec ectt an answ swer er or may may requ require ire lear learne ners rs to pres presen entt the the st step ep-b -byy-st step ep procedures in coming up their answers. On the other hand, for nonmathematical problem solving, such as a case study, the teacher may present a rubric how their answer will be rated. Example: Exam ple: Compute the mean of the following score distribution: 32, 44,
56, 69, 75, 77, 95, and 96. Indicate your answer in the blank provided. In this case, the learners will only need to give the correct answer without having to show the procedures for computation. Example: Lillia Lillian, n, a 55-yea 55-yearr old acc accoun ountan tant, t, has been su suffe fferin ringg from from
freque fre quent nt diz dizzin ziness ess,, nause nausea, a, and and lig lightht-hea headed dedne ness. ss. During During the inte interv rvie iew, w, Lill Lillia iann wa wass ob obvi viou ousl slyy rest restle less ss,, and and sw swea eatiting ng.. Sh Shee reported feeling so stressed and fearful of anything without any apparent reason. She could not sleep and eat well. She also started to withdraw from family and friends, as she experienced
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
109
SULTAN KUDARAT STATE UNIVERSITY
frequent panic attacks. She also said that she was constantly worrying about everything in work and at home. What might be Lill Lillia ian’ n’ss prob proble lem? m? Wh What at shou should ld she she do to al alle levi viat atee al alll her her symptoms? Problem-solving test items are good test format as they minimize guessing, measure instructional objectives that focus in higher cognitive levels, and measure extensive amount of contents or topics. However, they require more time for teachers to construct, read, and correct, and are prone to rater bias, especially when scoring rubrics/criteria are not available. It is therefore important that good quality problem-solving test items are constructed. The following are some of the general guidelines in constructing good problem-solving test items: 1. Identify Identify aand nd explain explain the problem problem clearly. clearly. Fau ault lty: y: Tr Tric icia ia wa wass 13 135. 5.66 lbs lbs. wh when en sh shee st star arte tedd witithh her her zumba
exercises. After three months of attending the sessions three times a week, her weight was down to 122.8 lbs. About how many lbs. did she lose after three months? Write your final answer in the space provided and show your computations. [This question asks “about how many” and does not indicate whether learners need to give the exact weight or whether they need to round off their answer and to what extent.] Good: Tric ricia was 135 35.6 .6 lbs lbs. wh when en sh shee sta tart rted ed wi with th he herr zumba
exercises. After three months of attending the sessions three times a week, her weight was down to 122.8 lbs. Did she lose afte afterr thre threee mo mont nths hs?? Wr Writitee yo your ur fifina nall answ answer er in the the sp spac acee provided and show your computations. Write the exact weight; do not round off. 2. Be sp spec ecifific ic an andd clea clearr of the the ty type pe of resp respon onse se requ requir ired ed from from the the students. Faul Fa ulty ty:: ASEANA Bottlers, Inc. has been producing and selling Tutti
Fruity juice in Philippines, aside from their Singapore market. The sales for the juice in the Singapore market were $5 million
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
110
SULTAN KUDARAT STATE UNIVERSITY
more than those of their Philippine market in 2016, S$3 million more mo re in 2017 2017,, and and S$ S$4. 4.55 mi millllio ionn in 2018 2018.. If the the sale saless in Philippine market in 2018 were PHP35million, what were the sales in Singapore market during that year? [This is a faulty question because it does not specify in what currency the answer be presented.] Good Go od:: ASEANA Bottlers, Inc. has been producing and selling Tutti
Fruity juice in Philippines, aside from their Singapore market. The sales for the juice in the Singapore market were S$5 million more than those of their Philippine market 2016, S$3 million more in 2017, and S$4.5 million in 2018. If the sales in Mexican market in 2018 were PHP 35 million, what were the sale sa less in U. U.S. S. ma mark rket et du duri ring ng that that ye year ar?? Pr Prov ovid idee answ answer er in Singap Sin gapore ore dol dollar larss (1S$ (1S$ = PHP PHP36. 36.50) 50).. [This is a better item because it specifies in what currency should the answer be presented, and the exchange exchange rate was give given.] n.]
3. Specify
in
the
directions
the
bases
for
grading
students’
answer/procedures. Faulty: VCV Consultancy Firm was commissioned to conduct a survey
on the the vo vote ters rs’’ pref prefer eren ence cess in VI VIsa saya yass and and Mind Mindan anao ao for for upcoming presidential election. In Visayas, 65% are for Liberal Party (LP) candidate, while 35% are for the Nationalists, while 30% are LP supporters. A survey was conducted among 200 voters for each region. What is the probability that the survey will show a greater percentage of Liberal Party supporters in Mindanao than in the Visayas region? [This question is undesirable because it is does not specify the basis for grading the answer.] Good: VCV Consultancy Firm was commissioned to conduct a survey
on vote voter’ r’ pref prefer eren ence cess in Vi Visa saya yass and and Mi Mind ndan anao ao for for the the upcoming presidential election. In Visayas, 65% are for Liberal
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
111
SULTAN KUDARAT STATE UNIVERSITY
Party (LP) candidate, while 35% are for the Nationalist Party (NP) candidate. In Mindanao, 70% of the voters are Nationalist while 30% are LP supporters. A survey was conducted among 200 voters for each region. What is the probability that the survey will show a greater percentage of Liberal Party supporters in Mindanao than in the Visayas region? Please show your solutions to support your answer. Your answer will be graded as follows:
0 points = for wrong answer and wrong solution
1 points = for correct answer only (i.e., without or wrong solution) 3 points = for correct answer with partial solutions 5 points = for correct answer with complete solutions
Assessment
A. Let us review what you have learned about cconstructing onstructing traditional tests. 1. What factors factors sh should ould be conside considered red when when choos choosing ing a particula particularr test format? 2. What are are the major major cate categorie goriess and fo formats rmats ooff traditiona traditionall tests? 3. When are are the fo followi llowing ng trad traditiona itionall tests aapprop ppropriate riate to use?
Multiple-choice test
- short-answer test
Matching-type test
- essay test
True or false test - problem-solving tests 4. How should should the ite items ms for the ab above ove trad traditiona itionall tests be constru constructed cted?? To check whether you have learned the important information about constructing the traditional types of tests, please complete the following graphical representation:
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
112
SULTAN KUDARAT STATE UNIVERSITY
5. Base Basedd on the the gu guid idel elin ines es on wr writitin ingg ititem emss for for trad tradititio iona nall test testss and and examples of good and faculty items presented, you are now ready to construct effective tests of different formats to assess your learners or the learning outcomes. Let us apply what you have learned by creating an assessment plan for your chosen subject. List down the desired learning outcomes and subject topic or lesson; and for each desired lear le arni ning ng ou outc tcom ome, e, iden identitify fy the the appr approp opri riat atee test test form format at to as asse sess ss learners’ achievement of the outcome. It is important that you have an assessment plan for each subject. Example of an Assessment Plan Subject: Economics Desired Learning Outcomes
e.g., Show understanding of the concept of demand and supply
Apply the concepts of
demand and supply in actual cases
Topic/Lesson
Definition of shortage, demand and supply, surplus, and market equilibrium Effects of change of demand and supply on market price Exchange Rate, Change in the Price of Goods in the Market, Price Ceiling and Price Floor
Types of Test
Multiple-choice; True or false, matching type, and completion test
Essay, problem sets, case analysis, and exercises
Others
B. Now that that you are able to iden identif tifyy the types of ass assess essmen mentt that you will employ for each desired learning outcome for a subject, you are now ready to construct sample tests for the subject. Construct a three-part test that includes test formats of your choice. In the development of the test, you will need the following information: 1. Desired Desired le learnin arningg ou outcome tcomess fo forr su subject bject area. 2. Level Level of cognitive/thi cognitive/thinking nking sk skills ills appr appropria opriate te to assess th thee desired desired learning outcomes 3. App Approp ropria riate te ttest est for format mat to uuse se
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
113
SULTAN KUDARAT STATE UNIVERSITY
4. Number Number of items items per lea learning rning ooutcom utcomee or area and and the w weight eightss 5. Number Number of points points for eeach ach item item and to total tal number number of points points the whole whole test Note: In the development of the test, you should take into consideration
the guidelines on developing table of specifications and on constructing the test items. C. Evaluate the sample tests that you have dev developed eloped by using the following checklist for the three test formats that you used. 1. Checklist for Writing Multiple-Choice Test Items
Yes 1. 2. 3. 4. 5.
Does the ititem em re reflec flectt specific specific cconte ontent nt and ment mental al task? task? Are sstate tatemen ments ts from from textboo textbookk avoide avoided? d? Is the item sstated tated in si simple mple and cclea learr lan languag guage? e? Is the item ffree ree ggramm rammatica aticall and sspell pelling ing errors errors?? Are tthe he ddire irecti ctions ons iinn the sstem tem ccle lear? ar?
No
6. Are do doubl ublee ne nega gativ tives es avoi avoide ded? d? 7. Does tthe he ite item m conta contain in irr irrelev elevant ant informa information tion,, maki making ng it to tooo wordy? 8. Does the iitem tem contain contain nnoo mor moree th than an five five ooption ptions? s? 9. Is the iinten ntended ded answer answer co correc rrectt or cle clearly arly oorr clea clearly rly the bbest est alternative? 10. Are the options paral parallel lel in structure and equ equal al in length to avoid clues? 11. Are the option optionss written in logi logical cal order? 12. Are the correc correctt answers for all ite items ms in the test placed placed randomly? 13. Is the None of the Above option used cautiously? 14. Is the All of the Above option as the right answer avoided? 15. Are the option optionss plausible aand nd homogen homogenous? ous?
2. Checklist for Writing Matching-Type Test
Yes 1. Do the ddirec irection tionss clea clearly rly sta state te the bbasis asis ffor or mat matchin chingg the stimuli with the responses? 2. Is the item ffree ree fr from om gra grammati mmatical cal oorr othe otherr clu clues es to th thee correct response? 3. Are tthe he st stems ems long longer er an andd the respo responses nses shor shorter? ter? 4. Do the item itemss sha share re the same foun foundati dation on of iinfor nformatio mation? n? 5. Are th thee ans answer wer cho choices ices sh short, ort, hhomog omogeneo eneous, us, and arran arranged ged logically? 6. Are the options options reas reasonab onable le and and rreal ealistic istic?? 7. Are th thee opti options ons si simila milarr in length length an andd gram grammatic matical al form form?? 8. Are ther theree mo more re respon response se ooptio ptions ns th than an sstems? tems? 3. Checklist for True or False Test Items
No
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
114
SULTAN KUDARAT STATE UNIVERSITY
Yes 1. 2. 3. 4.
Is th thee ite item m com complete pletely ly tr true ue oorr comple completely tely false false?? Is the iitem tem wri written tten iinn simp simple, le, ea easy-to sy-to-foll -follow ow stat statemen ements? ts? Are Are ne nega gatitive vess avoi avoide ded? d? Are ab absolu solutes tes suc suchh as “a “alway lways” s” and “n “never ever”” used spari sparingly ngly or not at all? 5. Do iitem temss expre express ss on only ly a ssing ingle le id idea ea?? 6. Is th thee us usee of uunfam nfamili iliar ar voc vocabul abulary ary aavoid voided? ed? 7. Is the iitem tem or st statem atement ent not not lifted lifted from th thee text text,, lectu lecture, re, or other materials?
No
4. Checklist for completion or Fill-in-the-Blank Test Items
Yes 1. Are th thee onl onlyy sig signifi nificant cant words words from from sstatem tatement ent oomitte mitted? d? 2. Are oonly nly fe few w ite items ms om omitted itted from the st stateme atement nt so tthat hat th thee intended meaning is not lost? 3. Are oobvio bvious us cl clues ues to the ccorre orrect ct res respons ponsee avo avoided ided?? 4. Is the there re is only oone ne co correc rrectt resp response onse to to the iitems? tems? 5. Are gr gramma ammatical tical clue cluess to the ccorrec orrectt resp response onse avoid avoided? ed?
6. Is the bblank lank sspace pacedd at the eend nd of a st statem atement ent rather rather th than an at the beginning?
No
5. Checklist for Writing Essay Question
Yes 1. Is the item/ item/topi topicc can bbest est aassess ssessed ed by an es essay say te test? st? 2. Is the eessay ssay qquesti uestion on ali aligned gned w with ith the ddesir esired ed le learni arning ng outcomes? 3. Does tthe he ess essay ay que question stion conta contain in a cl clear ear an andd del delimit imited ed task? 4. Is the ta task sk pre presente sentedd to stu student dentss realistic realistic aand nd rea reasona sonable ble?? 5. Is the time aallot llotment ment enoug enoughh for each each eessay ssay qquesti uestion? on? 6. Do the stud students ents kknow now hhow ow ma many ny po points ints th thee ess essay ay is w worth? orth?
No
D. Evaluate the level of your skills in developing different test formats using the following scale: Level
Performance Benchmarking
MultipleChoice
MatchingType
TrueFalse
Proficient
I know this every well. I can teach others on how to make one. I can do it by myself, though I
4
4
4
4
4
3
3
3
3
3
2
2
2
2
2
Master
Developing
sometimes make mistakes. I am getting
Short- Essay Answer
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
115
SULTAN KUDARAT STATE UNIVERSITY
Novice
there, though I still need help to be able to perfect it. I cannot do it myself. I need
1
1
1
1
1
help to make a good/effective test
E. Based on your self-assessment, choose the following tasks to help you enhance your skills and competencies in developing different test formats: Level
Possible Tasks
Proficient Master Developing/ Novice
Help or mentor peer/classmates who are having difficulty in developing good items for their course assessment. Examine the areas that you need to improve on and address them immediately. Read more books/references on how to develop effective items. Work and collaborate with your peer/classmates in developing developi ng a particular test format. Ask your teacher teacher to evaluate the iitems tems that you hav havee developedd and to give suggestions on how you can develope improve you skills in constructing items.
F. Test your your unders understandi tanding ng about about constru constructing cting tes testt items for different different test formats. Answer the following items. 1. What are are these these state statements ments that that learners learners aare re expected expected to to do or demonstrate as a result of engaging in the learning process? A. Desired learning outcomes outcomes
C. Learning intents
B. Learning goals
D. Learning objectives
2. Which Which of the fo followin llowingg is NOT a fac factor tor to co consid nsider er when cchoos hoosing ing a particular test format? A. Desired learning outcomes outcomes of the lesson B. Grade Grade level level of of st stud udent entss C. Lea Learni rning ng activi activitie tiess D. Level Level of thinkin thinkingg to be be assess assessed ed 3. Ms. Daniel Daniel is plan planning ning to us usee a traditi traditional onal/conv /conventio entional nal type of of classroom assessment for her Trigonometry quarterly quiz. Which of the following test formats she will likely NOT use?
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
116
SULTAN KUDARAT STATE UNIVERSITY
A. Fill-in-the-blank test
C. Multiple-choice
B. Matching type
D. Oral presentation
4. What is is the typ typee of test test in which the lea learners rners are are asked asked to formulate formulate their own answers? A. Alternative response response type
C. Multiple-choice type
B. Constructed-response type
D. Selected-response type
5. What is is the typ typee of true or or false test test item in which the the statemen statementt is presented with a key word or brief phrase that is underlined, and the student ha to apply the correct word or phrase? A. A-B variation
C. T-F substitution variation
B. T-F correction qu question
D. Y Yees-No va variation
6. What is is the typ typee of test test item in which which le learners arners are required required ttoo answer answer a question by filling in a blank with the correct word or phrase? A. Essay test B. Fill-in-the Fill-in-the-blan -blankk or comp completion letion test iitem tem C. Mod Modifie ifiedd true true or false false ttest est D. Shor Shortt an answ swer er ttes estt 7. What is is the mos mostt appr appropria opriate te test fo format rmat to uuse se if tea teacher cherss want to measure the learners’ higher order thinking skills, particularly their abilities to reason, analyze, synthesize, and evaluate? A. Essay
C. Problem solving sk skills ills
B. Matching type
D. True or False
8. What is the the first ste stepp when pla planning nning to cconst onstruct ruct a final final examination examination in Algebra? A. Come up with a table of specifications specifications B. Decide Decide on on the leng length th of tthe he test test C. Define Define the desired desired le learnin arningg out outcome comess D. Select Select the type of test test to cconstr onstruct uct 9. What is is the typ typee of learning learning ou outcome tcome that that Dr. Oňas Oňas is aasses ssessing sing if he he wants to construct a Multiple-choice test for his Philippine History class? A. Knowledge
C. Problem solving sk skills ills
B. Pe Performance
D. Product
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
117
SULTAN KUDARAT STATE UNIVERSITY
10.In constructing a fill-in-the-blank or completion test, what guidelines should be followed?
Educators’ Feedback
“As a teacher in senior high school, I always make sure that my periodical Ms. Cudera teaches Practical learning Research 1 and 2 in aaspublic senior exams measure the expected competencies stipulated in high the curriculum guideasked of theabout Department of Education. I then test create a table of school. When his experiences in writing items for his specifications,, wherein I follow the correct item allocation per competency based specifications subjects, he cited his practice of referring back theappropriate expectedcognitive learning on the number of hours being taught in the class andtothe domain expected of everyinlearning competency. I make sure and that using in assessing outcomes as specified the DepEd Curriculum Guide varied students, I am always guided by the DepEd Order No. 8, s. 2015 also known as types of assessments to measure his students’ achievement of these the Policy Guidelines on Classroom Assessment for the K to 12 Basic Education expected Program. outcomes. This is what he shared: For this school year, I was assigned to teach Practical Research 1 and 2 courses. To assess students’ learning or achievement, I first conducted formative assessment to provide me some background on what students know about Research. The result of the formative assessment allowed me to revise my lesson plans and gave me some directions on how to proceed with and handle the courses.
As part of the course requirements, I gave the students a lot of writing activities, wherein they were required to write the drafts of each part of research. For each work submitted, I read, checked, and gave comments and suggestions on how to improve their drafts. I then allowed them to rewrite and revise their works. The final research paper is used as basis for summative assessment. I made use of different types of tests to determine how my students are performing in my class. I administered administered selected-response selected-response type of test such as multiple-choice test, matching type, completion tests and true or false to determine how much they have learned about the different concepts, methods, and data gathering and analysis procedures used in research. In the developmentt of the test items, I made sure that I edit them for content, grammar, developmen grammar, and spelling. I also checked if the test items conformed to the table of specifications. Furthermore, I also relied heavily on essay tests and other performance tasks. As I have mentioned. I required students to produce or write the different parts of a research paper as outputs. They were also required to gather data for their research. I utilized a rubric that was conceptualized collaboratively with my students in order to evaluate their outputs. I used 360-degrees evaluation of their output, wherein aside from my assessment, other members would assess
the work of others and leader would also evaluate the work of its members. ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
118
I also conducted item analysis after every periodical exams to identify the least mastered competencies competencies for a given period, which to improve the performance of the students.”
SULTAN KUDARAT STATE UNIVERSITY
References
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
119
SULTAN KUDARAT STATE UNIVERSITY
Brame, C. (2013) Writing good multiple choice test questions. Retrieved on August 26, 2020 from https://cft.vanderbilt.edu/guides-subpages/writing-good-multiple-choice-test-questions/.... pages/writing-good-multiple-choice-test-questions/ Clay, B. (2001). A Short Guide to Writing Effective Test Questions. Kansas Curriculum Center, Department of Education: Kansas, USA. Retrieved on August 25, 2020 from https://www.k-state.edu/ksde/alp/resources/Handout-Chapter6.pdf David et al. (2020). Assessment in Learning 1. Manila: Rex Book Store. De Guzman, E. and Adamos, J. (2015). Assessment of Learning 1. Quezon City: Adriana Publishing Co., Inc. Popham, W. (2011). Classroom Assessment: What teachers need to know. Boston, MA: Pearson Education, Inc. Reiner et al. (2020). Preparing Effective Essay Questions: A Self-directed Workbook for Educators. Utah, USA: New Forums Press. Available in https://testing.byu.edu/handbooks/WritingEffectiv https://testing.byu.edu/han dbooks/WritingEffectiveEssayQuestions eEssayQuestions.pdf .pdf Truckee Meadows Community College (2015, February 18). Writing Multiple Choice Test Questions. [Video]. YouTube. https://youtu.be/3zQLZVqksGg
Lesson 3: Improving a Classroom-based Assessmen Assessmentt
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
120
SULTAN KUDARAT STATE UNIVERSITY
Pre-discussion
By now, it is assumed that you have known how to plan a classroom test by specifying the purpose for constructing it, the instructional outcomes to be assessed, and preparing a test blueprint to guide the construction process. The techniques and strategies for selecting and constructing different item formats to match the intended instructional outcomes make up the second phase of the test development process which is the content of the preceding lesso les son. n. Th Thee pro proce cess ss howev however er is not not comple complete te with without out ensur ensuring ing tha thatt the classroom instrument is valid for the purpose for which it is intended. Ensuring requires reviewing and improving the items which is the next stage in the proc proces ess. s. This This less lesson on offe offers rs the the prepre-se serv rvic icee teac teache hers rs the the prac practitica call and and necessary ways for improving teacher-developed assessment tools. What to Expect?
At the end of the lesson, the students students can: 1. list dow downn the diffe different rent ways ways for judgmenta judgmentall item-im item-improve provement ment and and other empirically-based procedures; 2. evalu evaluate ate which which type of test test item-i item-improv mprovemen ementt is appropriate appropriate to use; use; 3. compu compute te and interp interpret ret the results results for index index of diff difficulty iculty,, index of of discrimination and distracter efficiency; and 4. demo demonstra nstrate te knowl knowledge edge on th thee proce procedures dures fo forr improving improving a classroom classroom-based assessment. Judgmental Item-Improvement
This approach basically makes use of human judgment in reviewing the items. The judges are teachers themselves who know exactly what the test for, the instructional outcomes to be assessed, and the items’ level of difficulty appropriate to his/her class; the teacher’s peers or colleagues who are familiar with the curriculum standards for the target grade level, the subject matter content, and the ability of the learners; and the students themselves who can perceive difficulties based on their past experiences. Teachers’ Own Review
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
121
SULTAN KUDARAT STATE UNIVERSITY
It is always advisable for teachers to take a second look at the assessment tools he/she has devised for a specific purpose. To presume perfec per fectio tionn rig right ht awa awayy aft after er its co const nstruc ructio tionn may lead to failur failuree to det detect ect shortcomings of the test or assessment tasks. There are five suggestions given by Popham (2011) for the teachers to follow exercising judgment: 1. Adherence to item-specific guidelines and general item-writing commandments. The preceding lesson has provides specific
guidelines in writing various forms of objectives and non-objective constructed-responsee types and the selected-response type for constructed-respons measuring higher-level thinking skills. These guidelines should be used by the teachers to check how the items have been planned and written particularly and their alignment to their intended instructional outcomes. 2. Contribution to score-based inference. The teacher examines if the expected scores generated by the test contribute to making valid inference about the learners. Can the scores reveal the amount of learning achieved or show what have been mastered? Can the score infer the students’ capability to move on to the next instructional level? Or rather the scores obtained do not make any differences at all in describing or differentiating various abilities. 3. Accuracy of contents. This review should especially be considered when tests have been developed after a certain period of time. Changes that may occurred due to new discoveries or developments can refined the test contents of a summative test. If this happens, the items or the key to correction may be to be revisited. 4. Abs Absenc ence eo off c con onten tentt g gaps aps.. This review criterion is especially useful in
strengthening the score-based inference capability of the test. If the current tool misses out on important content now prescribed by a new curriculum standard, the score will likely not give an accurate description of what is expected to be assessed. The teacher always ensures that the assessment tool matches what is currently required to be learned. This is a way to check on the content validity of the test. 5. Fairness. The discussion on item-writing guidelines always give warning unintentionally favoring the uninformed student obtain higher
scores. These are due inadvertent grammatical clues, unattractive ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
122
SULTAN KUDARAT STATE UNIVERSITY
distracters, ambiguous problems and messy test instructions. Sometimes, unfairness can happen because of due advantage received by a particular group like those seated in the front of the classroom or those coming from a particular socio-economic level. Getting rid of faulty and biased items and writing clear instructions definitely add to the fairness of the test. Peer review
Ther Th eree are are sc scho hool olss th that at enco encour urag agee peer peer or co colllleg egia iall revi review ew of assessment instruments among themselves. Time is provided for this activity andd it ha an hass almo almost st alwa always ys yiel yielde dedd go good od resu resultltss for for impr improv ovin ingg test testss and and performance-based assessment tasks. During these teacher dyad or triad sessions, those teaching the same subject area can openly review together the classroom tests and tasks they have devised against some consensual criteria. The suggestions given by test experts can actually be used collegially as basis for a review checklist: a. Do the items follow follow the spe specif cific ic and gene general ral gui guidel deline iness in writin writingg items especially on:
Being aligned to instructional objectives?
Making the problem clear and unambiguous?
Providing plausible options?
Avoiding unintentional unintentional clues?
Having only one correct answer? b. Are the iitems tems free from inaccurate inaccurate conte content? nt? c. Are the the ite items ms fre freee from obs obsole olete te co conte ntent? nt? d. Are the test test instructi instructions ons clearly clearly w written ritten ffor or students students ttoo follo follow? w? e. Is the level level of difficu difficulty lty of the test ap appropr propriate iate to le level vel of learne learners? rs? f. Is the the test test fa fair ir to to aallll kind kindss ooff st stud udent ents? s? Student Review
Engagement of students in reviewing items has become a laudable practice for improving classroom test. The judgment is based on the students’ experience in taking the test, their impressions and reactions during the
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
123
SULTAN KUDARAT STATE UNIVERSITY
testing event. The process can be efficiently carried out through the use review rev iew que questi stionn onnair aire. e. Popha Popham m (2011) (2011) ill illust ustrat rates es a sample sample questi questionn onnaire aire shown in the textbox below. It is better to conduct the review activity a day after taking the test so the students still remember the experience when they see a blank copy of the test.
Item-Improvement Questionnaire for Students
If any of the items seemed confusing, confusing, which ones where they? Did any items have more than one correct answer? If so, which ones? Did any items have no correct answers? If so, which ones? Were there words in any item that confused you? If so, which ones? Were the directions for the test, or for particular sub-sections, unclear? If so, which ones?
Another technique of eliciting student judgment for item improvement is by going over the test with his/her students before the results are shown. Students usually enjoy this activity since they can get feedback on the answers they have written. As they tackle each item, they can be asked to give their answer, and if there is more than one possible correct answer, the teacher makes notations for item-alterations. Having more than one correct answer signals ambiguity either in the stem or in the given options. The teacher may also take the chance to observe sources of confusion especially when answers vary. During this session, it is important for the teacher to main ma inta tain in an atmo atmosp sphe here re th that at allo allows ws st stud uden ents ts to qu ques estition on and and gi givve suggestions. It also follows that after an item review session, the teacher should be willing to modify the incorrect keyed answers. Empirically-based Procedures
Item Item-i -imp mpro rove veme ment nt us usin ingg empi empiri rica callllyy-ba base sedd meth method odss is ai aime medd at improving the quality of an item using students’ response to the test. Test developers refer to this technical process as item analysis as it utilizes data obtained data separately for each item. An item is considered good when its quality indices, i.e., difficulty index and discrimination index, meet certain
characteristics. For a norm-referenced test, these two indices are related ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
124
SULTAN KUDARAT STATE UNIVERSITY
since the level of difficulty of an item contributes to its discriminability. An item is good if it can discriminate between those who perform well in the test and those who do not. However, an extremely easy item, that which can be answered correctly by more than 85% of the group, or an extremely difficult item, that which can only be answered correctly by 15%, is not expected to perf pe rfor orm m we wellll as a “dis “discr crim imin inat ator or”. ”. Th Thee grou groupp wi willll appe appear ar to be qu quititee homogenous with items of this kind. They are weak items since they do not contribute to “score-based inference”. The difficulty index, however, takes a different meaning when used in the context of criterion-referenced interpretation or testing for mastery. An item with a high difficulty index will not be considered as an “easy item” and therefore a weak item, but rather an item that displays the capability of the learners to perform the expected outcome. It therefore becomes an evidence of mastery. Particularly for objective tests, the responses are binary in form, i.e., right or wrong, translated into numerical figures as 1 and 0, for obtaining nominal data like frequency, percentage and proportion. Useful data then are in the form: a. Total Total number number of stud students ents answering answering the item item (T) (T) b. Total Total number number of student studentss answerin answeringg the item item right right (R) Difficulty Index
An item is difficult if majority of students are unable to provide the correct answer. The item is easy if majority of the students are able to answer correctly. An item can discriminate if the examinees who score high in the test can answer more the items correctly than examinees who got low scores. Below is a data set of five items on the additional and subtraction of integers. Follow the procedure to determine the difficulty and discrimination of each item. 1. Get the the tot total al sc score ore of each stude student nt an andd arr arrange ange scor scores es from from highest highest to lowest.
Item 1
Item 2
Item 3
Item 4
Item 5
Student 1
0
0
1
1
1
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
125
SULTAN KUDARAT STATE UNIVERSITY
Student 2 Student 3 Student 4 Student 5 Student 6 Student 7
1 0 0 0 1 0
1 0 0 1 0 0
1 0 0 1 1 1
0 1 0 1 1 1
1 1 1 1 0 0
S Sttuuddeenntt 89 Student 10
01 1
10 0
11 1
01 1
01 0
2. Obtained Obtained the upper upper and and lower 27 27% % of the gro group. up. Mul Multiply tiply 0. 0.27 27 by the total total number of students, you will get a value of 2.7. The rounded whole number value is 3.0. Get the top three students and the bottom 3 students based on their scores. The top three students are students 2, 5, and 9. The bottom three students are students 7, 8, and 4. The rest of the students are not included in the item analysis.
Student 2 Student 5 Student 9 Student 1 Student 6 Student 10 Student 3 Student 7 Student 8 Student 4
Item 1
Item 2
Item 3
Item 4
Item 5
Total score
1 0 1 0 1 1 0 0 0 0
1 1 0 0 0 0 0 0 1 0
1 1 1 1 0 0 0 0 1 0
0 1 1 1 1 1 1 1 0 0
1 1 1 1 0 0 1 0 0 1
4 4 4 3 3 3 2 2 2 1
3. Obtain Obtain the pro propo portio rtionn of correc correctt for eac eachh item. This is comput computed ed for the upper 27% group and the lower 27% group. This is done by summating the correct answer per item and dividing it by the total number of students.
Sttuuddeenntt 52 S Student 9
Item 1
Item 2
Item 3
Item 4
Item 5
Total score
01 1
11 0
11 1
10 1
11 1
44 4
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
126
SULTAN KUDARAT STATE UNIVERSITY
Total Proportion of the high group (pH)
Student 7 Student 8 Student 4 T otal Proportion of the
2 0.67
2 0.67
3 1.00
2 0.67
3 1.00
0 0 0
0 1 0
1 1 0
1 0 0
0 0 1
0.000
0.133
0.267
0.133
0.133
2 2 1
low group (pL)
4. The item item diffic difficulty ulty is oobtain btained ed using using the fol followin lowingg formula: formula:
Item difficulty =
The difficulty is interpreted using the table Difficulty
Remark
0.76 or higher 0.25 to 0.75 0.24 or lower
Easy Item Average Item Difficult Item
Computations Item 1
Index of difficulty Item difficulty
Item 2
Item 3
Item 4
Item 5
0.33
0.50
0.83
0.50
0.67
Difficult
Average
Easy
Average
Average
Discrimination Index
Obviously, the power of an item to discriminate between informed and uninformed groups or between more knowledgeable and less knowledgeable learners learn ers are shown using the item-discrimination index (D) . This is an item statistics that can reveal useful information for improving an item. Basically, an item discrimination index shows the relationship between the student’s performance in an item (i.e., right or wrong ) and his total performance in the test represented by the total score. Item-total correlation is usually part of a package from item analysis. Getting high item-total correlations indicate that the items contribute well to the total score so that responding item-total
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
127
SULTAN KUDARAT STATE UNIVERSITY
correlations indicate that the items contribute well to the total score so that resp respon ondi ding ng corr correc ectly tly to thes thesee item itemss gi give vess a be bett tter er chan chance ce of obta obtain inin ingg relatively high total scores in the whole test or subtest. For classroom tests, the discrimination index shows if a difference exists between the performance of those who scored high and those who scoredd low in the item. As a general rule, the higher score higher the discrimina discrimination tion index (D), the more marked the magnitude of the difference is, and thus, the more discriminating the item is. The nature of the difference however, can take different directions. a. Positively discriminating item – proportion of high scoring group is greater than that of the low scoring group b. Negatively discriminating item – proportion of high scoring group is less than that of the low scoring group c. Not discriminating item – proportion of high scoring group is equal to that of the low scoring group Computing the discrimination index therefore requires obtaining the difference between the proportion of the high-scoring group getting the item correctly and the proportion of the low-scoring group getting the item correctly using this simple formula: D = RU /TU – RL /TL
where D = is item i tem discrimination index
RU = number of upper group getting the item correct TU = number of upper group RL = number of lower group getting the item correct TL = number of lower group
Another calculation can bring about the same result result as: D = (RU – RL)/T
where RU = number of upper group getting the item correct RL = number of lower group getting the item correct T = number of either group As you can see R/T is actually getting the p value of an item. So to get
D is to get the difference between the p-value involving the upper half and the ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
128
SULTAN KUDARAT STATE UNIVERSITY
p-value involving the lower half. So the formula for discrimination index (D) can also be given as (Popham, 2011): D = pU – pL
where pU is the p-value for upper group (RU/TU) pL is the p-value for lower group (R L/TL) To obtain the proportions of the upper and lower groups responding to the item correctly, the teacher follows these steps: a. Score Score the test pap papers ers usi using ng a key to cor correc rectio tionn to obt obtain ain the total total scores of the students. Maximum score is the total number of objective items. b. Order the te test st papers papers from hhighes ighestt to lowes lowestt score. score. c. Split Split the the te test st pape papers rs in into to halv halves: es: high group and lower group For a class of 50 or less students, do a 50-50 split. Take the upper half as the HIGH score group and the lower half as the LOW group.
For a big group of 100 or so, take the upper 25% - 27% and the lower 25% - 27%.
Maintain equal numbers of test papers for the Upper and Lower groups.
d. Obtain Obtain the p-v p-valu aluee for the Uppe Upperr Group and p-v p-valu aluee for the Lower Group
pUpper = = RU/TH; pLower = = RL/TH
e. Get the discrim discrimina inatio tionn index index (D) by gettin gettingg the differe difference nce bet betwee weenn the p-values. For purposes of evaluating the discriminating power of items, Popham (2011) offers the guidelines proposed by Ebel and Frisbie (1991) shown below. The teachers can be guided on how to select the satisfactory items and what to do to improve the rest. Discrimination Index
Item Evaluation
.40 and above
Very good items
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
129
SULTAN KUDARAT STATE UNIVERSITY
.30 - .39 .20 - .29 .19 and below
Reasonably good items, but possibly subject to improvement Marginal items, usually needing improvement Poor items, to be rejected r ejected or improved by revision
Items with negative discrimination indices, although significantly high, are subject right away to revision if not deletion. With multiple-choice items, negative D is, a forensic evidence of errors in item writing. It suggests the possibility of:
Wrong key – More knowledgeable students selected the distracter which is the correct answer but is i s not the keyed option.
Unclear problem in the stem leading to more than one correct answer
Ambiguous distracters distracters leading the more informed stud students ents be divided in choosing the attractive options Implausible keyed option which more informed students will not choose
As you can see, awareness of item-writing i tem-writing guidelines can provide cues on how to improve items hearing negative or non-significant discrimination indices. Distracter Analysis
Another empirical procedure to discover areas for item-improvement utilizes an analysis of the distribution of responses across the distracters. Obviously, when the difficulty index and discrimination index of the item seem to suggest its being candidate for revision, distracter analysis becomes a useful follow-up. In distractor analysis, however, we are no longer interested in how test takers select the correct answer, but how the distracters were able to function effectively by drawing the test takers away from the correct answer. The number of times each distractor is selected is noted in order to determine the effectiveness of the distractor. We would expect that the distractor is selected by enough candidates for it to be a viable distractor. What exactly is an
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
130
SULTAN KUDARAT STATE UNIVERSITY
acceptable value? This depends to a large extent on the difficulty of the item itself and what we consider to be an acceptable item difficulty value for test times. If we are to assume that 0.7 is an appropriate item difficulty value, then we should expect that the remaining 0.3 be about evenly distributed among the distractors. Let us take the following test item as an example: In the story, he was unhappy because………… A. it rained all day B. he was scolded C. he hurt himself D. the weather was hot
Let us assume that 100 students took the test. If we assume that A is the answer and the item difficulty is 0.7, then 70 students answered correctly. What about the remaining 30 students and the effectiveness of the three distractors? If all 30 selected D, the distractors B and C are useless in their
role role as dist distra ract ctor ors. s. Simi Simila larly rly,, if 15 stud studen ents ts sele select cted ed D and and anot anothe herr 15 selected B, then C is not an effective distractor and should be replaced. The ideal situation would be for each of the three distractors to be selected by 10 students. Therefore, for an item which has an item difficulty of 0.7, the ideal effectiveness of each distractor can be quantified as 10/100 or 0.1. What would be the ideal value for distractors in a four option multiple choice item whenn the item whe item diffic difficult ultyy of the item is 0.4? 0.4? Hint: You need to identify the proportion of students who did did not select the correct option.
From a different perspective, the item discrimination formula can also be used in distractor analysis. The concept of upper groups and lower groups would still remain, but the analysis and expectation would differ slightly from the regular item discrimination that we have looked at earlier. Instead of expecting a positive value, we should logically expect a negative value as more students from the lower group should select distracters. Each distractor can have its own item discrimination value in order to analyse how the distracters work and ultimately refine the effectiveness of the test item itself. If we use the above item as an example, the item discrimination concept can be used to assess the effectiveness of each distractor. Consider a class of 100
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
131
SULTAN KUDARAT STATE UNIVERSITY
students, then shall form the upper and lower groups of 30 students each. Assume the following results results are observed: Distractor
Number of Upper
Number of Upper Group
Group who selected
who selected
20 3 4 3
10 3 16 1
A. it rained all day* day* B. he was scolded C. he hurt himself D. the weather
Discrimination
(20-10)/30 (3-3)/30 (4-16)/30 (3-1)/30
Index
.33 0 -.4 .07
was hot
*Correct answer Thee valu Th values es in the the last last co colu lumn mn of the the tabl tablee can can on once ce agai againn be interpreted according to how we examined item discrimination values, but with a twist. Alternative A is the key and a positive value is the value that we would want. However, the value of 0.33 is rather low considering the maximum value is 1. The value for distractor B is 0 and this tells us that the distractor did not discriminate between the proficient students in the upper group and the weaker students in the lower group. Hence, the effectiveness of this distractor is questionable. Distractor C, on the other hand, seems to have functioned effectively. More students in the lower group than in the upper group selected this distractor. As our intention in distractor analysis is to identify distractors that would seem to be the correct answer to weaker students, then distractor C seems to have done its job. The same cannot be said of the final distractor. In fact, the positive value obtained here indicates that more of the proficient students selected this distractor. We should understand by now that this is not what we would hope for. Distractor analysis can be a useful tool in evaluating the effectiveness of our distractors. It is important for us to be mindful of the distractors that we use in a multiple choice format test as when distractors are not effective, they are virtually useless. As a result, there is a greater possibility that students will be able to select the correct answer by guessing as the options have been reduced. Summary
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
132
SULTAN KUDARAT STATE UNIVERSITY
Judgment Judg mental al item-improv item-improvemen ementt is accomplis accomplished hed throu through gh teach teacher’s er’s own review, peer review, and student review.
Enhancement of test and test items may be possible using empiricallybased procedures like computing the index of difficulty, discrimination
index or distracter analysis. For items with one correct alternative worth a single point, the item diff diffic icul ulty ty is simp simply ly the the pe perc rcen enta tage ge of st stud uden ents ts who who answ answer er an ititem em correctly.
Item discrimination refers to the ability of an item to differentiate among students on the basis of how well they know the material being tested.
One important element in the quality of a multiple choice item is the quality of the item's distractors. A distractor analysis addresses the performance of these incorrect response r esponse options.
Enrichment
Read the following studies: 1. “Difficulty “Difficulty Ind Index, ex, Disc Discrimin rimination ation Index Index and Distrac Distractor tor Efficiency Efficiency in Multiple Choice Questions,” available from https://www.researchgate.net/publication/323705126 2. “Item Discrimi Discriminatio nationn and Distracto Distractorr Analys Analysis: is: A Technica Technicall Report on on Thirty Multiple Choice Core Mathematics Achievement Test Items,” available from https://www.researchgate.net/publication/335892361 https://www.researchgate.net/publication/335892361 3. “Index “Index and Distrac Distractor tor Efficiency Efficiency in a Form Formative ative E Examin xamination ation in in Community Medicine,” available from https://www.researchgate.net/publication/286478898 4. “Impact “Impact of distra distractors ctors in item item ana analysis lysis of mu multiple ltiple choic choicee quest questions. ions.”” Available from : https://www.researchgate.net/publication/332050250 https://www.researchgate.net/publication/332050250 Assessment
A. Below are descriptions of procedures done to review and improve test item. On the space provided, write J if a judgmental approach is uded and E if empirically-based.
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
133
SULTAN KUDARAT STATE UNIVERSITY
1. The Math Math coordi coordinato natorr of Grade 7 clas classes ses examine examinedd the periodica periodicall tests by the Math teachers to see if their items are aligned to the target outcomes for the first quarter. 2. The alternativ alternatives es of the mul multiple tiple-choi -choice ce items of of the Social Social Studies Studies test were reviewed to discover if they have only one correct answer. 3. To dete determi rmine ne if the items items are effi efficie cientl ntlyy discri discrimin minati ating ng between between the more able students from the less able ones, a Biology teacher obtained a discrimination index (D) of the items. 4. A Technology Technology E Educa ducation tion teacher teacher was iintere nterested sted to see see if the criterioncriterionreferenced test he has devised shows a difference in the item’s posttest and pre-test’s p-values. 5. An English English teac teacher her conducte conductedd a sessi session on with his studen students ts to find out if ther theree are are othe otherr resp respon onse sess acce accept ptab able le in thei theirr lilite tera ratu ture re test test.. He encouraged them to rationalize their answers. B. A final final test in Scie Science nce was was admi adminis nister tered ed to a Gra Grade de 6 class of 50. The The teacher wants to improve further the items for next year’s use. Calculate a quality index using the given data and indicate the possible revision needed by some items. Ite Item
Numb umber of stu stud dents ents ge gettti tin ng
Index
the correct answer
1 2 3 4 5
Revision needed to be done
14 18 10 45 8
C. Below are addition additional al data collect collected ed for the same item items. s. Calculate Calculate another another quality index and indicate what needs to be done with the obtained index as a basis. Item
1 2 3 4 5
Upper Group
Lower Group
25 9 2 38 1
9 9 8 8 7
Index
Revision ne needed to be done
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
134
SULTAN KUDARAT STATE UNIVERSITY
D.
A distra distracter cter aanalys nalysis is iiss ggiven iven for a te test st iitem tem given to a class class of 60. Obtai Obtainn th thee necessary item statistics using the given data. Item N=30
Difficult y index
Discriminatio n index
1
Group
Alternatives
A
B
C
D
Omit
Upper Lower
Write your evaluation on the following aspects of the item. a. Di Diff ffic icul ulty ty ooff the the item item b. Dis Discri crimin minati ation on ppowe owerr of th thee ite item m c. Plausi Plausibil bility ity of the opt option ionss d. Ambi Ambigu guitityy of th thee op optio tions ns E. For For ea each ch item item,, writ writee the the lett letter er of yo your ur corr correc ectt an answ swer er on th thee sp spac acee provided for. 1. Below are are differ different ent ways of of utilizi utilizing ng the con concept cept of discrim discriminati ination on as an index of item quality EXCEPT a. Gettin Gettingg the pro propor portio tionn of thos thosee an answe swerin ringg the ite item m correc correctly tly ove over r those answering the items b. Obtain Obtaining ing the dif differ ferenc encee bet betwe ween en the prop proport ortion ion of high-s high-scor coring ing group and the proportion of low-scoring group getting the item correctly c. Gettin Gettingg how much be bette tterr the perfo performa rmance nce of the cl class ass by item item is after instruction than before d. Diff Differ eren entitiat atin ingg the the pe perf rfor orma manc ncee in an ititem em of a grou groupp that that has has received instruction and a group that has not 2. What can can enab enable le some sstuden tudents ts to answ answer er items correc correctly tly even w withou ithoutt having enough knowledge for what is intended to be measured? a. Clear Clear aand nd bbrie rieff tes testt instru instructi ctions ons b. Comprehe Comprehensibl nsiblee st stateme atement nt of of th thee item item sstem tem c. Obviously Obviously corre correct ct and and obviousl obviouslyy wro wrong ng alterna alternatives tives
d. Simple Simple senten sentence ce struc structure ture of tthe he problem problem ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
135
SULTAN KUDARAT STATE UNIVERSITY
3. An instr instruct uctor or is going going to pre prepar paree an andd end-of end-of-co -cours ursee summat summative ive test. test. What major consideration should it observe so it will differ from a unit test? a. Inclusion Inclusion ooff all intended intended llearni earning ng outc outcomes omes of of the course course b. Appropriate Appropriate length length of the test test to cov cover er all subjec subjectt matter topics topics c. Preparatio Preparationn of a key ttoo correction correction iinn advance advance for for ease ooff scoring scoring d. Adequate Adequate sa samplin mplingg of higher-lev higher-level el learning learning outcome outcomess 4. Among the the stra strategie tegiess for improving improving tes testt quest questions ions given given below, below, which which is empirical in approach? a. Items Items that stud student entss find find confu confusin singg are col collec lected ted and are revised revised systematically b. Teac Teache hers rs who who are are teac teachi hing ng the the sa same me su subj bjec ectt ma matt tter er co colllleg egia iallllyy meet to discuss the alignment of items i tems to their learning outcomes c. Item Item respon response sess of high-s high-scor coring ing group group are comp compare aredd with those of the low-scoring group d. The The tea teach cher er examine examiness the ste stem m and altern alternati atives ves for acc accura uracy cy of content
5. Whic Whichh of the the foll follow owin ingg mult multip iple le-c -cho hoic icee item item data data show showss a need need for for revision? Item1 2 3 4
Upper Group Lower Group Upper Group Lower Group Upper Group Lower Group Upper Group Lower Group
A 5* 15 2 4 2 4 2 8
B 4 0 4 4 14* 4 4 5
C 9 5 12* 5 2 5 2 0
*correct answer References
Conduct the Item Analysis. Retrieved from http://www.proftesting.com/test topics/steps 9.php 9.php
D 2 0 2 7 0 7 10* 7
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
136
SULTAN KUDARAT STATE UNIVERSITY
David et al. (2020). Assessment in Learning 1. Manila: Rex Book Store. De Guzman, E. and Adamos, J. (2015). Assessment of Learning 1. Quezon City: Adriana Publishing Co., Inc. ExamSoft (2015, August 4). Putting it All Together: Using Distractor Analysis. [Video]. YouTube. https://www.youtube.com/watch?v=c8r_6bT_VQo _______ (2015, July July 21). The Definition of Item Difficulty. [Video]. YouTube. https://www.youtube.com/watch?v=oI_7HkgZKj8 https://www.youtube.com/watch?v=oI_7HkgZKj8 _______ (2015, July July 23). Twenty-Seven Perc Percent: ent: The Index of Discrimination. Discrimination. [Video]. YouTube. https://www.youtube.com/watch?v=Fr1KMb8GNNs https://www.youtube.com/watch?v=Fr1KMb8GNNs Exploring Reliability in Academic Achievement. Retrieved from https://chfasoa.uni.edu/reliabilityandvalidity.htm https://chfasoa.uni.edu/reliabilityandvalidity.htm Mahjabeen et al. (2017). Efficiency in Multiple Choice Questions. Annals of PIMS. Available in https://www.researchgate.net/publication/323705126 https://www.researchgate.net/publication/323705126 Popham, W. (2011). Classroom Assessment: What teachers need to know. Boston, MA: Pearson Education, Inc. Professional Testing, Inc. (2020). Building High Quality Examination Programs. Retrieved from http://www.proftesting.com/test_topics/steps_9.php http://www.proftesting.com/test_topics/steps_9.php The Graide Network, Inc. (2019). Importance of Validity and Reliability in Classroom Assessments. Retrieved from https://www.thegraidenetwork.com/blog-all/2018/8/1/the-two-keys-toquality-testing-reliability-and-validity quality-testing-reliability-and-validity
Lesson 4: Establishing Test Validity and Reliability
Pre-discussion
To be able to successfully perform the expected performance tasks, students should have prepared a test following the proper procedure with clear learning targets (objectives), table of specifications, and pre-test data per item. In the previous lesson, guidelines were provided in constructing test following different formats. They have also learned that assessment becomes valid when the test items represent a good set of objectives, and this should be found in table of specifications. The learning objectives or targets will help them construct appropriate test items. What to Expect?
At the end of this lesson, the students students can: ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
137
SULTAN KUDARAT STATE UNIVERSITY
1. exp explai lainn the dif differ feren entt test testss of validit validity; y; 2. ident identify ify the most most practi practical cal test to aapply pply wh when en validatin validatingg a typical typical teacher-made assessment; 3. tell wh when en to use a cert certain ain type type of reliability reliability test; 4. apply th thee suitab suitable le metho methodd of reliab reliability ility test test given a set set of assessment assessment results/test data; and 5. decid decidee whether whether a te test st is valid or rreliab eliable. le.
In order to establish the validity and reliability of an assessment tool, pre-service teachers need to know the different ways of establishing test validity and reliability. They are expected to read this before they can analyse their test items. Test Validity
A test is valid when it measures what it is supposed to measure. Validity pertains to the connection between the purpose of the test and which data the teacher chooses to quantify that purpose. If a quarterly exam is valid, then the contents should directly measure the the obje object ctiv ives es of the the curr curric icul ulum um.. If a sc scal alee that that me meas asur uree pers person onal alitityy is composed of five factors, then the scores on the five factors should have items that are highly correlated. If an entrance exam is valid, it should predict students’ grades after the first semester. It is better to understand the definition through looking at examples of invalidity. Colin Foster, an expert in mathematics education at the University of Nottingham, gives the example of a reading test meant to measure literacy that is given in a very small font size. A highly literate student with bad eyesight may fail the test because they cannot physically read the passages supplied. Thus, such a test would not be a valid measure of literacy ( though it may be a valid measure of eyesight ). ). Such an example highlights the fact that
validity is wholly dependent on the purpose behind a test. More generally, in a study plagued by weak validity, “it would be possible for someone to fail the test situation rather than the intended test subject.”
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
138
SULTAN KUDARAT STATE UNIVERSITY
Different Ways to Establish Test Validity
Validity can be divided into several different categories, some of which relate very closely to one another. Let us discuss a few of the most relevant types through this matrix. Type of validity
Definition
Content Validity
When the items represent the domain being measured
Procedure
The items are compared with the t he objectives of the program. The items need to measures directly the objectives (for achievement) achieveme nt) or definition (for scales). A reviewer conducts the checking. Face Fa ce Va Valilidi dity ty Wh When en th thee tes testt iiss The test items and layout are reviewed presented well, free of and tried and layout on a small group of errors, and respondents. responden ts. A manual for administration administered well can be made as a guide for the test administrator. A measure should should A correlation correlation coefficient is obtai obtained ned where Predictive Validity predict a future the X-variables is used as the t he predictor criterion. Example is an and Y-variable as the criterion. entrance exam predicting the grades of the students after the first semester. The Pearson r can can be used to correlate Construct The components or the items for each factor. However, there Validity factors of the test is a technique called factor analysis to should contain items determine which items are highly that are strongly correlated. correlated to form a factor. Concurrent When two or more The scores on the measures should be Validity measures are present correlated. for each examinee that measure the same characteristic Correlation is done for the factors of the Convergent When the components or factors of a test a are best. Validity hypothesizedd to have a hypothesize positive correlation Correlation is done for the factors of the When the components Divergent test. or factors of a test are Validity hypothesizedd to have a hypothesize negative correlate are the scores in a test on intrinsic and extrinsic motivation.
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
139
SULTAN KUDARAT STATE UNIVERSITY
There are cases for each type of validity provided that illustrates how it is conducted. After reading the cases references about the different kinds of validity look for a partner and answer the following questions. Discuss your answer. You may use other references and browse the internet. 1. Con Conten tentt Val Validi idity ty
A coordinator in science is checking the scien science ce test paper for Grade 4. She asked the Grade 4 science teacher to submit the table of specifications containing the objectives of the lesson and the corresponding items. The coordinator checked whether each item is aligned with the objectives.
How are the objectives used when creating test items? How is content validity determined when given the objectives and the items in a test? What hat sho houl uldd be pres presen entt in a test test tabl tablee of spec specifific icat atio ions ns when when
determining content validity? Who checks the content validity of items?
2.
Face V Va alidity
The assistant principal browsed the test paper made by the math teacher. She checked if the contents of the items are about mathematics. She examined if instructions are clear. She browsed through the items if the grammar is correct and if the vocabulary is within the student’s level of understanding.
What can be done in order to ensure that the assessment appears to be effective? What practices are done in conducting face validity? Why is face validity the weakest form validity?
3. Pr Pred edic icti tive ve Va Vali lidi dity ty
The school admission’s office developed an entrance examination. The officials wanted to determine if the results of the entrance examination are accurate in identifying good students. They took the grades of the students accepted for the first quarter. They correlated the entrance exam results and the first first quart quarter er grades grades.. They They fou found nd sig signif nifica icant nt an andd positiv positivee corre correlat lation ionss betwee betw eenn the the en entr tran ance ce ex exam amin inat atio ionn sc scor ores es and and grad grades es.. The The en entr tran ance ce
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
140
SULTAN KUDARAT STATE UNIVERSITY
examination results predicted the grades of students after the first quarter. Thus, there was predictive-prediction validity.
Why are two measures needed in predictive validity?
What is the assumed connection between these two measures?
How can we determine if a measure has predictive validity? What statistical analysis is done to determine predictive validity?
How can the test results of predictive validity be interpreted?
4. Co Conc ncur urre rent nt Va Vali lidi dity ty
A school Guidance Counsellor administered a math achievement test to Grade 6 students. She also has a copy of the students’ grades in math. She wanted to verify if the math grades of the students are measuring the same competencies as the math achievement test. The school counsellor correlated the math achievement scores and math grades to determine if they are measuring the same competencies.
What needs to be available when conducting concurrent validity? At least how many tests are are needed for conduc conducting ting concurrent validity? What statistical analysis can be used to established concurrent validity? How are the results of a correlation coefficient interpreted for concurrent validity?
5. Co Con nst stru ruct ct V Va ali lidi dity ty
A science test was made by a Grade 10 teacher composed of four domains: matter, living things, force and motio domains: motion, n, and earth space space.. There are 10 items under each domain. The teacher wanted to determine if the 10 items made ma de unde underr ea each ch do doma main in real really ly be belo long nged ed to that that do doma main in.. The The teac teache her r consulted an expert in test measurement. They conducted a procedure called factor analysis. Factor analysis is a statistical procedure done to determine if the items written will load under the domain they belong.
What type of test requires construct validity?
What should the test have in order to verify its constructs?
What are constructs and factors in a test?
How can these factors be verified if they are appropriate for the test?
What results come out in construct validity?
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
141
SULTAN KUDARAT STATE UNIVERSITY
How are the results in construct validity interpreted? The construct validity of a measure is reported in journal articles. The
following are guided questions used when searching for the construct validity of a measure from reports:
What was the purpose of construct validity? What type of test was used?
What are the dimensions or factors that were studied using construct validity?
What procedure was used to establish the construct validity?
What statistics was used for the construct validity?
What were the results of the test’s construct validity?
6. Co Conv nver erge gent nt Va Vali lidi dity ty
A Math teacher developed a test to be administered at the end of the scho sc hool ol ye year ar,, whic whichh me meas asur ures es nu numb mber er sens sense, e, pa patt tter erns ns an andd al alge gebr bra, a, measurement, geometry, and statistics. It is assumed by the math teacher that students’ competencies in number sense improve their capacity to learn patterns and algebra and other concepts. After administering the test, the scores were separated for each area, and these five domains were intercorrelated using Pearson r. the positive correlation between number sense and patterns and algebra indicates that, when number sense scores increase, the patters and algebra scores also increase. This shows student learning of number sense scaffold patterns and algebra competencies. What should a test have in order to conduct convergent validity?
What are done with the domains in a test on convergent validity?
What analysis is used to determine convergent validity?
How are the results in convergent validity interpreted?
7. Dive Diverrgen entt Va Vali lidi dity ty
An English teacher taught metacognitive awareness strategy to comprehend a paragraph for Grade 11 students. She wanted to determine if the performance of her students in reading comprehension would reflect well
in the readin readingg compre comprehen hensio sionn tes test. t. She She admini administe stered red the same same rea readin dingg ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
142
SULTAN KUDARAT STATE UNIVERSITY
comprehension test to another class which was not taught the metacognitive awareness strategy. She compared the results using a t-test of independent samples and found that the class that was taught metacognitive awareness strategy performed significantly better that the other group. The test has divergent validity. What conditions are needed to conduct divergent validity?
What assumption is being proved in divergent validity?
What statistical analysis can be used to establish divergent validity?
How are the results of divergent validity interpreted?
Test Reliability
Reliability is not at all concerned with intent, instead asking whether the test used to collect data produces accurate results. In this context, accuracy is defined by consistency or as to whether the results could be replicated. Also, it is the consistency of the responses to measure under three conditions: 1. whe whenn ret retest ested ed on tthe he ssame ame ppers erson; on; 2. when retes retested ted oonn th thee same same measur measure; e; an andd 3. similari laritty of responses across ititeems that measure the same characteristic. In the first condition, consistent response is expected when the test is given to the same participants. In the second condition, reliability is attained if the responses to the same test are consistent with the same characteristic equuiv eq ivaale lennt or ano noth theer te tesst th that at mea measur ures es but meas easures ures the the sa sam me charac cha racter terist istic ic when when adm admini iniste stered red at a dif differ ferent ent time. time. In the third condition, there is reliability when the person responded in the same way or consistently across items that measure the same characteristic. There are different factors that affect the reliability of a measure. The reliability of a measure can be high or low, depending on the following factor: 1. The numbe numberr of items items in a test test – Th Thee more items a test has, the the likelihood of reliability is high. The probability of obtaining consistent scores is high because of the large pool of items.
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
143
SULTAN KUDARAT STATE UNIVERSITY
2. Individua Individuall differen difference ce of partic participant ipantss – every every partici participant pant possess possesses es characteristics that affect their performance in a test, such as fatigue, concentration, innate ability, perseverance, and motivation. These individual factors change over time and affect the consistency of the answers in a test. 3. External External env environm ironment ent – The ext external ernal en environ vironment ment may includ includee room temperature, noise level, depth of instruction, exposure to materials, and quality of instruction which could affect changes in the responses of examinees in a test. What are the different ways to establish test reliability?
There are different ways in determining the reliability of a test. The specific kind of reliability will depend on the (1) variable you are measuring, (2) type of test, and (3) number of versions of the test. The different methods of reliability test are indicated and how they are done. Please note in the third column that statistical analysis is needed to determine the test reliability.
Method Testing in Reliability
How is this reliability done?
What is statistics is used?
1. Te Test st-r -ret etes estt
Yo Youu ha have ve a te test st,, aand nd yo youu ne need ed to administer it at one time to a group of examinees.. Administer it again at examinees another time to the “sane group” of examinees.. There is a time interval of examinees not more than 6 months between the first and second administration of test that measure stable characteristics, such as standardize standardizedd aptitude tests. The post-test can be given with a minimum time interval of 30 minutes. The response in the test should more or
Correlate the test scores from the first and the next administration. Significantt and positive Significan correlation indicates that the test has temporal stability overtime.
less be the same across the two points in time.
where linear relationship relationshi p is expected
Correlation refers to a statistical procedure
Test-retest is applicable for tests that
for two variables. Pearson Product
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
144
SULTAN KUDARAT STATE UNIVERSITY
measure stable variables, such as aptitude and psychomotor measures (e.g., typing test, tasks in physical education).
Moment Correlation or Person r may be used because test data are usually in an interval scale (refer to a statistics book for Pearson r ). ).
2. Pa Para rallllel el Forms
3. Splitt-H Half
There are two versions of a test. The items need to exactly measure the same skill. Each test version is called a “form.” Administer one form at one time and the other form to another time to the “same” group of participants. The responses on the two forms should be more or less the same. Parallel forms are applicable I there are two versions of the test. This is usually done when the test is repeatedly used for different groups, such as entrance examinations and licensure examinations. examinatio ns. Different versions of the test are given to a different group of examinees. Administer a te tesst to a group of examinees.. The items need to be split in examinees halves, usually using the odd-even technique. In this technique, get the sum of the points in the odd-numbered items and correlate it with the sum of points of the even-numbered items. Each examinee will have two scores coming from the same test. The scores on each set should be close or consistent. Split-half is applicable when the test has a large number of items.
Correlate the test results for the first form and the second form. Significantt and positive Significan correlation coefficient is expected. The significantt and positive significan correlation indicates that the responses in the two forms are the same or consistent. Pearson r is usually used for this analysis.
Correlate the two sets of scores using Pearson r. after the correlation use another formula called Spearman-Brown Coefficient. The correlation coefficient obtained using Pearson and Spearman Brown r and should be significant and positive to mean
that the consistency test has internal reliability. A statistical analysis analysis This procedure involves determining if 4. Te Test st of the scores for each item are consistently called Cronbach’s Internal alpha or the KuderConsistency answered by the examinees. After administering administeri ng the test to a group of Using Richardson is used to examinees,, it is necessary to determine determine the internal examinees Kuderconsistency of the Richardson and record the scores for each item. The idea here is to see if the responses items. A Cronbach’s and Cronbach’s per item are consistent with each other. alpha value of 0.60 and above indicates that the This technique will work well when the Alpha test items have internal assessment tool has a large number of Method items, it is also applicabl applicablee for scales and consistency inventories (e.g., Likert scale from “strongly agree” to “strongly disagree” ) 5. In Inte terr-ra rate terr This procedure is used to determine the A statistical analysis analysis
Reliability
consistency of multiple raters when
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
called Kendall’s tau 145
SULTAN KUDARAT STATE UNIVERSITY
using rating scales and rubrics to judge performance.. The reliabili performance reliability ty here refers to the similar or consistent ratings provided by more than one rater or judge when they they use an assessme assessment nt tool. Inter-rater is applicable when the assessment requires the use of multiple raters.
coefficient of concordancee is used to concordanc determine if the ratings provided by multiple raters agree with each other. Significant Kendall’s tau value indicates that the raters concur or agree with each other in their rating.
Notice that a statistical analysis is needed to determine the reliability of a measure. The very basis of statistical analysis to determine reliability is the use of linear regression. 1. Lin iner er reg regress ressio ion n
Linear regression is demonstrated when you have two variables that are measured, such as two set of scores in a test taken at two different times by the same participants. When the two scores are plotted in a graph (with Xand Y-axis), they tend to form a straight line. The straight line formed the two sets of scores can produce a linear regression. When a straight line is formed, we can say that there is a correlation between the two sets scores. This can be seen in the graph shown. This correlation is shown in the graph given. The graph is called a scatterplot. Each point in the scatterplot is a respondent with two scores (one for each test).
Figure 1. Scatterplot diagram
2.
Computation of Pearson r correlation correlation
The index of the linear regression is called a correlation coefficient.
When Wh en the the po poin ints ts in a scat scatte terp rplo lott tend tend to fall fall wi with thin in the the liline near ar liline ne,, the the ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
146
SULTAN KUDARAT STATE UNIVERSITY
correlation is said to be strong. When the direction of the scatterplot is directly proportional, the correlation coefficient will have a positive value. If the line is inverse, the correlation coefficient will have a negative value. The statistical analysis used to determine the correlation coefficient is called the Pearson r. How the How the Pe Pear arso sonn r is obtained by the following formula and is illustrated below. Formula:
where ∑X – Add all the X scores (Monday scores) ∑Y – Add all the Y scores (Tuesday
XY – Multiply the X and Y scores ∑X2 - Add all the squared values of X ∑Y2 – Add all the squared values of Y
X2 – scores) Square the value of the X scores (Monday 2 Y – Square the value of the Y scores (Tuesday scores)
∑XYscores) – Add all the production of X and Y
Suppose that a teacher gave the spelling of two-syllable words with 20 items ite ms for for Mo Mond nday ay an andd Tu Tues esda day. y. Th Thee teac teache herr want wanted ed to dete determ rmin inee the the reliability of two sets score by computing for the Pearson r. Monday Te Test
Tuesday Te Test
X
Y
2
10 9 6 10 12 4 5 7 16 8 ∑X=87
20 15 12 18 19 8 7 10 17 13 ∑Y=139
Applying the formula, we have: have:
X
100 81 36 100 144 16 25 49 256 64 2 ∑X =871
2
Y
400 225 144 324 361 64 49 100 289 169 2 ∑Y =2125
XY
200 135 72 180 228 32 35 70 272 104 ∑XY=1328
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
147
SULTAN KUDARAT STATE UNIVERSITY
0.80 The value of a correlation coefficient does not exceed 1.00 or -1.00. A value of 1.00 and -1.00 indicates perfect correlation. In test of reliability though, we aim for high positive correlation to mean that there is consistency in the way the students answered the test taken. Difference between a Positive and a Negative Correlation
When the value of the correlation coefficient is positive, it means that the higher the scores in X, the higher the scores in Y. This is called a positive correlation. In the case of the two spelling scores, a positive correlation is obtained. Then the value of the correlation coefficient is computed to be negative, it means that the higher the scores in X, the lower the scores in Y, and vice versa. This is called a negative correlation. When the same test is administered to the same group of participants, usually a positive correlation indicates reliability or consistency of the scores. Determining the Strength of a Correlation
Thee stre Th streng ngth th of the the corr correl elat atio ionn al also so in indi dica cate tess the the st stre reng ngth th of the the reli reliab abililitityy of the the test test.. Th This is is indi indica cate tedd by the the va valu luee of the the co corre rrela latition on coef co effifici cien ent. t. Th Thee clos closer er the the valu valuee to 1.00 1.00 or -1.0 -1.00, 0, the the st stro rong nger er is the the correlation. Below is the guide: 0.8800. 0-1. 1.000
ever everyy sstr troong rela relatition onsshi hipp
0.6-0.79
Strong relationship
0.40 0.40-0 -0.5 .599
Su Subs bsta tant ntia ial/l/ma mark rked ed rela relatition onsh ship ip
0.2-0.39
Weak relationship
0.0000. 0-0. 0.119
Negl Neglig igib iblle relat elatio ions nshi hipp
Internal Consistency of a Test
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
148
SULTAN KUDARAT STATE UNIVERSITY
Another statistical analysis to determine the internal consistency of test is the Cronbach’s alpha. Follow the given procedure to determine the internal consistency. Suppose that five students answered a checklist about their hygiene with a scale of 1 to 5, where in the following are the corresponding scores: 5 – Always 4 – Often 3 – Sometimes 2 – Rarely 1 – Never The checklist has five items. The teacher wanted to determine if the items have internal consistency. 2
Student
Item 1
Item 2
Item 3
Item 4
Item 5
A B C D E Total for each item (∑X) Mean SD2
5 3 2 1 3 14
5 4 5 4 3 21
4 3 3 2 4 16
4 3 3 3 4 17
1 2 3 3 4 13
2.8
4.2
3.2
3.4
2.6
2.2
0.7
0.7
0.3
1.3
Total for each case (x)
19 15 16 13 18 Xcase=16.2
ScoreMean
2.8 -1.2 -0.2 -3.2 1.8
(Score-Mean)
7.84 1.44 0.04 10.24 3.24 ∑(ScoreMean)2= 22.8 =
∑
=5.2
The Cronbach’s alpha formula is given by:
where k refers to the number of scale items refers to the variance associated with item i refers to the variance associated with the observed total scores Hence,
5.7
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
149
SULTAN KUDARAT STATE UNIVERSITY
Thee inte Th intern rnal al co cons nsis iste tenc ncyy of the the resp respon onse sess in the the atti attitu tude de towa toward rd teaching is 0.10, indicating low internal consistency. The consistency of ratings can also be obtained using a coefficient of concordance. The Kendall’s W coefficient of concordance is used to test the agreement among raters. Next illustration is a performance task demonstrated by five students rated by three (3) raters. The rubric used a scale of 1 to 4, where in 4 is the highest and 1 is the lowest.
/
Five demonstrations
Rater 1
Rater 2
Rater 3
Sum of Ratings
D
D2
A B C D E
4 3 3 3 1
4 2 4 3 1
3 3 41 2 2
11 8 11 8 4 XRatings=8.4
2.6 -0.4 2.6 -0.4 -4.4
6.76 0.16 6.67 0.16 19.36 ∑D2=33.2
Th Thee sco score ress ggiv iven en by the the tthr hree ee rate raters rs are are fir first st co comp mput uted ed by summ summin ingg up up
the total rating for each demonstration. The mean is obtained for the sum of ratings (XRatings=8.4). The mean is subtracted from each of the Sum of Ratings (D). Each difference is squared (D 2), then the sum of squares is computed (∑D2=33.2). The mean and summation of squared different is substituted in the Kendall’s W formula. In the formula, m is the numbers of raters while k is the number of students who perform the demonstrations. Let us consider the formula and the substitution of values:
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
150
SULTAN KUDARAT STATE UNIVERSITY
A Kendall’s W coefficient value of 0.37 indicates the agreement of the thre threee rate raters rs in th thee five five de demo mons nstr trat atio ions ns.. Cl Clea earl rly, y, th ther eree is mode modera rate te concordance among the three raters because the value is far from fr om 1.00. Summary
A test is valid when it measures measures what it is supposed to measure. measure. It can be categorized as face, content, construct, predictive, concurrent, convergent, or divergent validity.
Reliability is the consistency of the responses to measure. It can be implemented through test-retest, parallel forms, split-half, internal consistency and inter-rater reliability.
Enrichment A. Get a journal article about a study that developed a measure or conducted
validity or reliability tests. You may also download from any of the following open source.
Google Scholar
Directory of open access journals
Multidisciplinary open access journals
Allied academics journals journals
Your task is to write a short report focusing on important information on how the authors conducted and established test validity and reliability. Provide the following information. 1. Purp Purpos osee of of tthe he stud studyy 2. Describe Describe the the ins instrume trument nt with iits ts underly underlying ing factors factors 3. Validity Validity tec techniqu hniquee used iinn the stu study dy and analys analysis is they used used 4. Reliability Reliability techn techniques iques uused sed in th thee study and and ana analysis lysis uused sed 5. Results Results ooff the tests validity validity aand nd re reliabil liability ity B. Learn more on Reliability and Validity in Student Assessment by watching
ube.com/watch?v=gzv8 watch?v=gzv8Cm1jC4M Cm1jC4M.. a clip from http://www.yout http:// www.youtube.com/
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
151
SULTAN KUDARAT STATE UNIVERSITY
C. Read on Magno’s (2009) work titled, “Demonstrating the Difference
between Classical Test Theory and Item Response Theory Using Derived Test Data” published in the International Journal of Educational and Psychological Assessment, Volume 1. Access through https://files.eric.ed.gov/fulltext/ED506058.pdf Assessment A. Indicate the type of reliability applica applicable ble for each cas case. e. Write the type of reliability on the space before the number. Reliability Type
Cases
1. Mr. Perez conducted a surve surveyy of his students to determine their study habits. Each item is answered using a five-point scale (always, often, sometimes, rarely, never). He wanted to determine if the responses for each item are consistent. What reliable technique recommended? recommended ? students. Afte 2. A teacher administere administered d ais sp spelling elling te test st to her After r a day, another spelling test was given with the same length and stress of words. What reliability can be used for the two spelling tests? 3. A PE tea teacher cher requ requested ested two jjudges udges to rate the dance performance of her students in physical education. What reliability can be used to determine the reliabili reliability ty of the judgements? 4. An Eng English lish teac teacher her adm administered inistered a test to determ determine ine students’ use of verb given a subject with 20 items. The scores were divided into items 1 to 10, and another for items 11 to 20. The teacher correlated the two set of scores here? that form the same test. What reliability is done 5. A compu computer ter teac teacher her gav gavee a set of typi typing ng tests iinn Wednesday and gave the same set of the following week. The teacher wanted to know if the students’ typing skills are consistent. What reliability can be used? B. Indicate the type of validity applicable for each case. Write the type of validity on
the blank before the number. 1. The scienc sciencee coordinato coordinatorr develope developedd a science test to determi determine ne who among the students will be placed in an advanced science section. The students who scored high in the science test were selected. After two quarters, the grades of the students in the advanced science were determined. determined. The scores in the science test were correlated with the science grades to check if the
science test was accurate in the selection of students. What type of validity was used? ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
152
SULTAN KUDARAT STATE UNIVERSITY
2. A test composed of listen listening ing co comprehensi mprehension, on, readi reading ng comprehension, comprehens ion, and visual comprehensio comprehensionn items was administeredd to students. The researcher determined if the administere scores on each area refers to the same skill on comprehension. The researcher hypothesized a significant and positive relationshipp among tthese relationshi hese factors. What validity was established? 3. The gui guidance dance cou counsellor nsellor co conducted nducted aann interest inventory tthat hat measured the following factors: realistic, investigative, investigative, artistic, scientific, enterprising, and conventional. The guidance counsellorr wanted to provide evidence that the items constructed counsello really belong to the factor proposed. After her analysis, the proposed items had high factor loadings on the domain they belong to. What validity was conducted? 4. The technol technology ogy and llivelihood ivelihood eeducation ducation tea teacher cher devel developed oped a performance task to determine student competency in preparing a dessert. The students were tasked with selecting a dessert, preparing the ingredients, ingredients, and making the dessert in the kitchen. The teacher developed developed a set of criteria to assess the dessert. What type of validity is shown here? 5. The teacher in a robo robotics tics class ta taught ught students hhow ow to create a program to make the arms of a robot move. The assessment was a performance task making a program to make three kinds of robot arm movements. The same assessment task was given to students’ with no robotics class. The programming performance of the two classes was compared. What validity was establishe established? d? C. An English teacher administered a spelling test to 15 students. The
spelling test is composed of 10 items. Each item is encoded, wherein a correct answer is marked as “1”, and the incorrect answer is marked as “0”. The grade in English is also provided in the last column. The first five are words with two stresses, and the next five are words with a single stress. The recording is indicated in the table. Your task is to determine whether the spelling test is reliable and valid using the data to determine the following: (1) split-half, (2) Cronbach’s alpha, (3) predictive validity with the English grade, (4) convergent validity of between words with single and two stresses, and (5) difficulty index of each item.
Student No.
Item 1
Item 2
Item 3
Item 4
Item 5
Item 6
Item 7
Item 8
Item 9
Item 10
English grades
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
153
SULTAN KUDARAT STATE UNIVERSITY
1 2 3 4 5 6 7
1 0 1 0 0 1 1
0 0 1 1 1 0 0
0 0 0 0 1 1 1
1 1 0 0 0 0 1
1 1 1 1 1 1 1
1 1 0 1 1 1 1
0 1 1 1 1 1 1
1 1 0 1 0 1 1
1 0 1 1 1 1 0
0 0 1 0 1 1 1
80 81 83 85 84 89 87
89 10 11 12 13 14 15
11 1 0 1 1 1 1
11 1 1 0 1 1 1
11 1 1 1 1 0 1
01 1 1 1 1 1 1
11 0 0 1 1 1 1
11 0 1 1 1 1 0
11 1 1 1 1 1 1
11 1 1 1 0 1 1
10 1 1 1 1 1 0
11 1 0 1 1 1 1
8879 90 90 87 88 88 85
D. Create a short test and report its validity and reliability. Select a grade
level and subject. Choose one or two learning competencies and make at least lea st 10-20 10-20 ite items ms for these these two learni learning ng com compet petenc encies ies.. Con Consul sultt your your teacher on the items and the table of specification. 1. Hav Have yo your ur item itemss chec checke kedd by expe expert rtss if they they are are al alig igne nedd wi with th the the selected competencies. 2. Revise Revise your ititems ems ba based sed on the the rev reviews iews provide providedd by the exper experts. ts. 3. Make a layout layout of yyou ou test test and adm administe inisterr to about about 100 st studen udents. ts. 4. Enc Encod odee you data an andd you may us usee an appli applicat cation ion to com compu pute te for the needed statistical analysis. 5. Det Determ ermine ine the follow following ing::
Split-half reliability Cronbach’s alpha Item difficulty and discrimination
Write a report on you procedure. The report will contain the following parts: Introduction. Give Give the the pu purp rpos osee of the the st stud udy. y. De Desc scri ribe be the the test test
measures, its component, the competencies selected, and kind of items. Rationalize the need to determine the validity and reliability of the test. Method. Describe the participants who took the test. Describe what the
test measures, number of items, test format, and how content validity was established. Describe the procedure on how data was collected or how the test was administered. Describe what statistical analysis was used.
Results. Present the results in a table and provide the necessary interp int erpret retati ations ons.. Make Make sure sure to sh show ow the res result ultss of the sp splitlit-hal halff reliab reliabilit ility, y, ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
154
SULTAN KUDARAT STATE UNIVERSITY
Cronbach’s alpha, construct validity of the items with the underlying factors, convergent validity of the domains, and item difficulty and discrimination. Discussion. Provide implications about the test validity and reliability.
E. Multiple Choice
Choose the letter of the correct and best answer in every item. 1. Which Which is a way in in establis establishing hing test test rreliab eliability? ility? A. The test is examined if free from errors errors and properly administered. administered. B. Scores Scores in a test with with differe different nt vers versions ions are correl correlated ated to test test if they are parallel. C. The compon components ents or factors factors of th thee test contain contain items items that are strongly uncorrelated. D. Two or more more measu measures res are ccorrela orrelated ted to sho show w the same characteristics of the examinee. 2. What is being being est establis ablished hed if item itemss in the test are are cons consistent istently ly answered answered by the students? A. Internal consistency
C. test-retest
B. Inter-rater reliability
D. split-half
3. Which Which type of va validity lidity wa wass estab establishe lishedd if the comp componen onents ts or facto factors rs of a test are hypothesized to have a negative correlation? A. Construct validity
C. Content validity
B. Predictive validity
D. Divergent validity
4. How do do we de determi termine ne of aann item iiss easy or difficult difficult?? A. An item is easy if majority of students students are not able to provide the correct answer. The item is easy if majority of the students are able to answer correctly. B. An item is difficul difficultt if majorit majorityy of stude students nts are no nott able to pro provide vide th thee correct answer. The item is difficult if majority of the students are able to answer correctly. C. An item can can be dete determine rmine di difficu fficultlt if the exam examinees inees who who are high high in the test can answer more the items correctly than the examinees who got low scores. If not, the item is easy.
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
155
SULTAN KUDARAT STATE UNIVERSITY
D. An item can can be dete determine rmine ea easy sy if the examine examinees es who are are high in the test can answer more the items correctly than the examinees who got low scores. If not, the item is difficult. 5. Which Which is used wh when en the sc scores ores of the ttwo wo varia variables bles measure measuredd by a test taken at two different times ti mes by the same participants are correlated? A. Pearson r co correlation C. Significance of the correlation B. Linear regression
D. positive and negative correlation
F. Use the rubric to rate students’ work on the previous task. Part
Very Good
Introduction All the parts, such as the purpose, characteristics characteristi cs of the measure, and rationale, are
Method
Results
indicated. The rational justifies well the purpose of the study and adequate details about the test is described and supported. All the parts, such as participants, test description,, validity description and reliability, procedure and analysis, are all present. All the parts describe sufficiently how the data was gathered and analysed. The tables and interpretation necessary are all present. All the required analyses are complete and accurately interpreted.
Good
Fair
One of the parts is not
Two of the parts are not
Needs Improvement All parts of the report are not
sufficiently explained. The rational
sufficiently explained.. The explained rationale
sufficiently explained. The connection
somehow justifies justifies the purpose.the purpose. However, Several details some details of the test are about the test are not not found. indicated.
between the purpose and rationale is difficult to follow; the features of the test are not described well. All parts of the method are not sufficient explained. Two or more parts are missing.
One of the parts is not
Two of the parts are not
sufficiently explained. One part lacks adequate information on how data was gathered and analysed. There is one table and interpretation missing. One table and/or interpretation does not have accurate content
sufficiently explained.. Two explained parts lack parts lack adequate information about the data gathering and analysis. There are two tables and interpretations that are missing. Two tables and interpretations have inaccurate information.
There are more than two tables and interpretations that are missing. Three or more or more tables and interpretations have inaccurate
Discus Dis cussio sionn
Imp Implilicat cation ionss of the
Imp Implilicat catio ions ns of Imp Implilicat catio ions ns of
information. Imp Implic licati ations ons of
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
156
SULTAN KUDARAT STATE UNIVERSITY
test’s validity and reliability reliabili ty are well explainedd with explaine three or more supporting reviews. Detailed discussion on the results of reliability reliabili ty and validity are provided with explanation.
the test’s validity and reliability reliabili ty are explained with two supporting reviews. One of the results for reliability and validity are not provided with explanation.
the test’s validity and reliability reliabili ty are explained with no supporting review. Two of the results for the results for the validity and reliability are no not provided with explanation.
the test’s validity and reliability reliabili ty are not explained,
and there is no supporting review. Three or more of the validity and reliability reliabili ty are not provided with explanation.
G. Summarized the result of your performance in doing the culminating task using
the checklist below. Ready
Not yet ready
Learning Targets
□
□
1. I can indep independe endently ntly ddecid ecidee on th thee app appropria ropriate te typ typee of validity and reliability to be used for a test.
□ □ □ □ □
□ □ □ □ □
2. can analy analyse se results ooff the test te da data ta ind ependentl dently. y. oof f 3. II can interpr interpret et res theults results results fro from mstthe statistic staindepen tistical al analysis anal ysis the test. 4. I can distinguish the use of each type of test reliability 5. I can disti distinguis nguishh the thenn use ooff each type of of test validi validity. ty. 6. I can explain explain tthe he procedure procedure oonn establish establishing ing test test validity validity and reliability.
References
David et al. (2020). Assessment in Learning 1. Manila: Rex Book Store. De Guzman, E. and Adamos, J. (2015). Assessment of Learning 1. Quezon City: Adriana Publishing Co., Inc. Exploring Reliability in Academic Achievement. Retrieved from https://chfasoa.uni.edu/reliabilityandvalidity.htm https://chfasoa.uni.edu/reliabilityandvalidity.htm Exploring Reliability in Academic Achievement. Retrieved from https://chfasoa.uni.edu/reliabilityandvalidity.htm https://chfasoa.uni.edu/reliabilityandvalidity.htm Price et al. (2017). Reliability and Validity of Measurement . In Research Method in Psychology (3rd ed.). California, USA: The Saylor Foundation. Retrieved from https://opentext.wsu.edu/carriecuttler/chapter/reliability-and-validity-ofmeasurement/ measurement/ Professional Testing, Inc. (2020). Building High Quality Examination Programs. Retrieved from http://www.proftesting.com/test_topics/steps_9.php The Graide Network, Inc. (2019). Importance of Validity and Reliability in
Classroom Assessments. Retrieved from ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
157
SULTAN KUDARAT STATE UNIVERSITY
https://www.thegraidenetwork.com/blog-all/2018/8/1/the-two-keys-toquality-testing-reliability-and-validity
CHAPTER 4 ORGANIZATION, UTILIZATION, AND COMMUNICATION OF TEST RESULTS Overview
As we have learned in previous lessons, tests as used to measure learning or achievement are form of assessment. They are undertaken to gather data about student learning. These test results can assist teachers and the the sc scho hool ol in ma maki king ng info inform rmed ed de deci cisi sion onss to im impr prov ovee cu curr rric icul ulum um and and
instruction. Thus, collected information such as test scores should have to be organized to appreciate its meaning. Usually, the use of charts and tables are ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
158
SULTAN KUDARAT STATE UNIVERSITY
the common ways in the presentation of data. In addition, statistical measures are also utilized to help in interpreting correctly the data. Most often, students are interested to know, “What is my score in the test?” Nonetheless, the more critical question is, “What does one’s score means? mea ns?”” Tes Testt sco score re interp interpret retati ation on is imp import ortant ant not just just for the studen students ts concerned but also for the parents. Knowing how certain student performs with respect to the group or other members of the class is important. Similarly, it is significant to determine the intellectual characteristics of the students through their scores or grades. More Mo reov over er,, a stud studen entt who who rece receiv ived ed an ov over eral alll sc scor oree in the the 60th percentile in mathematics would place the learner in the average group. The learner’s performance is as good or better than 60% of the students in the group. A closer look into the sub-skill scores of the pupil can help teachers and parents in identifying problem areas. For instance, a child may be good in addition and subtraction but he or she may be struggling in multiplication and division. In some cases, assessment and grading are used interchangeably, but they are seemingly different. One difference is that assessment focuses on the learner. It gathers information about what the student knows and what he/she can do. Grading is a part of evaluation because it involves judgment made by the teacher. This chapter concludes with the grading system in the Philipp Phi lippine ines’ s’ K to 12 pro progra gram. m. Other Other report reporting ing sys system temss shall shall lik likewi ewise se be introdu intro duce cedd an andd disc discus usse sed. d. A sh shor ortt se segm gmen entt on prog progre ress ss moni monito tori ring ng is included to provide pre-service teachers with an idea of how to track student progress through formative assessment. Objective
Upon completion of the chapter, the students can demonstrate their knowledge, understanding and skills in organizing, presenting, utilizing and communicating the test results. Lesson 1: Organization of Test Data Using Tables and Graphs
Pre-discussion
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
159
SULTAN KUDARAT STATE UNIVERSITY
At the end of this lesson, pre-service teachers are expected to present in an organized manner the test collected data from existing database or those from pilot-tested materials in any of the assessment tools implemented in the earlier lessons. Your success in this performance task would be determined when you can do organizing ungroup raw test results through tables, using frequency distribution for presenting test data, describing the charac cha racter terist istics ics of fre frequ quenc encyy pol polygo ygons, ns, his histog togram rams, s, bar gra graphs phs,, and and the their ir interpretation, interpreting test data presented through tables and graphs, determining which types of tables and graphs are appropriate for given set data da ta,, an andd us usin ingg tech techno nolo logy gy like like stat statis istic tical al so soft ftwa ware re in orga organi nizi zing ng and and interpreting test data. What to Expect?
At the end of the lesson, the students students can: 1. org organi anize ze tthe he ra raw w data data fr from om a ttest est;; 2. con constr struct uct a freq frequen uency cy dis distri tribut bution ion;; 3. acqu acquire ire knowledge knowledge oonn the basic basic rules in pr prepari eparing ng table tabless and gra graphs; phs; 4. Summa Summarize rize ttest est data data us using ing approp appropriate riate ttable able or or graph graph;; 5. use Micr Microsof osoftt Excel to constru construct ct appro appropriat priatee graph graphss for a data set; set; 6. interp interpret ret the graph graph ooff a frequ frequency ency aand nd cum cumulativ ulativee frequency frequency distribution; and 7. chara characteri cterize ze a frequency frequency ddistrib istribution ution gr graph aph in term termss of skewness skewness and kurtosis. Frequency Distribution
In stat statis istic tics, s, a freq freque uenc ncyy dist distri ribu butio tionn is a lilist st,, tabl tablee or grap graphh that that displays the frequency of various outcomes in a sample. Each entry in the table contains the frequency or count of the occurrences of values within a particular group or interval. Here is an example of a univariate ( single variable) frequency table. The frequency of each response to a survey question is depicted.
Degree of Agreement
Frequency
Strongly agree
30
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
160
SULTAN KUDARAT STATE UNIVERSITY
Somewhat agree Not sure Somewhat disagree Strongly disagree
15 20 20 15
Total
100
A different tabulation scheme aggregates values into bins such that each bin encompasses a range of values. For example, the heights of the students in a class could be organized into the following frequency table. Height range of students
Frequency
less than 5.0 feet 5.0 - 5.5 feet 5.5 - 6.0 feet 6.0 - 6.5 feet Total
45 35 20 20 120
In order to make the data collected from tests and measurements meaningful, they must be arranged and classified systematically. Therefore, we have to organize the data in to groups or classes on the basis of certain characteristics. This Th is prin princi cipl plee of clas classi sify fyin ingg da data ta in into to grou groups ps is call called ed freq freque uenc ncyy distribution. In this process, we combine the scores into relatively small numbers of class intervals and then indicate number of cases in each class. Constructing a Frequency Distribution
Below are the suggested steps to draw up a frequency distribution: Step 1:
Find out the highest score and the lowest score. Then, determine the Range which is highest score minus lowest score.
Step 2:
Second step is to decide the number and size of the groupings to be used. In this process, the first step is to decide the size of the class interval . According to H.E. Garrett (1985:4), the most “commonly used grouping
intervals are 3, 5, 10 units in length.” The size should be such that number of ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
161
SULTAN KUDARAT STATE UNIVERSITY
classes will be within 5 to 10 classes. This can be determined approximately by dividing the range by the grouping interval tentatively chosen.
Step 3:
Prepare Prepa re the class class interv intervals. als. It is natural to start the intervals with their lowest scores at multiples of the size of the intervals. For example, when the interval is 3, it has to start with 9, 12, 15, 18, etc. Also, when the interval is 5, it can start with 5, 10, 15, 20, etc. The class intervals can be expressed in three different ways: First Type:
The first types of class intervals include all scores. For example: 10 - 15 includes scores of 10, 11, 12, 13 and 14 but not 15
15 - 20 includes scores of 15, 16, 17, 18 and 19 but not 20
20 - 25 includes scores of 20, 21, 22, 23 and 24 but not 25
In this type of classification, the lower limit and higher limit of the each class is repeated. This repetition can be avoided in the following type. Second Type:
In this type the class intervals are arranged in the following way:
10 - 14 includes scores of 10, 11, 12, 13 and 14
15 - 19 includes scores of 15, 16, 17, 18 and 19
20 - 24 includes scores of 20, 21, 22, 23 and 24
Here, there is no question of confusion about the scores in the higher and lower limits as the scores are not repeated. Third Type:
Sometimes, we are confused about the exact limits of class intervals because very often it is necessary the computations to work with exact limits. A score of 10 actually includes from 9.5 to 10.5 and 11 from 10.5 to 11.5. Thus, the interval 10 to 14 actually contains scores from 9.5 to 14.5. The
same principle holds no matter what the size of interval or where it begins in ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
162
SULTAN KUDARAT STATE UNIVERSITY
terms of a given score. In the third type of classification we use the real lower and upper limits.
9.5 - 14.5
14.5 - 19.5
19.5 - 24.5 and so on.
Step 4:
Once we have adopted a set of class intervals, we need to list them in their respective class intervals. Then, we have to put tallies in their proper intervals. (See illustration in Table 1.) Step 5:
Make a column to the right of the tallies headed “f” (frequency). Write the total number of tallies on each class interval under column f. The sum of the f column will be total number of cases “N”. The ne The next xt ma matr trix ix co cont ntai ains ns the the sc scor ores es of st stud uden ents ts in math mathem emat atic ics. s. Tabulate the scores into frequency distribution using a class interval of 5 units.
Solution:
Table 1. Frequency distribution
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
163
SULTAN KUDARAT STATE UNIVERSITY
Cumulative Frequency Distribution
Sometimes, our concerned is with the number of percentage of values greater than or less than a specified value. We can get this by adding successively the individual frequencies. The new frequencies obtained by this process, adding individual frequencies of class intervals are called cumulative frequency. If the frequencies of individual class interval are denoted as f1, f2, f3,… fk then the cumulative frequencies will be f1, f1 + f2, f1 + f2 + f3, f1 + f2 + f3 + f4, and so on. An illustration of determining cumulative frequencies has been given in the Table 2.
Table 2. Cumulative Frequency and Class Midpoint (n=60 ( n=60 ) Class Intervals (CI)
f
Midpoint (M)
Cumulative frequency
90 85 80 75 70
-
94 89 84 79 74
2 2 4 8 7
92 87 82 77 72
> 2 4 8 16 23
6650
--
6694
190
6672
3432
Cumulative percentage
< 60 58 56 52 44
> 3% 7% 13% 27% 38%
< 100% 97% 93% 87% 73%
3277
5750% %
6425% %
55 50
-
59 54
6 5
57 52
48 53
18 12
80% 88%
30% 20%
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
164
SULTAN KUDARAT STATE UNIVERSITY
45 40 35
-
49 44 39
3 2 2
47 42 37
56 58 60
7 4 2
93% 97% 100%
12% 7% 3%
Determining the Midpoint of the Class Intervals
In a given class interval, the scores are spread over on the entire interval. But when we want to the representative score of all the scores within a given interval by some single value, we take mid-point as the representative score. For example from Table 2, all 5 scores of class interval 69 to 65 are represented by the single value 67, while 39 to 35 is represented by 37. We can also take the same value when other two types of class intervals are taken. Below is the formula used to find out the mid-point.
Hence, the midpoint of 69 to 65 is:
. Other class midpoints can be derived in the same way.
Graphic Representation of Data
Most of us are familiar with the saying, “ A A picture is worth a thousand words.” In the same token, “ a graph can be worth a hundred or a thousand numbers.” The use of tables may not be enough to give a clear picture of the
properties of a group of test scores. If numbers presented in tables are transformed into visual models, then the reader becomes more interested in reading the material. Consequently, understanding of the information and prob proble lems ms for for disc discus ussi sion on is faci facililita tate ted. d. Gr Grap aphs hs are are very very us usef eful ul for for the the
comparison of test results of different groups of examinees. ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
165
SULTAN KUDARAT STATE UNIVERSITY
The graphic method is mainly used to give a simple, permanent idea and to emphasize the relative aspect of data. Graphic presentation is highly desired when a fact at one time or over a period of time has to be described. It must be stressed that tabulation of statistical data is necessary, while graphic presentation is not. Data is plotted on a graph from a table. This means that graphic form cannot replace tabular form of data. It can only supplement the tabular form. Graphic presentation has a number of advantages, some of which are enumerated below: 1. Graphs Graphs are visual visual aid aidss which give a bbird’s ird’s eeye ye view of of a given given set of numerical data. They present the data in simple, readily comprehensible form. 2. Graphs Graphs are generally generally more at attracti tractive, ve, fascinatin fascinatingg and impressive impressive tha thann the set of numerical data. They are more appealing to the eye and leave a much lasting impression on the mind as compared to the dry and uninteresting statistical figures. Even a layman, who has no statistics knowledge, can understand them easily. 3. They are are more catching catching aand nd as suc suchh are extensive extensively ly used to present present statistical figures and facts in most of the exhibitions, trade or industrial fairs, public functions, statistical reports, etc. Graphs have universal applicability. 4. They registe registerr a meanin meaningful gful imp impress ression ion on the mind mind almost almost befor beforee we think. They also save a lot of time as very little effort is required to grasp them and draw meaningful inferences from them. 5. Another Another adv advantag antagee of grap graphic hic for form m of data is that they they make make the principal characteristics of groups and series visible at a glance. If the data is not presented in graphic form, the viewer will have to study the whole details about a particular phenomenon and this takes a lot l ot of time. When data is presented in graphic form, we can have information without going into many details. 6. If the relations relationship hip betw between een two vvariab ariables les is to be stu studied died,, graphic graphic form of data is a useful device. Graphs help us in studying the relations of
one part to the other and to the whole set of data.
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
166
SULTAN KUDARAT STATE UNIVERSITY
7. Graphic Graphic form of data data is also very very usef useful ul device device to suggest suggest the directio directionn of investigations. Investigations cannot be conducted without any regard to the desired aim and the graphic form helps in fulfilling that desired aim by suggesting the direction of investigations. 8. In short, short, graphi graphicc form of sta statistic tistical al data co convert nvertss the complex complex and huge huge data into a readily intelligible form and introduces an element of simplicity in it. Basic Rules for the Preparation of Tables and Graphs
Ideally, every table should: 1. Be sel selff-ex expl plan anat ator ory; y; 2. Present Present valu values es with th thee same num number ber of decimal decimal pla places ces in all its its cells (standardization); 3. Include Include a title inf informin ormingg what is being being desc described ribed and and where, where, as well as the number of observations (N) and when data were collected; 4. Have a structur structuree formed bbyy three horizon horizontal tal lines, lines, defining defining table table heading and the end of the table at its lower border; 5. Not have have vertic vertical al lines lines at its la lateral teral borders; borders; 6. Provide Provide additiona additionall information information in in table footer, footer, w when hen ne needed eded;; 7. Be inserted inserted int intoo a docume document nt only aft after er bein beingg mentio mentioned ned in the te text; xt; and 8. Be nnumb umbere eredd by A Arab rabic ic nu numer merals als.. Similarly to tables, graphs should: 1. Include, Include, belo below w the figur figure, e, a title prov providing iding all all relevant relevant informatio information; n; 2. Be re refer ferred red to to as figur figures es in tthe he text; text; 3. Identify Identify fig figure ure ax axes es by the variab variables les un under der an analysi alysis; s; 4. Quote the source source which which prov provided ided the the da data, ta, if re require quired; d; 5. Dem Demons onstra trate te the scal scalee being being used; used; aand nd 6. Be sel selff-ex expl plan anat ator ory. y. The graph's vertical axis should always start with zero. A usual type of distortion is starting this axis with values higher than zero. Whenever it happens, differences between variables are overestimated, as can been seen
in Figure 1. ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
167
SULTAN KUDARAT STATE UNIVERSITY
Figure 1. Students’ Math and English Grades
Figure showing how graphs in which the Y-axis does not start with zero tend to overestimate the differences under analysis. On the left there is a graph whose Y axis does not start with zero and on the right a graph reproducing the same data but with the Y axis starting with zero. Other graphic presentations are hereby illustrated to interpret clearly the test data. 1. Lin ine eg gra raph ph ( polygon) polygon)
This is also used for quantitative data, and it is one of the most commonly used methods in presenting test scores. It is the line graph or a frequency polygon. It is very similar to a histogram, but instead of bars, it uses lines to compare sets of test data in the same axes. In a frequency polygon, you have lines across the scores in the horizonta horizo ntall axis. axis. Each Each poi point nt in the freque frequency ncy polygo polygonn rep repres resent entss two numbers, which are the scores or class midpoints in the horizontal axis and the frequency of that class interval in the vertical axis. Frequency poly po lygo gonn ca cann also also be supe superim rimpo pose sedd to comp compar aree seve severa rall freq freque uenc ncyy distribution, which cannot be done with histograms. Youu ca Yo cann co cons nstr truc uctt a freq freque uenc ncyy poly polygo gonn ma manu nual ally ly usin usingg the the histogram in Figure 2 by following these simple steps: a. Locate Locate th thee midp midpoin ointt on the top of eac eachh bar. Bear Bear in mind th that at the height of each bar represents the frequency in each class interval, and the width of the bar is the class interval. As such, that point in
the middle of each bar is actually the midpoint of that class interval. b. Draw a line line to connec connectt all the midpoi midpoints nts in consec consecutive utive oorder. rder. ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
168
SULTAN KUDARAT STATE UNIVERSITY
c. The The line graph graph is an esti estimat matee of the freq freque uency ncy po polyg lygon on of the test test scores.
Figure 2. Frequency Polygon
2. Cum Cumula ulativ tive e Fr Frequ equenc ency y Pol Polygo ygon n
This graph is quite different from a frequency polygon because the cumulative frequencies are plotted. In addition, you plot the point above the exact limits of the interval. As such, a cumulative polygon gives a picture of the number of observations that fall below or above a certain score instead of the frequency within a class interval. In Table 2, the
cumulative frequencies (less than and greater than ) are in the 4 th and 5th columns; in the 6th and 7th columns are the conversions to cumulative percentage. A cumulative percentage polygon is more useful when there is more mo re th than an one one fr freeque uenncy dis distr trib ibuutitioon wi with th uneq equa uall num numbe berr of observations. Thus, consider the class interval of 70-74 where cf> and cf< are 23 and 44, respectively. It means that there are 23 ( or 38%) students have scores of 70 and above, while there are 44 ( or 73%) students whose scores fall from 74 and below. (Please see illustrations in Figures 3 and 4).
ERNIE C. CERADO, PhD/MA. DULCE P. DELA CERNA, MIE
169
SULTAN KUDARAT STATE UNIVERSITY
Figure 3. Cumulative Frequency Polygon (cf>)
Figure 4. Cumulative Frequency Polygon (cf
View more...
Comments