Psychological Testing 1

May 13, 2018 | Author: Renee Rose Masicampo | Category: Validity (Statistics), Psychological Evaluation, Test (Assessment), Intelligence Quotient, Cronbach's Alpha

Share Embed Donate

Report this link

Short Description

Download Psychological Testing 1...

Description

HISTORY OF PSYCHOLOGICAL TESTING

Rise of Civil Service Examination (2200 B.C - Chan Dynasty) – first introduced in China; used to know one’s proficiency for Civil Service Examination – first government positions. Areas tested: music, archery, horsemanship, writing, arithmetic, geography.

RISE OF BRASS INSTRUMENT 



1862 – Wilhelm Wundt used a calibrated pendulum or thought meter (brass instruments) to measure the “speed of thought”. 1884 – Francis Galton (“Father of Mental Tests”) administered the first anthropometric & psychometric test battery to thousands of citizens at the International Health Exhibit in London. Anthropometric differences.





and

psychometric

tests – used

to

measure

individual

1890 – James McKeen Cattell used the term “mental test” in a publication. 1892 – Psychiatrist Emil Kraepelin who studied with Wundt, published his work regarding the use of a test that involves word association.

USE OF TESTS IN MENTALLY RETARDED & THE ADVENT OF IQ 











1904 – Charles Spearman proposed that intelligence consists of a single general factor “g” and numerous factors S1, S2, S3… & from it begins to lay the foundation for the concept of the test reliability and later for factor analysis. At the same time, Karl Pearson formulated the theory of correlation. 1905 – Alfred Binet and Theodore Simon invented the first modern intelligence test. (Binet-Simon Scales  Stanford-Binet Intelligence Test) 1908 – Henry H. Goddard translated the Binet-Simon Scales from French into English. 1913 – John Watson published psychology as the behaviorist views. It making behavioral observation a key tool in assessment. 1914 – Stern and Terman introduced the formula: IQ = MA/CA x 100 Lewis Terman revised the Binet-Simon Scales and published the Stanford-Binet Intelligence test . Revisions were 1937, 1960, and 1986.

INVENTION OF NONVERBAL TESTS, USE OF IQ/GROUP TESTS IN MILITARY 

1917 – Robert Yerkes spearheaded the Army Alpha and Beta examinations used for testing WWI recruits. – consists of native Americans, fluent in English, literate (written test) Army Alpha – consists – consists of immigrants, illiterate (abstract test) Army Beta – consists



Robert Woodsworth developed the Personal Data Sheet, the premiere personality test.

RISE OF PROJECTIVE TESTS, EDUCATIONAL TESTS, VOCATIONAL TESTS, AND OTHER LANDMARKS 



















– The Rorschach Inkblot Test is published by Hermann Rorschach. 1920 – The

1921 – Psychological Corporation (the first major test publishing company is founded by Cattell, Thorndike, & Woodsworth with the goal of “useful application of psychology”). 1926 – Florence Goodenough published the Draw-A-Man test. 1927 – Kurt Goldstein developed neurodiagnostic tests (neuropsychological tests) with soldiers who suffered brain injury during WWI. – Multiple Factor Analysis 1931 – L.L Thurnstone Thurnstone – Multiple – Bender Gestalt Test (a neuropsychological test that uses 9 1938 – Lauretta Bender – Bender stimulus cards)

1939 – Weschler Bellevue Intelligence Scale (David Weschler )



Weschler Adult Intelligence Scale

(MMPI) – a a clinical test used 1942 – Minnesota Multiphasic Personality Inventory (MMPI) – to know if one has a mental disorder; a paper and pencil test. – his theory revolves around the needs of a person. He made 1943 – Henry Murray – his the Thematic Apperception Test (TAT) , a test that involves storytelling.

1948 – Office of Strategic Services (OSS) used situational techniques for selection of officers. – officers were given different situations and was asked Situational test/techniques test/techniques – officers to enact what they have to do; roleplaying



– WISC 1949 – WISC



Rotter Incomplete Sentences Blank

Psychological Testing – process of measuring psychology-related variables by means of devices or procedures designed to obtain a sample of behavior (Cohen & Swerdik, 2010). – what we want to measure Variable – what

Psychological Assessment – gathering and integration of psychology-related data for the purpose of making psychological evaluation, accomplished through the use of tools (such as tests, interviews, case studies, behavioral observation) -

A problem solving process that begins with referral source and culminates into a psychological report.

judgment; Assessment – involves complex/comprehensive approach. instruments/tools

general; broader than testing; It is the evaluation through the

entails use of

– actual administering of psychological tests Testing – actual Testing is a part of assessment.

Psychometrician/Psychometrist – specialist in psychology and education who develops and evaluates psychological tests. Also known as “test users”. – one who takes the test. Test taker – one – one who develops tests; administers/evaluates test data Test user – one – measuring devices or procedures Tests – measuring Features of a test: -

Content

-

Format

-

Administration Procedures – Procedures – this this is usually found on the test manual

-

Scoring & Interpretation Procedures

- Technical Quality & Utility

Assessment Tools 

Interview – communication as a method of data gathering (e.g. face-to-face, telephone, panel/board). It requires note-taking as to the content. It follows a guide and is formal in nature. – writing word-for-word the response of the client Transcription – writing







Portfolio Assessment – samples of one’s ability and accomplishments composed of different medium as a tool for evaluation. It consists of the work samples of the client (such as canvas, paper, film, video, audio, etc.) – records, transcripts, and other accounts in written, pictorial, or Case History Data – records, other form, archival information and other data relevant to an assessee (such as medical records, TOR, etc.).

Behavioral Observation – monitoring the actions of others or oneself by visual or electronic means while recording quantitative and/or qualitative information regarding the assessee’s actions. 2 types of behavioral behavioral observation: 1. Naturalistic 2. Controlled/Laboratory Environment





– assesses are directed to act as if they were in a particular situation Role-Play Tests – assesses and they are evaluated through expressed thoughts, behaviors, abilities, and other variables. This is also called “situational performance test” – used in test administration, scoring, and interpretation. Computers – used Computer Assisted Psychological Assessment (CAPA) Computer Adaptive testing (CAT)

USES OF TESTING/ASSESSMENT 1. Selection/Screening 2. Placement 3. Classification – Classification – categorization categorization of people according to their skills 4. Description/Profiling 5. Decision-making 6. Tracking/Monitoring 7. Counseling/Self-Knowledge 8. Diagnosis/Prognosis 9. Treatment Planning/Intervention 10. Research & Evaluation 11. Program Development

12. Case Conferencing – Conferencing – the the professionals talk about the client’s case 13. Expert Testimony/Forensic Use 14. Credentialing/Certification

MAIN TYPES OF PSYCHOLOGICAL TESTS 1. Intelligence Tests – measure an individual’s individual’s ability in relatively global areas and thereby help determine potential for scholastic work or certain occupations. 2. Aptitude Tests – measure – measure the capability for a relatively specific task or type of skill. It is a narrower form of ability testing. 3. Achievement Tests – measure a person’s degree accomplishment in a subject or task. (e.g. board exams)

of

learning,

success,

or

4. Personality Tests – measure the traits, qualities, or behaviors that determine our individuality. (includes checklists, inventories, & projective techniques) 5. Career/Vocational Career/Vocational Tests – measure an individual’s preference for a certain activity or topic and thereby help determine occupational choice. 6. Behavioral Procedures – objectively – objectively describe and count the frequency of a behavior, identifying the antecedents and consequences of the behavior. It includes checklists, rating scales, interviews, structured observations. 7. Neuropsychological Tests – measure cognitive, sensory, perceptual, and motor performance to determine the extent, locus, and behavioral consequences of brain damage.

TESTING PROCESS – uniformity of procedures; usually found in the test manual Standardized Procedures – uniformity WHY CONTROL THE USE OF PSYCHOLOGICAL TEST? - To ensure that the test is used by a qualified examiner - To prevent general familiarity with the test content which would invalidate the test. LEVELS OF TEST USERS Level A – A – paper paper & pencil test; can be utilized by psychology and education graduates

Level B – B – for for degree holders (MS/MA); with exposure to assessment; those who have background in psychometrics & statistics; covers Level A test and paper & pencil personality test. Level C – projective tests; MA/PhD; had training and background in psychological testing & assessment.

GUIDELINES ON TEST ADMINISTRATION 1. Advance preparation of the examiner (reading the manual before the test) 2. Physical conditions 3. Rapport and test-taker orientation 4. Sensitivity to disabilities 5. Giving directions/manner of administration 6. Proper Timing

EXAMINER & SITUATIONAL VARIABLES 1. Examiner Personality 2. Rapport 3. Test Anxiety 4. Motivation 5. Test taker’s activities immediately preceding the test 6. Faking/Guessing 7. Test Sophistication 8. Training/Coaching

SOURCES OF INFORMATION ON TEST 1. Reference Books 2. Publisher’s Catalogues 3. Journals 4. Test Manuals

ETHICS 1. Right of informed consent – Test takers have the right to know why they are evaluated, how the test data will be used, and what information will be released to whom. The consent must be in written form. 2. Right to be informed of test findings – Test takers are also entitles to know the recommendations. Providing feedback also serves as therapeutic (oral feedbacking) 3. Right to privacy and confidentiality – withholding withholding information; keeping the client’s disclosures confidential 4. Right to the least stigmatizing label – least stigmatizing labels should always be assigned when reporting test results. Psychologists must take responsible ways to avoid harming their clients/patients and to minimize harm.

OTHER APPLICABLE ETHICAL CONSIDERATIONS 1. Bases for assessment – Psychologist’s opinions opinions must be based on actual data. All things written on the report should be data-driven. 2. Use of assessments – assessment instruments must have validity and reliability. Assessment methods must be appropriate to the client’s language. 3. Interpreting assessment results – Take – Take into account the purpose of assessment. 4. Assessment by unqualified persons 5. Maintaining – test materials should be secured Maintaining test security – test – refers to raw and scaled scores Test data – refers

PSYCHOMETRIC PROPERTIES OF PSYCHOLOGICAL TESTS Reliable test = Consistent test (dependable) Properties of a “Good” test 

Clear instructions for administration, scoring, and interpretation



Economy in time and money in terms of administration, scoring, and interpretation



One that measures what it purports to measure



Having psychometric soundness, that is reliable and valid. *A test is considered unethical if it is not reliable and valid. A test with test manual is reliable and valid.

ASSUMPTIONS UNDER THE CLASSICAL TEST THEORY 

A score on a test is presumed to reflect not only the test taker’s true score on the attribute being measured but also the error. Error – refers to the component of the observed test score that does not have to do with the test taker’s ability.



X=T+E

- formula to represent the presence of error

X = observed score (raw score) T = true score E = error (external/environmental factors) Environmental factors – affects the test taker’s performance (e.g. lights, humidity, noise, etc)

SOURCES OF ERROR VARIANCE 1. Test Construction – it is the way the test was developed. It involves item/content sampling. 2. Test Administration – it – it is the ways the test was delivered/handled. It involves the test environment, test taker variables, and examiner-related variables. (The test taker must be relaxed before taking the test.) 3. Test Scoring & Interpretation – always – always refer to the test manual. 4. Other potential sources of error

– dependability or consistency. A test is reliable if it involves the test itself and Reliability – dependability the test score, the consistency of the measuring tool, and if it generates consistent test scores. – an index of reliability that shows the degree of error present. Reliability Coefficient – an High reliability coefficient = good test

Reliability Estimates: 1. Test-Retest Reliability (Coefficient of Stability) – an estimate of reliability obtained by correlating pairs of scores from the same people on 2 different administration of similar tests.

CONDITIONS: -

Is appropriate when evaluating the reliability of a test that purports to measure something relatively stable over time such as personality test.

- The passage of time can be a source of error variance. *Avoid a very long time interval between the tests. The ideal time interval is 1-3 weeks. 2. Parallel Form/Alternate Forms Reliability (Coefficient of Equivalence) Parallel Form – Form – scores scores obtained on parallel or each form of the test correlate equally; 2 test sessions. Should have same validity (e.g. both IQ tests) and same length of test. Alternate Forms – Forms – scores scores obtained on a different versions of the tests are correlated; 1 or 2 testing sessions. Equivalent tests by the way of content and level of difficulty. Same type of test but different authors. 3. Split-Half Reliability – estimate of it is obtained by correlating two pairs of scores obtained from equivalent halves of a single test administered once. 3 steps in computing the estimate: a. Divide the test into equivalent halves. b. Calculate a Pearson r between scores on the 2 halves of a test c. Adjust the half-test reliability using the Spearman formula. Spearman Brown (SB) formula can estimate internal consistency reliability from a correlation of 2 halves of a test. 4. Inter-item/Internal Consistency – refers to the degree of correlation among all items on a scale. An indicator of inter-item consistency is useful in assessing test homogeneity.

Homogeneity – refers to the degree to which the test measures a single factor (unifactorial) – refers to the degree to which a test measures different items/factors Heterogeneity – refers 2 formulas that measure test homogeneity & heterogeneity: -

KR – KR – 20 20 / KR – KR – 21 21

-

Cronbach’ Cronbach’ s

α

- Kuder Richardson

(Coefficient Alpha)

- Lee Cronbach

5. Interscorer/Interrater Reliability – is the degree of agreement or consistency between 2 or more scorers/judges/raters with regard to a particular measure.

*The way of determining the degree of consistency among scorers in the scoring of a test is to calculate a coefficient of correlation. (Pearson r)

SUMMARY OF RELIABILITY TYPES TYPES OF RELIABILITY

NO. OF TESTING SESSIONS

Test-Retest

Parallel Form/ Alternate Forms

Split-Half

Inter-item/ Internal Consistency

Interscorer/ Interrater

1 (A) or 2 (P)

1

1

1

SOURCES OF ERROR VARIANCE

STATISTICAL PROCEDURES

Administration, changes 1 over time

Pearson r/ Spearman rho

2

Test construction or administration

Pearson r/ Spearman rho

1

Test construction; nature of split

Pearson r/ SB Formula

1

Item sampling; test heterogeneity

Coefficient Alpha

1

Scoring & Interpretation; Scorer differences

Pearson r/ Spearman rho

NO. OF TEST FORMS

2

INTERPRETING A COEFFICIENT RELIABILITY 95 – 95 – 90 90 = High Acceptability 89 – 89 – 80 80 = Moderate Acceptability 79 – 79 – 70 70 & below = Unacceptable

– an important attribute of a test for it refers to the “degree to which it measures Validity – an what it claims to measure” (Gregory, 2000) A test is valid to the extent that inferences made from it are appropriate, meaningful, and useful.

Also concerns to what the test measures and to what a test score really means. High validity coefficient = valid test

Major Categories of Validity: 1. Content Validity – involves essentially the systematic examination of the test content to determine whether it covers a representative sample of the behavior domain to be measured. It is done when test items are judged by experts in the field to be suitable for their purpose. The test should be checked/reviewed by professionals to know if it is valid.

Face Validity – is not validity in the technical sense nor to what extent the test actually measures, but to what it appears to measure. - Judgment based on test’s appearance (did not undergo technical examination) Ex: quizzes/tests on magazines 2. Criterion Validity – is demonstrated when a test is shown to be effective in estimating an examinee’s performance on some outcome measure or the criterion. It refers to the – is measured by its correlations with other similar tests Concurrent Concurrent Validity – is taken at the same time. (e.g IQ test is correlated with another IQ test)

Predictive Validity – tests scores are used to estimate outcome measures obtained at a later date. 3. Construct Validity – a test designed to measure a construct must estimate the existence of an inferred, underlying characteristic based on a limited sample of behavior.

Construct – a theoretical, intangible quality or trait in which individuals differ. (e.g IQ, personality, aptitude, self-esteem, etc.) Approaches to construct validity: 





Analysis to determine whether the test items or subtests are homogenous and therefore measure a single construct. Study of developmental changes to determine whether they are consistent with the theory of the construct. Research to ascertain whether group differences on test scores are theory consistent.



Analysis to determine whether intervention effects on test scores are theory consistent.



Correlation of the test with other related and unrelated tests and measures.



Factor analysis of test scores in relation to other sources of information Factor analysis – analysis – statistical statistical procedure to check test’s construct validity

Types of Construct Validity: a. Convergent Validity – a test correlates highly with other variables or tests with which it shares an overlap of constructs. b. Discriminant validity – validity – a a test doesn’t correlate with variables or tests from which it should differ.

NORMS AND TEST STANDARDIZATION Norms – based upon the distribution of scores obtained by a representative sample of examinees -

Scores on psychological tests are interpreted with reference to norms.

– a test that follows a norm (e.g psychological tests) Norm-referenced Norm-referenced test – a

Criterion-referenced Criterion-referenced test – score is compared on a certain criterion (e.g classroom tests) – consists of a sample of examinees who are representative of the population Norm group – consists for whom the test is intended. – representatives of the whole population Sample – representatives Varieties of norms: -

Percentile rank

-

Age equivalents

-

Grade equivalents

-

Standard scores

*Norms indicate the examinee’s standing on the test relative to the performance of other persons of the same age, grade, sex, etc.

Norm-referenced Norm-referenced tests – refers to a method of evaluation and a way of deriving meaning from test scores by evaluating an individual individual test taker’s score and comparing it to scores of a group of test takers on the same test.

RAW SCORE TRANSFORMATIONS 1. Percentile & Percentile Ranks – expresses the percentage of people in the standardization sample who scored below a specific raw score -

It indicates only how an examinee is being compared to the standardization sample and does not convey the percentage of questions answered correctly.

-

Ranks in a group of 100 representative subjects

2. Standard Scores (z score) – uses the standard deviation of the total distribution of raw scores as the unit of measurement. -

It expresses the distance from the mean in standard deviation units.

-

It always retains the relative magnitude of differences found in the original raw scores.

3. T scores – a standardized score with mean of 50 and standard deviation of 10. It eliminates fractions and negative signs. T score scales are common among personality tests. 4. Stanine (standard nine scale) – all – all raw scores are converted to a single-digit system of scores ranging from 1-9. Mean = always 5 Variations of stanine: -

Sten scale – scale – with with 5 units above and below the mean (expressed in 10 units)

-

C scale – scale – consisting consisting of 11 units

SELECTING A NORM GROUP 1. Random Sampling – used – used in research and testing *Fishbowl technique 2. Stratified Random Sampling – clusters, – clusters, groupings Nonprobability sampling is NOT used in testing

Age Norms – depicts the level of test performance for each separate age group in the normative sample -

Purpose: to facilitate same-aged comparisons

- The examinee’s performance is interpreted in relation to standardization of subjects of the same age.

Grade Norms - depicts the level of test performance for each separate grade group in the normative sample -

Very useful in school settings when reporting the achievement levels of school children

– derived from representative local examinees as opposed to a Local and Subgroup Norms – derived national sample. -

Subgroup norms consist of the scores obtained from an identified subgroup (e.g African American, Hispanics, Tagalog, Ilocano, etc.)

well -defined Criterion-referenced Criterion-referenced tests – compares an examinee’s accomplishments to a well-defined content domain or standards (e.g classroom tests) - The focus is on what the test taker could do rather than in comparison to the performance levels of others. -

Identify on examinee’s examinee’s absolute mastery or non-mastery non-mastery of specific behaviors or expectations.

PSYCHOLOGICAL TEST REPORT WRITING – a document written as a means of understanding certain features Psychological Report – a about the person and current life circumstances in order to make decisions and to intervene positively in a problem situation -

It is the culmination of the practice of psychological testing and usually is the product of assessment procedures.

-

It results from a process that starts with a referral source.



Psychological report writing is an important skill because of the potential, lasting impact of the written document.



Psychological test report is a communication. Therefore, it must be written in a way that corresponds to the reader’s level of understanding and training. trai ning. It must meet the criteria of clarity, meaningfulness, analysis/synthesis, and good clinical judgment.



The ultimate goal is to provide helpful perspectives on the client, not to impress the referral source.

USES OF TESTING/ASSESSMENT 1. Selection/Screening 2. Placement

3. Classification – Classification – categorization categorization of people according to their skills 4. Description/Profiling 5. Decision-making 6. Tracking/Monitoring 7. Counseling/Self-Knowledge 8. Diagnosis/Prognosis 9. Treatment Planning/Intervention 10. Research & Evaluation 11. Program Development 12. Case Conferencing – Conferencing – the the professionals talk about the client’s case 13. Expert Testimony/Forensic Use 14. Credentialing/Certification

POSSIBLE REFERRAL SOURCES/RECIPIENTS OF PSYCHOLOGICAL REPORTS 1. Client themselves 2. Parents of minors/mentally deficient 3. Regular class/Special education teachers 4. Counselors and school administrators 5. Company employers/Human resource officers 6. Psychologists/Psychotherapists 7. Medical doctors 8. Court and correctional personnel 9. Social/Community/Mental health volunteers

CONDITIONS IN WRITING THE REPORT 



The report should be written with the needs of the referring person in mind and should explicitly target referral questions. Reports should provide information that is relevant to the reader’s work with the client but that may have not been requested.







Reports should communicate in a way that is appropriate to the report’s intended recipient. Reports should affect the way the reader/recipient works with the client through its recommended specific course of actions. Effective reports stay within the bounds of expertise of the examiner. The best way to meet the needs of the client is to recommend immediate, consultation with appropriate professionals.

ORGANIZATION OF THE REPORT 1. Identifying Information/Demographic Information 2. Reason for Referral (e.g The client was referred to assessment as part of the requirements in the subject Psychological Testing 1) 3. Psychological Tests Given and the dates 4. Psychological Test Results/Interpretation -

Intellectual Functioning

-

Career Interest

-

Personality

-

Purpose in Life

5. Summary of the Findings and Recommendation 6. Prepared by: *Name, Course/Year/Section*

Noted:

*Name of faculty-in-charge* Faculty-in-Charge

Identifying Information/Demographic Information/Demographic Information Name of client, chronological age, sex, date of birth, place of birth, home address, current address, tel./contact number, educational background, school, occupation, company, marital status, citizenship, date of report, name of examiner

Reason for Referral This should cover the 1-2 statements/sentences of the reason for referral and specific statement of the question/s to be answered.

Psychological Test Results/Interpretation Results/Interpretation This should cover the list of assessment tolls with abbreviations and date when the test was taken. Results should be data-driven, should be based most of all from the test manual, but may also include the examiner’s analysis of the client’s test results or direct responses. The test protocols must also be attached to the psychological test report. Words that may be used: tends, has the tendency, shows, likely, may, might, possibly, suggests, indicates, appears, seems, etc.

Summary and Recommendation This should cover the integration or interplay of all things and statements linking summary/conclusions to recommendations. It should also highlight or emphasize the overall psychological functioning of the client based on all the tests given (with the examiner’s own analysis as well)

Psychological Testing 1

Short Description

Description

Comments

We need your help!