A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Ability test An ability test measures the present level of functioning and can provide an estimate of the future performance of an individual on specific tasks or domains in cognitive or psychomotor areas. See also achievement test and aptitude test
Accountability both students and teachers arc being required to show that students have mastered course or grade objectives. Minimum level skills, essential skills, survival skills, and other types of achievement tests arc used to provide evidence of mastery
Achievement test An achievement test measures the degree or extent of the knowledge information skills, and competencies that a person has acquired through training, instruction, or experience there are survey achievement tests as well as subject-related tests.
Acquiescence response set Some test takers have a tendency to select positive responses (e.g., "true" or "yes") on attitude and personality tests.
Adaptive testing this procedure adjusts the test questions presented according to an individual's responses to previous items on the test. Test items can thus be geared to the individual's ability or achievement levels, and test takers may start and finish at different levels.
Adjustment test An adjustment test is one of the major types of personality tests. such tests measure the ability of an individual to function normally In society and achieve personal needs.
Affective domain affective domain covers dimension of personality such as attitudes, motives, emotional behavior temperament and personality traits.
Age norms Age norms for a particular test provide the median score made by test takers of a given chronological age in addition to intellectual and social age, many tests provide information on typical characteristic behavior of individuals at given age levels.
Alternate forms One or many achievement and aptitude tests it is necessary to have more than one form of the tests available alternate forms are constructed according to the same blueprint, that is, with the same set of objectives the same type of items, and similar difficulty and discrimination values for the test items. The two forms also have similar statistical characteristics; the means, standard deviations and correlations with other measures should all be approximately equal.
Alternate-forms reliability This type of reliability requires correlating the scores of individuals on one form with the scores they made on the second form This coefficient provides evidence of the equivalence of the two forms as well as the stability of the individual's performance.
Alternative assessment Alternative assessment emphasizes assessing performance of the test taker through portfolios, interviews observations, work samples, and the like instead of through multiple choice norm-referenced or criterion-referenced examination.
Anecdotal records Anecdotal records require a series of observations on an individual(s) The observer should provide an objective description of the behavior observed and an interpretation of the situation. School psychologists arc of ten required to record two observations of the child along with test information.
Aptitude test An aptitude test is used to provide an estimate of future performance on tasks that may or may not be similar to the tasks measured on the test. Aptitude tests arc used to assess the educational readiness of individuals to tears or become proficient in a given area if education or training is provided, Aptitude tests may contain the same type of items as achievement tests.
Arithmetic mean See mean
Assessment procedures Assessment procedures arc the methods that enable one to appraise or estimate the attributes of a person, group, or programs. The tools of assessment include checklists, inventories, observation schedules, needs assessments, rating scales, and all types of tests.
Attenuation Attenuation is a phenomenon that takes place in the statistical determination of correlation and regression The correlation or regression is reduced because of the imperfect reliability of one or both of the measures being correlated or compared.
Attitude Attitude is E: dimension of the effective domain and one aspect of an individualâs personality. Attitude is reflected in reactions to events other individuals, objects, or institutions
Authentic Assessment This type of assessment focuses on assessing realistic tasks or activities that relate to the performance on E: domain or set of' constructs being measured.
Basal ape Basal age is the ape at which the test taker passes all of the items on a given test.
Basic skills Many achievement tests are designed to measure the basic skills required to be successful in school these arc usually reading, writing and arithmetic competencies Such skills arc necessary for the student to learn other subjects, such as science and social studies.
Battery A battery is a set of tests usually standardized on the same population Survey achievement tests are one example of a battery. A battery facilitates comparison of a test taker 's performance in different areas.
Behavioral assessment Behavioral assessment focuses on the more objective and observable components of behavior and utilizes a wide variety of techniques, such as observation, checklists, and self-monitoring
Behavioral objectives Behavioral objectives require counselors and teachers to specify desired behavioral outcomes in objective e. observable forms and to identify the conditions of measurement
Bias Bias in testing results in scores that are higher or lower than they would be it the measurement were more reliable and valid. The error caused by bias is systematic rather than random.
Bimodal distribution A bimodal distribution is a frequency distribution with two modes or high points.
Biographic inventory A biographical inventory is a questionnaire survey instrument used to obtain information about the individual s educational. social, medical, and work experiences. It is one of the tools used by counselors and employment psychologists.
Buckley Amendment The Buckley Amendment is federal legislation that gives individuals and their parents or guardians access to information, including the results of standardized tests.
Ceiling Ceiling is the level or point at which a test taker fails a test or sub-test.
Central tendency ( Central tendency relates to the typical or average score in a distribution The three measures of central tendency arc the mean median, and mode Any one of these statistics summarizes the typical or average performance of group.
Central tendency error An error of' central tendency occurs when the rater avoids all the extreme judgments, both high and low, and rates ail items in the middle.
Checklist A checklist is a list of words, phrases, or statements describing the behavior of individual or situation. The rater checks the presence or absence of the item.
Coaching Coaching occurs prior to the administration of a test and involves short-term instructional activities designed to help test takers increase their test scores. Sessions often include instruction on test-taking strategies and control of test anxiety
Coefficient alpha Coefficient alpha is a reliability coefficient that measures the internal consistency of a test. The coefficient is the expected correlation of one test form with an alternate form that contains the same number of items.
Coefficient of determination The coefficient of determination is computed by squaring the correlation coefficient It provides an estimate of the proportion of variance in one variable that is predictable front the other variable.
Coefficient of equivalence The coefficient of equivalence is used to compute the reliability of alternate forms of a test, based either upon two administrations or on a single administration with odd and even items constituting separate forms
Coefficient of internal consistency The coefficient of' internal consistency is based upon one testing and provides an estimate of the homogeneity of test items. The split-half and Kuder-Richardson methods provide coefficient of internal consistency
Coefficient of stability A coefficient of stability provides a picture of how consistent an individual's scores are over a period of time The test-retest method and alternate forms provide some information on the stability of scores over time.
Cognitive domain The cognitive domain encompasses the different levels individuals use ill perceiving, thinking, and remembering. These levels are knowledge, comprehension, application, synthesis analysis, and evaluation.
Cognitive style Cognitive style refers to the strategies or approaches an individual prefers to USC in cognitive activities. Some of the styles discussed in the text are internal and external locus of control, field independent-field dependence and reflectivity-impulsively
Competency test A competency test is an achievement test that assesses a test taker's level of knowledge or skill in some defined domain
Computer-assisted testing Computer-assisted testing refers to testing that is presented on the computer rather than in a test booklet.
Computer based interpretation Computer-based interpretation is a method of providing the interpretation of test scores according to algorithms build into the computer program. The test user is provided with score reports and narrative statement about the results.
Concurrent validity Concurrent validity is one type of criterion-referenced validity. Test scores are compared with a criterion measure obtained at about the same time, and the coefficient describes their relationship.
Confidence interval The confidence interval, or confidence band, is marked by two points that define with specified probability the range that includes an individual's true score.
Construct A construct is a theory or concept used to explain data hi an orderly way. In a psychometric sense it is a psychological attribute or trait.
Construct validity Construct validity is the extent to which a test measures the intended psychological trait or attribute.
Content validity Content validity is the degree to which a test measure a defined body of knowledge. This type of validity is extremely important for achievement tests.
Correlation Correlation is a statistic
used to measure the strength and direction of the association between two
sets of scores. (coefficients range from +1 00 to -1 00. A correlation
of +1 .00 indicates a perfect positive
relationship between the scores, a correlation of 00 indicates no relationship between the scores, and a
correlation of-1.00 indicates an inverse relationship
Covariation ( covariation refers to the variance that two or more tests or variables ha\ e in common
Criterion-referenced tests Criterion-referenced
tests are designed to assess a rather limited range of objectives or goals.
They are usually used as mastery tests to assess whether an individual
can demonstrate a specific skill
Criterion-referenced validity Criterion-referenced validity is validity based on the correlation of test scores with some type of criterion measure
Cronbach's alpha Cronbach's alpha is a procedure for estimating the internal consistency of a test, based on parts of the test. It is one of the procedures used to compute reliability.
Crystallized intelligence Crystallized intelligence is one of the types of intelligence in Cattell's model. The term refers to the part of intellogence acquired through experience and education
Culture-fair tests Culture-fair tests are designed to be fait to all types of cultural and socioeconomic groups. These tests attempt to include only content to which all groups have been exposed during maturation
Decision theory test Users are interested in making decisions and predictions on the basis of test results. Such predictions can be classified as positive, false positive, negative, and false negative. Decision theory is also used in selecting a test-to identify the types of information needed and the types of information already available.
Derived scores A derived score is a score into which a raw score is converted by some type of mathematical operation percentile ranks, standard scores, stoniness, and grade-equivalent scores are all derived scores.
Descriptive statistics Descriptive statistics are used to summarize characteristics of a group of scores-for example, central tendency and dispersion or variability of scores. The mean, median TIT. mode, variance and standard deviation are descriptive statistics.
Deviation IQ The deviation IQ is the score that compares an individual with his age or grade group. he fixed an is Usually 100, and the standard deviation is 15.
Diagnostic tests A diagnostic test is an achievement test, most often in mathematics and reading, used to identify the strengths and weaknesses of the individual. Such tests include a wide range of items on a given skill or objective.
Differential prediction Differential prediction is Used in situations of criterion-referenced validity. It indicates the degree to which a test that is used to predict individual attainment yields different predictions for the same criteria for groups with different demographic characteristics, prior experience, or treatment.
Domain-sampling Domain Refers to a given area from which sample items are taken Often criterion-refrenced tests are called domain-referenced tests. The three major domains are cognitive, affective, and psychomotor
Equivalence reliability Equivalence reliability gives an estimate of the extent two or more forms of a test are consistent in split-half reliability, the test is subdivided into two parts, for example, odd items versus even items. In parallel forms the scores from form X of a test arc correlated with the scores from form Y to provide an estimate of the equivalence of the forms.
Equivalent forms Test makers may construct more than one form of a test to measure the same objective s with items of similar difficulty
Error of measurement Error of measurement refers to the discrepancy between an individual's observed score and her true score.
Evaluation Evaluation is the process an individual uses to judge information from one or more sources. That process may torus on test data as well as observations and other sources.
Factor The term factor can describe a psychological construct, such as verbal, spatial, or numerical aptitude. It can also represent the covariance of various subtests that tend to cluster together; that is, it represents their intercorrelations or intersections.
Factor analysis Factor analysis is a statistical multivariate procedure used to analyze the intercorrelations or covariance of variables. The method results in the identification of a reduced number of factors needed to explain the intercorrelations of variables.
False negative A false negative is a type of error in which an individual is predicted to fail but actually succeeds if given a chance.
Field dependence Field dependence is one of the dimensions of cognitive style that relates to an individual's dependence on body cues in space perception.
Field independence Field independence is a dimension of cognitive style that relates to an individual's dependence on body cues in space perception but independence from the surrounding visual field.
Fluid intelligence Fluid intelligence is one of the two types of intelligence identified by Cattell. It is the inherited dimension of intelligence and includes problem-solving and thinking ability
Forced choice Some interest and personality tests utilize the forced choice method, whereby the test taker is required to select one or more items from two or more similar or related options. This method helps to control for response set.
Frequency distribution A frequency distribution is a way of organizing and arranging data in a table. Scores are usually grouped in fewer intervals to summarize overall performance.
g factor The g factor is a generalized intelligence factor that is measured in most intelligence tests; tests that yield one score are following the g-factor approach. Spearman was the leading theorist of this conceptual model.
Grade-equivalent scores Grade-equivalent scores are a type of derived test score utilized primarily by survey achievement tests. A raw score is translated into the grade level for which the achieved score is the real or estimated mean or median.
Group test A group test can be given to more than one test taker in a single setting by one test administrator.
Halo effect Halo effect is the tendency of a rater to let the ratings in other areas of a scale influence the ratings in areas that cannot be observed or are difficult to rate,
Histogram A histogram is a bar graph that provides a picture or description of scores.
In-basket technique The in-basket technique requires the examinee to take action on a series of problem situations presented through correspondence, memos, and other documents found in a sample in-basket.
Informed consent Informed consent requires test takers to give their consent before being tested; it is legally and ethically required. Test takers are to be told what the purposes of the test are, who will have access to the scores, and how the results will be used.
Intelligence test An intelligence test measures dimensions of aptitude and ability that are needed for success in educational and vocational fields. There are both individual and group tests.
Interest inventory An interest inventory assesses an individual's likes and dislikes, preferences, and interests. This information is then related to occupational fields and clusters and can sometimes be compared to individuals working in given occupational fields.
Internal consistency Internal consistency is a method of estimating reliability that is computed from a single administration of a test. The coefficients reflect the degree to which the items are measuring the same construct and are homogeneous. Cronbach's alpha and the Kuder-Richardson formulas are measures of the internal consistency of a test.
Interpretative report Most interest, aptitude, and achievement tests provide not only a summary of the test scores but also information that helps test takers interpret and understand those scores.
Interval scale The interval scale is one of the four measurement scales and can be used to classify and order measurements. It plots equal distances between score points but does not have a true zero point. Examples of interval scales include IQ, Celsius, and Fahrenheit scales.
Ipsative measurement Ipsative measurement is a type of item format, such as forced choice or ranking, in which the variables (options or items) are compared with each other. Ipsative comparisons are only intraindividual and are not appropriate for normative interpretation.
Item bias An item is said to be biased when the average expected score on the item for the group in question is substantially higher or lower than it is for the overall population and when this difference results from factors that the item is not intended to measure.
Job analysis The general process of identifying the abilities, competencies, knowledge, and skills needed to perform jobs is called job analysis.
Kuder-Richardson 20/21 The KR 20 and 21 formulas are used to compute reliability in one administration of a test. They are internal consistency measures. KR 20 provides an estimate equal to the mean of all possible split-half coefficients. KR 21 can be substituted for KR 20 if the item difficulty levels are similar.
Least restrictive environment Public Law 94-142 calls for the placement of each handicapped individual in the most normal situation in which that individual can be successful.
Leniency error When a person is rated higher on an item than he should be rated, an error of leniency is operating.
Likert scale A Likert scale is an attitude scale that asks people to rate the intensity of their agreement with certain statements.
Locus of control Rotter identifies a cognitive and perceptual style that relates to how individuals perceive themselves as being controlled. The two ends of the continuum are external and internal, with external indicating reliance on external reinforcement and a belief in chance and fate.
Mean The mean is the arithmetic average of a set of scores. It is equal to the sum of the scores divided by the number of scores. The mean is one of three statistics used to indicate central tendency.
Measurement Measurement is the process used to assign numerals to objects or constructs according to rules so that the numbers have quantitative meaning.
Median The median is the 50th percentile or the midpoint of a distribution.
Minimum level competency test A minimum level competency test measures the essential or minimum skills that school systems and states have defined as a standard to be met.
Mode The mode is the most frequently occurring score or number in a set of scores.
National norms National norms give the average or median performance of a probability sample representative of the whole country. Most survey achievement batteries and scholastic aptitude tests report national norms.
Nominal scale The nominal scale is a scale that can be used to classify data into mutually exclusive and exhaustive categories.
Nonverbal test In a nonverbal test the test taker is not required to respond to the tasks verbally and items are not presented in a written format. The examinee may be shown pictures and asked to point to a specific one. Or the test taker may be asked to manipulate materials, copy block or bead designs, assemble puzzles, and so on.
Norm groups Norm groups are composed of the individuals on whom a test is standardized. These groups provide the basis for interpreting scores.
Norm-reference test A norm-referenced test presents score interpretation based on a comparison of individual performance with that of other individuals in specified groups.
Normal curve The normal curve is a smooth bell-shaped curve that is symmetrical around the mean; the curve can be computed with the use of an equation. Most educational and psychological variables, such as achievement and intelligence, have a normal bell-shaped distribution.
Objective test An objective test has a predetermined scoring key. Multiple choice tests are an example of such a test.
Observational techniques Observational techniques are used to look at the behavior of the test taker, sometimes during the examination and sometimes in naturalistic situations. In behavior assessment self-observation is sometimes used; the client is asked to keep a log or diary of her behavior. Situational tests require raters to observe individual behavior during the test.
Ordinal scale An ordinal scale requires an individual to rank order measurements.
Parallel forms See alternate or equivalent forms.
Percentile rank Percentile rank places a score in a distribution by identifying the percentage of scores that fall at or below the given score. A percentile rank is a type of derived score utilized on most norm-referenced type of tests.
Performance test A performance test requires the test taker to engage in some type of process, such as manipulating physical objects, rather than marking an answer on an answer sheet.
Personality inventory A personality inventory measures one or more dimensions of personality, such as attitude, adjustment, temperament, and values. The test taker is usually presented with a wide variety of behaviors and asked if they are characteristic of her.
Portfolio Portfolio is a collection of products produced by the person such as the papers, themes, tests, book reports of a student in language arts, or a collection of products produced by a person in a fine arts class.
Predictive validity Predictive validity is a form of criterion-referenced validity in which test scores are compared with performance that is measured sometime in the future. The predictor variable is the test and the criterion variable is the future performance, often on the job or in school.
Predictor A predictor is a measurable characteristic-such as a test score, previous performance, a rating or observation-that is correlated with a criterion variable to indicate future success or failure.
Primal standards Primary standards are the essential and fundamental characteristics that should be met by all tests before they are used.
Profile A profile is a graphic representation of individual or group scores. It provides a picture of the relative magnitude of scores.
Projective techniques Projective techniques are one method of assessing personality. The test taker gives free response to a series of stimuli, such as inkblots, pictures, or incomplete sentences. It is assumed that individuals will project their own perceptions, feelings, and attitudes in their answers.
Psychomotor test A psychomotor test measures fine and gross motor skills. The psychomotor domain-unlike the cognitive and affective domains-organizes and classifies psychomotor behaviors in terms of the amount of concentration required.
Public Law 94-142 Public Law 94-142, entitled the Education of All Handicapped Children Act, was passed in 1975 and established the requirements for free and appropriate education for all handicapped children. It also restricts the use of tests and assessment procedures with handicapped individuals.
Questionnaire A questionnaire is similar to a structured interview; it contains a list of questions on a topic or issue. A questionnaire is usually administered to a group of individuals to find out about their attitudes, beliefs, behaviors, and so on.
Range Range is a statistic used to measure the variable or spread of the scores. It is the difference between the highest and lowest score in a distribution.
Rapport Rapport is a warm and friendly relationship or interpersonal environment. An examiner wants to ensure valid results by establishing this type of positive environment and thereby encouraging the proper motivation and cooperation of the test taker.
Rating scale Rating scale is a measure that requires the rater to estimate the value of a person or thing or assess the presence of some trait or characteristic. Sometimes the scale calls for self-ratings; at other times ratings are given by peers, teachers, parents, and so on.
Ratio scale The ratio scale has equal units of measurement and has a true zero point. Time, height, and weight are examples of measurements that utilize the ratio scale. Most educational and psychological variables cannot be measured on the ratio scale.
Raw score Raw score is the unadjusted number of correct answers.
Readiness test A readiness test measures the extent to which the test taker has acquired the skills and knowledge necessary to learn a more complex skill.
Regression Regression is a statistical technique used to help individuals predict x when they know y and the relationship between x and y. A linear equation can be compared to predict criterion scores with one or more predictor variables.
Reliability Reliability refers to the degree to which test scores are consistent, dependable, or repeatable. Reliability is a function of the degree to which test scores are free from errors of measurement.
Response set Response set refers to the tendency of a test taker to respond to test items in a stereotyped or fixed way. The test taker may consciously or unconsciously choose the most socially desirable answers or perhaps true rather than false options.
Scatter plot A scatter plot is a bivariate graph that shows the paired values of two variables being correlated.
Scholastic aptitude test A scholastic aptitude test measures the cognitive skills necessary for success in school. It is used to predict how well individuals will do in educational contexts.
Screening test A screening test makes broad categorizations as a first step in a selection or diagnostic process in school or industry.
Situational test A situational test is a performance test in which an individual is placed in a realistic but contrived situation and then is rated on her role competence and problem solving abilities.
Skewness Skewness is the degree of asymmetry in a frequency distribution. In positively skewed distributions the scores are piled up at the lower end of the distribution, to the left of the mode. In a negatively skewed distribution the scores are piled up at the high end of the distribution, to the right of the mode.
Social desirability response set The social desirability response set is active when an individual tries to portray himself in a socially desirable light. The individual fakes "good" instead of putting down what is truly descriptive of his behavior.
Spearman-Brown prophecy formula The Spearman-Brown formula is used to estimate the reliability of a test if the test length is increased. Theoretically, a longer test samples more behaviors and covers more items from the measured domain, thereby increasing the reliability of the test.
Speeded test A speeded test measures performance by the number of tasks performed in a given time period. Clerical speed and accuracy, typing, and coding tests are examples of this type of test.
Split-half reliability Split-half reliability is a method of computing the reliability of a test from one administration of the test. An internal analysis coefficient is obtained by using one-half of the items on the test for one score, and the other half for the second score. These scores are then correlated and corrected for a full-length test, using the Spearman-Brown Formula. The method provides an estimate of the alternate form reliability.
Standard deviation The standard deviation is the square root of the variance; it is a statistic that describes the spread or dispersion of scores. Standard deviation is used with a derived scores as an index of how far above or below the mean the score falls.
Standard error of measurement The standard error of measurement is the standard deviation of the errors of measurement associated with the test scores for a specific group of test takers.
Standard score A standard score describes the location of an individual's score within a set of scores. Its distance from the mean is expressed in terms of standard deviation units. Such a score is used in norm-referenced measurement contexts.
Standardized test A standardized test is administered under standard directions and conditions.
Stanines Stanines are a 9-point scale having a mean of 5 and a standard deviation of 2. All but stanines 1 and 9 are one-half standard deviation in width. They are used to describe an individual's position relative to the norming group.
Statistics Statistics are an area of mathematics focusing on the collection, organization, and interpretation of numerical data. A statistic is a number used to describe some characteristic, such as the central tendency or variability of a set of scores.
Subtest A subtest is a grouping of items measuring the same function.
T-score A T score is a derived score on a scale usually having a mean score of 50 and a standard deviation of 10. '
Test anxiety Test anxiety is a psychological state of stress and fear caused by testing situations. Although some anxiety may be beneficial, extreme test anxiety can disrupt performance.
Test-retest reliability Test-retest methods require giving the same test to the same group of examinees on two different occasions and correlating the two sets of scores. The resulting coefficient gives an indication of the stability of the results.
True score The true score in classical test theory is the average of the scores earned by an individual on an unlimited number of perfectly parallel forms of the same test.
Usability Usability refers to the practical factors that must he considered in selecting a test÷for example, cost, time, ease of administration, and ease of scoring.
User's guide The user's guide contains a statement of the purpose of the test, its content, and appropriate uses. The guide often contains information on how to administer, score, and interpret the test.
Validity Validity is the degree to which a certain inference from a test is appropriate or meaningful.
Variance Variance is the average squared deviation from the mean or the standard deviation squared. The statistic is a measure of variability or dispersion of the scores.
Verbal test A verbal test is a test requiring oral or written responses to test items.
z score A z score
is a type of standard score in which the mean is zero and the standard
deviation is 1. A z score represents the raw score in standard deviation