Trends in Mathematics and Science Performance: Early 1970s to Late 1990s
Recent Performance in Mathematics and Science
International Comparisons of Mathematics and Science Performance
Available data on U.S. student performance in mathematics and science
present a mixed picture. Although data show some overall gains in
achievement, most students still perform below levels considered
proficient or advanced by a national panel of experts. Furthermore,
sometimes substantial achievement gaps persist between various U.S.
student subpopulations, and U.S. students continue to do poorly
in international comparisons, particularly in the higher grades.
This section describes longterm trends based on curriculum frameworks
developed in the late 1960s, recent trends based on frameworks aligned
more closely with current standards, and the performance of U.S.
students relative to their peers in other countries.
The National Assessment of Educational Progress (NAEP), also known
as "The Nation's Report Card," has charted U.S. student
performance for the past 3 decades (Campbell,
Hombo, and Mazzeo 2000) and is the only nationally representative,
continuing assessment of what students know and can do in a variety
of academic subjects, including reading, writing, history, civics,
mathematics, and science. NAEP consists of three separate testing
programs. The "longterm trend" assessment of 9, 13,
and 17yearolds has remained substantially the same since it was
first given in mathematics in 1973 and in science in 1969, and it
thereby provides a good basis for analyzing achievement trends.
[More detailed explanations of the NAEP longterm trend study are
available in Science and Engineering Indicators — 2002
(National Science Board 2002)
and at http://www.nces.ed.gov/naep3/mathematics/trends.asp.]
A second testing program, the "National" or main NAEP,
is based on more contemporary standards of what students should
know and be able to do in a subject. It assesses students in grades
4, 8, and 12. A third program, "state" NAEP, is similar
to national NAEP, but involves representative samples of students
from participating states. The NAEP data summarized here come from
the longterm trend assessment and the national NAEP. Chapter 8
covers the considerable variation by state.
The most recent NAEP longterm trend assessment took place in 1999.
Because the 1999 NAEP data have already been reported widely (including
in the 2002 version of this report), this chapter only summarizes
the main findings.
Trends in Mathematics and Science Performance:
Early 1970s to Late 1990s
The NAEP trend assessment shows that student performance in mathematics improved
overall from 1973 to 1999 for 9, 13, and 17yearolds, although
not at a consistent rate across the 3 decades (Campbell,
Hombo, and Mazzeo 2000) (figure
11 ).
In general, declines occurred in the 1970s, followed by increases
in the 1980s and early 1990s and relative stability since that time.
The average performance of 9yearolds held steady in the 1970s,
increased from 1982 to 1990, and showed additional modest increases
after that. For 13yearolds, average scores improved from 1978
to 1982 with additional improvements in the 1990s. The average performance
of 17yearolds dropped from 1973 to 1982, rose from 1982 to 1992,
and has since remained about the same, resulting in an overall gain
from 1973 to 1999.
Average student performance in science also improved from the early
1970s to 1999 for 9 and 13yearolds, although again, not consistently
over the 3 decades. Achievement declined in the 1970s and increased
in the 1980s and early 1990s, holding relatively stable since that
time. By 1999, increases had overcome the declines of the 1970s.
In 1999, 9yearolds' average performance was higher than in 1970.
Among 13yearolds, average performance in 1999 was higher than
in 1973 and essentially the same as in 1970. By 1999, 17yearolds
had not recouped decreases in average scores that took place during
the 1970s and early 1980s. This resulted in lower performance in
1999 than in 1969 when NAEP first assessed 17yearolds in science.
The NCLB Act requires every student, regardless of poverty level,
sex, race, ethnicity, disability status, or English proficiency,
to meet challenging standards in mathematics and science. Patterns
in the NAEP longterm trend data can show whether the nation's school
systems are providing similar learning outcomes for all students
and whether performance gaps between different groups of students
have narrowed, remained steady, or grown.
Performance Trends for Males and Females
In general, the average performance of both males and females in
mathematics improved from the early 1970s to the late 1990s, including
the period from 1990 to 1999 (Campbell,
Hombo, and Mazzeo 2000). For 9 and 13yearolds, differences
in average mathematics scores shifted from favoring females in the
1970s to favoring males by the 1990s (figure
12
and appendix table 11
).
Among 17yearolds, the performance gap that favored males in 1973
had narrowed by 1999. By 1999, none of the apparent sex differences
in mathematics performance were statistically significant. In science,
average scores tended to favor males through 1999, although the
apparent difference in 1999 for 9yearolds was not statistically
significant. The gender gap in science has remained relatively stable
for 9 and 13year olds, but it narrowed for 17yearolds between
1969 and 1999.
Performance Trends for Racial/Ethnic Subgroups
In every racial/ethnic subgroup, a general trend of improved mathematics
performance occurred over the past 3 decades. Scores for white,
black, and Hispanic students, regardless of age, were higher in
1999 than in 1973 (Campbell, Hombo,
and Mazzeo 2000). (Trends for other racial/ethnic groups are
not reported because the samples for these groups are too small
to analyze separately.) However, during the 1990s, although the
performance of white students increased for each age group, the
performance for blacks in each age group and for Hispanic 9 and
13yearold students remained flat. The performance of Hispanic
17yearolds increased from 1990 to 1999.
In science, scores for 9 and 13yearolds from each racial/ethnic
subgroup in 1999 were higher than in the year NAEP first assessed
a particular subgroup (1970 for whites and blacks, 1977 for Hispanics)
but held steady from 1990 to 1999. Among 17yearolds, science performance
trends varied. White students in that age group had lower scores
in 1999 than in 1969, although the average score did increase between
1990 and 1999. The performance of black 17yearold students was
about the same in 1969, 1990, and 1999. Science scores of Hispanic
17yearolds were higher in 1999 than in 1969 and increased from
1990 to 1999.
Despite improved performance overall from the 1970s to the late
1990s for all racial/ethnic subgroups studied, significant performance
gaps persist among these subgroups (figure
13
and appendix table 12
).
In mathematics, the sizable gap between white and black students
of all ages in 1973 narrowed until 1986 but remained relatively
stable in the 1990s. Even larger performance gaps exist between
white and black students in science. These gaps narrowed somewhat
from 1970 to 1999 for 9 and 13yearolds but remained essentially
unchanged among 17yearolds from 1969 to 1999. To place these gaps
in perspective, in 1999 in mathematics, black students averaged
about 30 points lower than did white students; in science, scores
ranged from 39 to 52 points lower than those of white students,
depending on the age level. These differences are roughly the same
size as the differences between the average 13yearold and 17yearold
in these subjects (figure
11 ).
Substantial gaps also exist between Hispanic and white students
at each grade level for both mathematics and science. Among 9yearolds,
the mathematics gap favoring white students widened between 1982
and 1999. Hispanicwhite mathematics performance differences for
13 and 17yearolds persist but have lessened over the past 3 decades.
In science performance, even larger gaps exist. For 9yearolds,
the science gap did not narrow overall. The 1977 science gap for
13yearolds narrowed during the 1980s and early 1990s, but by 1999,
it had returned to nearly the 1973 level. The score difference between
17yearold white and Hispanic youth did increase at several points
in time, but by the end of the 1990s, was at the same point as in
the late 1970s. The whiteHispanic differences in average scale
scores in 1999 ranged from 22 to 26 points in mathematics and from
30 to 39 points in science (figure
13 ).
Racial/ethnic subgroups differ in several characteristics generally
agreed to influence academic achievement. For example, black and
Hispanic students' parents have less education compared with the
parents of white students, and black and Hispanic students are more
likely to live in poverty (Peng, Wright,
and Hill 1995). Economic hardship and low education levels can
limit parents' ability to provide stimulating educational materials
and experiences for their children (Hao
1995; and Smith, BrooksGunn, and
Klebanov 1997). Appendix
table 13
illustrates the persistent achievement gaps between students whose
parents have different levels of education.
Recent Performance in Mathematics and Science
Thus far, this section has presented NAEP results based on the
longterm trend assessments, which use the same items each time.
The next analysis uses data from the national NAEP program, which
updates instruments to measure the performance of students based
on more current standards. These assessments are based on frameworks
developed through a national consensus process involving educators,
policymakers, assessment and curriculum experts, and representatives
of the public, then approved by the National Assessment Governing
Board (NAGB).
NAEP first developed a mathematics framework in 1990, then refined
it in 1996 (NCES 2001c).
It contains five broad content strands (number sense, properties,
and operations; measurement; geometry and spatial sense; data analysis,
statistics, and probability; and algebra and functions). The assessment
also tests mathematics abilities (conceptual understanding, procedural
knowledge, and problem solving) and mathematical power (reasoning,
connections, and communication). Along with multiplechoice questions,
assessments include constructedresponse questions that require
students to provide answers to computation problems or describe
solutions in sentence form.
NAEP developed the science framework in 1991 and used it in the
1996 and 2000 assessments (NCES 2003c).
It includes a content dimension divided into three major fields
of science (earth, life, and physical) and a cognitive dimension
covering conceptual understanding, scientific investigation, and
practical reasoning. The science assessment also relies on both
multiplechoice and constructedresponse test questions. A subsample
of students in each school also conduct a handson task and answer
questions related to that task.
Student performance on the national NAEP is classified according
to three achievement levels developed by NAGB that are based on
judgments about what students should know and be able to do. The
basic level represents partial mastery of the knowledge and skills
needed to perform proficient work at each grade level. The proficient
level represents solid academic performance at grade level and the
advanced level signifies superior performance. Disagreement exists
as to whether NAEP has appropriately defined these levels, but they
do provide a useful benchmark for examining recent changes in achievement.
The proportion of fourth and eighth grade students reaching at
least the proficient level in mathematics increased by a few percentage
points from 1996 to 2000, when just over onefourth of fourth and
eighth grade students scored at or above that level (NCES
2001c) (figure 14
).
Among 12th graders, only 17 percent reached that level. Approximately
onethird of students at each grade level scored below the basic
level in 2000. The proportion of fourth and eighth grade students
scoring below the basic level decreased from 1996 to 2000, but the
proportion for 12th graders increased.
In general, the 2000 science results mirror the mathematics results
(NCES 2003c). Only a minority of
students reached the proficient level, and at least onethird of
students at each grade level did not reach the basic level. Among
12^{th} graders, that figure approached half, an increase
from 1996. Across both subjects, very few students performed at
the advanced level (only 2 to 5 percent).
Mathematics and Science Proficiency for Males
and Females
Like the NAEP longterm assessment program, the national NAEP assessment
reports results by subgroups, which allows comparisons of achievement
levels among different subgroups. In 2000, similar percentages of
males and females in each grade reached at least the basic level
in mathematics (figure 15
).
However, more males scored at or above the proficient level. The
2000 mathematics results show improvement over 1996 for both sexes
in the percentage scoring at or above the basic level in grade 4,
but a decline in grade 12 (appendix
table 14 ).
The 2000 science results show that a greater percentage of males
than females in both grades 4 and 8 attained at least the basic
level, and higher percentages of males at each grade level scored
at or above the proficient level. The period between 1996 and 2000
saw no significant change in the proportion of females scoring at
or above basic, or at or above proficient. Males in grade 12 registered
a decline in the percentage at or above the basic level, and males
in grade 8 registered an increase in the percentage at or above
proficient (appendix
table 14 ).
Mathematics and Science Proficiency by Racial/Ethnic
Subgroups
Variations in performance levels across racial/ethnic groups are
more apparent than variations between males and females (figure
16 ).
At each grade level in mathematics in 2000, higher proportions of
white and Asian/Pacific Islander students (when scores for the latter
group were reported) scored at or above the basic and proficient
levels compared with black, Hispanic, and American Indian/Alaskan
Native students. Among 12th grade students, 74 percent of white
students and 80 percent of Asian/Pacific Islander students scored
at or above the basic level compared with 31 percent of blacks,
44 percent of Hispanics, and 57 percent of American Indians/Alaskan
Natives. Overall, black students had the lowest percentage scoring
both at or above the basic level and at or above the proficient
level. Only one statistically significant change occurred from 1996
to 2000: the proportion of white fourth grade students scoring at
or above the proficient level in mathematics increased (appendix
table 15 ).
These differences in mathematics performance across racial/ethnic
groups are evident even when children begin school (Denton
and West 2002). Children from lowincome and minority family
backgrounds start kindergarten at a disadvantage in mathematics
knowledge and skills. This disadvantage persists throughout kindergarten
and into the first grade. By the first grade, black and Hispanic
children are less likely than white children to solve addition,
subtraction, multiplication, and division problems, and children
from poor families are also less likely than those from nonpoor
families to demonstrate proficiency in these areas.
Similar racial/ethnic differences hold true for science. In 2000,
higher percentages of white and Asian/Pacific Islander students
scored at or above the basic level and at or above the proficient
level at each grade level compared with their black, Hispanic, and
American Indian/Alaskan Native counterparts. Black students at all
grade levels were least likely to reach these performance goals.
Only one statistically significant change occurred from 1996 to
2000, a decrease in the proportion of white 12th graders reaching
or exceeding the basic level (appendix
table 15 ).
Mathematics Achievement in HighPoverty Schools
Poverty is negatively associated with student achievement. Analyses
of NAEP 2000 mathematics data show that fourth graders in schools
with higher proportions of students eligible for the Free/ReducedPrice
Lunch Program, a commonly used indicator of poverty, tend to have
lower scores (NCES 2002a) (figure
17 .)
This pattern occurred among eligible and not eligible students.
These highpoverty schools also enrolled a greater percentage of
black and Hispanic students and had higher rates of absenteeism,
a lower proportion of students with a very positive attitude toward
academic achievement, and lower levels of parent involvement in
school activities (NCES 2002a).
International Comparisons of Mathematics and
Science Performance
Two international assessment programs collected data on student
performance in mathematics and science during the past decade. The
1995 Third International Mathematics and Science Study (TIMSS) involved
41 nations and studied the performance of fourth and eighth grade
students as well as students in their final year of secondary school
(12th grade in the United States). Four years later, a repeat study
focused on the performance of eighth graders (TIMSSR) in 38 countries.
In 2000, the Program for International Student Assessment (PISA),
organized by the Organisation for Economic Cooperation and Development
(OECD), assessed 15yearolds from 32 countries in reading, mathematics,
and science.
The design and purpose of the two assessment programs differ somewhat
(Nohara 2001). TIMSS and TIMSSR
measured students' mastery of curriculumbased scientific and mathematical
knowledge and skills. PISA assessed students' scientific and mathematical
"literacy," with the aim of understanding how well students
can apply scientific and mathematical concepts and thinking skills
to reallife challenges and nonschool situations. The TIMSS and
TIMSSR findings have been reported extensively, including in the
two most recent editions of Science and Engineering Indicators
(National Science Board 2000 and
2002). Therefore, this section only
briefly reviews the main findings from TIMSS and TIMSSR, and devotes
more coverage to the PISA findings.
Achievement of Fourth and Eighth Grade
U.S. Students on TIMSS and TIMSSR
In 1995, U.S. students performed slightly better than the international
average in mathematics and science in grade 4, but by grade 8, their
relative international standing had declined, and it continued to
erode through grade 12 (figure
18 ).
Of the 25 other countries participating in the fourth grade component
of the assessment, 12 had lower average mathematics scores than
the United States, 6 had equivalent average scores, and 7 had higher
average scores. In science, 19 countries had lower scores, 5 had
equivalent scores, and 1 had a higher score. Not all nations participated
in every aspect of the TIMSS assessment.
U.S. eighth graders scored below the international average in mathematics
but above the international average in science (NCES
1997b). However, nine countries outperformed the United States
compared with only one in the fourth grade science assessment.
The fourth and eighth grade results from the 1995 TIMSS study suggest
that U.S. students perform less well on international comparisons
as they advance through school. TIMSSR, by enabling comparisons
between the relative international standing of U.S. fourth grade
students in 1995 and U.S. eighth grade students 4 years later, tended
to confirm this interpretation (NCES
2000b).
Achievement of 12th Grade U.S. Students on
TIMSS
TIMSS assessed the mathematics and science performance of students
in their final year of secondary school (12th grade in the United
States).
It included a test of general knowledge of mathematics and science
for all students and a more specialized assessment for students
enrolled in advanced courses. U.S. 12th graders performed below
the 21country international average on the TIMSS test of general
knowledge in mathematics and science (NCES
1998).
U.S. students taking advanced mathematics and science courses also
did not fare well in comparison with their international counterparts.
The advanced mathematics assessment was administered to students
in 15 other countries who were taking or who had taken advanced
mathematics courses and to U.S. students who were taking or who
had taken precalculus, calculus, or Advanced Placement (AP) calculus.
Among students who participated in the advanced assessment, U.S.
students registered lower average scores compared with their international
counterparts, even though the United States tends to have fewer
young people taking advanced mathematics and science courses relative
to other countries. A total of 11 nations outperformed the United
States, and 4 nations scored similarly. No nation scored significantly
below the United States.
TIMSS administered the advanced science assessment, a physics assessment,
to students in 15 other countries who were taking science courses
and to U.S. students who were taking or had taken physics I and
II, advanced physics, or AP physics. U.S. students performed below
the international average, with 14 countries having average scores
higher than the United States, and 1, Australia, having an average
score equivalent to that of the United States.
Mathematics and Science Literacy of U.S. 15YearOlds
on PISA
OECD first conducted PISA in 2000 and plans two additional assessments
at 3year intervals (NCES 2001d).
Although PISA 2000 concentrated on reading, it did include some
mathematics and science items.
PISA aims to measure how well equipped students are for the future
by emphasizing items that have a realworld context. (See sidebar
"Sample Mathematics and Science Items From PISA.")
In both mathematics and science literacy, U.S. student performance
did not differ from the average performance of students in the other
OECD countries (appendix table
16
and 17 ).
Of the seven countries that had significantly higher average science
scores, all also had higher average mathematics scores (Australia,
Canada, Finland, Japan, New Zealand, South Korea, and the United
Kingdom). In addition, Switzerland significantly outperformed the
United States in mathematics. A common set of six countries had
average scores significantly lower than the United States in both
mathematics and science: Brazil, Greece, Latvia, Luxemburg, Mexico,
and Portugal.
Subgroup Differences in Mathematics and Science
Literacy
A recent report released by the U.S. Department of Education (NCES
2001d) considers PISA score differences by sex, parents' education,
parents' occupation, parents' national origin, and language spoken
in the home. Findings reveal no statistically significant sex difference
among U.S. 15yearolds in mathematics. This was also true for 16
other countries that participated in PISA; however, males outperformed
females in mathematics in 14 countries. In science literacy, male
and female students in the United States, as in most other nations,
performed equally well. This absence of sex differences in mathematics
and science literacy in the United States is generally consistent
with findings from the NAEP, TIMSS, and TIMSSR assessments, all
of which assess more curriculum and schoolbased achievement.
PISA also collected information on parents' education levels and
occupation, both of which have been linked to student achievement
(Coleman et al. 1966; NCES
2000b and 2001c; West,
Denton, and Reaney 2000; and Williams
et al. 2000). PISA data indicate that parents' education level
and occupation are more strongly associated with mathematics and
science literacy in the United States than in some other countries,
although links between parents' education level and student achievement
existed in all PISA countries (NCES
2001d). For example, in every country, students whose parents
have college degrees outperformed students whose parents did not
have a high school diploma. However, in only 12 of 29 countries,
including the United States, students whose parents graduated from
college scored higher in science literacy than students whose parents
completed high school but not college. In the remaining countries,
science performance did not differ between the subgroups of students
with these two levels of parental education. A stronger association
between parents' occupation and student mathematics and science
literacy existed in the United States compared with some other PISA
countries. In Finland, Iceland, Japan, Latvia, and South Korea,
the relationship between parents' occupation and mathematics and
science literacy was smaller than it is the United States; for mathematics,
the relationship was also smaller in Canada and Italy. No country
had a stronger relationship than the United States between parents'
occupation and student performance on PISA's mathematics and science
portions.
Students who are foreign born or who have foreignborn parents
face challenges in adjusting to a new country and a new school system.
According to PISA data, approximately 13 percent of U.S. students
have parents who were both born outside the United States. In about
half of the participating countries that reported this data (15
of 26), including the United States, students whose parents were
both nativeborn scored significantly higher in mathematics. In
the United States, no difference in science literacy by parent nativity
existed, although differences did exist in 17 of 26 participating
countries.
U.S. schools educate many students who speak a language other than
English at home. In 19 of the 28 nations that reported data on students'
home language, including the United States, students who spoke the
language of the assessment at home scored better in mathematics
literacy than students who did not. U.S. students registered a greater
difference in mathematics performance by home language than the
average OECD difference. In science, in 21 of 28 participating nations,
including the United States, students who spoke the language of
the assessment at home scored better than those who did not. Many
PISA items impose a fairly high reading (and sometimes writing)
load, which contributes to home language effects.
