Skip all navigation and go to page content.

Chapter 1. Elementary and Secondary Mathematics and Science Education

Student Learning in Mathematics and Science

Increasing overall student achievement, especially lifting the performance of low achievers, is an essential goal of education reform in the United States. Reform efforts center on improving student learning in mathematics and science because these fields are widely regarded as critical to the nation’s economy (Atkinson and Mayo 2010; President’s Council of Advisors on Science and Technology 2012). This section presents indicators of U.S. student performance in mathematics and science, beginning with a snapshot of the mathematics and science test scores of a recent cohort of U.S. kindergartners. It then presents long-term trends in the mathematics and science performance of U.S. fourth and eighth graders,[4] examining more than two decades of changes in overall performance and in gaps between different groups. The section ends by placing U.S. student performance in an international context, comparing U.S. fourth and eighth graders’ mathematics and science test scores with those of their peers in other nations.

Mathematics and Science Performance During the Kindergarten Year

The Early Childhood Longitudinal Study, Kindergarten Class of 2010–11 (ECLS-K:2011) is a nationally representative, longitudinal study of children’s development, early learning, and school progress (Mulligan, Hastedt, and McCarroll 2012). The study began with approximately 18,200 children in kindergarten in fall 2010 and will ­follow and test them every year until spring 2016, when most of them are expected to be in fifth grade. The study gathers information from many sources, including the students themselves, their families, teachers, schools, and before- and after-school care providers. These data provide a wealth of information on children’s cognitive, social, emotional, and physical development; family and neighborhood environments; school conditions; and before- and after-school care. The longitudinal study design will enable research on how various family, school, community, and individual factors are associated with school performance over time. At the time this chapter was prepared, only data from the initial year of the study were available for analysis. This section, therefore, presents descriptive information on children when they enter school and their initial mathematics and science assessment results (mathematics and science assessment scores cannot be compared directly because scales are developed independently for each subject). This information will serve as a baseline for measuring students’ progress on future assessments as they advance through elementary school. Findings from these assessments will be presented in future editions of Science and Engineering Indicators.

Demographic Profile of U.S. First-Time Kindergartners. In fall 2010, about 3.5 million U.S. children entered kindergarten for the first time (Mulligan, Hastedt, and McCarroll 2012). Students in this cohort came from diverse backgrounds: about two-fifths of kindergartners (38%) had at least one parent with a bachelor’s degree or higher, 32% had parents who attended some college but did not earn a bachelor’s degree, and 29% had parents with no more than a high school education (appendix table 1-1). About one-quarter of children were living in families with incomes below the federal poverty level (25%) or in single-parent households (22%). Fifteen percent of students came from families where the primary language used at home was not English. Nearly half (47%) were racial and ethnic minorities, with Hispanics being the largest minority group (24%), followed by blacks (13%) and Asians (4%).[5] The following analysis examines the size and direction of achievement differences among different groups at the outset of formal schooling.

Mathematics Performance.[6] Even as early as kindergarten, large gaps in mathematical understanding already exist among different subpopulations. Initial mathematics assessment scores varied by parental education level; for example, children whose parents had less than a high school education scored 15 points (on a scale of 0–75) below their peers whose parents attended a graduate or professional school (figure 1-1). Students from homes with a primary language other than English earned an average of 24 points on the initial mathematics test, compared with 30 points earned by those with a primary home language of English. Students from families with incomes below the federal poverty level scored 9 points below their peers from families with incomes at or above 200% of the federal poverty level. Those from single-parent households also did not perform as well as those from two-parent households (26 versus 31 points). The gaps were further evident among different racial and ethnic groups: black and Hispanic students lagged behind Asian students by 9 to 10 points and white students by 6 to 7 points.

By spring 2011, the overall average mathematics score of kindergartners had increased by 13 points, from 29 to 42, on the 0–75 scale (figure 1-1). All groups gained 12–13 points from fall 2010 to spring 2011. Although the performance gaps did not widen during this period, students’ initial exposure to formal schooling did not help narrow these gaps either.

Science Performance. Overall, kindergartners earned an average of 11 points (on a scale of 0–20) on their initial science assessment administered several months after the beginning of the school year (appendix table 1-1). Like in mathematics, variations in science performance among kindergartners with different characteristics were evident at this early stage of schooling, and the pattern of variations was largely similar. For example, science assessment scores increased with parental education level, with children whose parents had less than a high school education scoring 4 points below their peers whose parents attended a graduate or professional school (9 versus 13 points). Kindergartners from homes with a primary home language other than English earned an average of 9 points on the initial science assessment, compared with 12 points earned by those with a primary home language of English. Those from households with incomes below the federal poverty level also had lower scores than their peers from households with incomes at or above 200% of the federal poverty level (10 versus 13 points). Among all racial and ethnic groups, white children earned the highest average score (12 points), followed by American Indian or Alaska Native and Asian children (about 11 points for both groups); black and Hispanic children earned the lowest average score (about 10 points for both groups).

Large gaps in student performance at the beginning of formal schooling suggest that nonschool factors play a big role in these disparities. Although a body of research has attempted to identify various factors underlying students’ achievement gaps, efforts have mostly focused on school-related factors such as teacher quality, available resources, principal leadership, and school climate, or such nonschool factors as sex, race and ethnicity, and family socioeconomic status (SES) (Coleman et al. 1966; Corcoran and Evans 2008; Fryer and Levitt 2004; Greenwald, Hedges, and Laine 1996; Hanushek and Rivkin 2006; Lamb and Fullarton 2002; Leonidas et al. 2010; OECD 2005; Rivkin, Hanushek, and Kain 2005). Researchers are now turning their attention to a broader range of nonschool factors beyond students’ demographic and socioeconomic backgrounds, and probing deeper into their roles in student achievement (Henig and Reville 2011) (see sidebar, “The Role of Nonschool Factors in Student Learning”).

Mathematics and Science Performance in Grades 4 and 8

The National Assessment of Educational Progress (NAEP), a congressionally mandated study, has monitored changes in U.S. students’ academic performance in mathematics, science, and other subjects since 1969 (NCES 2011a, 2012). NAEP has two assessment programs: the main NAEP and the NAEP Long-Term Trend (LTT).[7] The main NAEP assesses national samples of fourth and eighth graders at regular intervals, and twelfth graders on an occasional basis. These assessments are updated periodically to reflect changes in curriculum standards. The NAEP LTT assesses the performance of students ages 9, 13, and 17. Its content framework has remained the same since it was first administered in 1969 in science and in 1973 in mathematics, permitting analyses of trends over more than three decades. This section examines recent performance results using the main NAEP data only. The most recent available findings based on NAEP LTT data have been reported in previous editions of Science and Engineering Indicators.[8]

Reporting Results for the Main NAEP

The main NAEP reports student performance in two ways: scale scores and achievement levels. Scale scores use a continuous scale to measure student learning. For mathematics assessments, scales range from 0 to 500 for grades 4 and 8 and from 0 to 300 for grade 12. For science assessments, scales range from 0 to 300 for all grades. Scores cannot be compared across subjects because NAEP scales are developed independently for each subject.

In addition to scale scores, NAEP reports student results in terms of achievement levels. Developed by the National Assessment Governing Board (NAGB), achievement levels are intended to measure the extent to which students’ actual achievement matches the achievement expected of them. Based on recommendations from panels of educators, policymakers, and the general public, NAGB sets three achievement levels for mathematics (NAGB 2010a), science (NAGB 2010b), and other subjects assessed by NAEP:

  • Basic denotes partial mastery of materials appropriate for the grade level.
  • Proficient indicates solid academic performance.
  • Advanced represents superior academic performance.

Based on their test scores, students’ performance can be categorized as below basic, basic, proficient, and advanced.[9]Achievement levels cannot be compared across grade levels because they were developed independently at each grade level.[10] Although the NAEP achievement levels can be helpful in understanding and interpreting student results and have been widely used by national and state officials, there is ongoing disagreement about whether they are appropriately defined (Harvey 2011). A study commissioned by the National Academy of Sciences judged the NAEP achievement levels to be “fundamentally flawed” (Pellegrino, Jones, and Mitchell 1999). In addition, the National Mathematics Advisory Panel concluded that NAEP scores for the two highest achievement categories (proficient and advanced) were set too high (NMAP 2008). Because of criticisms like these, NCES has recommended that achievement levels be used on a trial basis and interpreted with caution (NCES 2011a, 2012). The following review of NAEP results reports both average scale scores and the percentage of students performing at or above the proficient level.

Mathematics Performance from 1990 to 2011

Average Score. The average mathematics score of U.S. fourth graders increased by 27 points from 1990 to 2007, leveled off between 2007 and 2009, and then rose by 1 point from 2009 to 2011 (figure 1-2). This overall trend was reflected in almost all demographic groups,[11] across students at all performance levels (i.e., 10th to 90th percentiles[12]), and among students at both public and private schools. For example, from 1990 to 2007, the fourth grade average mathematics score increased substantially—by 28 points for white students, 34 points for black students, 27 points for Hispanic students, and 28 points for Asian or Pacific Islander students (appendix table 1-2). Average scores for these racial and ethnic groups remained unchanged between 2007 and 2009 and then increased by 1 or 2 points from 2009 to 2011.

Among U.S. eighth graders, the average mathematics score increased continuously from 1990 to 2011, with a total gain of 21 points over the period (figure 1-2). Although the scores of all demographic groups have improved substantially since 1990, not all groups have experienced this upward trend in recent years. For example, the average mathematics scores for male students, whites, Asians or Pacific Islanders, American Indians or Alaska Natives, and those attending private schools remained unchanged between 2009 and 2011 (appendix table 1-2). Groups that experienced score gains during this period included black female students (whose scores increased by 2 points), Hispanic male and female students (by 3 and 5 points, respectively), and low- or high-income students (by 2 and 3 points, respectively).[13]

Achievement Level. Trends in the percentages of fourth and eighth graders reaching the proficient level parallel the scale score trends (figure 1-3). The percentage of fourth graders performing at or above the proficient level increased steadily through 2007 and essentially leveled off from 2009 to 2011. Eighth graders overall showed continuous improvement from 1990 to 2011, though the improvement did not persist for some groups during recent years (appendix table 1-3). Furthermore, despite overall upward trends, the actual percentage of students reaching the proficient level in mathematics remained well below half—in 2011, 40% of fourth graders and 35% of eighth graders performed at or above this level.

Science Performance from 2009 to 2011

In 2009, the framework for the main NAEP science assessment was significantly changed to reflect advances in science, curriculum standards, assessments, and research on science learning (NAGB 2010b). Because of these modifications, the results from the 2009 and 2011 assessments cannot be compared with those from the earlier assessments. Whereas the 2009 assessment included students in grades 4, 8, and 12, the 2011 assessment targeted students only in grade 8. This section, therefore, discusses the 2009 and 2011 assessment results for students in grade 8 only.[14]

Average Score. The average science score of eighth graders increased from 150 in 2009 to 152 in 2011 (figure 1-4).[15]With a few exceptions (Asian or Pacific Islander students, high-performing students [at the 90 percentile], and private school students), most demographic groups improved their science scores during this period, with score gains ranging from 1 point for female students and white students to 3 points for black students, 4 points for low-income students, and 5 points for Hispanic students (appendix table 1-4).

Achievement Level. Like scale scores, the percentage of eighth graders performing at or above the proficient level in science increased slightly from 30% in 2009 to 32% in 2011 (appendix table 1-5). Despite this improvement, the majority of students performed below the proficient level on the science assessment in both years. In 2011, for example, 68% of eighth graders failed to reach the proficient level in science. The percentage who scored below this level was especially high among black and Hispanic students (90% and 84%, respectively), particularly among female students in both groups (91% and 87%, respectively).

Changes in Performance Gaps in Mathematics and Science

Most performance gaps that existed in earlier years persisted in 2011, although none of these gaps have widened since 1990 (appendix tables 1-2 and 1-4). Overall, sex differences were small, with male students performing slightly better than female students in mathematics and science. Differences between male and female students, however, were not consistent across racial and ethnic groups. Although eighth grade white male students in 2011 had higher mathematics scores than their female counterparts (295 versus 292), similar sex differences were not observed among Hispanic, Asian or Pacific Islander, and American Indian or Alaska Native students (figure 1-5). Among black eighth graders, the gap was reversed: female students performed slightly better than male students (264 versus 261).

Large performance gaps existed among other groups. For both mathematics and science at grades 4 and 8, white and Asian or Pacific Islander students performed better than their black, Hispanic, or American Indian or Alaska Native counterparts (appendix tables 1-2 and 1-4). Students from higher-income families also had higher scores in mathematics and science than those from lower-income families. Gaps were observed by school type as well, with private school students scoring higher than public school students.[16]

Some gaps in mathematics and science scores have narrowed over time (table 1-2). In mathematics, gap reductions occurred among fourth grade students but not among eighth grade students. Specifically, the 32-point white-black gap in mathematics performance among fourth grade students decreased to 25 points between 1990 and 2011 because of larger gains by black students (figure 1-6). The reduction in the white-black gap occurred among both male and female fourth graders (table 1-2; appendix table 1-2). Further, the fourth graders’ score at the 10th percentile rose more than did the score at the 90th percentile, reducing the gap between low- and high-performing students from 82 to 73 points between 1990 and 2011. None of these gap reductions was observed among eighth grade students, however.

In science, the eighth graders’ average score increased more for black students (3 points) and Hispanic students (5 points) than for white students (1 point) between 2009 and 2011, narrowing the white-black gap (especially among male students) and the white-Hispanic gap (among both male and female students) (table 1-2; appendix table 1-4). Finally, the eighth graders’ science score at the 10th percentile rose faster than that at the 90th percentile, reducing the gap between low- and high-performing students from 89 to 87 points.

International Comparisons of Mathematics and Science Performance

Two international assessments—the Trends in International Mathematics and Sciences Study (TIMSS) and the Program for International Student Assessment (PISA)—compare U.S. students’ achievement in mathematics and science with that of students in other countries. These two assessments differ in several fundamental ways, including the purpose of the study, age of the students tested, test content, and the number of participating nations.[17] Targeting students in grades 4 and 8 regardless of their age, the TIMSS tests focus on students’ application of skills and knowledge to tasks akin to those encountered in school. The PISA tests, in contrast, assess the abilities of 15-year-olds to apply mathematics and science skills and information to solve real problems they may face at work or in daily life. This section compares the mathematics and science performance of U.S. students with that of their counterparts in other countries using assessment data from the latest administration of TIMSS (2011). No new data from PISA were available for this volume. The most recent PISA results showed that U.S. 15-year-olds did not perform as well as their peers in many developed countries. In 2009, the U.S. average score ranked 18th in mathematics and 13th in science out of 34 Organisation for Economic Co-operation and Development (OECD) nations participating in the assessment.[18]

First conducted in 1995, TIMSS assesses the mathematics and science performance of fourth and eighth graders every 4 years. TIMSS has been administered five times, most recently in 2011. Over 20,000 students in more than 1,000 schools across the United States took the assessment in spring 2011, joining almost 500,000 other students from 62 countries and jurisdictions (Provasnik et al. 2012).

TIMSS is designed to test students’ knowledge of specific mathematics and science topics that are closely tied to the curricula of the participating education systems (Mullis et al. 2009). The assessment framework includes two dimensions: a content domain for the subject matter to be assessed within mathematics and science and a cognitive domain for the skills (e.g., knowing, applying, and reasoning) expected of students as they learn the mathematics or science content. Specifically, the content domain for fourth and eighth grade mathematics and science in TIMSS 2011 includes the following topics (see sidebar, “TIMSS 2011 Sample Items”):


  • Number, Geometric Shapes and Measures, Data Display (Grade 4)
  • Number, Algebra, Geometry, Data and Chance (Grade 8)


  • Life Science, Physical Science, Earth Science (Grade 4)
  • Biology, Chemistry, Physics, Earth Science (Grade 8)

Within each topic in the content domain, students are assessed on several skills, including their knowledge of facts, concepts, and procedures; application of those facts, concepts, and procedures to solve problems; and reasoning (i.e., solving unfamiliar, complex, or multistep problems). Although the content differs for fourth and eighth graders, reflecting the nature and difficulty of the mathematics and science taught at each grade, the cognitive domain is the same for both grade levels and subjects. A more detailed discussion of the framework for the TIMSS 2011 mathematics and science assessments can be found at

Mathematics Performance of U.S. Students in Grades 4 and 8 on TIMSS

Performance on the 2011 TIMSS Mathematics Tests. The U.S. average score on the 2011 TIMSS mathematics assessment was 541 at grade 4 and 509 at grade 8 (figure 1-7). Both scores were higher than the international TIMSS average, which is set to 500 at both grades.[19] Among 50 countries/jurisdictions that participated in the 2011 TIMSS mathematics assessment at grade 4, the U.S. average mathematics score was among the top 13 (seven scored higher; five did not differ), outperforming 37 countries/jurisdictions (appendix table 1-6).[20] The top scorers—Singapore, Republic of Korea, and Hong Kong (China)—each had average scores above 600.

At grade 8, the U.S average mathematics score was below the scores of six countries/jurisdictions, not different from the scores of seven, and higher than those of 28, placing the United States among the top 14 in eighth grade mathematics. The average scores of students in the Republic of Korea, Singapore, and Taipei[21] (the top three leaders) were at least 100 points higher than the average score of U.S. eighth graders (609–613 versus 509).

Performance Trends. Over the 16 years since the first TIMSS mathematics administration in 1995, U.S. fourth and eighth graders raised their scores and international ranking.[22] At grade 4, the average mathematics score of 541 in 2011 was 23 points higher than the score of 518 in 1995 (figure 1-8). Not only did U.S. fourth graders’ mathematics scores increase but also the U.S. position relative to other nations climbed from 1995 to 2011. Among the 17 countries that participated in both the 1995 and 2011 TIMSS mathematics assessment of fourth graders, 7 outscored the United States in 1995 compared with 4 in 2011 (Provasnik et al. 2012).

At grade 8, the U.S. average score of 509 in 2011 reflected a 17-point increase over the 1995 score (492) (figure 1-8). The relative standing of U.S eighth graders’ mathematics performance has also improved over this time period: among the 16 countries that participated in both the 1995 and 2011 TIMSS mathematics assessment of eighth graders, 5 outperformed the United States in 2011, down from 8 in 1995 (Provasnik et al. 2012).

Science Performance of U.S. Students in Grades 4 and 8 on TIMSS

Performance on the 2011 TIMSS Science Tests. In 2011, the average science scores of both U.S. fourth and eighth grade students (544 and 525, respectively) were higher than the international TIMSS scale average (500) (figure 1-9). At grade 4, the United States was among the top seven countries/jurisdictions, outperforming 43 among a total of 50 participants (appendix table 1-7). Students in Republic of Korea, Singapore, Finland, Japan, Russian Federation, and Taipei outscored students in the United States (552–587 versus 544). At grade 8, the U.S. average science score of 525 was lower than those of 8 countries/jurisdictions, higher than those of 29, and not measurably different from those of the remaining 4.

Performance Trends. In contrast to the mathematics trends, which showed significant improvement in both grades, the average scores of U.S. students on the TIMSS science assessment have remained flat since 1995 for fourth graders and improved 12 points for eighth graders (figure 1-8). U.S. fourth and eighth graders have not improved their international position. Among 17 countries and jurisdictions that participated in both the 1995 and 2011 fourth grade TIMSS science assessments, 3 outscored the United States in 2011 compared with 2 in 1995; at grade 8, the number scoring higher than the United States was 6 in both years (Provasnik et al. 2012).

[4] No new assessment data on high school students were available at the time this chapter was prepared. The 2012 volume of Science and Engineering Indicators (NSB 2012) contains recent trend data on mathematics and science performance of students in grade 12.
[5] Asians and Pacific Islanders are combined into one category in some indicators for which the data were not collected separately for the two groups.
[6] Mathematics assessments were administered in fall 2010 and spring 2011. These assessments were designed to measure students’ conceptual knowledge, procedural knowledge, and problem-solving skills and included questions on number sense, properties, and operations; measurement; geometry and spatial sense; data analysis, statistics, and probability; and pre-algebra skills (Mulligan, Hastedt, and McCarroll 2012). Although the assessments included largely items related to students’ knowledge at the kindergarten level, easier and more difficult items were included to measure the achievement of students performing below or above grade level. Some students who spoke a language other than English or Spanish at home did not participate in mathematics assessments because of low English proficiency. Because the ECLS-K:2011 is a longitudinal study, the assessments were developed to measure the growth in performance of children from kindergarten entry through fifth grade.
[7] These two NAEP assessment programs differ in many respects, including samples of students and assessment times, instruments, and contents. See
[8] The 2010 volume reviewed long-term trends in mathematics from 1973 to 2008, and the 2004 volume examined trends in science from 1969 to 1999. The long-term trend assessments in mathematics were administered again in 2012 and are not yet available; no long-term trend assessments in science have been conducted since 1999.
[9] Students in the below-basic category have scores that are lower than the minimum score for the basic level. Students in the basic category have scores that are at or above the minimum score for the basic level but lower than the minimum score for the proficient level. Students in the proficient category have scores that are at or above the minimum score for the proficient level but lower than the minimum score for the advanced level. Students in the advanced category have scores that are at or above the minimum score for the advanced level.
[10] See NAEP’s mathematics and science achievement levels defined by grade at and
[11] Estimates for long-term trends could not be performed for American Indian or Alaska Native students because of unavailable data in the 1990s.
[12] Percentiles are scores below which the scores of a specified percentage of the population fall. For example, among fourth graders in 2011, the 10th percentile score for mathematics was 203. This means that 10% of fourth graders had mathematics scores at or below 203, and 90% scored above 203. The scores at various percentiles indicate students’ performance levels.
[13] Students’ eligibility for free/reduced-price lunch is often used as a proxy measure of family poverty. In this chapter, students who are eligible for free/reduced-price lunch are considered to come from low-income families.
[14] For fourth and twelfth graders’ science assessment results in 2009, see Science and Engineering Indicators 2012 (NSB 2012:1-13). For results from administration years prior to 2009, see Science and Engineering Indicators 2008 (NSB 2008:1-13–1-14).
[15] The substantive implication of this small increase will be clearer when more assessment data are available for analysis in the future.
[16] Differences in performance between public and private school students reflect in part different types of students enrolled in public and private schools and differences in the availability of resources, admissions policies, level of parental involvement, and school conditions.
[17] For detailed comparisons between PISA and TIMSS, see Science and Engineering Indicators 2010 (NSB 2010:1–16).
[18] For more information about the PISA results, see Science and Engineering Indicators 2012 (NSB 2012:1-14–1-16).
[19] The scores are reported on a scale from 0 to 1,000, with the TIMSS scale average set at 500 and the standard deviation set at 100.
[20] The TIMSS results presented in this report exclude individual U.S. states, Canadian provinces, and Dubai and Abu Dhabi in the United Arab Emirates. These states/provinces participated in 2011 TIMSS as “benchmarking participants” in order to assess the comparative international standing of their students’ achievement and to view their curriculum and instruction in an international context.
[21] Taipei is the capital city of Taiwan.
[22] The TIMSS scale for each subject and grade originally was established to have a mean of 500 as the average of all of the countries and jurisdictions that participated in TIMSS 1995. TIMSS assessments since then have scaled the achievement data so that scores are comparable from assessment to assessment. Thus, for example, a score of 500 in fourth grade mathematics in 2011 is equivalent to a score of 500 in fourth grade mathematics in 1995, 1999, 2003, or 2007.