- Mathematics and Science Performance During the Kindergarten Year
- Mathematics and Science Performance in Grades 4 and 8
- International Comparisons of Mathematics and Science Performance

Increasing overall student achievement, especially lifting the performance of low achievers, is an essential goal of education reform in the United States. Reform efforts center on improving student learning in mathematics and science because these fields are widely regarded as critical to the nation’s economy (Atkinson and Mayo 2010; President’s Council of Advisors on Science and Technology 2012). This section presents indicators of U.S. student performance in mathematics and science, beginning with a snapshot of the mathematics and science test scores of a recent cohort of U.S. kindergartners. It then presents long-term trends in the mathematics and science performance of U.S. fourth and eighth graders,^{[4]} examining more than two decades of changes in overall performance and in gaps between different groups. The section ends by placing U.S. student performance in an international context, comparing U.S. fourth and eighth graders’ mathematics and science test scores with those of their peers in other nations.

The Early Childhood Longitudinal Study, Kindergarten Class of 2010–11 (ECLS-K:2011) is a nationally representative, longitudinal study of children’s development, early learning, and school progress (Mulligan, Hastedt, and McCarroll 2012). The study began with approximately 18,200 children in kindergarten in fall 2010 and will follow and test them every year until spring 2016, when most of them are expected to be in fifth grade. The study gathers information from many sources, including the students themselves, their families, teachers, schools, and before- and after-school care providers. These data provide a wealth of information on children’s cognitive, social, emotional, and physical development; family and neighborhood environments; school conditions; and before- and after-school care. The longitudinal study design will enable research on how various family, school, community, and individual factors are associated with school performance over time. At the time this chapter was prepared, only data from the initial year of the study were available for analysis. This section, therefore, presents descriptive information on children when they enter school and their initial mathematics and science assessment results (mathematics and science assessment scores cannot be compared directly because scales are developed independently for each subject). This information will serve as a baseline for measuring students’ progress on future assessments as they advance through elementary school. Findings from these assessments will be presented in future editions of *Science and Engineering Indicators*.

**Demographic Profile of U.S. First-Time Kindergartners.** In fall 2010, about 3.5 million U.S. children entered kindergarten for the first time (Mulligan, Hastedt, and McCarroll 2012). Students in this cohort came from diverse backgrounds: about two-fifths of kindergartners (38%) had at least one parent with a bachelor’s degree or higher, 32% had parents who attended some college but did not earn a bachelor’s degree, and 29% had parents with no more than a high school education (appendix table ^{[5]} The following analysis examines the size and direction of achievement differences among different groups at the outset of formal schooling.

**Mathematics Performance.**^{[6]} Even as early as kindergarten, large gaps in mathematical understanding already exist among different subpopulations. Initial mathematics assessment scores varied by parental education level; for example, children whose parents had less than a high school education scored 15 points (on a scale of 0–75) below their peers whose parents attended a graduate or professional school (figure

By spring 2011, the overall average mathematics score of kindergartners had increased by 13 points, from 29 to 42, on the 0–75 scale (figure

**Science Performance.** Overall, kindergartners earned an average of 11 points (on a scale of 0–20) on their initial science assessment administered several months after the beginning of the school year (appendix table

Large gaps in student performance at the beginning of formal schooling suggest that nonschool factors play a big role in these disparities. Although a body of research has attempted to identify various factors underlying students’ achievement gaps, efforts have mostly focused on school-related factors such as teacher quality, available resources, principal leadership, and school climate, or such nonschool factors as sex, race and ethnicity, and family socioeconomic status (SES) (Coleman et al. 1966; Corcoran and Evans 2008; Fryer and Levitt 2004; Greenwald, Hedges, and Laine 1996; Hanushek and Rivkin 2006; Lamb and Fullarton 2002; Leonidas et al. 2010; OECD 2005; Rivkin, Hanushek, and Kain 2005). Researchers are now turning their attention to a broader range of nonschool factors beyond students’ demographic and socioeconomic backgrounds, and probing deeper into their roles in student achievement (Henig and Reville 2011) (see sidebar, “The Role of Nonschool Factors in Student Learning”).

The National Assessment of Educational Progress (NAEP), a congressionally mandated study, has monitored changes in U.S. students’ academic performance in mathematics, science, and other subjects since 1969 (NCES 2011a, 2012). NAEP has two assessment programs: the main NAEP and the NAEP Long-Term Trend (LTT).^{[7]} The main NAEP assesses national samples of fourth and eighth graders at regular intervals, and twelfth graders on an occasional basis. These assessments are updated periodically to reflect changes in curriculum standards. The NAEP LTT assesses the performance of students ages 9, 13, and 17. Its content framework has remained the same since it was first administered in 1969 in science and in 1973 in mathematics, permitting analyses of trends over more than three decades. This section examines recent performance results using the main NAEP data only. The most recent available findings based on NAEP LTT data have been reported in previous editions of *Science and Engineering Indicators*.^{[8]}

The main NAEP reports student performance in two ways: scale scores and achievement levels. Scale scores use a continuous scale to measure student learning. For mathematics assessments, scales range from 0 to 500 for grades 4 and 8 and from 0 to 300 for grade 12. For science assessments, scales range from 0 to 300 for all grades. Scores cannot be compared across subjects because NAEP scales are developed independently for each subject.

In addition to scale scores, NAEP reports student results in terms of achievement levels. Developed by the National Assessment Governing Board (NAGB), achievement levels are intended to measure the extent to which students’ actual achievement matches the achievement expected of them. Based on recommendations from panels of educators, policymakers, and the general public, NAGB sets three achievement levels for mathematics (NAGB 2010a), science (NAGB 2010b), and other subjects assessed by NAEP:

*Basic*denotes partial mastery of materials appropriate for the grade level.*Proficient*indicates solid academic performance.*Advanced*represents superior academic performance.

Based on their test scores, students’ performance can be categorized as* below basic, basic, proficient,* and *advanced*.^{[9]}Achievement levels cannot be compared across grade levels because they were developed independently at each grade level.^{[10]} Although the NAEP achievement levels can be helpful in understanding and interpreting student results and have been widely used by national and state officials, there is ongoing disagreement about whether they are appropriately defined (Harvey 2011). A study commissioned by the National Academy of Sciences judged the NAEP achievement levels to be “fundamentally flawed” (Pellegrino, Jones, and Mitchell 1999). In addition, the National Mathematics Advisory Panel concluded that NAEP scores for the two highest achievement categories (proficient and advanced) were set too high (NMAP 2008). Because of criticisms like these, NCES has recommended that achievement levels be used on a trial basis and interpreted with caution (NCES 2011a, 2012). The following review of NAEP results reports both average scale scores and the percentage of students performing at or above the proficient level.

**Average Score**. The average mathematics score of U.S. fourth graders increased by 27 points from 1990 to 2007, leveled off between 2007 and 2009, and then rose by 1 point from 2009 to 2011 (figure ^{[11]} across students at all performance levels (i.e., 10th to 90th percentiles^{[12]}), and among students at both public and private schools. For example, from 1990 to 2007, the fourth grade average mathematics score increased substantially—by 28 points for white students, 34 points for black students, 27 points for Hispanic students, and 28 points for Asian or Pacific Islander students (appendix table

Among U.S. eighth graders, the average mathematics score increased continuously from 1990 to 2011, with a total gain of 21 points over the period (figure ^{[13]}

**Achievement Level.** Trends in the percentages of fourth and eighth graders reaching the proficient level parallel the scale score trends (figure

In 2009, the framework for the main NAEP science assessment was significantly changed to reflect advances in science, curriculum standards, assessments, and research on science learning (NAGB 2010b). Because of these modifications, the results from the 2009 and 2011 assessments cannot be compared with those from the earlier assessments. Whereas the 2009 assessment included students in grades 4, 8, and 12, the 2011 assessment targeted students only in grade 8. This section, therefore, discusses the 2009 and 2011 assessment results for students in grade 8 only.^{[14]}

**Average Score****.** The average science score of eighth graders increased from 150 in 2009 to 152 in 2011 (figure ^{[15]}With a few exceptions (Asian or Pacific Islander students, high-performing students [at the 90 percentile], and private school students), most demographic groups improved their science scores during this period, with score gains ranging from 1 point for female students and white students to 3 points for black students, 4 points for low-income students, and 5 points for Hispanic students (appendix table

**Achievement Level.** Like scale scores, the percentage of eighth graders performing at or above the proficient level in science increased slightly from 30% in 2009 to 32% in 2011 (appendix table

Most performance gaps that existed in earlier years persisted in 2011, although none of these gaps have widened since 1990 (appendix tables

Large performance gaps existed among other groups. For both mathematics and science at grades 4 and 8, white and Asian or Pacific Islander students performed better than their black, Hispanic, or American Indian or Alaska Native counterparts (appendix tables ^{[16]}

Some gaps in mathematics and science scores have narrowed over time (table

In science, the eighth graders’ average score increased more for black students (3 points) and Hispanic students (5 points) than for white students (1 point) between 2009 and 2011, narrowing the white-black gap (especially among male students) and the white-Hispanic gap (among both male and female students) (table

Two international assessments—the Trends in International Mathematics and Sciences Study (TIMSS) and the Program for International Student Assessment (PISA)—compare U.S. students’ achievement in mathematics and science with that of students in other countries. These two assessments differ in several fundamental ways, including the purpose of the study, age of the students tested, test content, and the number of participating nations.^{[17]} Targeting students in grades 4 and 8 regardless of their age, the TIMSS tests focus on students’ application of skills and knowledge to tasks akin to those encountered in school. The PISA tests, in contrast, assess the abilities of 15-year-olds to apply mathematics and science skills and information to solve real problems they may face at work or in daily life. This section compares the mathematics and science performance of U.S. students with that of their counterparts in other countries using assessment data from the latest administration of TIMSS (2011). No new data from PISA were available for this volume. The most recent PISA results showed that U.S. 15-year-olds did not perform as well as their peers in many developed countries. In 2009, the U.S. average score ranked 18th in mathematics and 13th in science out of 34 Organisation for Economic Co-operation and Development (OECD) nations participating in the ^{[18]}

First conducted in 1995, TIMSS assesses the mathematics and science performance of fourth and eighth graders every 4 years. TIMSS has been administered five times, most recently in 2011. Over 20,000 students in more than 1,000 schools across the United States took the assessment in spring 2011, joining almost 500,000 other students from 62 countries and jurisdictions (Provasnik et al. 2012).

TIMSS is designed to test students’ knowledge of specific mathematics and science topics that are closely tied to the curricula of the participating education systems (Mullis et al. 2009). The assessment framework includes two dimensions: a content domain for the subject matter to be assessed within mathematics and science and a cognitive domain for the skills (e.g., knowing, applying, and reasoning) expected of students as they learn the mathematics or science content. Specifically, the content domain for fourth and eighth grade mathematics and science in TIMSS 2011 includes the following topics (see sidebar, “TIMSS 2011 Sample Items”):

Mathematics

- Number, Geometric Shapes and Measures, Data Display (Grade 4)
- Number, Algebra, Geometry, Data and Chance (Grade 8)

Science

- Life Science, Physical Science, Earth Science (Grade 4)
- Biology, Chemistry, Physics, Earth Science (Grade 8)

Within each topic in the content domain, students are assessed on several skills, including their knowledge of facts, concepts, and procedures; application of those facts, concepts, and procedures to solve problems; and reasoning (i.e., solving unfamiliar, complex, or multistep problems). Although the content differs for fourth and eighth graders, reflecting the nature and difficulty of the mathematics and science taught at each grade, the cognitive domain is the same for both grade levels and subjects. A more detailed discussion of the framework for the TIMSS 2011 mathematics and science assessments can be found at http://timssandpirls.bc.edu/timss2011/downloads/TIMSS2011_Frameworks.pdf.

**Performance on the 2011 TIMSS Mathematics Tests.** The U.S. average score on the 2011 TIMSS mathematics assessment was 541 at grade 4 and 509 at grade 8 (figure ^{[19]} Among 50 countries/jurisdictions that participated in the 2011 TIMSS mathematics assessment at grade 4, the U.S. average mathematics score was among the top 13 (seven scored higher; five did not differ), outperforming 37 countries/jurisdictions (appendix table ^{[20]} The top scorers—Singapore, Republic of Korea, and Hong Kong (China)—each had average scores above 600.

At grade 8, the U.S average mathematics score was below the scores of six countries/jurisdictions, not different from the scores of seven, and higher than those of 28, placing the United States among the top 14 in eighth grade mathematics. The average scores of students in the Republic of Korea, Singapore, and Taipei^{[21]} (the top three leaders) were at least 100 points higher than the average score of U.S. eighth graders (609–613 versus 509).

**Performance Trends.** Over the 16 years since the first TIMSS mathematics administration in 1995, U.S. fourth and eighth graders raised their scores and international ranking.^{[22]} At grade 4, the average mathematics score of 541 in 2011 was 23 points higher than the score of 518 in 1995 (figure

At grade 8, the U.S. average score of 509 in 2011 reflected a 17-point increase over the 1995 score (492) (figure

**Performance on the 2011 TIMSS Science Tests.** In 2011, the average science scores of both U.S. fourth and eighth grade students (544 and 525, respectively) were higher than the international TIMSS scale average (500) (figure

**Performance Trends.** In contrast to the mathematics trends, which showed significant improvement in both grades, the average scores of U.S. students on the TIMSS science assessment have remained flat since 1995 for fourth graders and improved 12 points for eighth graders (figure