In spring 2007, an estimated 1.5 million students ages 5–17 were homeschooled in the United States, accounting for about 3% of the K–12 student population at that time (Bielick 2008). Trends from 1999 to 2007 show a 74% increase in the number of homeschooled students over this 8-year period. Homeschooled students in these estimates are defined as those who are schooled at home for at least part of their education and whose enrollment in public or private school does not exceed 25 hours per week.
The decentralized nature of the homeschooled population limits researchers' ability to collect nationally representative data on these students' achievement and other outcomes. A growing number of families choose to homeschool their children, and many states are drafting and implementing regulations for homeschooling (Belfield 2004; Lines 2003; Lips and Feinberg 2008). Thus far, however, no national data allow researchers to examine homeschooled students' involvement with mathematics and science courses or to compare their achievement with that of students who attend public or private schools (Lips and Feinberg 2008).
Back to top
ECLS-K measures student proficiency at nine specific mathematics skill levels. These skill levels, which were identified based on frameworks from other national assessments and advice from a panel of education experts, represent a progression of mathematics skills and knowledge. Levels 6, 7, and 8 were first assessed in third grade, and level 9 was first assessed in fifth grade. By the fifth grade, levels 1 through 4 were not assessed. Each level is labeled by the most sophisticated skill in the set (Princiotta, Flanagan, and Germino Hausken 2006; West, Denton, and Reaney L 2000):
Level 1, Number and shape: Recognize single-digit numbers and shapes.
Level 2, Relative size: Count beyond 10, recognize the sequence in basic patterns, and compare the relative size and dimensional relationship of objects.
Level 3, Ordinality and sequence: Recognize two-digit numbers, identify the next number in a sequence, identify the ordinal position of an object, and solve simple word problems.
Level 4, Add and subtract: Solve simple addition and subtraction items and identify relationships of numbers in sequence.
Level 5, Multiply and divide: Perform basic multiplication and division and recognize more complex number patterns.
Level 6, Place value: Demonstrate understanding of place value in integers to the hundreds place.
Level 7, Rate and measurement: Use knowledge of measurement and rate to solve word problems.
Level 8, Fractions: Solve problems using fractions.
Level 9, Area and volume: Solve problems using area and volume.
Back to top
The National Assessment of Educational Progress (NAEP) assessments use frameworks developed by educators, policymakers, assessment and curriculum experts, and skilled practitioners (e.g., mathematicians) in a consensus-oriented process. Frameworks define what students should know at a given grade level and provide a blueprint for the assessment (Lee, Grigg, and Dion 2007). Once developed, the frameworks are reviewed and approved by the National Assessment Governing Board (NAGB). NAGB then defines three performance levels for each grade. The basic level indicates partial mastery of material appropriate for the grade level, proficient indicates solid academic performance, and advanced indicates superior performance. (Students in the basic category have scores at or above the minimum score for basic but lower than the minimum for proficient.) For more detailed definitions of the NAEP proficiency levels, see Science and Engineering Indicators 2006, pp. 1-13 and 1-14 (NSB 2006).
Although experts have approved the NAEP tests as measuring achievement with sufficient accuracy (for example, see Daro et al. 2007), some have disagreed about whether the NAEP proficiency levels are appropriately defined. The National Mathematics Advisory Panel concluded in its 2008 study, "On the basis of international performance data, there are indications that the NAEP cut scores for the two highest performance categories [the proficient and advanced levels] are set too high" (NMP 2008). An earlier study commissioned by the National Academy of Sciences concluded that the process used to set these levels was "fundamentally flawed" (Pellegrino, Jones, and Mitchell 1999). NAGB acknowledges the controversy surrounding the proficiency levels (Bourque and Byrd 2000) and warns data users to interpret findings related to these levels with caution (NCES 2006b). Some of the disagreement may stem from different understandings of the word proficient, as well as differing expectations about what students should be able to do at particular grade levels. (Mandated statewide achievement tests reflect a range of these expectations; see sidebar "NAEP Contrasted With State Achievement Tests" for more information.)
Back to top
Provisions in the No Child Left Behind (NCLB) Act require states to test students annually in mathematics, science, and English in grades 3–8 and once in high school. These test results, in particular the percentage of students reaching the proficient level, must be reported to the U.S. Department of Education, and schools must show adequate gains every year toward the goal of 100% of students reaching proficient. However, states are free to create their own tests and set the minimum score for proficient wherever they want. The incentives of these accountability systems may push policymakers to use easy tests, set the minimum proficient score low, or both (Cronin et al. 2007; Peterson and Hess 2008; Loveless 2006). The requirement for steady improvement encourages making the tests easier to pass over time, thus boosting apparent achievement. State standards vary widely in difficulty, content coverage, and the minimum scores set for reaching proficient. This variation has prevented making valid comparisons of student achievement across states using state test scores.
To address this problem, a recent study converted the proficient cutoff scores that states set for their own fourth and eighth grade mathematics tests to the 2005 National Assessment of Educational Progress (NAEP) scale (NCES 2007b). For example, for a state that rated 75% of its students proficient on the state's test, researchers assigned that state's "NAEP-equivalent score" as the NAEP score at the 75th percentile on NAEP's test for the same subject and grade level. After converting score data from all available states to this NAEP metric, the study found that, in most states, students who reached the proficient level on state tests had reached NAEP's basic level, and many states' average scores were below basic. In addition, the range of states' proficiency standards was 55 NAEP score points at grade 4 and 81 at grade 8. To put those ranges in context, they are roughly two to three times the difference between black and white students' scores on recent NAEP mathematics assessments.
To document progress in achievement for NCLB, a federal regulation issued in late 2008 (Title I—Improving the Academic Achievement of the Disadvantaged, Final Rule (73 Fed. Reg. 64435 ) adds another requirement: states must report their students' NAEP test scores along with state test results for the same grade and subject. These data will provide information for observers interested in state-by-state comparisons as well as the overall range of achievement across states.
Back to top
Several primary differences in the design and purpose of these two assessments likely contribute to the differing U.S. results: age of students tested, test content, and participating nations.
First, the Trends in International Mathematics and Science Study (TIMSS) tests the mathematics and science achievement of students in grades 4 and 8, regardless of their age. The Program for International Student Assessment (PISA) assesses the performance of secondary school students by sampling 15-year-olds, who are nearing the age when compulsory schooling ends in many countries. The divergent international results shown here are consistent with differences by age in the main National Assessment of Educational Progress (NAEP) results: U.S. 12th graders have generally shown flat or even declining achievement over time, whereas younger students, particularly 4th graders, have demonstrated steadily rising scores (NSB 2008). Similar patterns from the NAEP Long-Term Trend assessment are described in "Long-Term Trends in Mathematics Performance."
A second difference between TIMSS and PISA is how closely they adhere to the mathematics and science curriculums used for instruction in various countries. TIMSS focuses on application of familiar skills and knowledge emphasized often in classrooms. (Content experts and teachers from various countries select elements of curriculums common to most participating nations.) The PISA tests, in contrast, emphasize students' abilities to apply skills and information learned in school (or from life experiences) to solve problems or make decisions they may face at work or other circumstances. PISA test questions tend to deemphasize factual recall and demand more complex reasoning and problem-solving skills than those in TIMSS (Neidorf et al. 2006; Loveless 2009), requiring students to apply logic, synthesize information, and communicate solutions clearly. Curriculums and teaching methods may vary in their emphasis on these skills. (See sidebar "Sample Items From TIMSS and PISA Assessments" for examples of science test questions included on the two assessments.)
A third main difference between the two assessments is the number of participating countries and their levels of economic development. Countries participating in TIMSS form a large and diverse group: some highly industrialized nations and many developing ones, the latter of which have been growing in number over time. In contrast, nearly all countries that participated in PISA were members of OECD and thus are economically advanced nations. The international comparisons here were limited to nations that are current or likely potential competitors with the United States in scientific and technical fields, however. This restriction increased the overlap between nations taking both tests and excluded nearly all developing nations from these analyses. Restricting to this group of selected nations prevents the increasing number of developing countries from artificially inflating the United States' standing relative to other nations over time, particularly in TIMSS.
Back to top
Sample Science Items from Trends in
International Mathematics and Science
Study (TIMSS) Tests (for Eighth Graders)
1) Food and oxygen are produced during photosynthesis in green plants. Chlorophyll is one thing that is needed for photosynthesis. Name two more factors that are needed for photosynthesis.
Correct answer: Sunlight and carbon dioxide.
Difficulty level: High international benchmark (550)
2a) The diagram shows what happens to three magnets when they are placed close together on a pencil. Magnets X and Y move until they touch each other, but magnets Y and Z remain separated. Explain why magnets X and Y touch each other.
Correct answer: Because north and south poles were facing each other.
2b) Explain why magnets Y and Z remain separated.
Correct answer: Because they may have had south and south or north and north facing each other.
Difficulty level: Advanced international benchmark (625)
Additional sample questions: http://timss.bc.edu/TIMSS2007/items.html.
Sample Science Items from Program for
International Student Assessment (PISA)
Tests (for 15-Year-Olds)
1) Statues called Caryatids were built on the Acropolis in Athens more than 2,500 years ago. The statues are made of a type of rock called marble. Marble is composed of calcium carbonate. In 1980, the original statues were transferred inside the museum of the Acropolis and were replaced by replicas. The original statues were being eaten away by acid rain.
1a) Normal rain is slightly acidic because it has absorbed some carbon dioxide from the air. Acid rain is more acidic than normal rain because it has absorbed gases like sulfur oxides and nitrogen oxides as well. Where do these sulfur oxides and nitrogen oxides in the air come from?
Correct answer: For full credit, students needed to include one or more major sources: car exhausts, factory emissions, burning fossil fuels such as oil and coal, gases from volcanoes, or "burning of materials that contain sulphur and nitrogen." Answers that mention one actual source and one incorrect source (such as nuclear power plants) received only partial credit.
Difficulty level: 506
1b) The effect of acid rain on marble can be modeled by placing chips of marble in vinegar overnight. Vinegar and acid rain have about the same acidity level. When a marble chip is placed in vinegar, bubbles of gas form. The mass of the dry marble chip can be found before and after the experiment. A marble chip has a mass of 2.0 grams before being immersed in vinegar overnight. The chip is removed and dried the next day.
What will the mass of the dried marble chip be?
A. Less than 2.0 grams
B. Exactly 2.0 grams
C. Between 2.0 and 2.4 grams
D. More than 2.4 grams
Correct answer: A
Difficulty level: 460
1c) Students who did this experiment also placed marble chips in pure (distilled) water overnight. Explain why the students included this step in their experiment.
Correct answer: For full credit, students needed to explain that the acid in vinegar dissolves some of the marble just like acid in acid rain does, and that distilled water does not dissolve marble because it's much less acidic (the water test is a control).
Difficulty level: 717 for full-credit answers, 513 for partial credit
Additional sample questions: http://www.pisa.oecd.org/dataoecd/13/33/38709385.pdf (for science) and http://www.oecd.org/dataoecd/14/10/38709418.pdf (for mathematics).
Back to top
Massachusetts and Minnesota participated in a special benchmarking study included in the Trends in International Mathematics and Science Study (TIMSS) 2007, along with three Canadian provinces, the city of Dubai, and one region of Spain. Results for these entities were compared with those for all participating nations. These two states, particularly Massachusetts, are among the higher-scoring states on the National Assessment of Educational Progress (NAEP), and thus provide some insight into how some of the best students in the United States compare with their competitors in other nations.
In mathematics, Massachusetts fourth graders scored 572, far above the scale average of 500, and in third place after only two jurisdictions, Hong Kong and Singapore (Mullis et al. 2008). Massachusetts' average score was equivalent to scores in Chinese Taipei and Japan. Minnesota scored slightly lower (554), below only four Asian leaders and on par with Kazakhstan, England, and the Russian Federation. At grade 8, both U.S. states (at 547 and 532, respectively) scored below the five leading Asian nations but above all other participants, including European nations.
In fourth grade science, Massachusetts ranked second with its score of 571, after only Singapore (Martin et al. 2008). Minnesota also performed well (551), bested only by Massachusetts and Singapore and scoring on par with eight jurisdictions (including the United States overall) but above all the rest. Massachusetts eighth graders' science score (556) was similar to the four leading Asian economies' scores (Singapore, Chinese Taipei, Japan, and Republic of Korea) and higher than scores from all other participants. At grade 8, Minnesota (at 539) was outscored by the four top Asian countries but performed similarly to a group that included Hong Kong and several high-scoring European nations.
Back to top
To compare the performance of other countries' students with the achievement standards set for the National Assessment of Educational Progress (NAEP), a series of studies has used various statistical methods to project the results of one assessment into the scale of the other (Beaton and Gonzales 1993; Johnson and Siegendorf 1998; Johnson et al. 2005; Pashley and Phillips 1993). In the most recent of these studies (Phillips 2007), scores from the Trends in International Mathematics and Science Study (TIMSS) eighth grade mathematics and science assessments in 1999 and 2003 were translated into the 2000 NAEP eighth grade performance levels using data from a sample of students who participated in both the 1999 TIMSS and 2000 NAEP assessments.
NAEP results have long demonstrated that a minority of U.S. students reaches the proficient level of performance as defined in NAEP. The linked TIMSS data for both years were not only consistent with this finding in both mathematics and science, but showed that few countries' students met the standard for achievement set by NAEP's proficient criterion. In math, six countries had an average score that met this criterion in 1999, and five countries did so in 2003. In science, two countries' average scores fell within the proficient level in each year.
Back to top
Local Systemic Change (LSC) Through Teacher Enhancement is a teacher professional development program that aims to improve K–12 instruction in science, mathematics, and technology. LSC embraces many characteristics of effective professional development—it requires all mathematics and science teachers from schools or districts to participate in a minimum of 130 hours of professional development over the course of the project, provides ongoing support during the school year, adopts a range of formats, emphasizes subject content and pedagogy, offers active learning opportunities, and promotes efforts to build a supportive environment for change.
A decade's worth of data indicate that LSC has had a positive impact in many areas, including teachers' attitudes toward reform-oriented teaching, perceptions of their pedagogical preparedness, and adoption of reform-oriented teaching practices in the classroom; the quality of instruction delivered to students; and student achievement, attitudes, and coursetaking patterns in mathematics and science (Banilower et al. 2006; Heck, Rosenberg, and Crawford 2006; Shimkus and Banilower 2004). Moreover, the program's impact increased as teachers accrued more professional development hours (although there appeared to be a limited increase in impact beyond 80 hours).
National and international organizations endorse technology as both a tool for instruction in various academic subjects and an important area in which K–12 students should achieve some competency. The National Governors Association (NGA) recently argued that the prevalence of technology in most professions requires all students to have a strong foundation in using technology—along with other science, technology, engineering, and mathematics (STEM) competencies—to compete in a 21st century economy (NGA 2007). Several organizations, including the International Technology Education Association (ITEA), have developed technology standards outlining what students should know about various types of technologies, the concepts behind them, and their significance to society (ITEA 2007). The International Society for Technology in Education (ISTE) has developed technology standards for teachers and students that have been widely adopted by states and districts interested in integrating technology into their educational goals (ISTE 2007; Trotter 2009). Beginning in 2012, the National Assessment of Educational Progress (NAEP) will pilot a computer-based evaluation of students' understanding of all technologies, including their information technology literacy (Kerr 2008).
State policies also reflect a growing emphasis on K–12
students' learning about technology and the use of technology
in education (table
Historically, state education agencies have used different methods for estimating graduation rates, rendering state-by-state comparisons problematic. Experts disagree on the best method to calculate these rates, but there is wide consensus that the current calculations are badly flawed (Greene 2002; Swanson and Chapman 2003; NGA 2005).
To facilitate comparability, the National Governors Association (NGA) endorsed an adjusted cohort method in 2005, and all 50 governors agreed to work toward implementing that method (NGA 2005). Using this method, the high school graduation rate is calculated by dividing the number of graduates in a given year by the number of students who entered ninth grade 4 years earlier, adjusting the denominator for migration into and out of the state over those 4 years.
States require substantial time and funding to develop data collection, storage, and analysis procedures before they can use this method. In 2008, 16 states were calculating graduation rates using the cohort method (NGA 2008). Another 29 states planned to implement data procedures to enable such reporting by 2012. The remaining 5 states either lacked necessary data capacity or had no plans in mid-2008 to calculate rates according to this formula.
In 2008, the U.S. Department of Education directed states to use a cohort method that tracks individual students, beginning with reports for academic year 2010–11.* The following year, states must include this graduation rate as one of the measures used to document adequate yearly progress for schools that include 12th grade.
*Title I—Improving the Academic Achievement of the Disadvantaged, Final Rule (73 Fed. Reg. 64435 ).