PART I.
Introduction to Mixed Method Evaluations

Chapter 1: Introducing This Handbook

The Need for a Handbook on Designing and Conducting Mixed Method Evaluations

Evaluation of the progress and effectiveness of projects funded by the National Science Foundation’s (NSF) Directorate for Education and Human Resources (EHR) has become increasingly important. Project staff, participants, local stakeholders, and decisionmakers need to know how funded projects are contributing to knowledge and understanding of mathematics, science, and technology. To do so, some simple but critical questions must be addressed:

What are we finding out about teaching and learning?
How can we apply our new knowledge?
Where are the dead ends?
What are the next steps?

Although there are many excellent textbooks, manuals, and guides dealing with evaluation, few are geared to the needs of the EHR grantee who may be an experienced researcher but a novice evaluator. One of the ways that EHR seeks to fill this gap is by the publication of what have been called "user-friendly" handbooks for project evaluation.

The first publication, User-Friendly Handbook for Project Evaluation: Science, Mathematics, Engineering and Technology Education, issued in 1993, describes the types of evaluations principal investigators/project directors (PIs/PDs) may be called upon to perform over the lifetime of a project. It also describes in some detail the evaluation process, which includes the development of evaluation questions and the collection and analysis of appropriate data to provide answers to these questions. Although this first handbook discussed both qualitative and quantitative methods, it covered techniques that produce numbers (quantitative data) in greater detail. This approach was chosen because decisionmakers usually demand quantitative (statistically documented) evidence of results. Indicators that are often selected to document outcomes include percentage of targeted populations participating in mathematics and science courses, test scores, and percentage of targeted populations selecting careers in the mathematics and science fields.

The current handbook, User-Friendly Guide to Mixed Method Evaluations, builds on the first but seeks to introduce a broader perspective. It was initiated because of the recognition that by focusing primarily on quantitative techniques, evaluators may miss important parts of a story. Experienced evaluators have found that most often the best results are achieved through the use of mixed method evaluations, which combine quantitative and qualitative techniques. Because the earlier handbook did not include an indepth discussion of the collection and analysis of qualitative data, this handbook was initiated to provide more information on qualitative techniques and discuss how they can be combined effectively with quantitative measures.

Like the earlier publication, this handbook is aimed at users who need practical rather than technically sophisticated advice about evaluation methodology. The main objective is to make PIs and PDs "evaluation smart" and to provide the knowledge needed for planning and managing useful evaluations.

Key Concepts and Assumptions

Why Conduct an Evaluation?

There are two simple reasons for conducting an evaluation:

To gain direction for improving projects as they are developing, and
To determine projects’ effectiveness after they have had time to produce results.

Formative evaluations (which include implementation and process evaluations) address the first set of issues. They examine the development of the project and may lead to changes in the way the project is structured and carried out. Questions typically asked include:

To what extent do the activities and strategies match those described in the plan? If they do not match, are the changes in the activities justified and described?
To what extent were the activities conducted according to the proposed timeline? By the appropriate personnel?
To what extent are the actual costs of project implementation in line with initial budget expectations?
To what extent are the participants moving toward the anticipated goals of the project?
Which of the activities or strategies are aiding the participants to move toward the goals?
What barriers were encountered? How and to what extent were they overcome?

Summative evaluations (also called outcome or impact evaluations) address the second set of issues. They look at what a project has actually accomplished in terms of its stated goals. Summative evaluation questions include:

To what extent did the project meet its overall goals?
Was the project equally effective for all participants?
What components were the most effective?
What significant unintended impacts did the project have?
Is the project replicable and transportable?

For each of these questions, both quantitative data (data expressed in numbers) and qualitative data (data expressed in narratives or words) can be useful in a variety of ways.

The remainder of this chapter provides some background on the differing and complementary nature of quantitative and qualitative evaluation methodologies. The aim is to provide an overview of the advantages and disadvantages of each, as well as an idea of some of the more controversial issues concerning their use.

Before doing so, however, it is important to stress that there are many ways of performing project evaluations, and that there is no recipe or formula that is best for every case. Quantitative and qualitative methods each have advantages and drawbacks when it comes to an evaluation's design, implementation, findings, conclusions, and utilization. The challenge is to find a judicious balance in any particular situation. According to Cronbach (1982),

There is no single best plan for an evaluation, not even for an inquiry into a particular program at a particular time, with a particular budget.

What Are the Major Differences Between Quantitative and Qualitative Techniques?

As shown in Exhibit 1, quantitative and qualitative measures are characterized by different techniques for data collection.

Exhibit 1. Common techniques

Quantitative Qualitative

Questionnaires
Tests
Existing databases Observations
Interviews
Focus groups

Aside from the most obvious distinction between numbers and words, the conventional wisdom among evaluators is that qualitative and quantitative methods have different strengths, weaknesses, and requirements that will affect evaluators’ decisions about which methodologies are best suited for their purposes. The issues to be considered can be classified as being primarily theoretical or practical.

Theoretical issues. Most often, these center on one of three topics:

The value of the types of data;
The relative scientific rigor of the data; or
Basic, underlying philosophies of evaluation.

Value of the data. Quantitative and qualitative techniques provide a tradeoff between breadth and depth and between generalizability and targeting to specific (sometimes very limited) populations. For example, a sample survey of high school students who participated in a special science enrichment program (a quantitative technique) can yield representative and broadly generalizable information about the proportion of participants who plan to major in science when they get to college and how this proportion differs by gender. But at best, the survey can elicit only a few, often superficial reasons for this gender difference. On the other hand, separate focus groups (a qualitative technique) conducted with small groups of male and female students will provide many more clues about gender differences in the choice of science majors and the extent to which the special science program changed or reinforced attitudes. But this technique may be limited in the extent to which findings apply beyond the specific individuals included in the focus groups.

Scientific rigor. Data collected through quantitative methods are often believed to yield more objective and accurate information because they were collected using standardized methods, can be replicated, and, unlike qualitative data, can be analyzed using sophisticated statistical techniques. In line with these arguments, traditional wisdom has held that qualitative methods are most suitable for formative evaluations, whereas summative evaluations require "hard" (quantitative) measures to judge the ultimate value of the project.

This distinction is too simplistic. Both approaches may or may not satisfy the canons of scientific rigor. Quantitative researchers are becoming increasingly aware that some of their data may not be accurate and valid, because some survey respondents may not understand the meaning of questions to which they respond, and because people’s recall of even recent events is often faulty. On the other hand, qualitative researchers have developed better techniques for classifying and analyzing large bodies of descriptive data. It is also increasingly recognized that all data collection - quantitative and qualitative - operates within a cultural context and is affected to some extent by the perceptions and beliefs of investigators and data collectors.

Philosophical distinction. Some researchers and scholars differ about the respective merits of the two approaches largely because of different views about the nature of knowledge and how knowledge is best acquired. Many qualitative researchers argue that there is no objective social reality, and that all knowledge is "constructed" by observers who are the product of traditions, beliefs, and the social and political environment within which they operate. And while quantitative researchers no longer believe that their research methods yield absolute and objective truth, they continue to adhere to the scientific model and seek to develop increasingly sophisticated techniques and statistical tools to improve the measurement of social phenomena. The qualitative approach emphasizes the importance of understanding the context in which events and outcomes occur, whereas quantitative researchers seek to control the context by using random assignment and multivariate analyses. Similarly, qualitative researchers believe that the study of deviant cases provides important insights for the interpretation of findings; quantitative researchers tend to ignore the small number of deviant and extreme cases.

This distinction affects the nature of research designs. According to its most orthodox practitioners, qualitative research does not start with narrowly specified evaluation questions; instead, specific questions are formulated after open-ended field research has been completed (Lofland and Lofland, 1995). This approach may be difficult for program and project evaluators to adopt, since specific questions about the effectiveness of interventions being evaluated are usually expected to guide the evaluation. Some researchers have suggested that a distinction be made between Qualitative and qualitative work: Qualitative work (large Q) refers to methods that eschew prior evaluation questions and hypothesis testing, whereas qualitative work (small q) refers to open-ended data collection methods such as indepth interviews embedded in structured research (Kidder and Fine, 1987). The latter are more likely to meet EHR evaluators' needs.

Practical issues. On the practical level, there are four issues which can affect the choice of method:

Credibility of findings;
Staff skills;
Costs; and
Time constraints.

Credibility of findings. Evaluations are designed for various audiences, including funding agencies, policymakers in governmental and private agencies, project staff and clients, researchers in academic and applied settings, as well as various other "stakeholders" (individuals and organizations with a stake in the outcome of a project). Experienced evaluators know that they often deal with skeptical audiences or stakeholders who seek to discredit findings that are too critical or uncritical of a project's outcomes. For this reason, the evaluation methodology may be rejected as unsound or weak for a specific case.

The major stakeholders for EHR projects are policymakers within NSF and the federal government, state and local officials, and decisionmakers in the educational community where the project is located. In most cases, decisionmakers at the national level tend to favor quantitative information because these policymakers are accustomed to basing funding decisions on numbers and statistical indicators. On the other hand, many stakeholders in the educational community are often skeptical about statistics and "number crunching" and consider the richer data obtained through qualitative research to be more trustworthy and informative. A particular case in point is the use of traditional test results, a favorite outcome criterion for policymakers, school boards, and parents, but one that teachers and school administrators tend to discount as a poor tool for assessing true student learning.

Staff skills. Qualitative methods, including indepth interviewing, observations, and the use of focus groups, require good staff skills and considerable supervision to yield trustworthy data. Some quantitative research methods can be mastered easily with the help of simple training manuals; this is true of small-scale, self-administered questionnaires, where most questions can be answered by yes/no checkmarks or selecting numbers on a simple scale. Large-scale, complex surveys, however, usually require more skilled personnel to design the instruments and to manage data collection and analysis.

Costs. It is difficult to generalize about the relative costs of the two methods; much depends on the amount of information needed, quality standards followed for the data collection, and the number of cases required for reliability and validity. A short survey based on a small number of cases (25-50) and consisting of a few "easy" questions would be inexpensive, but it also would provide only limited data. Even cheaper would be substituting a focus group session for a subset of the 25-50 respondents; while this method might provide more "interesting" data, those data would be primarily useful for generating new hypotheses to be tested by more appropriate qualitative or quantitative methods. To obtain robust findings, the cost of data collection is bound to be high regardless of method.

Time constraints. Similarly, data complexity and quality affect the time needed for data collection and analysis. Although technological innovations have shortened the time needed to process quantitative data, a good survey requires considerable time to create and pretest questions and to obtain high response rates. However, qualitative methods may be even more time consuming because data collection and data analysis overlap, and the process encourages the exploration of new evaluation questions (see Chapter 4). If insufficient time is allowed for the evaluation, it may be necessary to curtail the amount of data to be collected or to cut short the analytic process, thereby limiting the value of the findings. For evaluations that operate under severe time constraints - for example, where budgetary decisions depend on the findings - the choice of the best method can present a serious dilemma.

In summary, the debate over the merits of qualitative versus quantitative methods is ongoing in the academic community, but when it comes to the choice of methods for conducting project evaluations, a pragmatic strategy has been gaining increased support. Respected practitioners have argued for integrating the two approaches building on their complementary strengths.¹ Others have stressed the advantages of linking qualitative and quantitative methods when performing studies and evaluations, showing how the validity and usefulness of findings will benefit (Miles and Huberman, 1994).

¹ See especially the article by William R. Shadish in Program Evaluation: A Pluralistic Enterprise, New Directions for Program Evaluation, No. 60 (San Francisco: Jossey-Bass. (Winter 1993).

Why Use a Mixed Method Approach?

The assumption guiding this handbook is that a strong case can be made for using an approach that combines quantitative and qualitative elements in most evaluations of EHR projects. We offer this assumption because most of the interventions sponsored by EHR are not introduced into a sterile laboratory, but rather into a complex social environment with features that affect the success of the project. To ignore the complexity of the background is to impoverish the evaluation. Similarly, when investigating human behavior and attitudes, it is most fruitful to use a variety of data collection methods (Patton, 1990). By using different sources and methods at various points in the evaluation process, the evaluation team can build on the strength of each type of data collection and minimize the weaknesses of any single approach. A multimethod approach to evaluation can increase both the validity and reliability of evaluation data.

The range of possible benefits that carefully crafted mixed method designs can yield has been conceptualized by a number of evaluators. ²

The validity of results can be strengthened by using more than one method to study the same phenomenon. This approach - called triangulation - is most often mentioned as the main advantage of the mixed method approach.
Combining the two methods pays off in improved instrumentation for all data collection approaches and in sharpening the evaluator's understanding of findings. A typical design might start out with a qualitative segment such as a focus group discussion, which will alert the evaluator to issues that should be explored in a survey of program participants, followed by the survey, which in turn is followed by indepth interviews to clarify some of the survey findings (Exhibit 2).

Exhibit 2. Example of a mixed method design
Quantitative	Qualitative	Qualitative
(questionnaire)	(exploratory focus group)	(personal interview with subgroup)

But this sequential approach is only one of several that evaluators might find useful (Miles and Huberman, 1994). Thus, if an evaluator has identified subgroups of program participants or specific topics for which indepth information is needed, a limited qualitative data collection can be initiated while a more broad-based survey is in progress.

A mixed method approach may also lead evaluators to modify or expand the evaluation design and/or the data collection methods. This action can occur when the use of mixed methods uncovers inconsistencies and discrepancies that alert the evaluator to the need for reexamining the evaluation framework and/or the data collection and analysis procedures used.

There is a growing consensus among evaluation experts that both qualitative and quantitative methods have a place in the performance of effective evaluations. Both formative and summative evaluations are enriched by a mixed method approach.

² For a full discussion of this topic, see Jennifer C. Greene, Valerie J. Caracelli, and Wendy F. Graham, Toward a Conceptual Framework for Mixed Method Evaluation Designs, Educational Evaluation and Policy Analysis, Vol. 11, No. 3, (Fall 1989), pp.255-274.

How To Use This Handbook

This handbook covers a lot of ground, and not all readers will want to read it from beginning to end. For those who prefer to sample sections, some organizational features are highlighted below.

To provide practical illustrations throughout the handbook, we have invented a hypothetical project, which is summarized in the next chapter (Part 1, Chapter 2); the various stages of the evaluation design for this project will be found in Part 3, Chapter 6. These two chapters may be especially useful for evaluators who have not been involved in designing evaluations for major, multisite EHR projects.
Part 2, Chapter 3 focuses on qualitative methodologies, and Chapter 4 deals with analysis approaches for qualitative data. These two chapters are intended to supplement the information on quantitative methods in the previous handbook.
Part 3, Chapters 5, 6, and 7 covers the basic steps in developing a mixed method evaluation design and describes ways of reporting findings to NSF and other stakeholders.
Part 4 presents supplementary material, including an annotated bibliography and a glossary of common terms.

Before turning to these issues, however, we present the hypothetical NSF project that is used as an anchoring point for discussing the issues presented in the subsequent chapters.

References

Cronbach, L. (1982). Designing Evaluations of Educational and Social Programs. San Francisco: Jossey-Bass.

Kidder, L., and Fine, M. (1987). Qualitative and Quantitative Methods: When Stories Converge. Multiple Methods in Program Evaluation. New Directions for Program Evaluation, No. 35. San Francisco: Jossey-Bass.

Lofland, J., and Lofland, L.H. (1995). Analyzing Social Settings: A Guide to Qualitative Observation and Analysis. Belmont, CA: Wadsworth Publishing Company.

Miles, M.B., and Huberman, A.M. (1994). Qualitative Data Analysis, 2nd Ed. Newbury Park, CA: Sage, p. 40-43.

National Science Foundation. (1993). User-Friendly Handbook for Project Evaluation: Science, Mathematics, Engineering and Technology Education. NSF 93-152. Arlington, VA: NSF.

Patton, M.Q. (1990). Qualitative Evaluation and Research Methods, 2nd Ed. Newbury Park, CA: Sage.

Previous Chapter | Back to Top | Next Chapter

Table of Contents

Exhibit 1. Common techniques
Quantitative	Qualitative
Questionnaires Tests Existing databases	Observations Interviews Focus groups

PART I. Introduction to Mixed Method Evaluations

Chapter 1: Introducing This Handbook

The Need for a Handbook on Designing and Conducting Mixed Method Evaluations

PART I.
Introduction to Mixed Method Evaluations