Saturday, February 1, 2025

Carnoy, M., Elmore, R., & Sisken, L. S. (Eds.) (2003). The new accountability: High schools and high-stakes testing. Reviewed by Sharon L. Nichols, Arizona State University

 

Carnoy, M., Elmore, R., & Sisken, L. S. (Eds.) (2003). The new accountability: High schools and high-stakes testing. NY: RoutledgeFalmer.

218 pp.

$19.95     ISBN: 0-415-94705-7

Reviewed by Sharon L. Nichols
Arizona State University, Tempe

June 17, 2004

In the book The new accountability: High schools and high-stakes testing, editors Martin Carnoy, Richard Elmore, and Leslie Santee Siskin present a series of essays that describe the impact of contemporary education reform efforts on high school operations and outcomes. Chapter authors describe qualitative and quantitative data drawn from fifteen high schools in four states that explore how schools’ internal accountability operations meet the demands of externally imposed accountability requirements.

The New Accountability is a generally solid contribution to the field; a useful theoretical framework is introduced for thinking about the impact of education reform on practice. Nonetheless, theoretical clarity is lacking and the emphasis on high-stakes testing as a means of improving student achievement is misplaced. An introductory essay lays the critical theoretical and empirical framework that guides the studies presented in subsequent chapters. This framework is both a strength and weakness of the book.

In the introductory chapter, the editors introduce the plight of the American high school. They state that one of the most dominant concerns of our high schools is “the polarization of high school outcomes—increased dropouts at one end and increased college attendance for high school graduates at the other end” (p. 2). Further, they note that the organizing principles of high schools—as departmentalized institutions—are distinct from elementary and middle school settings in terms of mission and organizational structure. This argument lends substance to their focus on high schools for exploring how external (state-imposed) policy demands affect internal operations (school management, leadership).

Carnoy, Elmore, and Siskin briefly describe the history of educational reform, and introduce the reader to the role of “assessment” and “accountability” in education policy rhetoric and practice in the U.S. As the practice of standardized achievement assessment has grown in popularity over time (and with the growing concern over standards-based learning) so has the notion of holding students, teachers, and administrators “accountable” for reaching predefined achievement goals.

The driving concept behind the book is the notion of “internal accountability” (originally described by Abelmann & Elmore, 1999). This theory maintains that a high school’s approach to education delivery is tied to how the teachers, students, administrators and parents discuss and internalize educational values and expectations. Specifically, it is “based on the premise that schools actually have conceptions of accountability embedded in the patterns of their day-to-day operations, and that a school’s conception of accountability significantly influences how it delivers education” (p. 3). Internal accountability is defined by three layers of interaction:

The individual’s sense of accountability or responsibility; parents’, teachers’, administrators’, and students’ collective sense of accountability, or expectations; and the organizational rules, incentives, and implementation mechanisms that constitute the formal accountability system in schools. (pp. 3-4)

These three “tiers” form the basis by which formal and informal information is communicated to the outside agencies; and they define the existence, level, and type of stakes attached to success or failure.

The editors hypothesize that schools are more successful (i.e., more learning occurs) when formal and informal accountability mechanisms are aligned with individuals’ internalized notions of accountability and responsibility. A strongly “aligned” school would be one where, for instance, there is coherence among teachers’ and administrators’ expectations for student success and a philosophical agreement on how that success is obtained. In contrast, a highly misaligned school, is one with a relatively “weak or dysfunctional internal accountability system.” For example, it may be a school where “the principal forces teachers to adhere to rules that they know result in poor achievement outcomes” (p. 4).

The editors also describe “external accountability” as top-down forces, such as those promulgated by the state, exerted on individual schools. The current No Child Left Behind Act is an example of “external accountability,” whereby the federal government tells each state—which subsequently tells each school—what must be done to increase achievement or it will face a predefined set of sanctions. Thus, they introduce another notion of “alignment;” the extent to which internal mechanisms are aligned with external ones.

The purpose of this book is to use this organizing framework to present data on how and where internal and external accountability mechanisms meet. The specific questions posed by the research presented in the book include: “Does external accountability tend to ‘align’ atomistic schools [those that operate around largely separate and independent goals] around clearly defined goals? Does external accountability help less aligned schools more than aligned schools? What does external accountability do to schools that are already aligned but around something different from state standards?” (p. 5).

Study Design

The broader study looked at three schools in each of three states (Kentucky, New York, and Vermont) and six in Texas. In two states, all three schools were in a single district (Kentucky and New York). In Texas, six schools were from two districts, and in Vermont three schools were from three different districts. States were selected to represent a range of external accountability systems. States such as New York, Kentucky, and Texas were chosen to represent a range of historical precedents for implementing accountability. For example, in New York students have been required for decades to pass the state’s Regents Examination to receive a diploma. And in Texas, schools, teachers, administrators, and students have been held accountable (at least for a decade) through a system of sanctions and rewards tied to test performance. By contrast, Vermont has had very little state-imposed accountability.

Chapter one, written by Rhoten, Carnoy, Chabrán, and Elmore, provides a rich contextual backdrop for thinking about data presented in subsequent chapters. The authors provide a relatively comprehensive account of each study state’s accountability history and how assessment systems and sanctions-based reform have evolved. This chapter gives the reader a solid political and educational context from which school-level processes can be understood. Following a state-by-state description of accountability policy, the authors conclude with a few statements regarding the achievement trajectories of each state—implying a link between policy and achievement. For example, they note that between 1993 and 1998, the percentage of Kentucky students scoring “proficient” rose in every subject at every level, except in science among middle school students” (p. 46). Similarly, “More New York high school students took and passed Regents exams in 1998 and 1999 than ever before, with 73 percent of the state’s twelfth graders passing the English Regents exam in 1998 and 78 percent passing in 1999” (p. 50).

Authors seem to imply that accountability reforms, in place prior to these trends, have caused them. Thankfully, the authors qualify these assertions and are extremely careful to argue that the “positive” trends are not necessarily indicative of accountability policies per se. That is, it is not clear if externally imposed accountability policies have caused higher achievement. Indeed, the authors ask “Is this simply because accountability needs many years to work its way up the educational ladder? Or is there something about high schools that makes them relatively unresponsive to accountability efforts?” (p. 50). A difficult question to answer, but one subsequent chapter authors attempt to address.

Chapters 2, 3, 4, and 5 report on interview, observation and achievement data collected from high schools in each study state. Each chapter gives the reader a different set of constructs and perspectives from which to consider the broader organizing framework. In Chapter 2, Debray, Parson, and Avila set up this framework by presenting a model of internal/external accountability to explore how individual schools meet the demands of an external accountability system. Schools were selected for study based on one of three ways they were “positioned” in their state (i.e., where their internal accountability was with respect to their external accountability). The researchers defined these three positions as “target” “better positioned,” and “orthogonal.”

“Target” schools were described as those schools that “had not been performing well by traditional measures, but that had not been declared failing or selected for reconstitution” (p. 7). These schools were selected because they were the focus of the accountability movement—low performing schools that were most at risk of receiving high-stakes consequences such as being labeled “failing” or being taken over by the state. A “better positioned” school was defined as one that had historically performed well on standardized achievement tests. Lastly, “orthogonal” schools were schools with specifically defined missions and included career academy schools, alternative high schools, magnets or a school with a particular strong “external constituency that drives its mission.” The authors drew on observational and interview data with teachers and administrators in each study school to identify the main factors that facilitated or impeded each type of school’s progress toward meeting the goals of the externally-imposed accountability system.

For each state, Debray et al. located each study school within a two-by-two configuration. Each of the four quadrants represented some combination of internal accountability (how strong it was from low to high) and alignment with external accountability (how aligned they were from low to high).They argued that schools with weaker internal accountability systems were much harder pressed to meet the demands of the external accountability system—that the process of alignment was more cumbersome and likely to fail.

In their examination of “target,” “better positioned,” and “orthogonal” schools in each of the four states, the authors came to a few general conclusions. First, the nature of external accountability is important in how schools react to external pressures. Second, whether stakes are low or high, the better-positioned schools are likely to incorporate state requirements quicker and more easily. Third, when stakes are low, targets schools are unlikely to align themselves to external requirements but when stakes are high and consistently enforced, many of these same target schools are likely over time to “accept and successfully incorporate new state demands into their teaching/learning structures.” In conclusion, they write:

The only chance of the desired alignment of schools with weak internal accountability structures is a state accountability system with clear goals and strong sanctions and rewards. Even then, however, many target schools may not have the capacity to respond adequately to external demands. (p. 85)

Although the chapter is meant to be exploratory, I found the concepts, and therefore, the “conclusions” to be suspect on several grounds. First, their operational definition of “high” versus “low” internal accountability was thin and difficult to conceptualize in practice. It was assumed that “low” internal accountability meant that the educational players were not on the same page. However, what does that mean? Did it mean principals and teachers agreed on accountability, but inconsistently enforced it? Did it mean teachers did not see the principal as credible? Did it mean teachers held equally strong, but very disparate views of learning? It just is not clear. And, to compound this confusion, it was difficult to understand what an “aligned” school looked like. For example, if the state believes that teachers should be replaced on account of chronic low achievement, is the school “aligned” with the state when all teachers agree this is an appropriate consequence? Furthermore, what if teachers believe in eliminating a bad teacher but vehemently oppose the state’s process for doing so—is that alignment or misalignment? It is simply unclear what this actually looks like in practice or, perhaps more importantly, why this so-called “alignment” is a worthy goal.

Lastly, their model and how schools of varying “positions” react to each state’s external system is only vaguely substantiated with data (i.e., only a select few individuals in each school were quoted—how are we to be sure these represent the school as a whole? What kind of in-school variation existed?). Further, there is no description of the methods used to help the reader contextualize their conclusions made about each school. Were all teachers in all schools interviewed? At what point in the school year were they interviewed? Were they all interviewed the same number of times? This chapter raised more questions than it gave compelling answers or practical implications for practice. As qualitative research, it was either seriously underdeveloped or, at best, under-reported.

Chapters 3 and 4 are stronger in design, focus and clarity. In Chapter 3, Sisken compares the impact of external accountability on tested (math and reading) and untested (music in particular) subjects. She explores the effects of standards-based reform and external accountability as it relates to music and compares with “core” subjects such as math and reading. Kentucky is of particular interest because it has attempted to keep a focus on music by making it a tested subject. Sisken documents music teachers’ reactions to the new pressures of preparing students for a test and compares evolving pressures to those once experienced by math and reading teachers. In doing so, she contemplates the potentially damaging effect of “standardizing” a subject such as music. Interview data revealed teacher’ laments that focusing on music theory or composition—commonly tested areas—took the focus away from performance and improvisation—less testable aspects of music. Sisken concludes with a few powerful rhetorical questions:

The example of music dramatically illuminates one of the major problems confronting high-stakes accountability at the high school level. Once we move beyond (or arrive at) required standards for reading and writing, is there actually agreement on what all high school students should know and be able to do to earn a high school diploma? Are high standards possible across all subjects? We will expect all students to achieve high standards in chemistry or to ‘know how their body works, so they can make informed medical decisions?’ to perform the Gloria Vivaldi or to correctly label the kind of dance illustrated by a picture of dancers in poodle skirts and flat tops? In transforming subjects into something all students need to be able to demonstrate on a test, do we inadvertently lower performance standards, weaken existing professional accountability systems, or lose knowledge outside the core altogether? (p. 97)

Chapter 4 is equally compelling as it raises questions about the effects of accountability, this time as it relates to school leadership. The authors define their notion of “leadership” at the outset. “[we] adopt a distributed theory of leadership, one that examines multiple sources of school-level leadership and how this leadership is distributed across the organization.” They go on to define their approach to data analysis using this conceptualization of leadership: “We look at how leadership, broadly defined, emerges and is distributed in high schools responding to standards-based accountability” (p. 100). In their exploratory analysis, they conclude that successful responses to the external accountability requirements in schools with relatively low “capacity usually come from the top—from the school principal, who has the most at stake in a strong, sanction-based accountability system” (p. 100). Strong leadership skills have the potential benefit of getting individuals to focus on similar goals.

They especially believed that strong leadership, in combination with quick and strong consequences imposed from the state, greatly benefited low performing schools. They note,

Perhaps the most striking feature of our comparison is the dramatic differences across the states in our target schools. In Texas, where there are more immediate and observable stakes, such as reconstitution, the schools seem to act more coherently and under more coherent leadership. In [Kentucky] the weaker sanctions do not inspire the school to cohere, and the school continues to be fragmented and unorganized in its leadership and response. (p. 126)

What is troubling about this conclusion (based as it is on qualitative data from one target school in Kentucky and two in Texas) is that it leads the reader to infer that strong, rapidly applied sanctions have the positive effect of engendering stronger leadership than in contexts where stakes are weak or delayed. This is a troubling endorsement for high-stakes accountability. Although the authors provide some interview data to show school members supported the notion that a strong leader was critical for helping turn the school around academically, the link between leadership skills and sanctions-based accountability was not empirically established. Could it not have been good leadership that would have occurred in spite of a consequence-based approach to accountability?

In Chapter 5, Chabrán presents data on the effects of high-stakes testing on student motivation and attitudes towards testing. This chapter raised a series of important questions for thinking about the intended and unintended effects of accountability reforms. The author notes that in states where stakes are attached to tests that directly affect students, a few notable reactions from students and teachers emerged. For example, students and teachers worried about the anxiety producing effect of high-stakes tests. Students from Texas, for example, where the class of 1990-1991 was the first required to pass a test to receive a diploma, teachers explicitly worried that the stress attached to testing was significantly detrimental to students. Other worries were that too much pressure on testing was influencing teachers to teach-to-the-test, with a possible de-emphasis on creativity and innovation in the classroom. In spite of some of these worries, the author pointed out that some teachers and students believed that if students are to be tested, they are more likely to take it seriously if stakes are attached to the test.

In Chapter 6, Carnoy, Loeb, and Smith present data to test the notion that high-stakes testing motivates teachers to “teach to the test” and influences students to drop out of high school at higher rates. Looking at the legacy of accountability in Texas, the authors argue that the institution of the Texas Assessment of Academic Skills (TAAS) (first given in 1990 as a graduation test) has not had the negative effects on student outcomes (increasing dropout rates, decreasing achievement) suggested by others (e.g., Haney 2000). Indeed, the main goal of this chapter is to show the opposite effect—that the first and later administrations of the TAAS are associated with the desired outcomes of decreasing dropout rates and increasing student achievement. The authors present enrollment and high school survival trend data to argue that since the TAAS has been given to students as a high stakes test in 1990, it has not had the effect of “increasing” dropout rates. In fact, they present trend lines disaggregated by ethnicity to show how they arrived at this conclusion. Interestingly, between the years 1991 and 1995, the trend lines for minority students continue a steady downward trend —more black and Hispanic students are in fact dropping out, with the dropout gap between them and white students getting larger. However, the authors are correct in their assertion that by the mid 1990s, more students are staying in school—dropouts start to decrease. In spite of these trends, the inference made by the authors is that TAAS is somehow affecting the tendency of students to dropout. In fact, these data are merely correlational and causal conclusions are dubious.

The authors also explore achievement data (TAAS and the National Assessment of Educational Progress—NAEP) to show that students, in general, are doing better on both tests. However, the authors contradict themselves when they talk about their own data.

From our observations and interviews, it appears that teachers and principals in schools with higher percentage of lower income, African-American and Latino pupils are more likely to focus on teaching the test than those in schools with higher income pupils. (p. 151)

If teachers are engaging in teaching-to-the-test practices, it serves to invalidate the meaning of the test scores in the first place. Thus, achievement patterns on the TAAS cannot be considered as indicators of learning. However, students’ NAEP performance is a solid indicator of transfer of learning; however, the data on NAEP is mixed at best for showing whether Texan students are actually achieving at higher levels. I strongly encourage readers to review the data presented in this chapter and to conclude for themselves the exact nature of achievement trends. In short, the authors present weak correlational data to argue a causal relationship between the implementation of high-stakes testing and student outcomes (e.g., achievement, tendency to drop out of school, post high school aspirations).

The final two chapters provided a few concluding comments on the contents of the book and the broader study. In spite of the majority of the arguments that seem to be an overall endorsement of high stakes testing as a reform movement for improving student learning (with the exception of Chapter 3 on untested subjects), the last two chapters offer a more balanced and thought provoking perspective on reform and the broader goals of schooling.

In the next to last chapter, Sisken pulls together and articulates the broader mission of high schools as an educational institution and draws comparisons to the external accountability forces they are required to meet. She argues that almost all educators agree on the notion of equitable education and providing equal opportunities for students at all levels of the achievement spectrum. The problem, she argues, is that there are wide disagreements as to the best approach for achieving that goal. First and foremost are disparate ideas as to what are the fundamental goals of high school education in general. She writes:

There is wide agreement, for example, that all students need literacy and numeracy, and relative agreement on the content and skill levels that all elementary school students should achieve. But at the high school level, where curriculum and faculty are officially and organizationally divided along subject lines, the questions are far more complicated, and states are struggling with the question of which subjects will count in their accountability system. Does every student need to appreciate music, or to be able to play an instrument? Should they be required to demonstrate mastery of world history, or U.S. government, or both? How well does every student actually need to perform on a chemistry test? (p. 178).

Perhaps, it should not be surprising that the soundest conclusion that can be drawn from this book is that schools vary widely in how they operate to meet the demands of an external reform policy and the effects of these reforms on achievement vary greatly.

In the final chapter, Elmore questions the role of “capacity” for meeting the complex demands of current accountability reforms. Elmore defines capacity in terms of: (a) how much teachers know about their subjects, (b) how leadership is defined and distributed throughout the school, (c) the resources available to the school, and (d) the inherent relationships among the foregoing. He notes that all schools have some internalized sense of accountability and the extent to which these internalized expectations and values for accountability were present internally determined how they reacted to externally imposed accountability systems.

He concludes, based on data presented from throughout the book, that high internal accountability is a necessary condition for schools to be successful in “responding to the pressures of external accountability systems.” But Elmore acknowledges that even given this conclusion, it is difficult to ask high schools to develop this sense of “coherence.” High schools by definition are compartmentalized institutions focusing on a range of subject-matter emphases. Thus, bringing an entire school onto the same page with respect to “internal accountability” is a significant challenge. The argument presented by Sisken in Chapter 3 dealing with teachers’ reactions to accountability as it affects tested and untested subjects is a particularly good example of this problem.

In general, I found the notion of internal accountability, the framework that guides the entire book, difficult to fully conceptualize. I kept hoping for more information such as vignettes, anecdotes, or any kind of detailed analytic summary that could illustrate better the notion of weak/strong internal accountability in practice but it just was not there. Additionally, the authors’ underlying assumption that alignment with external accountability is laudable is troubling. The assumption being made was that alignment with state-imposed accountability will yield greater learning, achievement, and overall benefits for students. Aside from streamlining the bureaucracy that is imposed on schools, it isn’t fully established why accountability alignment is so critical for creating successful schools. Further, authors throughout this book rarely challenge the current reform movement—again, Sisken’s Chapter 3 is a notable exception.. The book fails to mount a convincing argument that accountability as an “end” was a worthy goal for enhancing educational opportunities for students of all backgrounds.

References

Abelmann, C., & Elmore, R. F. (1999). When accountability knocks, will anyone answer? Philadelphia: Consortium for Policy Research in Education.

Haney, W. (2000). The myth of the Texas miracle in education. Education Policy Analysis Archives, 8(41). Retrieved June 17, 2004 from http://epaa.asu.edu/epaa/v8n41/.

About the Author

Sharon L. Nichols is currently a Postdoctoral Research Associate working with David Berliner at Arizona State University. Her postdoctoral research project is a study investigating the effects of high-stakes testing on student achievement patterns. She received her degree from University of Arizona in educational psychology under the supervision of Tom Good. In Fall 2004, she will join the faculty of the University of Texas, San Antonio.

 

No comments:

Post a Comment