The Data Puzzle: The Nature of Interpretation in Quantitative Research

Herbert M. Kritzer



American Journal of Political Science, Vol. 40 (February 1996), pp. 1-32



Theory: Interpretation is central to the social scientist's process of analysis, regardless of whether that analysis relies on quantitative or qualitative data. This essay presents a "reconstructed logic" of the interpretation process involved in quantitative data analysis.

Argument: Drawing upon a broad literature on interpretation, the paper shows how the interpretive processes for quantitative "data" has significant similarities to interpretation in other settings. For example, both qualitative textual analysis and quantitative statistical analysis rely upon contextual and tropological paradigms, although the specific conventions differ in many respects. The process of play employed by musicians and actors in developing an interpretation of a piece of music or a dramatic role suggests ways in which the quantitative analyst might let data perform to help in arriving at appropriate interpretations of statistical results.

Conclusion and Implications: The lines between quantitative and qualitative social science are less clear than often presumed. Both types of analysis involve extensive interpretation, and tools of interpretation that have many fundamental similarities.





There is no royal road to statistical induction, that the informed judgment of the investigator is the crucial element in the interpretation of data.

--Jacob Cohen (1990, 1304)

Even those texts or archeological documents which seem the clearest and most accommodating will speak only when they are properly questioned.

-Marc Bloch (1953, 64)

Introduction

How do those of us who do quantitative political science (and quantitative social science more generally) interpret our analyses and data? To answer this question, we might be tempted to turn to a convenient book shelf, pull off a favorite statistics text, and look up "interpretation" in the index. Most likely we would discover that neither our favorite statistics book, nor most of the others on our shelves, have index entries for "interpretation" (only about 10% of the more than 60 texts on my shelf have such an entry). Scholars in the humanities (e.g., literary criticism, cultural history, performing arts) have extensively considered the idea and practice of interpretation, and social scientists who rely upon qualitative tools to study social phenomena (e.g., anthropology, social history) regularly debate interpretive practices and theory. Beyond the elementary process of understanding what a specific statistical result "means" (e.g., Achen 1982; Liao 1994; but see McCloskey 1985), however, those of us who do quantitative social science seldom address the issue of how we "do interpretation." While some might argue that this neglect is because interpretation is not, and should not be, central to quantitative social science, such a position at best begs the issue. In fact, as one moves from previously existing texts of the type central to the humanities, through the textual materials of qualitative social science, to the quantitative data many of us use, the role of interpretation--which I broadly define as the process of ascertaining the meaning(s) and implication(s) of a set of materials--actually increases.

The archetypical qualitative method in social science is ethnographic fieldwork. The data ultimately analyzed by ethnographers are largely in the form of field notes. These data themselves are the result of a process of interpretation: "The data fieldworkers come to hold are not like dollar bills found on the sidewalk and stealthily tucked away in our pockets for later use. Field data are constructed from talk and action. They are then interpretations of other interpretations" (Van Maanen 1988, 95; see also Sarat 1990, 163). Similarly, most quantitative social science data are constructed by some process of interpretation. The provider of the data may make the interpretation, such as when someone interprets and responds to a question posed by an interviewer (see Fee 1981). Alternatively, the researcher who collects the data engages in interpretation through the process of writing survey questions, constructing and applying a set of codes, or establishing rules for what is and is not included when something is counted or measured. Furthermore, with quantitative data, the selection and application of statistical procedures represents another element of the interpretive process.

That interpretation is important in quantitative social science should not be surprising because interpretation is central to analysis of human phenomena. In literary analysis, one is typically presented vvith a text for interpretation. In qualitative social science, the analyst must construct the text for interpretation. In quantitative social science, the analyst constructs both a first order text (in assembling the data) and a second order text (in the form of statistical results). With each additional step in the process, the role of interpretation increases, as do the technical elements that must be considered as part of the interpretive process. Thus, rather than being more divorced from the human process of interpretation, quantitative social sciencent> probably involves more levels of interpretation than does qualitative social science.

The important role of interpretation is often evident in our reports of quantitative research. Take for example what many would describe as the central finding of The American Voter, the role of party identification. The key statement of that finding is not any particular statistical result, but the interpretation of a pattern of results (Campbell et al. 1960, 135):

We are convinced that the relationships in our data reflect primarily the role of enduring partisan commitments in shaping attitudes toward political objects. Our conviction on this point is rooted in what we know of the relative stability and priority in time of party identification and the attitudes it may affect. We know that persons who identify with one of the parties typically have held the same partisan tie for all or almost all of their adult lives. But within their experience since coming of age many of the elements of politics have changed ...

The reactions to the personalities of Eisenhower and Stevenson, to the issues of the Far Eastern war, and to the probity of the Democratic Administration differed markedly according to the individual's party allegiance. If we are to trust the evidence on the stability of party identification, these differences must be attributed to the capacity of a general partisan orientation to color responses to particular political objects.

Here interpretation is apparent as the bringing together of disparate pieces of evidence. As I will argue in the pages that follow, however, interpretation is central throughout the analysis of quantitative data.

In the following sections I seek to move the understanding of interpretation in quantitative analysis beyond the usual discussions of the technical meaning of the results yielded by particular methodologies. In doing so, I draw on a variety of sources of discussions of interpretation, with the goal of showing how socialscience can transcend many of the traditional boundaries between quantitative and qualitative analysis.

Text and Interpretation

Texts and Text Analogues

In recent years, the concept of "text" has expanded to include a wide variety of phenomena that are appropriate for the type of interpretive activities that traditional texts are subject to. The model of the text has been applied to human action (see Ricoeur 1979) and to society as a whole (see Brown 1977). In this section I extend the model of the text to quantitative data and statistical results derived from those data; that is, I view a set of quantitative materials as a "text-analogue" (Taylor 1971, 4).

Some data may fall outside the bounds of a text-analogue in that they do not require interpretation. This type of "brute data" are those "data whose validity cannot be questioned by offering another interpretation or reading, data whose credibility cannot be founded or undermined by further reasoning" (Taylor 1971, 8). While data can frequently be used to answer directly a question in a brute data sense (i.e., unambiguously, such as, what is the numerical balance between boys and girls in an elementary school classroom?), the more interesting, more theoretical questions that social scientists typically pose cannot be answered by direct reference to brute data. This is illustrated by the phenomenon of party identification discussed in the introduction. This problem is by no means unique to the social sciences. As Hempel points out (1966, 81), in a natural science that limited itself to brute data observables (my term, not his), it would not be possible to develop precise explanatory principles for the kinds of phenomena that are explained by "underlying entities such as molecules, atoms, and subatomic particles."

To be interesting and useful for social inquiry, data must be interpreted. Data seldom speak unless asked. Many analysts often will say to themselves or others, "what are these data telling me" or "these data say to me...." These statements can only be made in the context of asking particular questions within the context of some model; given such a context, one can look to data for help in answering the questions raised by the model. Some of the answers provided by the data may be fairly clear in the context of the questions asked, but a slight modification of the question may substantially cloud the answers. In fact, the practical problem is often that of trying to define the questions and/or the model so that the answers yielded by the data are internally consistent.

In discussing the need for interpretation of text, Taylor observes (1971, 11):

The meaning of a word depends ... on those words with which it contrasts, on those which define its place in the language ... on those which define the activity ... in which it figures.

Consider the following modification:

The meaning of a datum depends on those data with which it contrasts, on the theory that defines its place within the larger data set, on the context to which it relates....

The problems of data interpretation parallel those of textual interpretation.

Precisely this parallelism leads me to view quantitative materials as a text analogue. As I noted previously, in choosing the statistical methods to use in analyzing quantitative data, the analyst actually produces much of the text that is subject to direct interpretation. Those analyzing quantitative data generally do not interpret the raw data; they interpret statistical results derived from the data. Researchers do not determine the outcomes of the analyses, but they have a central role in constructing and selecting results.

One distinction that some observers make is between reading and interpretation. According to Scholes (1985, 21-23), "the supposed skill of reading is actually based upon a knowledge of the codes that were operative in the composition of any given text and the historical situation in which it was composed." In contrast, interpretation is the process of connecting between "texts" at different levels: "

Interpretation is not a pure skill but a discipline deeply dependent upon knowledge. It is not so much a matter of generating meanings out of a text as it is a matter of making connections between a particular verbal text and a larger cultural text" (ibid., 33). As will be evident in the discussion that follows, what I refer to as the first level of interpretation is very close to what Scholes labels reading. I do not choose to apply that label because, as I will discuss in detail below, even the simplest levels of interpretation rely on devices such as metaphors and analogies. Once such devices become second-nature to the analyst, however, they are almost invisible and the process will seem very much like reading.

Levels of Interpretation

In thinking about interpretation of quantitative data and the results of statistical analysis, one can identify three levels at which the interpretive process operates in assessing the meaning and adequacy of quantitative results. The first level--first order interpretation--draws upon the formal definition of the statistical measures being considered. For example, how do we explain what a regression coefficient means to a student or a lay audience? At the heart of the typical explanation is some type of analogy to a machine-like process based on Newton's action-reaction third law of mechanics.(1) One such analogy is to a rigid lever: as one variable changes the other variable changes in the same general way that one end of a lever moves as the other end is moved (albeit in opposite directions).(2) While there are problems with this lever analogy--there is a causal element but almost no stochastic component to the lever movement-the implicit analogy, or one like it (e.g., the dial on an experimental machine), facilitates both our presentation of results to our audience(s) and our own attribution of meanings to regression slopes. The experienced data analyst backs off from the simple analogy to recognize the ambiguities of causation, and to introduce the stochastic component. Without something like the lever analogy, however, most first order interpretation would be extremely difficult.

The importance of the lever-type analogy can be seen by looking at the problems of interpretation that arise when the metric of the dependent variable lacks intuitive meaning. It is common to find reports of analyses of log-linear, logistic, and probit models with no references to the actual values of the coefficients, other than the statistical conclusion that individual coefficients do or do not differ from zero. It is as if we have measured the movement of the manipulated end of the lever in centimeters and the movement of the opposite end in log cubits; the latter metric is unfamiliar (the cubit is a biblical unit of measure) even though it could be meaningfully interpreted by someone accustomed to thinking in terms of that metric. Compare this to the problem of the "line" in Abbott's 19th century classic Flatland (1882 [1953]). When the line, who came from a two dimensional world, visited a one dimensional world, he could not communicate the nature of his world to the inhabitants of one dimension; similarly, after having the opportunity to experience a third dimension, the line could not communicate that experience to the other inhabitants of Flatland (in fact, the line was declared insane]). In the models referred to above, the dependent variable is measured on an interpretable metric, but the metric is so unfamiliar to most people, both analysts and potential audiences, that no effort is made at interpretation. The problem here is not simply one of metrics. The problem is that the familiar linkage to something like the lever does not work, and there is no accepted alternative that takes into account the nonlinearities imposed by limiting factors but uses a familiar metric; one such alternative is the variable amount of force needed to stretch a rubber band--small amounts of force are needed at first, and this increases sharply as the stretch approaches its limit.

Interestingly, many statistical results have little in the way of first order interpretive tools. What does one say to a student who asks "what does the value we get for an F-ratio mean?" Of course, one can easily say, "the bigger, the better," but that does not help a lot in terms of interpretation; one could say that the F-ratio does not have any direct meaningful interpretation, but that is not correct. What is lacking here is an easily understood interpretive device of the type regularly used in contexts such as regression slopes. In fact, there may be many statistical indicators which do not lend themselves to interpretive tools, and these are usually the ones that are the most difficult to teach or to communicate to persons not already familiar with the indicator.

Second order interpretation is the use of statistical results to identify "problems" in the data and the analysis. Some of the problems are data-oriented, some merge theoretical issues with data issues, and some are primarily the result of theoretical complexities in the substantive model. Typically the identification of problems involves linking patterns in a nonobvious manner; that is, one might be tempted to apply some first order interpretation, but the pattern in the results suggests something different.

With experience, the analyst learns how to recognize the need to move beyond the first level of interpretation focusing on the simple meaning of statistical indicators.(3)

For example, what is indicated by regression coefficients that are large in absolute terms, but have the "wrong" sign (based on theoretical expectations) and fail to achieve statistical significance? Is this an indication that the variables are unrelated, or that the relationship is the exact opposite of what the researcher was expecting? In fact, the pattern described here frequently occurs when two or more predictors have a high degree of collinearity; recognizing this pattern as reflecting collinearity can lead to further analyses that can confirm that interpretation. While analysts may tend to think of collinearity as a problem of data (i.e., lack of information in the data) rather than a problem that combines theory and data, there are cases of collinearity that are fundamentally substantive reflecting a strong theoretical relationship between two variables. One example is what difference, if any, does the vice presidential candidate make in voting decisions? While one might want to include feeling thermometer scores for vice presidential candidates in an equation predicting vote, the collinearity between presidential and vice presidential feeling thermometers is so high, and it would be theoretically surprising if it was not high, that separating the effects of the two candidates is extremely difficult.

For some types of models, however, the kinds of inconsistencies associated with collinearity may be a product of the structure of the model itself (i.e., connected to the substantive theory that defines the model). For example, models that incorporate statistical interactions (i.e., the effect of two or more predictors is not equal to the sum of their individual effects) may exhibit patterns associated with collinearity. The sign reversal and/or the failure to achieve statistical significance, however, may reflect that several coefficients must be interpreted jointly rather than looking at the individual coefficients one at a time. Even if individual coefficients are not statistically significant, the coefficients taken as a set can be. The interpretive process involves learning to recognize these kinds of patterns, and associating particular meanings to the patterns.

Another issue of second order interpretation is that of recognizing how specific features of the data can influence statistical results in ways that are not closely tied to the substantive theory. For example, a small number of extreme outliers can greatly change regression results. Unexpected coefficients may be traceable to such outliers. In recent years, tools such as regression diagnostics have become available that can help the analyst in identifying individual cases that may be particularly important in generating specific results. While one might say that analysts should always use such techniques, in reality the range of diagnostic procedures is so broad that analysts tend to use specific procedures selectively: i.e., when there is reason, either because of some known feature of the data or because of problematic results of some type. That is, in practice, the effective use of diagnostics is heavily dependent upon the interpretation of initial results as indicating possible problems.

The failure of an analysis to yield results consistent with a theoretical proposition may be a "data" problem rather than a problem with the theory. An example is the relatively low relationship between scores on Graduate Record Examinations (GRE) and performance in graduate school. The range of GRE scores represented in a typical graduate program is relatively narrow because the process of selecting students for admission screens out many or most students who fail to perform well unless there is some information that suggests that the GRE may not be a good indicator for a specific individual. If true random samples of students with GRE scores across the full range were available, the posited relationships might appear much stronger, both in terms of the correlation between GRE scores and performance and in terms of the slope(s) predicting expected performance from GRE scores. Part of second order interpretation involves looking for indicators of this type of "selectivity bias" and lowing what types of patterns in results are indicative of such bias. In some situations, one should expect to have problems obtaining results, and obtaining theoretically consistent results may suggest problems. In one analysis of the predictability of graduate school GPA based on GRE, I found a modest relationship, but that relationship was in fact generated by a single outlier in the data set. Interestingly, at the time I did that analysis, I had not thought through the selectivity bias issue, and was expecting to find a relationship on the order of what I found. Because the result was what I expected, I did not initially check for things like outliers; when finally I did this check, and removed that outlier, the relationship disappeared. In fact, it may be common not to consider appropriate second order interpretations regarding problems in the analysis when the first order interpretations confirm our expectations.

A final type of second order interpretation problem arises from recurring patterns in data that have roots in substantive theory. One of the most common of these involves an intervening variable that has recurring influence across a range of substantive questions. One example of this can be found in aggregate state-level analysis. Inexperienced analysts, such as graduate students working with aggregate state data for the first time, will often find relatively weak patterns in their analyses. Frequently, controlling for region, in one of a variety of ways (the simplest of which is to omit the Southern states from the analysis), will drastically change the results. These kinds of patterns recur in such disparate contexts as variation in party identification (see Norrander 1989), criminal sanctioning and deterrence (Kritzer 1975), and state expenditures and policies (Erikson, Wright, and McIver 1989, 745-7). Knowing what kinds of patterns to look for involves learning how to interpret data within a given substantive context, in this case the context of aggregate state-level analysis.

Third order interpretation is the most complex and the least understood: connecting the statistical results to broader theoretical patterns. One can approach this aspect of interpretation as a problem in Bayesian inference. Since an analyst can never be certain that his or her theoretical model accounts for the patterns observed in a set of data, the "truth" of any given model can only be known to some degree of certainty. Implicitly, researchers attach varying probabilities to a range of models (most which are never articulated, or even explicitly recognized to exist); at its simplest, there is a probability that THE model is true, with the complement being the probability that any OTHER model is true (obviously this presumes that some model is "true," which is an arguable issue). The process of linking data (and data analyses) to the substantive theory is one of revising the probabilities associated with the substantive models.

Drawing upon this type of Bayesian perspective, Leamer described a variety of types of "specification searches" which he defines as "data dependent processes of selecting a statistical model" which can be "used to bring to the surface the nuggets of truth that may be buried in a data set" (1978, 1). In Leamer's analysis, specification searches differ from what is derogatorily referred to as data grubbing:

The essential ingredients of specification searches are judgment and purpose, which jointly determine where in a data set one ought to be digging and also which stones are gems and which are rocks. Without judgment and purpose, a specification is merely a fishing expedition, and the product of the search will have a value that is difficult or impossible to assess (ibid., 2).
Specification searches are processes of "inference," and inference is one way of thinking about interpretation (i.e., the connecting of statistical evidence to substantive propositions).(4) Leamer's concern is with the statistical aspects of inference, but he clearly recognizes that a full model of inference involves substantial elements that lie outside the realm of statistics and probability (ibid., 16-20).

Whether or not one adopts an explicit Bayesian perspective, third order interpretation is the use of data to assess and develop theoretical propositions. As I will show below, third order interpretation is closely tied to contextual elements such as substantive theory, data collection/generation, and side information available to the analyst. Understanding third order interpretation involves both the recognition of the key contextual elements and a model of how the analyst links specific results to those elements.

Approaches to Textual Interpretations

There are many approaches to textual interpretation. Two of these, contextual analysis and tropological analysis, are particularly useful because they have direct analogues in the interpretation of quantitative data and results of statistical analyses. Space does not permit comprehensive discussions of each approach; consequently, I provide only a summary of the elements necessary for my discussion of interpretation of quantitative materials.

Contextual Analysis

I am using the term "contextual analysis" of text to refer to the method of interpretation that seeks to understand the broad context in which the text was produced (e.g., the supposed ideological nature of the 1964 election, or the selection of new justices for the Supreme Court by Franklin Roosevelt), and how that context served to influence the content of the text (e.g., responses to issue questions in election surveys, or decisions by Supreme Court justices). In social science terms, one might think of this as a model of individual "author" decision-making, where the experience, background, and environment of the author of the text influences, although does not necessarily determine, the creation of a given text. Thus, the key part of interpreting a text is understanding what influenced the author: what had the author done previously, what responses had the author' s earlier work provoked, who was the author in communication with, etc.

This type of contextual analysis can be linked to movements in literary analysis that have been labeled "historical criticism," "sociological criticism," and "psychological criticism" (see Beljame 1948; Duncan 1953).

During the 1970s and 1980s, these approaches to reading texts through various presumed lenses of the authors were displaced by post-modern/post-structuralist approaches (see, for example, the selections in Harari 1979). The importance, however, of the contextual elements surrounding the creation of a text have remained an important aspect of interpretation, as evidenced in the emergence of the "new historicism" (see Smarr 1993; Bernstein 1991; Greenblatt 1990). Late 20th Century critical approaches have added a second contextual element: what influences the way that a text is perceived; the impact of the interpreter's context is the subject of a school of post-structuralist criticism that is called "reader-response" criticism (see Jauss 1982; Tompkins 1980).

Both forms of contextual interpretation are important. Students embarking on their first "real" literature courses are encouraged to understand the context in which the author wrote and the context in which they are reading the text. Those who must present a text to others, such as directors of theatrical productions, rely heavily upon contextual analysis, often having to choose which context (the author's or the audience's) should serve as the focus of interpretation.

Tropological Analysis

Traditional tropological analysis is a form of interpretation that relies upon the identification and explication of tropes found in a text. In traditional language and text analysis, "tropes are deviations from literal, conventional, or 'proper' language use, swerves in locution sanctioned neither by custom nor logic" (White 1928, 2). More generally, tropes are those forms or elements of presentation that have come to be accepted norms of expression and explication; they are the kinds of usages that we have come to expect and that we are able to readily interpret. Students of literary texts are taught to look for and understand tropes such as metaphor, metonymy, synecdoche, and irony--what Kenneth Burke (1945, 503-17) describes as the four "master tropes." Social science students are taught to be concerned about a different set of tropes; these include, but are not limited to the trope of setting (see Kondo 1990, 7-9), trope of control, trope of statistical significance (see McCloskey 1985, 154-9), and the trope of explained variation. In both literary and social scientific analysis tropes provide ways to express a thought or describe a "reality" that is complex, inconsistent, and even contradictory.

The purpose of recognizing tropes is to facilitate interpretation, and to standardize the communication of those interpretations. If one sees a particular statement in tropological terms, one can understand the meanings attached to the words by the author of the text, and one can begin to share the full message that the author intended to convey by his or her selection of those words. Tropes can be thought of as a way of recognizing levels of meanings attached to words. At one level, there is what might be described as the plain meaning, or, to anticipate a later usage, "brute meaning; "that is, words mean exactly what they are defined to mean ("I say what I mean and I mean what I say"). Tropological analysis seeks to move beyond the "brute meaning" of words to recognize that words can mean more than one thing, that those meanings need not be consistent, and that the meaning of a specific word depends upon the other words around it.

Probably the best known of the classic tropes is metaphor, which is an implied comparison that uses a word or phrase ordinarily applied to one object to refer to something very different (e.g., "all the world's a stage"). More generally, metaphor is "seeing something from the viewpoint of something else" (Brown 1977, 77), or a "general mappings across conceptual domains" (Lakoff 1993, 203). As applied to word usage, metaphors are "an illustrative device whereby a term from one level or frame of reference is used within a different level or frame" (Brown 1977, 78). In electoral studies, the spatial metaphor has been a central conceptual tool at least since the publication of An Economic Theory of Democracy (Downs 1957). Researchers have used this metaphor to develop theories, to design survey questions, to guide statistical analyses, and to interpret and communicate statistical results for a wide range of choice-oriented political behaviors (e.g., party strategies, mass voting decisions, appellate court decision-making, etc.).

As I will discuss below, metaphors are used not only in communicating the results of social scientific analysis, but tropes underlie the interpretive aspects of the analysis itself. Miles and Huberman (1984, 221) point out four ways that metaphors are used in qualitative analysis:

While they have in mind theory-specific metaphors, their discussion (1984, 234-5) of "triangulation"--the method of pinpointing a location by sighting from several different perspectives-is one example of the role of metaphor in the process of data analysis. Thus, as Brown (1977) has argued, the "logic of discovery" in social science involves an "aesthetic" which is heavily dependent on metaphoric thinking, and the end result of interpretation, discovery, relies heavily upon a more general tropological process.

Interpretation in Quantitative Analysis

Contextual Interpretation

The issue of understanding the broad context in which a set of data was obtained suffuses all empirical analysis, both quantitative and qualitative. A failure to take into account the context can raise questions about the validity of an interpretation. The influence of context suffuses the entire research process from the selection of theory and data to the analysis and interpretation that follow. For example, the contextual elements underlying The American Voter include the social-psychological orientations of the authors, the domestic political context of the post-New Deal era, the international context of the Cold War, and the cultural context of Western democratic theory. Simply put, if the authors of The American Voter had been scholars at the Moscow Institute of American Studies, one would have expected a radically different book.

Fenno has pointed out, in his presidential address to the American Political Science Association, that context is frequently a very important variable in an analysis (1986, 4-6); however, context is more than just another variable. For example, Huffs classic work, How to Lie With Statistics (1954) makes clear the role of a comparative base in attributing meaning to a Set of statistical results ("the little figures that are not there," "the geewhiz graph," "the one-dimensional picture," "the semi-attached figure"). Many of the same points are echoed in the more recent book Innumeracy (Paulos 1988): people are afraid of numbers because they do not know how to interpret them, which often means that they do not know how to put them in context.

In good social sciencent>, the contextual element of interpretation extends beyond the generation of the data to include the theoretical issues that motivated the collection and analysis of the data (or, in the case of secondary analysis, the selection of a particular data set). While an analyst can always report any given datum as an isolated statistic, a single datum is useful only in the context of both other data and theory; as Brown observes (1977, 43):

Scientific representations are significant only in relation to a surrounding body of theory. The fact that such theories are justified only by supplemental scientific observation does not matter; for the definition of what constitutes such observation is also made from the point of view of a surrounding body of theory [emphasis in the original].

To illustrate this point, Brown quotes Polanyi's description of how medical students learn to read X-rays (Polanyi 1958, 101):

Think of a medical student attending a course in X-ray diagnosis of pulmonary diseases....At first the student is completely puzzled.... Then as he goes on listening for a few weeks, looking carefully at ever new pictures of different cases, a tentative understanding will dawn on him...And eventually a rich panorama of significant details will be revealed....He has entered a new world....Thus, at the very moment when he has learned the language of pulmonary radiology, the student will also have learned to understand pulmonary radiograms. The two can only happen together. Both halves of the problem set to us by an unintelligible text, referring to an unintelligible subject, jointly guide our efforts to solve them, and they are solved eventually together by discovering a conception which comprises a joint understanding of both words and the things.

It is notable that the term customarily used to refer to the analysis of X-rays is "read" rather than "interpret" although the latter term is probably more appropriate in many situations (but see Scholes 1985, 21-3). The use here of "read" indicates that there is tight linkage between a particular pattern of "results" and a clearly recognized disease; the experienced analyst can immediately draw the connection between the X-ray "text" and the standard diagnosis. Extending the analogy to include other medical diagnostic tools, the interpretive process comes to resemble more and more the problem of interpretation confronting a quantitative social scientist: taking a variety of quantitative and nonquantitative pieces of information and assessing them within a theoretical context (which has also directed the selection of the information to be examined). The process of medical diagnosis is relatively straightforward because the theoretical contexts are relatively narrow and generally accepted. While social scientists engage in similar theoretically-based interpretation, the range of potential theory is much broader, and the links between theory and data are much weaker and/or not agreed upon.

Despite these weaknesses, the contextual element in data interpretation is crucial. Without reference to context, both theoretical and otherwise, we are confronted with a problem of information overload: there is virtually an infinite range of information one could look at. Furthermore, once data are selected for interpretation, it is the context that allows the analyst to attach meanings to the data. This is a problem both for data generated from naturalistic settings and from experimental settings. For example, a group of followers of the Maharishi Mahesh Yogi undertook an experiment to test the ability of group practice of transcendental meditation to reduce society stress (not directly related to practitioners); applying sophisticated statistical methodologies, Orme-Johnson and his colleagues (1988) reported experimental evidence supporting the theory they were testing. Not surprisingly, despite the technical correctness of the methods applied, scholars who are not part of the transcendental meditation fraternity have a great deal of difficulty accepting the validity of Orme-Johnson et al.'s interpretation (Russett 1988; Duval 1988).

I am not saying that by selecting the appropriate context, one can ascribe any meaning that one wants. There are wrong interpretations of data under any context, and there are interpretations that would be wrong under all contexts. At the same time, an interpretation that is right in one context may not hold in a different context. For example, the description of party identification as a deeply rooted psychological attachment and the recognition of that description based on the various analyses reported in The American Voter makes sense in the context of a voting model that is essentially social-psychological in nature. In contrast, an analyst working in a rational choice framework almost certainly would not come to the same conclusion, and probably would have never chosen to collect much of the data included in the early ANES surveys, or taking those data as a given, to look at the analyses described by Campbell and his colleagues. It is only in the specific theoretical context that the data become relevant for analysis.

The larger context in which the analyst is working drives the choices of both theory and data. In an almost trivial sense, we can see this in the variations in disciplinary approaches to the same problem (see Lehman, Lempert, and Nisbett 1988). One expects, and finds, that a social psychologist and an economist adopt different theories when looking at the same phenomenon, whether that be voting, bargaining, the commission of criminal acts, or any of the many other questions found in both the economic and social psychology literature. Beyond that, context of current events, current problems, and current political priorities shape the selection of explanatory theories and the availability of data for analysis. We know much more about bargaining in criminal cases than we know about bargaining in civil cases in the American justice system because the political climate of the 1960s and 1970s directed attention and money toward the "problem of crime;" the models derived from the study of crime influence the analysis of parallel processes in the civil arena (see Kritzer 1991) simply because more analysis has already been done. We will never know how our understanding of the criminal system would be different if initial attention had been focused on the civil system.

The American Voter's focus on the 1952 and 1956 elections, rather than the 1964 and 1968 elections, had very important implications for how interpretation of data evolved over time. The elections during the 1950s did not have strong issue content, and this focused attention on other kinds of factors to help differentiate between Eisenhower and Stevenson supporters. This led to very careful and thoughtful development of models that could account for voting in nonissue elections. These models then became the base against which the more issue-oriented elections of the 1960s were evaluated. If, on the other hand, the paradigmatic elections had been more issue-oriented, the focus on nonissue-based models might not have come to the forefront; the theoretical context in which the elections without strong issue content were analyzed might have been very different, and the intellectual history of American voting studies might show little resemblance to what we see when we look back at 40 years of research.

Another contextual element is the "side information" the analyst brings to the research. By side information I mean information not contained within the data set being analyzed. The simplest kind of side information is "common sense" or "common knowledge." While we all know that common sense or common knowledge can be wrong, we still regularly rely upon it, if nothing else as what Wonnacott and Wonnacott (1990, 295n6) call a "personal prior probability." Side information extends well beyond this level as the following examples illustrate.

When a researcher has collected original data, the process of collecting the data can yield important insights. This may involve information distilled from the experience of conducting pretest face-to-face interviews, or perhaps from semi-structured follow up interviews specifically intended to clarify quantitative analysis. Some years ago, Converse (1974, 65) provided a vivid illustration of the role of such information by referring to his own experience as an interviewer as part of a defense of his classic article "The Nature of Belief Systems in Mass Publics" (1964):

Many respondents could not understand that a battery of pure opinion items had no objective "right-wrong" scoring, or that don't know responses were not a confession of the most abject ignorance, to be avoided at all cost.

I was also struck by the frequency with which respondents chose a response alternative dutifully but accompanied the choice with side cues (shoulder shrugging, eye-rolling, giggles, and even sotto voce comments) indicating that they were very much out of their element and would pick any alternative haphazardly by way of helping me out.

The type of side information used in the secondary analysis of survey data is fundamentally different, but no less important. It may involve reference to prior research, or to events related to but outside the data actually analyzed. For example, in The Changing American Voter, Nie, Verba, and Petrocik (1979) argue that while Americans may not have viewed politics in ideological terms in the 1950s, ideological "constraint" jumped sharply in 1964. This interpretation of increased constraint relied heavily upon the "ideologically tempestuous campaign of 1964" (Kinder and Palfrey 1991, 4). As this well-known example shows, the use of side information does not necessarily lead to correct interpretation, even if the interpretation seems logical in the context of thepolitical events in which it is embedded; Sullivan, Piereson, and Marcus (1978) demonstrated that changes in question wording that happened to coincide with the increased ideology content of the 1964 election could account for the measured difference in constraint.

Side information is by no means unique to survey research. Early studies of judicial ideology relied upon behavioral measures of agreement among Supreme Court justices (Pritchett 1948). The interpretation that agreement reflected ideology relied upon the knowledge that the appointing President sought to select new justices who agreed with the President's political goals (i.e., who shared the President's ideology). Thus, here again, the context (or situation background) of the analysis was centrally important to the interpretation of what a pattern of data meant.

The Tropes of Quantitative Interpretation

The idea that context is important for interpretation, both in qualitative and quantitative analysis, represents the clearest parallel between textual and statistical analysis. How can tropological analysis be extended to quantitative analysis? In fact, quantitative analysis relies heavily on tropes, including tropes that take the traditional form of metaphor, as well as a set of tropes that are specific to quantitative analysis. Some of these tropes are explicit, others are not.

Metaphor is important for the interpretation of statistical analysis. Take for example the previously discussed "slope trope" for interpreting regression coefficients. One immediate objection to this metaphor is the implication of causation. While regression analysts regularly acknowledge that a regression coefficient does not automatically imply causation, the interpretation of the coefficients cannot avoid the machine metaphor, even if the connecting word(s) are changed from "causes" to "is associated with."(5) One variable is seen as reacting in concert with the movement of the other variable. The movement of both variables may be attributable to a third variable, one variable may be influencing the second variable, or the two variables may be influencing each other. Regardless of the nature of the connection, the coefficient is interpreted as "if X changes one unit, we expect to see beta units change in Y" in exactly the same way that we say "if we move one end of a lever one unit, we expect to see P units movement in the other end of e lever." Understanding the former relies on familiarity with phenomena like the latter. Someone from a world where action-reaction machines did not exist would have a great deal of difficulty grasping the meaning of a regression coefficient; at the same time, someone from such a world might devise theoretical frameworks that differed sharply from the linear model that underlies much of the statistical analysis of social sciencent> (cf. Achen 1994).

The machine analogy is not the only "trope" (the "slope trope") that one might use for interpreting. Gary King (1989, 102-10) has spelled out four alternative, but related, strategies for interpreting coefficients, each of which with sufficient use, could become an interpretive trope:

These alternatives have potential advantages over the slope trope because they are not as closely tied to linearity, or to the machine analogy.

Images with metaphoric value are regularly used in describing distributions, the relationship among variables, and relationship between metrics. One common example is the "s-curve" that relates the logistic and probit metrics to the probability metrio. The particular graphical display used as an example by King (1989, 105) is such an s-curve, which he calls an "escalator-shaped" curve (note how this term also suggests movement upward or downward). Examples that are even more familiar include the normal "bell-shaped" curve and the "hat-shaped" bivariate normal distribution.

Aside from adapting standard tropes like metaphor to quantitative analysis, what are the special tropes that we regularly employ? One of the most common tropes in statistical analysis, interpretation, and presentation is the trope of statistical significance. It is one of the first elements of interpretation taught in introductory statistics courses, and its subtlety is something that students, and practitioners, have difficulty grasping. The first point of confusion enters with the use of the word significance, which is frequently confused with its near synonym importance. The standard line that teachers of statistics attempt to convey to students is to avoid confusing "statistical significance" with "substantive significance." Of course, experienced analysts understand that all statistical significance is concerned with is the likelihood of seeing the observed pattern if multiple observations were drawn under the condition that the null hypothesis is true. Conveying this understanding in a way that makes clear the distinction between statistical and substantive significance is difficult, and few texts effectively take on the challenge (see Hanushek and Jackson 1977, 60-5, for one of the more successful efforts).

An even more subtle problem arises in comparisons of results that are statistically significant with results that are not. One strategy often pursued in analysis is to stratify a sample based on a theoretically important variable and then repeat a set of statistical procedures separately for the various strata. Frequently, the results achieve statistical significance for some strata but not for others (or there are differences in significance for the stratified analyses compared to the unstratified analyses). Analysts often interpret this as reflecting meaningful differences among strata when in fact parameter estimates are similar and all that differs are standard errors (often reflecting variations in sample sizes). Of course this is clearly wrapped up with a closely related trope, statistical power, and the fact that "a large sample is like a large magnifying glass that allows us to discern the smallest molehill" (Wonnacott and Wonnacott 1990, 291n3). Learning the subtleties of the trope of statistical significance is fraught with traps that often lead to reading much more into a set of results than is actually there.(6)

A second common specialized trope is the trope of "control" or the trope of "partialing." This is the statistical effort to achieve the ceterus paribus ideal that underlies perfect randomization.(7) In multiple regression, the coefficients represent the amount of change expected in the dependent variable for a unit change in each predictor taken one at a time. In interpreting these results, we proceed as if one predictor could change with no other variables changing when we know that this is at best very unlikely (unless the predictors are essentially independent of one another--in which case we did not need to use multiple regression in the first place); the movement of predictors is joint and so the world does not operate in the way that the regression equation leads us to think. The trope of control is important, nonetheless, for interpretation because we usually need to break the world into small slices if we are going to understand it in a way that is both parsimonious and communicatable to our audience.

A third common statistical trope is the trope of "explained variation." That is, in assessing the quality of a regression-type model, analysts often look to some type of Proportional Reduction in Error statistic (e.g., R sup 2 ) as an indicator of how much substantive importance should be attached to the results. As recent critiques have made clear (see, as one example among many, Achen 1991) these statistics, particularly the coefficient of determination (R2), are not without their limitations. The vigorous defense of R sup 2 (see, for example, Lewis-Beck and Skalaban, 1991) arises from the statistic's tropological value: analysts find it extremely valuable as a tool of interpretation. Critics, while arguing that other statistics (or combinations of statistics) contain the same information as R sup 2 , have not provided analysts with tools that are as interpretable as R sup 2 The critics typically argue that statistical tests for individual coefficients should be the core of interpretation, but as I have outlined above, this is a different trope that yields different types of information and suffers from problems of its own; in fact, the explained variation trope is typically used to put tests of statistical significance for individual coefficients into a broader interpretive framework. I would argue that underlying the debate is a set of problems of interpretation that is escapable only by recognizing the need to rely upon both tropes rather than simply one or the other.

There are other tropes of statistical analysis. One of these is the "factoring" trope. In factor analysis (and a number of other techniques of "dimensional" analysis) the goal is to move from pieces of underlying constructs reflected in measured variables to estimates of those constructs. Particularly in exploratory factor analysis, one of the key issues is interpreting and labeling the factors. While ideally there are theoretically determined expectations about what should emerge from the analysis, looking at the items that "load" on each "factor" is central to interpreting what each factor "is." Thus, the parts are used to arrive at an understanding of the whole. Another common trope is that of "temporal ordering;" in the context of the statistical analysis of causation, temporal position is often central to developing a causal argument because one of the key elements required to infer causation from association is temporal ordering: the cause must precede the effect. Without this link it is difficult, if not impossible, to make a causal argument. In some situations, temporal positioning is the crucial element of the causal argument. For example, the link between party identification and voting behavior in The American Voter is based on the finding that party identification is typically established in childhood, well before the voting decision. This example suggests one aspect of the trope of causal ordering: conceptualizing causes and effects as discrete (as opposed to continuous or ongoing) events; one could just as well conceptualize many causes and effects as ongoing (and this is certainly true for many of the constructs used in the study of voting behavior such as partisan identification). Nonetheless, the tropological convention here is to focus on the discrete so that it is possible to arrive at some temporal ordering.

The list above is by no means exhaustive. Rather it simply shows that tropological conventions that serve functions equivalent or similar to tropes like metaphor, metonymy, and synecdoche are central to statistical analysis and interpretation. Except at the point of initial training the tropes are usually implicit in the interpretation process rather than explicit. Without these kinds of tropological tools, nonetheless, we would have a great deal of difficulty both in recognizing the meaning of statistical results and in communicating those results to an audience, particularly an audience that is not thoroughly familiar with the underlying tropes. When analysts from disciplines with differing tropological conventions (e.g., those trained in the econometrics tradition that shuns R sup 2 and those trained in the psychometric tradition where R sup 2 is accepted and used) are presented with a set of statistical results, the tropes they focus on can affect the substantive interpretation of the "statistical text," possibly leading to very different emphases and in extreme cases to opposite conclusions.

Summary

In this section I discussed various elements of the interpretation process in quantitative analysis. My discussion drew upon concepts and constructs from the type of textual analysis done by literary critics and some social scientists working with textual data. My goal has been to present a reconstruction of what it is that we do. Simply describing many of the common conventions used in interpretation as tropes (e.g., R sup 2 or explained variance as a trope) does not change how we use them, although recognizing the way we use such conventions may improve our understanding of our own processes of interpretation.

Note that I do not posit that interpretation in one setting is just the same as interpretation in another. For example, one of the key elements of interpretation theory in contemporary literary criticism is the indeterminacy of interpretation and meanings, the implication being, within very broad limits, one interpretation is just as good as another. While there may be ambiguity in much of the interpretation involved in quantitative analysis, both theory and method impose much tighter constraints on those doing the interpreting, and provide criteria against which both the analyst and the audience can judge specific interpretations.

An important question is whether there is anything to be gained with regard to our method and practice of interpretation by recognizing the commonalities between interpretation in very different endeavors. That is, can other disciplines, far removed from social science, suggest to us practices or approaches that we might employ to help us find interpretations of our data and analysis? I turn to this in my final section.

Bringing Art into Data Analysis and Interpretation

Performing the Data, or The Data Analyst at Play

In the preceding sections of this essay, I have tried to show the commonalities between interpretational methods used by quantitative analysts and textual analysts. There is (at least) one other realm where interpretation plays a key role: performance. Is there an approach to interpreting quantitative data that might get the data to "perform" for the analyst? That is, interpretation can include "performing the data." In suggesting that we as data analysts perform the data, I draw on an important commonality between performing a musical score or a dramatic text, and interpreting the results of a data analysis: in both cases one is looking for more than is simply in the notes, or text, or coefficients. That is, one is trying to read something into the notes, or the lines, or the numbers in order to grasp the "subtext."

What does this parallel really mean for our methods of interpretation? The answer lies in looking at the methods used by actors or musicians seeking to understand the script or the score. (Here I focus on the performer's methods of interpretation, leaving aside the interpretive tools of the director or conductor--one might imagine a situation of a senior researcher working with one or more associates who plays a role in the interpretation process analogous to a conductor). In both cases, rehearsal plays a central role in arriving at interpretations, and there are specific techniques used during rehearsal that help the performer in finding his or her interpretation. For the actor, this might involve engaging in some types of exercises using the text, singing his or her lines, over-dramatizing those lines, racing through the lines as fast as possible, or filling in "facts" about his or her character--e.g., what experiences the fictional character might have had before the dramatized events (see Hagen 1973). For a musician, developing an interpretation involves things such as experimenting with fermata (the length of a hold or pause), testing the range of the dynamics specified in the piece of music (loudness or softness), or varying aspects of the tempo (speed and timing). A musician might try developing variations and improvisations based on a theme in the piece, and manipulating them in various ways; or, the musician may play a piece in a way that he or she would never do in performance--very fast, very slow, in a varying, nonsensical tempo (see Barra 1983; Goodrich 1899). In a sense, arriving at an interpretation for playing a dramatic role or playing a piece of music requires that the performer play with the text or score in ways that would not be appropriate for public presentation.

In much the same way, for a data analyst playing (in the sense of performing) the data may be aided by playing with the data. That is, an analyst can do things with the data that he or she knows are technically inappropriate or problematic. The rationale here is that by stretching the limits of the data or the techniques it is possible to arrive at new insights and perspectives that aid in interpreting the results of the overall analysis. As an undergraduate learning and doing statistical analysis, I worked for a social psychologist. He would almost always do a factor analysis as his first look at any data set, explaining that he found factor analysis a good way to get a sense of the structure within the data set, even while knowing that the primary methods of analysis would not be factor analysis. He also knew that he had to be very cautious in considering the results of these analyses.

A good example of a method that we have long realized is fraught with dangers is stepwise regression (see Lewis-Beck 1978). The standard line that we preach to our students is that you should never, ever use this method. In making such a definite statement, however, we may be discarding a useful tool for interpretation because it is a bad tool for model building. That is, after specifying a model based on our theory, it may be that applying one or more variations on stepwise regression will provide some insights into how the variables combine, and why they combine the way that they do. By playing with the data and the model, we may come to see aspects of it, either those present or those omitted, that deepen our understanding of what is going on. The stepwise variations might involve controlling the order of entry or removal of variables rather than relying solely upon the mindless computer algorithms, or it might involve watching what happens when we let the mindless algorithm make the decisions (because, if nothing else, that mindless algorithm is doing it differently than we would do it). One of the dangers of using stepwise regression for model building, the tendency to be overly responsive to sample specific patterns, can help us understand why a theoretical model does not seem to fit in a particular sample. That is, by carefully watching how a stepwise procedure works through a set of variables, we may see something that tells us why our theoretical model did not work quite the way we expected it to (e.g., it may be that two variables are correlated in a peculiar way in a sample and this can become evident in the order in which the variables enter the equation).

Various kinds of dimensional methods--factor analysis, cluster analysis, multidimensional scaling (smallest space analysis)--provide all sorts of opportunities for "playing" with the data. The authors of The Hollow Core (Heinz, et al. 1993), a recent book on interest representation in Washington, described to me sitting before a computer screen as one of them dynamically manipulated the three dimensional plots of some of their smallest space analysis results. It was by looking at the results from different perspectives that they began to see what they came to label the "hollow core." This was not a pattern or finding that they had anticipated.

Dynamically manipulating a spatial solution is only one of many graphical methods for playing with data. The richest discussion of how such manipulations can lead to insights into data and statistical results is Cleveland's recent book, Visualizing Data (1943). Graphical methods have been around for at least two hundred years, although their widespread use is largely a 20th century phenomenon (Lewandowsky and Spence 1989-90, 202). With some exceptions (e.g., scatter plots associated with regression-type analyses), until recently graphics have primarily been a tool for communication; this largely reflected the labor required to prepare graphs. Rapidly improving desktop graphic software has opened new opportunities for play. In his book, Cleveland shows how manipulating graphs in a variety of ways (ordering, banking, varying the aspect ratio, etc.) can lead to insights that would otherwise be overlooked.

A last technique that is good for playing with the data is the old Automatic Interaction Detector (ATD) algorithm (Sonquist, Baker, and Morgan 1973) which takes a set of data where one variable is defined as a dependent variable, and does successive splits that maximize the differences in the means. As with stepwise regression, this is fraught with the dangers of sample-specific results. It may suggest, nonetheless, alternative model specifications, or nonobvious interrelationships among important variables.

We sometimes speak of the "art" of data analysis as well as the "method." I have sought here to exploit the idea of art by suggesting that a method of interpretation can involve letting the data perform for us. The way to make the data perform is to stretch the limits of what we customarily view as acceptable. As we relax controls, we let the data go, in much the same way that a performer can develop and/or enhance his or her interpretation by "letting go." One must approach the results of these performances with an extremely critical eye, but from the performances new and enhanced understandings can emerge in unexpected ways.

A skeptic might argue that drawing analogies to the actor or the musician is grossly inappropriate because a musical score leaves much more room for interpretation than does a set of data or a mathematical model. A key difference that negates the comparison is the precision of mathematical representation, which is why science looks so strongly to mathematical models and mathematical analysis. A musical score, however, is a very precise representation, with each note showing a precise frequency and duration, and the score as a whole showing tempo, meter, rhythm, and attack (Levinson and Balkin 1991, 1608). Even with this precision, there is much room for interpretation by the performer because there are more elements to the music than what the score represents. Likewise, while a mathematical representation is very precise, achieving that precision involves leaving a lot out; connecting what the representation does show to the real world involves adding elements, and choosing the elements to add is fundamentally a process of interpretation. While mathematical forms provide more precision than do most other forms, that precision is at best relative. Karl Llewellyn, writing on law, another area where interpretation is controversial, rejected the view that in law there could be a single right answer, commenting that "square roots shook me out of this, with the +/- answer" (1960, 213).

The Pleasure of the Statistical Text

The choice between quantitative and qualitative analysis is often posed in terms of selecting the "right" method. This problem of selection may be framed around problem-specific issues or may be described as rooted in more deeply held philosophical perspectives. The debates that surround methodological choices are important, but they often are specious. Most analysts make their broad methodological choices based on what they like doing. I do quantitative research because the analytic process is fun. Most members of the political Methodology Section do quantitative research because they enjoy the process of puzzling out the problems and implications of the data they are working with. If the process of quantitative analysis were a simple one--posit a theory, do a straight forward analysis, report the results--most good analysts would quickly get bored and look for another line of work. But data are fun:

The idea that the research and analysis process is a source of pleasure is seldom discussed in an explicit way. The evidence for this proposition is clear, however, in the genre of writing that Van Maanen (1988, 73-200) has described as "confessional tales." In the domain of political science, Fenno's writings on participant observation (1990) make clear the pleasure that he draws from his research on Congress, and the collection of reflective essays assembled by Shively (1984) is typified by Stimson's comment (1984, 76):

Scratch a satisfied researcher and you are likely to uncover bemused guilt. "If this work is so much fun," he asks, "why is someone paying me to do it? Should I enjoy it less or come clean and admit that I have been accepting salary under a false pretense of work?" For what we do, when it is satisfying, is what sensible people experience when they read or watch detective thrillers.

Of course the research process can be and often is difficult and frustrating, but working through these problems heightens the pleasure that the analyst derives from his or her work. As Stimson notes (drawing from Kuhn), research often has a puzzle-like quality: we do not want it to be too easy (because it will become boring), but we do want to have a chance of reaching something resembling closure (1984, 76).

Where the qualitative researcher often finds positives in the fieldwork experience, the quantitative researcher's opportunities for fun concentrate more around the data themselves (or the interaction of data and theory). In writing about textual analysis, Barthes referred to The Pleasures of the Text (1975 1982, 412):

There is supposed to be a mystique of the Text.--On the contrary, the whole effort consists in materializing the pleasure of text, in making the text an object of pleasure like the others. That is: either relate the text to the "pleasures" of life (a dish, a garden, an encounter, a voice, a moment, etc.) and to it join the personal catalogue of our sensualities, or force the text to breach bliss, that immense subjective loss, thereby identifying this text with the purest moments of perversion, with its clandestine sites. The important thing is to equalize the field of pleasure, to abolish the false opposition of practical life and contemplative life.

There is an important distinction to be drawn between the fun associated with applying a methodology and the fun of discovery. One pattern that can be observed among some quantitative analysts is the use of a particular statistical technique for reasons of "love of technique" rather than a good fit between substance and method. Kaplan (1964, 28) referred to this problem as the law of the instrument: "Give a small boy a hammer, and he will find that everything he encounters will need pounding." A generation ago there was a period in which a number of political scientists became enamored with factor analysis, and proceeded to factor sets of data with little attention to either the substantive or methodological issues involved; this inspired a highly critical essay with the subtitle, "Tom Swift and His Electric Factor Analysis Machine" (Armstrong 1967).

The pleasures of the statistical text arise not from being the first on the block to try the latest statistical technique. The pleasures come from discovering the links between the text and the "real world" that the text derives from. This discovery process is fundamentally interpretive. Of course technical issues are important. Applying statistical techniques in grossly inappropriate ways precludes a meaningful interpretive process. Good quantitative analysis, however, is more than an issue of technique. Moving from the data and the statistical results to an understanding of the significance of the findings involves working at a variety of levels and using a variety of skills that transcend particular methods. These skills of interpretation, when used well, serve to bind together social scientists working with either quantitative (statistical) or qualitative (textual) data because regardless of the type of "text" interpretation relies upon similar processes.

The process of data analysis, regardless of whether the data are qualitative or quantitative, involves an interaction between the analyst and the data. The better the questions the analyst asks of the data, the more informative the answers that can be extracted. King has argued (1991, 24) that quantitative political science will advance only by "bringing more politics into our quantitative analyses," and that this can be accomplished by "using more sophisticated stochastic modeling, understanding and developing our own theories of inference, and developing and using graphical analysis more often." In fact, developing an understanding of the interpretive process is a precondition to bringing more politics into the analytic process; interpretation is a human process which is inseparable from politics. As Toulmin (1983) has made clear, the political nature of that process pervades interpretation regardless of the object of interpretation. Furthermore, interpretation is a problem of language and communication (see the "rhetoric of inquiry literature" including Simons 1989; Nelson, McGill, and McCloskey 1987; McCloskey 1985), even where that language is mathematical in form. This leads to the recognition that modes of interpretation must have core commonalities regardless of whether the object of interpretation is traditional text or statistical data.

Manuscript submitted 6 January 1995.

Final manuscript received 10 May 1995.

Notes

1. Newton's third law is typically stated as "for every action there is an equal and opposite reaction."

2. The lever is one of two "machines" (the other being the pulley) whose governing principle was first described by the founder of the sciencent> of mechanics, Archimedes. Other simple machines, some of which might be used in explaining the interpretation of regression coefficients include the wheel and axle, the screw, the inclined plane, and the wedge.

3. This idea is nicely captured by the distinction made in semiotics (see Saussure 1915; Barthes 1967; Eco 1976) between denotation and connotation. Semiotics posits that meanings are produced by the union of communication units (such as words, pictures, numbers) called "signifiers" with definitions or meanings called "signifieds" to produce "signs." This process goes on at multiple levels, with the most basic level involving "denotation" and the higher levels (where signs themselves become signifiers) involving "connotation."

4. One of the types of specification searches defined by Leamer is the "interpretative search" which involves the interpretation of multidimensional evidence. While this sounds very much like the idea of interpretation being used more broadly in this paper, it is in fact much nanwer, since "multidimensional evidence" is defined as the specific use of alternate regression models to update the analyst's "opinions from prior to posterior distribution in response to the data evidence" (1978, 123). The general Bayesian problem is much broader, using evidence that extends beyond the alternative regression models that might be estimated.

5. When one moves to the realm of analyzing causation, particularly through the method of controlled comparison (one form of which is regression), the process can be described as one of "elaborating analogic metaphors" (see Brown 1977, 120-1).

6. I have only touched the surface of issues related to the trope of statistical Standard presentations of significance testing in introductory texts describe the method advocated by Neyman and Pearson (1928; see also, Pearson 1995, 1966), with no mention of controversial aspects of that approach (see Fisher 1955; Hacking 1965; Morrison and Henkel 1970, or alternative approaches such as Bayesian inferences.

7. Strictly speaking, these two tropes need not be the same. There are types of "control" that differ from the "partialing" approach used in multiple regression. For example, one common kind of control is to create a set of data with a common base such as is done when an analyst works with per capita measures. This kind of control has led to a debate over the problem of "ratio variables." This debate (see, for example, Firebaugh 1988; Long 1980; Lyons 1977; Uslaner 1976) which is traceable to work by Karl Pearson almost 100 years ago (1897), is typically expressed as a mathematically-based problem of method, and is better understood as a problem of interpretation (see Kritzer 1990).

REFERENCES

Abbott, Edwin A. 1882 [1953]. Flatland: A Romance of Many Dimensions. Reprint. New York: Dover.

Achen, Christopher H. 1982. Interpreting and Using Regression. Beverly Hills, CA: Sage Publications.

Achen, Christopher H. 1991. "What Does 'Explained Variance' Explain?: Reply." In Political Analysis, 1990, ed. James Stimson. Ann Arbor: University of Michigan Press.

Achen, Christopher H. 1994. "The Data Analysis Revolution and S-PLUS." The Political Methodologist 6(1):2-6.

Armstrong, J. Scott. 1967. "Derivation of Theory of Means of Factor Analysis, or Tom Swift and His Electric Factor Analysis Machine." American Statistician 21:17-27.

Barra, Donald. 1983. The Dynamic Performance: A Performer's Guide to Musical Expression and Interpretation. Englewood Cliffs, NJ: Prentice Hall, Inc.

Barthes, Roland. 1967. Elements of Semiology. New York: Hill and Wang.

Barthes, Roland. 1975 [1982]. "From The Pleasure of the Text." In A Barthes Reader, ed. Susan Sontag. New York: Hill and Wang.

Beljame, A. 1948. Men of Letters and the English Public in the Eighteenth Century, 1660-1744: Dryden, Addison, and Pope. London: Kegan Paul.

Bernstein, Richard. "It's Back to the Blackboard for Literary Criticism." New York Times, 19 February 1991, sec. B.

Bloch, Marc. 1953. The Historian's Craft. New York: Alfred Knopf.

Brown, Richard Harvey. 1977. A Poetic for Sociology: Toward a Logic of Discovery for the Human Sciences. Chicago: University of Chicago Press.

Burke, Kenneth. 1945. A Grammar of Motive. New York: Prentice-Hall.

Campbell, Angus, Philip E. Converse, Warren E. Miller, and Donald E. Stokes. 1960. The American Voter. New York: John Wiley & Sons.

Cleveland, William S. 1993. Visualizing Data. Summit, NJ: Hobart Press.

Cohen, Jacob. 1990. "Things I Have Learned (So Far)." American Psychologist 45:1304-12.

Converse, Philip E. 1964. "The Nature of Belief Systems in Mass Publics." In Ideology and Discontent, ed. David E. Apter. Glencoe: The Free Press.

Converse, Philip E. 1974. "Comment: The Status of Nonattitudes." American Political Science Review 68:650-60.

Downs, Anthony. 1957. An Economic Theory of Democracy. New York: Harper.

Duncan, Hugh Dalziel. 1953. Language and Literature in Society. New York: Bedminster Press.

Duval, Robert. 1988. "TM or Not TM? A Comment on 'International Peace Project in the Middle East'." Journal of Conflict Resolution 32:813-7.

Eco, Umberto. 1976. A Theory of Semiotics. Bloomington: Indiana University Press.

Erikson. Robert S., Gerald C. Wright, Jr., and John P. McIver. 1989. "Political Parties, Public Opinion, and State Policy in the United States." American Political Science Review 83:729-49.

Fee, Joan Flynn. 1981. "Symbols in Survey Questions: Solving the Problem of Multiple Word Meanings." Political Methodology 7(2): 71-95.

Fenno, Richard F., Jr. 1986. "Observation, Context, and Sequence." American Political Science Review 80:3-15.

Fenno, Richard F., Jr. 1990. Watching Politicians: Essays on Participant Observation. Berkeley, CA: IGS Press.

Firebaugh, Glenn. 1988. "The Ratio Variables Hoax in Political Science." American Journal of Political Science 32:523-35.

Fisher, Ronald. 1955. "Statistical Methods and Scientific Induction." Journal of the Royal Statistical Society (Series B) 17:69-78.

Goodrich, A: J. 1899. Theory of Interpretation Applied to Artistic Musical Performance. Philadelphia: Theodore Presser.

Greenblatt, Roger. 1990. Learning to Curse. New York: Routledge. Hacking, Ian. 1965. The Logic of Statistical Inference. Cambridge: Cambridge University Press.

Hagen, Uta. 1973. Respect for Acting. New York: Macmillan.

Hanushek, Eric A., and John E. Jackson. 1977. Statistical Methods for Social Scientists. New York: Academic Press.

Harari, Josue V., ed. 1979. Textual Strategies: Perspectives in Post-Structralist Criticism. Ithaca, NY: Cornell University Press.

Heinz, John P., Edward O. Laumann, Robert L. Nelson, and Robert Salisbury. 1993. The Hollow Core: Private Interests in National Policy Making. Cambridge, MA: Harvard University Press.

Hempel, Carl G. 1966. Philosophy of Natural Science. Englewood Cliffs, New Jersey: Prentice-Hall, Inc.

Huff, Darrell. 1954. How To Lie with Statistics. New York: W.W. Norton.

Jauss, Hans Robert. 1982. Toward an Aesthetic of Reception. Minneapolis: University of Minnesota Press.

Kaplan, Abraham. 1964. The Conduct of Inquiry: Methodology for Behavioral Science. San Francisco: Chandler Publishing Company.

Kinder, Donald R., and Thomas R. Palfrey. 1991. "An Experimental Political Science? Yes, an Experimental Political Science." The Political Methodologist 4(1):2-8.

King, Gary. 1989. Unifying Political Methodology: The Likelihood Theory of Statistical Inference. New York: Cambridge University Press.

King, Gary. 1991. "On Political Methodology." In Political Analysis, 1990, ed. James Stimson. Ann Arbor: University of Michigan Press.

Kondo, Dorinne K. 1990. Crafting Selves: Power, Gender, and Discourses of Identity in a Japanese Workplace. Chicago: University of Chicago Press.

Kritzer, Herbert M. 1975. "Sanctions and Deviance: Another Look." IUSTITA 3:1828.

Kritzer, Herbert M. 1990. "Substance and Method in the Use of Ratio Variables, or the Spurious Nature of Spurious Correlation?" Journal of Politics 52:243-54.

Kritzer, Herbert M. 1991. Let's Make a Deal: Negotiation and Settlement in Ordinary Litigation. Madison: University of Wisconsin Press.

Kuhn, Thomas. 1962. The Structure of Scientific Revolution. Chicago: University of Chicago Press.

Lakoff, George, 1993. "The Contemporary Theory of Metaphor." In Metaphor and Thought [2d Edition], ed. Andrew Ortony. New York: Cambridge University Press.

Leamer, Edward E. 1978. Specification Searches: Ad Hoc Inference with Nonexperimental Data. New York: John Wiley & Sons.

Lehman, Darrin R., Richard O. Lempert, and Richard E. Nisbett. 1988. "The Effects of Graduate Training on Reasoning: Formal Discipline and Thinking About Everyday Life Events." American Psychologist 43:431-42.

Levinson, Sanford, and J.M. Balkin. 1991. "Law, Music, and other Performing Arts." University of Pennsylvania Law Review 139:1597-1659.

Lewandowsky, Stephan, and Ian Spence. 1989-90. "The Perception of Statistical Graphs." Sociological Methods and Research 18:200-42.

Lewis-Beck, Michael S. 1978. "Stepwise Regression: A Caution." Political Methodology 5:213-40.

Lewis-Beck, Michael S., and Andrew Skalaban. 1991. "The R-Squared: Some Straight Talk." In Political Analysis, 1990, ed. James Stimson. Ann Arbor: University of Michigan Press.

Liao, Tim Futing. 1994. Interpreting Probability Models: Logit, Probit, and Other Generalized Linear Models. Thousand Oaks, California: Sage Publications.

Llewellyn, Karl N. 1960. The Common Law Tradition: Deciding Appeals. Boston: Little, Brown.

Long, Susan B. 1980. "The Continuing Debate Over the Use of Ratio Variables: Facts and Fiction." In Sociological Methodology 1980, ed. Karl Schuessler. San Francisco: Jossey-Bass.

Lyons, William. 1977. "Per Capita Index Construction: A Defense." American Journal of Political Science 21:177-82.

McCloskey, Donald N. 1985. The Rhetoric of Economics. Madison: University of Wisconsin Press.

Miles, Matthew B., and A. Michael Huberman. 1984. Qualitative Data Analysis: A Sourcebook of New Methods. Beverly Hills, CA: Sage Publications.

Morrison, Denton E., and Ramon E. Henkel, eds. 1970. The Significance Test Controversy. Chicago: Aldine.

Nelson, John S., Allen Megill, and Donald N. McCloskey, eds. 1987. The Rhetoric of the Human Sciences. Madison University of Wisconsin Press.

Neyman, Jerzy, and Egon S. Pearson. 1928. "On the Use and Interpretation of Certain Test Criteria for the Purposes of Statistical Inference." Biometrika 20:174-233.

Nie, Norman H., Sidney Verba, and John R. Petrocik. 1979. The Changing American Voter. Cambridge: Harvard University Press.

Norrander, Barbara. 1989; "Explaining Cross-State Variation in Independent Identification." American Journal of Political Science 33:516-36.

Orme-Johnson, David W., Charles N. Alexander, John L. Davies, Howard M. Chandler, and Wallace E. Larimore. 1988. "International Peace Project in the Middle East: The Effects of the Maharishi Technology of the Unified Field." Journal of Conflict Resolution 32:776-812.

Paulos, John Allen. 1988. Innumeracy: Mathematical Illiteracy and Its Consequences. New York: Hill and Wang.

Pearson, Egon S. 1955. "Statistical Concepts and Their Relation to Reality." Journal of the Royal Statistical Society. Series B. 17:204-207.

Pearson, Egon S. 1966. "Some Thoughts on Statistical inference." In The Selected Papers of E.S. Pearson. Berkeley: University of California Press.

Pearson, Karl. 1897. "Mathematical Contributions to the Theory of Evolution: On a Form of Spurious Correlation When Indices Are Used in the Measurement of Organs." Proceedings of the Royal Society of London 60:489-98.

Polanyi, Michael. 1958. Personal Knowledge: Toward a Post-critical Philosophy. Chicago: University of Chicago Press.

Pritchett, C. Herman. 1948. The Roosevelt Court: A Study in Judicial Politics and Values, 1937-1947. New York: Macmillan.

Ricoeur, Paul. 1979. "The Model of the Text: Meaningful Action Considered as a Text." In Interpretive Social Science: A Reader, ed. Paul Rabinow and William M. Sullivan. Berkeley: University of California Press.

Russett, Bruce. 1988. "Editor's Comment." Journal of Conflict Resolution 44:773-5.

Sarat, Austin. 1990. "Off to Meet the Wizard: Beyond Validity and Reliability in the Search for a Post-empiricist Sociology of Law." Law & Social Inquiry 15:155-70.

Saussure, Ferdinand de. 1915, 1959. Course in General Linguistics. Trans. Wade Baskin. New York: McGraw-Hill.

Scholes, Robert. 1985. Textual Power: Literary Theory and the Teaching of English. New Haven: Yale University Press.

Shively, W. Phillips, ed. 1984. The Research Process in Political Science. Itasca, IL: F.E. Peacock Publishers.

Simons, Herbert W., ed. 1989. Rhetoric in the Human Sciences. Newbury Park, CA: Sage Publications.

Smarr, Janet Levarie, ed. 1993. Historical Criticism and the Challenge of Theory. Urbana: University of Illinois Press.

Sonquist, James A., Elizabeth Lauh Baker, and James N. Morgan. 1973. Searching for Structure. Ann Arbor: Institute for Social Research.

Stimson, James E. 1984. "Pursuing Belief Structure: A Research Narrative." In The Research Process in Political Science, ed. W. Phillips Shively. Itasca, IL: F.E. Peacock Publishers.

Sullivan, John L., James E. Piereson, and George E. Marcus. 1978. "Ideological Constraint in the Mass Public: A Methodological Critique and Some New Findings." American Journal of Political Science 22:233-249.

Taylor, Charles. 1971. "Interpretation and the Sciences of Man." Review of Metaphysics 25:3-51.

Tompkins, lane P., ed. 1980. Reader-Response Criticism: From Formalism to Post-Structuralism. Baltimore: Johns Hopkins University Press

Toulmin, Stephen. 1983. "The Construal of Reality: Criticism in Modern and Postmodern sciencent>." In The Politics of Interpretation, ed. W.J.T. Mitchell. Chicago: University of Chicago Press.

Uslaner, Eric. 1976. "The Pitfalls of Per Capita." American Journal of Political Science 20:125-33.

Van Maanen, John. 1988. Tales of the Field: On Writing Ethnography. Chicago: University of Chicago Press.

White, Hayden. 1978. Tropics of Discourse: Essays in Cultural Criticism. Baltimore: Johns Hopkins University Press.

Wonnacott, Thomas H., and R. J Wonnacott. 1990. Introductory Statistics. 5th ed. New York: John Wiley & Sons.