20 Argument, Evidence, and the Limits of Digital Literary Studies

David L. Hoover

There are many kinds of literary arguments, and two ways they vary are in the extent to which they deploy evidence and in the nature of that evidence. Here I want to discuss a small sampling of these varieties of argument and evidence and how they intersect with and affect or limit the usefulness of digital approaches to literary studies. I will argue that at least some approaches (intentionally) devalue evidence and use arguments that barely deserve the name. The specious attractiveness of such approaches is dangerous to the unwary and has the potential to damage further how the humanities are viewed both by the public at large and by disciplines with approaches that are more closely based on evidence. Finally, I will look more closely at two problematic discussions of the nature of evidence and argument in the digital humanities and suggest ways that digital methods might improve them.

Argument and Evidence in (Digital) Literary Studies

I begin with a brief look back at a famous hoax. Alan Sokal, a New York University physicist, submitted “Transgressing the Boundaries: Toward a Transformative Hermeneutics of Quantum Gravity” to Social Text. When the journal published it (Sokal, “Transgressing the Boundaries”) without having anyone who knew anything about science read it, Sokal revealed to the editors of Lingua Franca that it was a hoax (Sokal, “A Physicist Experiments with Cultural Studies”). This fiasco resonates in many ways with the all-too-common lack of respect for argument and evidence in literary studies before and since the hoax (see Guillory for a thorough discussion of the hoax in a broader context). Sokal put it this way:

Social Text’s acceptance of my article exemplifies the intellectual arrogance of Theory—postmodernist literary theory, that is—carried to its logical extreme. No wonder they didn’t bother to consult a physicist. If all is discourse and “text,” then knowledge of the real world is superfluous; even physics becomes just another branch of cultural studies. If, moreover, all is rhetoric and language games, then internal logical consistency is superfluous too: a patina of theoretical sophistication serves equally well.

Incomprehensibility becomes a virtue; allusions, metaphors, and puns substitute for evidence and logic. My own article is, if anything, an extremely modest example of this well-established genre. (Sokal, “A Physicist Experiments with Cultural Studies,” 52)

I do not think Sokal’s wonderful proposal that gravity is a social construct is “an extremely modest example,” but egregious examples of the misuse of argument and evidence in literary studies help to explain why his spoof was not so far beyond the pale as to alert the editors of Social Text to its nature. To take just two examples I have discussed elsewhere, consider Stanley Fish’s argument in his influential Is There a Text in This Class that there is nothing about Faulkner’s “A Rose for Emily” that prevents it from being read as a story about Eskimos (347), or Jerome McGann’s argument in his prize-winning Radiant Textuality that Joyce Kilmer’s “Trees” is actually a good modernist poem about God having sex with trees since “only God can make a tree” (29–52). (See Hoover, “Hot-Air Textuality” and “The End of the Irrelevant Text,” for discussion of these and other examples.) Both of these arguments were intended as provocations and as invitations to rethink interpretation, but such arguments, and the tolerance for them that made Sokal’s hoax possible, have, I think, damaged the public perception of literary studies. Worse yet, some scholars may, because of the prestige of those making the arguments and the venues in which they appear, come to think of such arguments as legitimate and acceptable outside the provocative discussions in which they first appeared. I turn now to two discussions of digital humanities that explicitly take up the question of evidence and argument.

Ramsay’s Reading Machines and Virginia Woolf’s The Waves

In Reading Machines: Toward an Algorithmic Criticism, Stephen Ramsay suggests that computational studies of literature remain marginalized because they lack “bold statements, strong readings, and broad generalizations” (2). They are too cautious, too scientific, to interest literary critics, who value opening texts to new interpretations more than they value solving problems (10–11). (My discussion here is based, in part, on Hoover, “Making Waves,” and Plasek and Hoover.) Ramsay quotes from a feminist discussion of The Waves (Woolf) that he says challenges the digital humanities:

In this essay I want to resituate The Waves as complexly formulating and reformulating subjectivity through its playful formal style and elision of corporeal materiality. The Waves models an alternative subjectivity that exceeds the dominant (white, male, heterosexual) individual western subject through its stylistic usage of metaphor and metonymy. . . . Focusing on the narrative construction of subjectivity reveals the pertinence of The Waves for current feminist reconfigurations of the feminine subject. This focus links the novel’s visionary limitations to the historic moment of Modernism. (Wallace, 295–96)

Ramsay argues that

Wallace frames her discourse as a “resituation” of Woolf’s novel within several larger fields of critical discourse. This will presumably involve the marshaling of evidence and the annunciation of claims. It may even involve offering various “facts” in support of her conclusions. But hermeneutically, literary critical arguments of this sort do not stand in the same relationship to facts, claims, and evidence as the more empirical forms of inquiry. There is no experiment that can verify the idea that Woolf s “playful formal style” reformulates subjectivity or that her “elision of corporeal materiality” exceeds the dominant Western subject. (Ramsay, Reading Machines, 7)

Literary criticism’s problematic relationship to facts, claims, and evidence seems more like a bug than a feature, but I think John Guillory is right in arguing that “if positivism is a holistic or totalizing ideology that reserves the name of knowledge only for the results of the scientific method (narrowly defined), it does not follow that the critical disciplines must be based on a counter-holism in which everything is interpretation, in which the very possibility of a positive knowledge is called into question” (Guillory, 504). A reexamination and interrogation of Ramsay’s algorithmic provocation will clarify these issues.

The Waves consists of alternating monologues by three male and three female characters, an experimental technique that has invited critical comment about what axes of difference or unity characterize the novel:

Are Woolf’s individuated characters to be understood as six sides of an individual consciousness (six modalities of an idealized Modernist self?), or are we meant to read against the fiction of unity that Woolf has created by having each of these modalities assume the same stylistic voice?

It is tempting for the text analysis practitioner to view this as a problem to be solved—as if the question were rhetorically equivalent to “Who wrote Federalist 10?” The category error arises because we mistake questions about the properties of objects with questions about the phenomenal experience of observers. . . . We may ask “What does it mean?” but in the context of critical discourse this is often an elliptical way of saying “Can I interpret (or read) it this way?” (Ramsay, Reading Machines, 10)

It is not clear to me that Woolf creates a “fiction of unity” in The Waves. One could argue instead that separating the voices of six “different” characters creates a “fiction of diversity.” (For a thorough recent discussion of the various views about the similarities and differences among the voices in The Waves, see Balossi, A Corpus Linguistic Approach to Literary Language and Characterization, chaps. 1–2.)

Computationally Tractable and Computationally Intractable Questions

There is, I suggest, a hierarchy in the tractability of various literary claims to a computational investigation, though I am not suggesting an equation of tractability with value or importance. Clearly many tractable problems will be of limited interest to literary scholars (are Faulkner’s sentences longer than those of Henry James?). “We should expect six different voices (or one unified voice) in The Waves” seems a computationally intractable claim, a matter of interpretation. Evidence might be adduced in its favor, but such evidence would necessarily be oblique and the argument for it largely a matter of persuasion. “The Waves has a ‘playful formal style’ also seems intractable. The style does not seem playful to me, but it is difficult to imagine how convincing evidence could be brought to bear on the question.” (Evidence to support or contest the argument that the style is “experimental” would be easier to produce.) “The Waves displays an ‘elision of corporeal materiality,’ in contrast, seems more amenable to an argument based on evidence.” The claim that Woolf elides “corporeal materiality” surely has implications for the vocabulary of the text that could be tested computationally and statistically. I reject Ramsay’s claim that it is a category error to treat the question of whether the voices of the six monologues in The Waves are the same or different as a problem to be solved. It seems instead an important question that is quite amenable to digital humanities methods—methods that have been extensively tested in the area of authorship attribution (where Ramsay accepts the idea of solvable problems). I will make the bold claim that the six voices are demonstrably different. I should emphasize here that the fact that a problem is computationally tractable does not mean that a definitive or certain solution is necessarily possible; nor does the fact that a problem is computationally intractable mean that a legitimate and effective argument cannot be made about it.1

In an attempt to provoke discussion rather than solve a problem, Ramsay treats the six monologues as a corpus of documents and investigates them with tf-idf, a measure from the field of information retrieval that is used (in multiple variations) in many search engines.2 Simply put, a tf-idf score is the frequency of a word multiplied by the total number of documents and divided by the number of documents containing the word. These scores, he suggests, should identify each monologue’s characteristic words more effectively than a traditional word-frequency list that is dominated by very frequent words shared by most texts. (The overwhelming evidence of the last thirty years that the most frequent words are very effective for authorship attribution problems makes this argument less than compelling.) Tf-idf scores are lower for function words and higher for speakers’ characteristic words because the frequencies of words used by only one speaker are multiplied by six (six total documents divided by the one in which the word occurs), while the frequencies of words used by all six speakers are multiplied by one (six total documents divided by the six in which the word occurs) (Ramsay, Reading Machines, 11). After identifying each speaker’s most characteristic words, Ramsay reveals that he has actually used a slightly different formula that includes a log function (to reduce the effect of a word’s appearance in only one speaker) and prevents the scores from becoming negative.3 The purpose of the alterations in the formula “is not to bring the results into closer conformity with ‘reality,’ but merely to render the weighting numbers more sensible to the analyst” (Reading Machines, 15). Yet the variants are not “merely” at the whim of the analyst; they have testable consequences.

Tf-idf and the Question of Character-Individualization in The Waves

But let us travel a bit further with Ramsay by looking at his list of the words with the highest tf-idf scores in Louis’s monologue (I have added the raw frequency of each word). He suggests that “few readers of The Waves would fail to see some emergence of pattern in this list” (Ramsay, Reading Machines, 12). For example, western seems to echo Louis’s concern about his Australian accent and England (all four of these words are in Louis’s top twenty-five). But actually western, wilt, and thou appear in Louis’s monologue only in quotations from a sixteenth-century poem. Ramsay’s intervention raises interesting questions: Why choose this algorithm? How do the results affect our emerging reading of The Waves? (Reading Machines, 15). But how to answer these questions? Ramsay also ignores some interesting questions: Would other algorithms give similar results? Should Louis’s quotations be considered his speech (and retained?) or the anonymous author’s (and omitted?). My own answer is that the anonymous author’s words are not Louis’s words for the purposes of characterizing his speech. Louis’s repeated quotation of this poem and the nature of the author and the quotation are certainly significant in characterizing him, but I would distinguish between his mention of words and his use of them.4 (The styles of the six monologues are obviously not characterized only by their vocabularies, but discussing the vocabulary separately seems reasonable.) Some of the questions I have just raised are not computationally tractable, and other analysts would give answers different from mine. Ramsay has argued that computational analyses are often too reductive, but here his own argument suffers because it rejects any attempt to answer the questions in a reasonable way.

The tf-idf algorithm was designed to retrieve relevant documents rather than to characterize or analyze them, but my point is not that any particular choice of algorithm is definitively correct or incorrect. Rather, the choice of algorithm is important and its consequences need more discussion if we are to achieve more than mere provocation. In any case, Ramsay’s lists of the words with the highest tf-idf scores for each of the six characters suggest a possible answer to whether we should read the novel as six voices or one. The differences are surely real, though Ramsay is right in suggesting that they still need to be interpreted. As I will soon clarify, the legitimacy of treating Louis as if he were a real person also needs further discussion.

Trying to recreate Ramsay’s analysis reveals further interesting points. First, and probably least controversial, is his omission of the long final chapter of The Waves from his analysis. This chapter is all in Bernard’s voice, and it begins “Now to sum up,” showing that it is likely to be quite different from the rest of the novel. I have also removed it from my analysis, and Burrows does the same (206), as does Balossi (84). Ramsay (personal communication) agrees that he should have noted this. Even after removing the final chapter, however, calculating the tf-idf scores for Louis’s words shows that Ramsay’s analysis and mine yield different word frequencies. His score for beast requires six occurrences, for example, not the five my analysis finds. Although there is a sixth occurrence in the omniscient narration at the beginning of the fifth chapter, the difference in frequencies seems to be a result of different tokenizations: beast’s also occurs only in Louis’s monologue, and this implies that Ramsay’s tokenization must break words at apostrophes. Another indication of this is that Bernard’s does not occur among the men-only words later in Ramsay’s discussion, even though it occurs in all the men and none of the women. His score for accent indicates thirteen occurrences, not the fourteen I found, but Rhoda’s most characteristic words include them — and accent occurs once as accent —. This presumably reduces his count by one. What constitutes a word is a surprisingly complex question, but counting them — and accent — as different words from them and accent while counting beast and beast’s as the same word seems at best debatable (see Hoover, “The Trials of Tokenization,” for a discussion). The rarity of the characteristic words identified by tf-idf makes questions of tokenization much more important than they would be if more frequent words were involved. Tf-idf also strongly privileges words limited to one character: only accent is used by more than one speaker.

The identification of characteristic words is problematic in another way. Low, one of Bernard’s characteristic words, also occurs once in the sections of omniscient narration that begin the chapters of the novel. Analyzing only the six monologues seems reasonable, but should words that also occur in the narration be counted as Bernard’s characteristic words? (Some consider Bernard to be modeled on Woolf herself [Ramsay, Reading Machines, 13].) The possibility of including the omniscient third-person narration seems intriguing but problematic, not least because it has fairly well-known differences from the “speech” of the six monologues (see Biber). Including the narration would remove low, canopy, bowled, and brushed from Bernard’s most characteristic words and beast, steel, and discord from Louis’s. What questions does this raise? Is there a reasoned and rational way to answer them? Asking them should deepen and inform our discussion of the novel.

Tf-idf and the Question of a Gender Division in The Waves

Most algorithms for computational approaches come from authorship attribution, where ostensibly correct answers exist. But Ramsay is certainly right that the existence of “correct” answers to questions like “Do the men and women speak differently?” or “Do the six characters have distinct and consistent voices?” is precisely at issue. Although these characters are literary constructs created by a single author rather than authors themselves, many studies have shown that some authors’ characters can profitably be treated as if they were individuals (see Burrows, for example). Furthermore, if we attempt to distinguish individual voices where none really exist, the attempt will simply fail. Ramsay shows us provocatively that all three women share only fourteen words that are used by none of the men, but that all three men share ninety words that are used by none of the women (Reading Machines, 13–14):

Women-words:

shoes, Lambert, million, pirouetting, antlers, bowl, breath, coarse, cotton, diamonds, rushes, soften, stockings, wash

Men-words:

boys, possible, ends, church, sentences, everybody, Larpent, tortures, feeling, office, united, felt, rhythm, weep, heights, wheel, able, however, banker, accepted, hundred, Brisbane, act, included, ourselves, alas, inflict, poetry, approach, irrelevant, power, background, knew, arms, baker, language, destiny, banks, Latin, letters, became, meeting, lord, block, neat, poet, board, novel, reason, brake, observe, respect, burnt, oppose, telephone, central, pointing, waistcoat, certainly, sensations, beak, chose, sheer, chaos, cinders, story, difficult, clamour, suffering, endure, course, torture, forgotten, crucifix, troubling, friend, distinctions, use, god, distracted, waste, king, doctor, watched, notice, ease, willows, ordinary, edges, works

These lists are indeed provocative, as Ramsay suggests: many of the words even seem disturbingly stereotypical. He goes on to say that “critics who have argued for a deep structure of difference among the characters—one perhaps aligned along the gender axis—might also feel as if the program vindicates their impressions. Is there a gender divide? Yes; the characters are divided along the gender axis by a factor of 6.4285 to 1” (Ramsay, Reading Machines, 14). But he then remarks that the algorithm that produces this difference “has no more claim to truth value than any ordinary reading procedure” (14–15), and that his analysis does not settle the question of whether there is a gender divide in The Waves. I think this last claim is right, though not because the question cannot be settled.

Attempting to recreate Ramsay’s analysis of the genders reveals a few discrepancies like those in his list of Louis’s distinctive words. For example, the men-only words banker and Brisbane appear in Neville’s and Bernard’s monologues only as imagined quotations from Louis. My list of men-only words, unlike Ramsay’s, also includes Bernard’s because my tokenization does not break words at apostrophes. Bernard’s only use of Bernard’s, however, is in a quotation from his girlfriend, which one might consider deleting as not really Bernard’s language. But the girlfriend is actually imaginary. Given that the novelist Virginia Woolf has invented the monologue of her imaginary novelist Bernard, does Bernard’s invented dialogue for his imaginary girlfriend count as his “own” language? Why? Why not? I think her language should count as Bernard’s here, but discussing these kinds of computationally intractable decisions, I suggest, could deepen and enrich a conversation about The Waves.

More significantly, Ramsay’s provocative ratio of ninety men-only to fourteen women-only words rests problematically on the amounts of text by the two genders. Even without Bernard’s final chapter, there are about 35,000 words by the men and only 20,000 by the women, a discrepancy that explains the preponderance of men-only words. To “prove” this one could simply cut each male monologue to the length of its corresponding female monologue. Corresponding how? Matching longest to longest, shortest to shortest? Why? I have chosen a different “deformation,” randomizing the lines of each monologue and cutting each one to 6,067 words, the length of the shortest (Susan’s), equalizing each character’s contribution. Why? Why not just take the first 6,067 words of each? (Answer: so as not to test all of Susan’s language against only the first part of the language of the other characters.) This deformed text produces thirty-one women-only and twenty-nine men-only words.

Ramsay is right that the algorithm merely begins the argument, but the provocative revelation that the men share more words than the women is deceptively and inappropriately provocative: it rests merely on the lengths of the monologues. Why the male monologues are longer is also a provocative question, especially as the monologues of all three men are longer than the longest monologue by any of the women. Nevertheless, the nature of the male and female words remains provocative and suggestive for a conversation about gender in The Waves:

All Men, No Women:

boys, feeling, poet, dropped, letters, waste, weep, swing-doors, office, oppose, wheel, hundred, waistcoat, however, telephone, ease, suffering, board, arms, Bernard’s, sheer, beak, possible, bag, lord, approach, god, friend, able

All Women, No Men:

pavement, bedroom, stockings, wash, shoes, wander, hide, tennis, step, soften, front, shelter, settle, million, music, matter, Lambert, diamonds, fills, rushes, breath, pulls, shot, fling, real, cotton, pirouetting, coarse, branch, bowl, antlers

My deformation’s doubling of women-only words raises questions of its own and warns against over-interpreting the lists. Obviously, the men use some of these women-only words in the parts I left out, a problem exacerbated by the rarity of these words: the highest frequency of any of the words I have listed is four. All this suggests a reconsideration of the initial decision to use tf-idf and a search for a method that can cope in a reasonable way with the variation in how much the characters of each gender speak.

Figure 20.1. Tf-idf and Character-Individualization (words with the 20 highest tf-idf scores).

Character-Individualization in The Waves Revisited

Examining The Waves in the light of Ramsay’s provocation raises so many intriguing questions that they cannot all be addressed here, but the question of character-individualization can be revisited by using the same deformation I used for examining the men-only and women-only words. I identified the fifty most characteristic words of these six sections of 6,067 randomly chosen words from each character using Ramsay’s tf-idf formula. I then tested how well they group with the remainders of the longer monologues using cluster analysis, starting with all 300 words (in descending tf-idf order), then reducing the number gradually. The best result, for the twenty most distinctive words, is shown in Figure 20.1.5

Bernard’s and Louis’s sections group together, while Neville’s and Rhoda’s fail (Jinny and Susan have too little text for two sections). A simple word frequency list, however, correctly groups all four in many analyses, providing a tentative answer to the question of whether the voices are distinct. (The only adjustment I made here was to base the word list on the six random 6,067-word sections only, to equalize the amount of text by each character.)

Selecting the most characteristic words for each monologue using Zeta (Craig and Kinney) also produces many perfect results (see Figure 20.2, based on the twenty most characteristic words). This very different method, which measures consistency of use rather than frequency, confirms the distinctness of the voices and, like tf-idf, also eliminates the most frequent words. It also suggests, as does the analysis of the most frequent words, that Susan’s narrative is unusual compared to the others, not just shorter. Agreement between kinds of evidence gathered in two distinct ways strengthens an argument.

Figure 20.2. Zeta and Character-Individualization (20 most characteristic words).

Finally, testing the six characters in 2,000-word sections with two-word sequences (based on the six full monologues, minus Bernard’s final chapter) also yields some completely correct clusters (see Figure 20.3 for an analysis based on the 900 most frequent two-word sequences) and reconfirms the distinctiveness of Susan’s monologue.

As I have noted, Ramsay suggests that treating the question of whether the six characters in The Waves share “the same stylistic voice” as a problem to solve is a “category error,” and that the proper question—one computers cannot answer—is “Can I interpret (or read) it this way?” (Reading Machines, 9–10). Critics still can read the novel as a single stylistic voice, and the six monologues undoubtedly share many characteristics. There is, after all, a Woolf style that should distinguish her from other authors and a Waves style that should distinguish it from her other novels. In spite of a host of very interesting remaining questions about the status of algorithms, arguments, and evidence, however, the bold claim that there are six distinct character voices in The Waves seems strongly confirmed. Reading them as the “same stylistic voice” should require at the very least some recognition that they are quite easy to distinguish, especially because, as more than twenty-five years of research has shown, even different authors are sometimes much more difficult to distinguish than this.6 An analysis of the monologues divided into 3,000-word sections and based on the 800 most frequent words suggests that the early chapters, when the characters are children, are also significantly different from the adult chapters (see also Balossi, A Corpus Linguistic Approach, chap. 6 and app. E). The fact that the characters do not group completely by gender in most of the analyses I have described also suggests a tentative negative answer to the question “Do the men and women speak differently?” that is different from Ramsay’s answer. More analysis would be needed to see how this answer should be qualified, but the results I have described suggest that the characters are more distinct than the genders, a conclusion that agrees with much previous work (for a discussion of the strengths of various signals in texts, see Jockers, Macroanalysis, ch. 6). The grouping of the early sections of the text (when the characters are children) by gender suggests an intriguing direction for further research into the gender question that might be folded into Wallace’s feminist discussion of the novel.

Figure 20.3. Two-Word Sequences and Character-Individualization (900 most frequent two-word sequences).

Ramsay’s provocative intervention is valuable for forcing us to reexamine our methods and focus on questions of interest to traditional literary scholars. But further analysis of his provocation and his algorithms suggests that more attention to the text, to the nature and function of the algorithms, and to method can prompt bold claims that rest on a sounder foundation—claims that can improve and deepen the discussion, not just make sure it continues. Further work will help us explore the boundary between computationally tractable and computationally intractable questions and the significance of that boundary for the future of literary criticism.

Stanley Fish on Argument, Evidence, and Method in the Digital Humanities

In “Mind Your P’s and B’s: The Digital Humanities and Interpretation,” Stanley Fish remarks, by way of introduction, that

Halfway through “Areopagitica” (1644), his celebration of freedom of publication, John Milton observes that the Presbyterian ministers who once complained of being censored by Episcopalian bishops have now become censors themselves. Indeed, he declares, when it comes to exercising a “tyranny over learning,” there is no difference between the two: “Bishops and Presbyters are the same to us both name and thing.” That is, not only are they acting similarly; their names are suspiciously alike. (Fish, “Mind Your P’s and B’s”)

He goes on to argue the phonetic similarity of bishop and presbyter and to claim that in the sentences following Milton’s equation of these words “‘b’s’ and ‘p’s’ proliferate in a veritable orgy of alliteration and consonance.”

I will return to these specific claims, but I want to start at the end of his comments and work backward. Fish says that the “interpretive proposition” that Milton believes that the censors have turned into their oppressors led him to notice the prevalence of [p] and [b] in the passage. He argues as follows:

The direction of my inferences is critical: first the interpretive hypothesis and then the formal pattern, which attains the status of noticeability only because an interpretation already in place is picking it out.

The direction is the reverse in the digital humanities: first you run the numbers, and then you see if they prompt an interpretive hypothesis. The method, if it can be called that, is dictated by the capability of the tool. (Fish, “Mind Your P’s and B’s”)

I do not think that Fish is right that inference usually works this way in literary studies. Surely it is also quite common to notice a phrase, a plot twist, a quirk of characterization, or some other detail that acts as a discovery tool and leads to an interpretation, once further details are integrated into a coherent whole. He is certainly right that evidence only becomes fully meaningful and valuable when it is integrated into an interpretation, but just as certainly wrong that a pattern only attains “noticeability” because an already-existing interpretation is “picking it out.” Surely a reader needs no interpretation to notice the alliteration at the beginning of Hopkins’s “The Windhover”:

I caught this morning morning’s minion, king-

dom of daylight’s dauphin, dapple-dawn-drawn Falcon, in his riding

Of the rolling level underneath him steady air

As in any field of inquiry, there are multiple methods and multiple approaches in digital humanities, and Ramsay’s critique, as I have just discussed, has been that digital humanities is too much hypothesis-driven, too positivistic, too definite. Yet the example Fish critiques really is problematic. He notes that computational tools can identify patterns we cannot see for ourselves, so that we cannot know in advance what a computational analysis will show, and he suggests that digital humanities practitioners therefore proceed randomly or on whim. In the example he mentions, a computational analysis shows that nineteenth-century American fiction mentions many international locations, a fact that leads the analyst to suggest that it was more “outward looking” than previously thought. Here Fish rightly points out that we cannot know whether this claim is true without a great deal more investigation: there are many possible reasons for the presence of a large number of international locations, though probably not the “infinite” number Fish claims. I have argued, in my discussion of McGann’s claim that “The Snow Man” is a noun-heavy poem, that the claim is not meaningful until we know how many nouns is normal (Hoover, “Hot-Air Textuality” and “The End of the Irrelevant Text”); similarly, the mere presence of many international locations may surprise the analyst, but that surprise is not very meaningful until we know how it compares to something other than the analyst’s expectation (British novels of the same period, for example) and how the international locations function in the texts.

Fish goes on to discuss Ramsay’s project of deforming texts and multiplying interpretations, and his praise of “screwing around” as a method (Ramsay, “The Hermeneutics of Screwing Around”). He concludes as follows:

But whatever vision of the digital humanities is proclaimed, it will have little place for the likes of me and for the kind of criticism I practice: a criticism that narrows meaning to the significances designed by an author, a criticism that generalizes from a text as small as half a line, a criticism that insists on the distinction between the true and the false, between what is relevant and what is noise, between what is serious and what is mere play. Nothing ludic in what I do or try to do. I have a lot to answer for. (Fish, “Mind Your P’s and B’s”)

Those who have followed the many phases of Fish’s career, especially those of us who remember “the Eskimo ‘A Rose for Emily,’” may find this statement somewhat surprising. More important, many of us who have been doing digital humanities for a very long time have more in common with this orientation than Fish understands. I doubt that criticism can (even if it should) narrow interpretation to the author’s meanings—an idea that has been very unpopular in literary criticism for a long time (including in Fish’s own work), but narrowing interpretations, focusing on even the smallest details, and insisting on a distinction between what is relevant and what is noise, between “what is serious and what is mere play,” are all very much mainstream in many approaches to digital humanities. Yet, even “screwing around” can be valuable as a method, and I have often argued that one benefit of digital tools and methods is that they help us to uncover details and facts that we would not have noticed otherwise.

I have argued that “mere play” is not sufficient, and although it seems unwise to “generalize” from “a text as small as half a line,” as Fish says he does, it is certainly important at times to focus even on a single word. As a simple example, I have argued that the single instance of the word arrow in William Golding’s The Inheritors, in which he imagines a Neanderthal society that does not use or even understand such weapons and refers to arrows many times as twigs, is an error (Hoover, Language and Style, 147–48). Although distant readings, like those of Moretti, can be valuable, I find more valuable the close readings of most work done in the previous twenty-five years of what is now called digital humanities—work that very often features detailed, minute, and hypothesis-driven analysis of texts.

Argument, Evidence, and Areopagitica

Now I want to take up Fish’s specific example and try to mind his “p’s” and “b’s” a little more carefully. As a consequence, he will have still more to answer for, though not in the way he suggests. As I have noted, Fish says that Milton declares that “Bishops and Presbyters are the same to us both name and thing.” But consider this phrase from Areopagitica in context:

if some who but of late were little better then silenc’t from preaching, shall come now to silence us from reading, except what they please, it cannot be guest what is intended by som but a second tyranny over learning: and will soon put it out of controversie that Bishops and Presbyters are the same to us both name and thing. (Milton, Areopagitica)

Milton does not quite assert that Bishops and Presbyters are the same and that their names are suspiciously similar. It seems inappropriate to ignore the conditional future meaning of the statement. If these things come to pass, it will remove all question that Bishops and Presbyters are the same: a weaker foundation on which to base Fish’s observation of the importance of [p] and [b]. It even seems uncertain that name here means the words bishops and presbyters rather than “reputation,” as it does elsewhere in Milton’s essay; for example, in “Dionysius Alexandrinus was . . . a person of great name in the Church for piety and learning” and “fain he would have the name to be religious, fain he would bear up with his neighbours in that.”

What comes next is worse:

In both names the prominent consonants are “b” and “p” and they form a chiasmic pattern: the initial consonant in “bishops” is “b”; “p” is the prominent consonant in the second syllable; the initial consonant in “presbyters” is “p” and “b” is strongly voiced at the beginning of the second syllable. The pattern of the consonants is the formal vehicle of the substantive argument, the argument that what is asserted to be different is really, if you look closely, the same. That argument is reinforced by the phonological fact that “b” and “p” are almost identical. Both are “bilabial plosives” (a class of only two members), sounds produced when the flow of air from the vocal tract is stopped by closing the lips. (Fish, “Mind Your P’s and B’s”)

Are [b] and [p] really the prominent consonants in both words? The initial [b] of bishops and the initial [p] of presbyter clearly qualify, but the [p] of bishops is surely less prominent than the medial [sh] in the stressed first syllable (especially when the following [s] robs the [p] it of its aspiration). The [z] and [t] of presbyters both seem more prominent than the “b” at the beginning of the unstressed second syllable, and the first [z] is reinforced by the final [z]. The claim that the [b] of presbyters is “strongly voiced” is both unclear (is it spoken strongly or with strong voicing?) and doubtful. Compare the much more salient [b] of prebendary, proboscis, or preboreal, where it begins the second, stressed syllable. Furthermore, Fish never explains why chiasmus should be considered a sign of similarity rather than of difference. Still further, how significant is the chiasmus in the full consonant sequence?

[b]—[sh]—[p] [s] vs [p][r]—[z]—[b]—[t]—[r] [z]

Fish also claims that “the phonological fact that ‘b’ and ‘p’ are almost identical” reinforces his argument, ignoring the fact that the voicing difference between [p] and [b] is crucial in English and many other languages, that it also distinguishes [t] from [d], [k] from [g], [f] from [v], [s] from [z], [ch] from [j], and [sh] from [zh], making it arguably the most important difference/differance in the English consonant system. We might say that if Milton really means that the words bishops and presbyters sound suspiciously alike, he is on shaky ground.

After claiming that the sentences that follow his quotation from Milton demonstrate a “veritable orgy of alliteration and consonance,” Fish gives a partial list of words containing “p” or “b” and continues as follows:

Even without the pointing provided by syntax, the dance of the “b’s” and “p’s” carries a message, and that message is made explicit when Milton reminds the presbyters that their own “late arguments . . . against the Prelats” should tell them that the effort to block free expression “meets for the most part with an event utterly opposite to the end which it drives at.” The stressed word in this climactic sentence is “opposite.” Can it be an accident that a word signifying difference has two “p’s” facing and mirroring each other across the weak divide of a syllable break? Opposite superficially, but internally, where it counts, the same.

There is a wealth of confusion here, but first consider the final confused notion that opposite has two “p’s” separated by a syllable break. This is true for print, but there is only one [p] sound in opposite, belonging unequivocally to the first syllable. Indeed, Fish’s entire argument might seem more defensible if he concentrated on spelling rather than sound, but he repeatedly emphasizes sound, and the spelling similarity between bishops and presbyters is hardly compelling. Bishops and Prelates may be alike, but Milton’s point is that the effect of censorship will truly be opposite from its intent. Could Milton really be using two written “p’s,” only one pronounced, in a word signifying difference to support the identity of bishop and presbyter based on the reversed positions of the [p] and [b] sounds in the two words? I suppose it is possible, but it is not unreasonable to hope for better kinds of argument and evidence than this. Although Fish describes his quotation from Milton as a “climactic” sentence, it is worth noting that the sentence does not end with “meets for the most part with an event utterly opposite to the end which it drives at,” but rather continues, after a colon, with “instead of suppressing sects and schisms, it raises them and invests them with a reputation,” and goes on to emphasize censorship’s likely encouragement of the growth of sects. Can it be an accident that “suppressing” also contains two “p’s” facing each other across a syllable break? Surely “yes” is a possible answer for both “suppressing” and “opposite.”

Further, is it really true that the dance of [p] and [b] “carries a message”? Does Fish’s list of words with these consonants constitute “a veritable orgy of alliteration and consonance”? There is no way to answer these questions without knowing if the dance and density are unusual (as Fish later suggests). And it would help if Fish gave us some rationale for his choice of what words with [p] and [b] to include in his list. If Milton is playing with contrast/similarity, should the listed words be related thematically? Should only content words count? Only stressed words? (Fish only lists nouns, verbs, and adjectives.)

Fish suggests that a full argument for his hypothesis would have to demonstrate that Milton intentionally put the pattern of [p] and [b] in the text, building an argument from the counts of the sounds to his intention and back, partly by citing other places Milton plays with sound in a similar way. Fish claims that, given only twenty-one consonants in the alphabet of twenty-six letters, he would have to “separate the patterns produced by the scarcity of alphabetic resources (patterns to which meaning can be imputed only arbitrarily) from the patterns designed by an author.” Undeniably, accidental patterns must be distinguished from meaningful ones, but Fish unfortunately again equates spelling with sound here, even though he explicitly argues that Milton is playing with sound. Twenty-one of the letters of the alphabet may typically be called consonants (<y> is bit problematic), but there are not twenty-one consonants in English. Many letters represent more than one sound (<c f g s x z>, for example), and some consonants must be indicated with combinations of letters (<th>, <ch>, <sh>, <ng>), some of which also represent more than one sound (bath vs bathe).

The mere fact that Fish noticed the prevalence of [p] and [b] in Areopagitica because of Milton’s equation of bishop and presbyter is completely irrelevant to whether or not the question can or should be studied computationally. Fish’s method of collecting evidence only after adopting a preliminary hypothesis is even somewhat problematic because of the danger of self-fulfilling prophecy, but a computational (and consistent) method of counting [p] and [b] sounds in texts is entirely compatible with Fish’s method and could help him persuade us that the pattern might be intentional. I have not performed a computational analysis, but a check of many passages of the same size by Milton’s contemporaries reveals passages from Joseph Hall’s An Humble Remonstrance (a tract that Milton probably had a hand in answering) that show higher proportions of words with [p] and [b] than the passage Fish analyzes. Further, the parts of Areopagitica that precede the alleged equation of bishop and presbyter show proportions of [p] and [b] that are almost identical to those of the passages following it. Is this what we should expect if Milton is intentionally using a high frequency of the two sounds to imply that bishops and presbyters are the same?

I have only suggested some directions for a more comprehensive and reliable analysis of Areopagitica, but such an analysis would surely enhance and deepen the discussion of Fish’s hypothesis that Milton is using the prevalence of [p] and [b] to argue that bishops and presbyters are the same. Digital humanities is far from being antithetical to his argument, which seems to cry out for the very precision and accuracy that computational approaches can provide.7

Conclusion

Digital literary studies offers no panacea, and its tools and methods can never eliminate the importance of literary intuition and close reading. There is plenty of room for a productive and vigorous discussion of what kinds of literary questions can profit from computational approaches, what kinds cannot, and what the differences between them mean for the practice of literary criticism. The results of studies using digital tools and methods must, like any results arrived at by any method, still be interpreted and must still be integrated into coherent rhetorical arguments. New methods, tools, and results must also be rigorously interrogated, questioned, tested, and replicated. Digital methods can provide new evidence and even new kinds of evidence in support of literary claims, and can make new kinds of claims possible. They can also make some claims untenable. In addition to allowing for “distant” kinds of readings of enormous collections of texts that are simply too large to be studied otherwise, the extraordinary powers of the computer to count, compare, collect, and analyze can be used to make our close readings even closer and more persuasive. Perhaps the availability of new and more persuasive kinds of evidence can also inspire a greater insistence on evidence for literary claims and push traditional literary scholars in some productive new directions. I would not argue that digital methods should supplant traditional approaches (well, maybe some of them). Instead, they should be integrated into the set of accepted approaches to literary texts.

Notes

1. I have argued that the question of whether dictation caused the change from Henry James’s early to late style is also computationally tractable (Hoover, “Modes of Composition in Henry James”). Even interpretation seems amenable to new kinds of evidence from giant natural language corpora (Louw; Hoover, Language and Style in The Inheritors; Hoover, “The End of the Irrelevant Text”; Hoover, “Some Approaches to Corpus Stylistics”), though some corpus approaches have more in common with postmodern theory (Hoover, Culpeper, and O’Halloran, Digital Literary Studies, chaps. 6–7).

2. Ramsay’s initial formula is tf-idf = tf *(N/df), where “tf” is the term frequency (the frequency of the term in the corpus), “N” = the number of documents (here, the six monologues), and “df” = the document frequency (the number of documents containing the term); see Ramsay, Reading Machines, 11.

3. My attempt to duplicate Ramsay’s results revealed that he actually uses a somewhat different formula. Many formulas exist, but for the four I tested, twenty of the twenty-five most characteristic words are the same. For similar problems reproducing Ramsay’s tf-idf scores, see Forster.

4. In his brief but suggestive and valuable discussion of the monologues of The Waves, John Burrows goes a bit further than I have in removing pseudo dialogue from the monologues, a practice that reduces Susan’s part from the 6,067 words I analyze to 5,690 (Burrows, 191, 205–7).

5. Cluster analysis is an exploratory method often used in authorship studies. There is no space here for a full discussion, but the method compares similarities and differences among the frequencies of all the words being analyzed in all of the texts and groups most closely those texts that use the words in the most similar way. The closer to the left of the graph that they form a cluster, the more similar two texts are. In Figure 20.1, Neville Remainder and Rhoda Remainder are the two most similar (the up-down proximity of Bernard Remainder to Neville Rand 6067 is not meaningful). The words are not truly randomly chosen because I have only sorted the lines of the text using a randomizing function, but this is random enough for my purposes.

6. Lexical and semantic differences among the six monologues are strongly confirmed using quite different methodology and in much greater detail in Balossi’s impressive corpus linguistics study of characterization in this novel (Balossi, A Corpus Linguistic Approach, see especially chaps. 6–8).

7. In a more informal context and blog post, Mark Liberman performed a quick test of Fish’s claim that this section of the Milton contains a preponderance of “p” and “b” and found it (somewhat) lacking; see Liberman, “The ‘Dance of the p’s and b’s’: Truth or Noise?” This post and a succession of comments on it show that other readers were also bothered by many of the problems I cite in Fish’s argument.

Bibliography

Balossi, Giuseppina. A Corpus Linguistic Approach to Literary Language and Characterization: Virginia Woolf’s The Waves. Amsterdam: John Benjamins, 2014.

Biber, Douglas. Variation across Speech and Writing. Cambridge: Cambridge University Press, 1988.

Burrows, John F. Computation into Criticism. Oxford: Clarendon Press, 1987.

Craig, Hugh, and Arthur Kinney. Shakespeare, Computers, and the Mystery of Authorship. Cambridge: Cambridge University Press, 2009.

Fish, Stanley. Is There a Text in This Class? Cambridge, Mass.: Harvard University Press, 1980.

—. “Mind Your P’s and B’s: The Digital Humanities and Interpretation.” New York Times, January 23, 2013. http://opinionator.blogs.nytimes.com/2012/01/23/mind-your-ps-and-bs-the-digital-humanities-and-interpretation/?_r=0.

Forster, Chris. “With Thanks to Woolf and emacs, Reading ‘The Waves’ with Stephen Ramsay,” February 13, 2013. http://cforster.com/2013/02/reading-the-waves-with-stephen-ramsay/.

Guillory, John. The Sokal Affair and the History of Criticism. Critical Inquiry 28, no. 2 (2002): 470–508.

Hoover, David L. “Hot-Air Textuality: Literature after Jerome McGann.” Text Technology 14, no. 2 (2005): 71–103.

—. Language and Style in The Inheritors. Lanham, Md.: University Press of America, 1999.

—. “Making Waves: Algorithmic Criticism Revisited.” DH2014, University of Lausanne and Ecole Polytechnique Fédérale de Lausanne, July 8–12, 2014.

—. “Modes of Composition in Henry James: Dictation, Style, and What Maisie Knew.” Henry James Review 35, no. 3 (Fall 2014): 257–77.

—. “Some Approaches to Corpus Stylistics.” In Stylistics: Past, Present and Future, ed. Yu Dongmin, 40–63. Shanghai: Foreign Language Education Press, 2010.

—. “The End of the Irrelevant Text: Electronic Texts, Linguistics, and Literary Theory.” Digital Humanities Quarterly 1, no. 2 (2007). http://www.digitalhumanities.org/dhq/vol/1/2/000012/000012.html.

—. “The Trials of Tokenization.” DH2015, University of Western Sydney, Australia, June 29–July 3, 2015.

Hoover, David L., Jonathan Culpeper, and Kieran O’Halloran. Digital Literary Studies: Corpus Approaches to Poetry, Prose, and Drama. London: Routledge, 2014.

Jockers, Matthew L. Macroanalysis: Digital Methods and Literary History. Urbana: University of Illinois Press, 2013.

Liberman, Mark. “The ‘Dance of the p’s and b’s’: Truth or Noise?” Language Log, January 26, 2012. http://languagelog.ldc.upenn.edu/nll/?p=3730.

Louw, Bill. “Irony in the Text or Insincerity in the Writer? The Diagnostic Potential of Semantic Prosodies.” In Text and Technology, ed. M. Baker, G. Francis, and E. Tognini-Bonelli, 157–76. Philadelphia: Benjamins, 1993.

McGann, Jerome J. Radiant Textuality: Literature after the World Wide Web. New York: Palgrave, 2004.

Milton, John. Areopagitica. Available at http://www.dartmouth.edu/~milton/reading_room/areopagitica/text.shtml.

Plasek, Aaron, and David L. Hoover. “Starting the Conversation: Literary Studies, Algorithmic Opacity, and Computer-Assisted Literary Insight.” DH2014, University of Lausanne and Ecole Polytechnique Fédérale de Lausanne, July 8–12, 2014.

Ramsay, Stephen. Reading Machines: Toward an Algorithmic Criticism. Urbana: University of Illinois Press, 2011.

—. “The Hermeneutics of Screwing Around; or What You Do with a Million Books.” In Pastplay: Teaching and Learning History with Technology, ed. Kevin Kee, 111–20. Ann Arbor: University of Michigan Press, 2014.

Sokal, Alan D. “A Physicist Experiments with Cultural Studies.” In The Sokal Hoax: The Sham That Shook the Academy, ed. Lingua Franca, 49–53. Lincoln: University of Nebraska Press, 1996.

—. “Transgressing the Boundaries: Toward a Transformative Hermeneutics of Quantum Gravity.” Social Text 46/47 (1996): 217–52.

Wallace, Miriam L. “Theorizing Relational Subjects: Metonymic Narrative in The Waves.” Narrative 8 (2000): 294–323.

Woolf, Virginia. The Waves. London: Hogarth Press, 1931.

21. Pedagogies of Race: Digital Humanities in the Age of Ferguson | Amy E. Earhart and Toniesha L. Taylor

Show the following:

Adjust appearance:

Notes

20

Argument, Evidence, and the Limits of Digital Literary Studies

Argument and Evidence in (Digital) Literary Studies

Ramsay’s Reading Machines and Virginia Woolf’s The Waves

Computationally Tractable and Computationally Intractable Questions

Tf-idf and the Question of Character-Individualization in The Waves

Tf-idf and the Question of a Gender Division in The Waves

Character-Individualization in The Waves Revisited

Stanley Fish on Argument, Evidence, and Method in the Digital Humanities

Argument, Evidence, and Areopagitica

Conclusion

Notes

Bibliography

Annotate