Skip to main content

Debates in the Digital Humanities 2023: Chapter 18

Debates in the Digital Humanities 2023
Chapter 18
  • Show the following:

    Annotations
    Resources
  • Adjust appearance:

    Font
    Font style
    Color Scheme
    Light
    Dark
    Annotation contrast
    Low
    High
    Margins
  • Search within:
    • Notifications
    • Privacy
  • Project HomeDebates in the Digital Humanities 2023
  • Projects
  • Learn more about Manifold

Notes

table of contents
  1. Cover
  2. Title Page
  3. Copyright Page
  4. Contents
  5. Introduction: The Digital Humanities, Moment to Moment by Matthew K. Gold and Lauren F. Klein
  6. Part I. Openings and Interventions
    1. 1. Toward a Political Economy of Digital Humanities by Matthew N. Hannah
    2. 2. All the Work You Do Not See: Labor, Digitizers, and the Foundations of Digital Humanities by Astrid J. Smith and Bridget Whearty
    3. 3. Right-to-Left (RTL) Text: Digital Humanists Plus Half a Billion Users by Masoud Ghorbaninejad, Nathan P. Gibson, and David Joseph Wrisley
    4. 4. Relation-Oriented AI: Why Indigenous Protocols Matter for the Digital Humanities by Michelle Lee Brown, Hēmi Whaanga, and Jason Edward Lewis
    5. 5. A U.S. Latinx Digital Humanities Manifesto by Gabriela Baeza Ventura, María Eugenia Cotera, Linda García Merchant, Lorena Gauthereau, and Carolina Villarroel
  7. Part II. Theories and Approaches
    1. 6. The Body Is Not (Only) a Metaphor: Rethinking Embodiment in DH by Harmony Bench and Kate Elswit
    2. 7. The Queer Gap in Cultural Analytics by Kent K. Chang
    3. 8. The Feminist Data Manifest-NO: An Introduction and Four Reflections by Tonia Sutherland, Marika Cifor, T. L. Cowan, Jas Rault, and Patricia Garcia
    4. 9. Black Is Not the Absence of Light: Restoring Black Visibility and Liberation to Digital Humanities by Nishani Frazier, Christy Hyman, and Hilary N. Green
    5. 10. Digital Humanities in the Deepfake Era by Abraham Gibson
    6. 11. Operationalizing Surveillance Studies in the Digital Humanities by Christina Boyles, Andrew Boyles Petersen, and Arun Jacob
  8. Part III. Disciplines and Institutions
    1. 12. A Voice Interrupts: Digital Humanities as a Tool to Hear Black Life by Alison Martin
    2. 13. Addressing an Emergency: The “Pragmatic Tilt” Required of Scholarship, Data, and Design by the Climate Crisis by Jo Guldi
    3. 14. Digital Art History as Disciplinary Practice by Emily Pugh
    4. 15. Building and Sustaining Africana Digital Humanities at HBCUs by Rico Devara Chapman
    5. 16. A Call to Research Action: Transnational Solidarity for Digital Humanists by Olivia Quintanilla and Jeanelle Horcasitas
    6. 17. Game Studies, Endgame? by Anastasia Salter and Mel Stanfill
  9. Part IV. Pedagogies and Practices
    1. 18. The Challenges and Possibilities of Social Media Data: New Directions in Literary Studies and the Digital Humanities by Melanie Walsh
    2. 19. Language Is Not a Default Setting: Countering DH’s English Problem by Quinn Dombrowski and Patrick J. Burns
    3. 20. Librarians’ Illegible Labor: Toward a Documentary Practice of Digital Humanities by Spencer D. C. Keralis, Rafia Mirza, and Maura Seale
    4. 21. Reframing the Conversation: Digital Humanists, Disabilities, and Accessibility by Megan R. Brett, Jessica Marie Otis, and Mills Kelly
    5. 22. From Precedents to Collective Action: Realities and Recommendations for Digital Dissertations in History by Zoe LeBlanc, Celeste Tường Vy Sharpe, and Jeri Wieringa
    6. 23. Critique Is the Steam: Reorienting Critical Digital Humanities across Disciplines by James Malazita
  10. Part V. Forum: #UnsilencedPast by Kaiama L. Glover
    1. 24. Being Undisciplined: Black Womanhood in Digital Spaces, a conversation with Marlene L. Daut and Annette K. Joseph-Gabriel
    2. 25. How This Helps Us Get Free: Telling Black Stories through Technology, a conversation with Kim Gallon and Marisa Parham
    3. 26. “Blackness” in France: Taking Up Mediatized Space, a conversation with Maboula Soumahoro and Mame-Fatou Niang
    4. 27. The Power to Create: Building Alternative (Digital) Worlds, a conversation with Martha S. Jones and Jessica Marie Johnson
  11. Acknowledgments
  12. Figure Descriptions
  13. Contributors

Chapter 18

The Challenges and Possibilities of Social Media Data: New Directions in Literary Studies and the Digital Humanities

Melanie Walsh

During the Q&A after my 2017 talk at Michigan State University’s Global Digital Humanities conference, a faculty member raised his hand and asked a series of questions that challenged me—even scared me—because I wasn’t confident about the answers. “How would you describe your ethical approach to this data?” he asked. “Did you get IRB approval for this research?” At the time, I was a graduate student in English literature, and I had just presented a paper about my nascent digital humanities project—an analysis of #BlackLivesMatter tweets that cited the American novelist and civil rights activist James Baldwin. Then, as now, I argued that social media data can provide a rich archive of reading, literary reception, and textual circulation, that it can help us understand how people feel about books and authors and how they use them in the world. This social media data might include tweeted quotations, like the ones I referenced in my paper, but it might also include Goodreads ratings, TikTok (or “BookTok”) videos, fanfiction stories, Tumblr discussions, Reddit memes, YouTube reviews, or countless other kinds of content from countless other platforms. Because this material is published on the internet, and because it can be collected as data, I believed, and still believe, that it can fuel new, data-rich approaches to literary history, opening up additional understandings of how literature lives in the contemporary world.

Yet the audience member’s questions about ethics and obtaining approval from the IRB gave me pause. I understood the Twitter users in my dataset as authors in their own right, people who deserved credit and citation in my paper, and I had planned to inform these users before I formally published any research that cited their tweets. But I had not discussed this approach with many other scholars, nor was I certain about best practices for the field. And I pointedly shied away from the second question. What’s an IRB again? I thought to myself.

An IRB, or institutional review board, is an administrative body that evaluates and approves the ethical dimensions of research involving human participants, with the goal of protecting the rights and welfare of the people involved. These review boards, a requirement for any U.S. institution or university that receives federal research money, were established in response to heinous human subject research abuses committed in the early- and mid-twentieth century.1 Though familiar to many scholars in the sciences and social sciences, IRBs were not familiar to me, an English literature graduate student. Working with human subjects is not common in literary studies research—or at least, historically, it has not been. Though I later learned that publicly available data like tweets do not typically require IRB approval, I also learned that IRB approval or exemption is hardly the end of the story when it comes to dealing with social media data.2 As Moya Bailey argues, “social media users require a level of forethought that extends beyond the purview of the IRB.”3 This perspective is shared by many other scholars as well as by the social media archiving project Documenting the Now, which has specifically grappled with the difficult questions raised by archiving #BlackLivesMatter data.

I begin with the story of this conference Q&A because I think it highlights a basic disconnect between my humanistic training and the more social-scientific research that I was beginning to do, a kind of interdisciplinary research that has come to characterize my scholarship and that has been embraced by other digital humanists and quantitative literary critics, too. Literary studies education does not typically cover how to conduct responsible research about people, or at least not “ordinary” people—in other words, people who are not published authors. Yet research that involves social media data always demands understanding how to treat people and their information with responsibility and care. As social media data becomes a more commonly used source in the digital humanities—and I use “social media data” as a catchall term for any user-generated content published on the internet—the absence of ethical frameworks to guide this type of research will become an even more pressing problem.

To be clear, I believe that social media data holds great promise for the humanities and for literary studies in particular, especially for an emerging subfield that I call computational reception studies, which explores questions related to reading, reception, and textual circulation.4 But social media data is a fundamentally different kind of data than the digitized text collections that are familiar to most digital humanists and quantitative literary scholars. Most consequentially, social media data is tethered to living people who, unlike traditionally published authors, may not be expecting to be featured in scholarship and may even by harmed it. This difference demands that humanists who use social media data in their research must engage with the people behind the data being studied and with the ethical questions that their data invites.

The many challenges involved with collecting and analyzing data produced by communities has already been a central focus of digital humanities scholarship, and this body of thought serves as an essential resource. But there is additional work that needs to be done to connect this scholarship to the specific practices associated with computationally assisted social media research, and there are other gaps that remain. In the first half of this chapter, I sketch out the subfield of computational reception studies, showing how social media data resembles earlier forms of reception data featured in existing digital humanities work (e.g., nineteenth-century library records or twentieth-century book reviews) and how it also figures in emerging research on contemporary literature, readership, and politics. In the second half, I highlight the approaches to social media data being used and recommended by leading researchers at the border of the social sciences and digital humanities. These scholars—including Sarah J. Jackson, Moya Bailey, and Brooke Foucault Welles; Deen Freelon, Charlton D. McIlwain, and Meredith D. Clark; Dorothy Kim and Eunsong Kim; Brianna Dym and Casey Fiesler; and members of the Documenting the Now project—are currently pursuing the complex questions animated by the scholarly use of social media data. By drawing on these scholars as well as my own research experience, I outline three unresolved questions for humanists who plan to study social media data with the help of computational tools: (1) How should scholars engage with the online communities whose data they computationally collect and analyze? (2) How should scholars cite social media users in published research? (3) How, if at all, should scholars share users’ data? These three issues—community engagement, citation, and data sharing—do not have easy or universal solutions, but I suggest some best practices for addressing them.

Though this chapter mostly focuses on the reception turn in quantitative literary studies, social media data is already being used in other parts of literary studies and the humanities, and the best practices that I offer also apply to those areas. Matt Kirschenbaum, for example, has argued that scholars “cannot write seriously about contemporary literature without taking into account myriad channels and venues for online exchange.” Proving Kirschenbaum’s point, Laura McGrath has studied how aspiring authors pitch their work to literary agents on Twitter by using both qualitative and quantitative methods plus an archive of computationally collected tweets. Advocating for the broad potential of social media data in digital humanities research, Michael L. Black has suggested that it might help build a more interdisciplinary and global digital humanities community, drawing in “fields like new media studies, software studies, science and technology studies, and Internet history” while also offering an alternative to “major print data repositories, which continue to rely on cultural categories defined around national identities” (96). Whether or not this exciting vision is fulfilled, an increasing number of scholars and students will inevitably apply the technical methods of the digital humanities (e.g., computational text analysis, data mining, machine learning) to the social media sphere. The issues of community engagement, citation, and data sharing will thus become essential considerations not only for scholars who are interested in reception, or scholars who plan to use social media data in their own research, but for many more. We must shift the way we teach and do research with computational methods in the humanities so that we can more responsibly engage with communities and the social data they create, and so that we can build a lasting framework for the research to come.

The Reception Turn in Quantitative Literary Studies

At the 2020 Modern Language Association (MLA) annual convention, Ted Underwood heralded a shift in quantitative literary studies, or “distant reading,” when he announced a turn away from the field’s primary focus on the text of books and toward evidence about the life of books. “The collections distant readers built ten or twenty years ago informed us about literary production, not circulation or reception,” Underwood said. “But if we want to learn about the other half of a book’s life cycle, we’re going to need other kinds of evidence.” This shift has partly been driven, as Underwood acknowledged, by scholars like Katherine Bode and Lauren Klein, who argue that quantitative approaches to literary history are reductive if scholars only apply them to a single copy of a literary work and do not account for the work’s publication or circulation history (Bode, 79), or if they frame their analyses only along the axis of “close” and “distant” reading and do not consider additional dimensions of scale (Klein, 25). In their own research, both Bode and Klein offer promising ways to enrich quantitative approaches to literary history. But using more and better evidence about reception is another clear way that scholars can produce deeper, more complex cultural histories.

Many digital humanities scholars have in fact already begun to produce these multidimensional cultural histories by incorporating a wide variety of reception evidence in their work. Bode, for example, has pointed out that many of Franco Moretti’s claims about readers are tenuously based on the “publication [date] and/or formal features of literary works,” not on actual evidence about readers (Bode, 92).5 Anne DeWitt, rather than relying on formal textual features alone, explores reader responses to nineteenth-century theological novels through their discussion in newspapers. Lynne Tatlock, Matt Erlin, Doug Knox, and Steve Pentecost similarly study late-nineteenth-century readership by examining library book checkout records digitized by the What Middletown Read project. Underwood, Wenyi Shang, and their team at the University of Illinois Urbana-Champaign are parsing twentieth-century book reviews from volumes like Book Review Digest, hoping to understand how reviews influence literary prestige, popularity, and style. There are other leading digital humanities projects that rely on reception and circulation data as well.6 Projects like Viral Texts and America’s Public Bible use newspaper archives to track the reprinting of texts and the popularity of biblical quotations; the Reading Chicago Reading project relies on library records from the Chicago Public Library to understand how people across the city engage with the same books; and the Open Syllabus Project uses crowdsourced syllabi to understand the contours of college curriculum and the most frequently taught texts and authors. By shifting attention away from primary textual evidence and toward these other forms of evidence, this emerging subfield seeks to answer questions that have long been central to critical traditions like reader-response criticism, reception studies, book history, audience theory, and the history of reading.

These reception-oriented approaches, which first took shape in the 1970s and 1980s, emerged, much like their digital humanities descendants, from a dissatisfaction with the narrow scope of then-dominant critical methods and with scholars’ excessive focus on the text. “Although theorists of reader-oriented criticism disagree on many issues,” as Jane Tompkins once put it, “they are united in one thing: their opposition to the belief that meaning inheres completely and exclusively in the literary text” (201). The same could be said of computational reception studies, which, while varied in its approaches, seems united around the belief that “other kinds of evidence” are required to understand literary and cultural texts.

The kinds of evidence discussed thus far—historical book reviews, newspaper articles, library circulation records—are largely familiar to scholars who use computational methods in the digital humanities because they come from familiar databases (e.g., the HathiTrust Digital Library, the Library of Congress’s Chronicling America newspaper archive) or entail familiar processes of digitization (e.g., scanning and optical character recognition). But data about reading and reception is also being curated from a source far less familiar to the field: the internet, where book reviews, fanfiction stories, tweeted quotations, and other forms of reception evidence are routinely and copiously published by readers, writers, and amateur critics. Compared to more traditional digitized text collections, this born-digital data offers new affordances. Its abundance, its relative ease of collection, and its unique, often intimate documentation of reader responses open new possibilities for literary research.7 For example, Goodreads, the social networking site for readers, currently has more than 120 million users who have published, since the site’s founding in 2006, more than 90 million reviews of books. While bigger data is not always better data, bigger data, in this case, represents an expanded archive of reception. Goodreads reviews and other social media data also represent readers’ responses in their own words, a kind of evidence that has been historically difficult, if not impossible, for literary critics to find, especially in large quantities.8 Further, because these reviews are published on the internet, they can be “scraped” or otherwise computationally collected as structured data, and scraping data from the web is generally easier and faster than digitizing print materials.9

Social media data’s abundance, availability, and richness have made it a growing focus in humanistic research. For example, literary scholars have used Goodreads reviews and ratings to empirically investigate various differences between reception communities: the readers of best-selling novels versus the readers of critically acclaimed novels (English et al.), the language of amateur reviewers versus the language of professional reviewers (Hegel), the tastes of conservative readers versus the tastes of liberal readers (Piper and So), and the books rated by Goodreads users versus the books cited by literary scholars (Manshel, McGrath, and Porter). Additionally, teams at the Stanford Literary Lab have explored how genres of fanfiction, in which amateur authors write and share stories based on existing fiction, evolve over time and how Harry Potter fanfiction compares across different languages. A team at the University of Pennsylvania’s Price Lab for Digital Humanities, meanwhile, has studied how quotations get recycled from movies to Archive of Our Own fanfiction stories. They have also published an interactive tool, The Fan Engagement Meter (fanengagement.org), which can be used to explore these trends. In my own research, I have drawn on an archive of #BlackLivesMatter tweets to explore quotations of the writer James Baldwin, showing that Twitter users overwhelmingly cited Baldwin’s 1960s mass media material as well as circulated various misquotations of his words (Walsh).10 And while Twitter has received outsized scholarly attention because it makes data more available to researchers, many other platforms—such as YouTube, TikTok, and Reddit—can also give us enlightening perspectives on how people discuss, deploy, and use literature in everyday life.

Though social media data clearly offers new opportunities when compared to traditional digitized texts, it also poses new challenges. The most serious challenges stem from the fact that social media data is published by people who, unlike traditionally published authors, may not be expecting or may not want to be the subjects of research. These are people who may, consequently, be at special risk of doxxing, trolling, harassment, or of otherwise experiencing negative, undesired outcomes from the mishandling of their data. Scholars who study fanfiction communities, are intimately familiar with such risks and with the difficulties of separating online texts from the people who authored them. One of the most complicated questions for fanfiction scholars, as Kristina Busse puts it, is “whether online evidence ought to be viewed as a textual document or as an utterance by the person who wrote it” (11). Scholars of computational reception studies must face the same question. Is a Goodreads dataset, for example, a collection of texts or a collection of utterances made by people? It is both, of course. But I believe that we can take a cue from fanfiction scholarship in more clearly recognizing the people in the data and the living quality of the data. “Unlike traditional texts,” as Brianna Dym and Casey Fiesler write, “fan works are personal and tied to the people and communities they are created in as living data, so they carry consequences with their use and analysis” (emphasis added). For scholars who use fanfiction stories or similar “living data” in their work, these consequences are necessary to address.

This is not, of course, the first time that humanists or literary critics have engaged with living data. Perhaps the most influential work of reception scholarship, Janice Radway’s Reading the Romance, is centered around a community of real Midwestern women romance readers. Interestingly enough, Underwood has previously pointed to Radway as a proto-quantitative literary critic because she uses, in this study, social-scientific methods like experimental design, samples, and hypotheses. This genealogy, as Underwood tells it, is meant to highlight quantitative literary studies’ investment in social-scientific approaches rather than in digital technology or the digital humanities. But Radway’s decision to engage with actual readers—“to move beyond the various concepts of the inscribed, ideal, or model reader and to work with actual subjects in history” (5)—demanded not only that she formulate hypotheses and create samples, but also that she incorporate ethnographic approaches such as interviews, surveys, participant observation, and collaboration. This aspect of Radway’s social-scientific approach is, in my view, not what disconnects quantitative literary studies from the rest of the digital humanities but perhaps one of the things that most meaningfully connects them.

As many of the chapters in this 2023 edition of Debates in the Digital Humanities testify, digital humanities scholars have been leaders in conversations around data ethics and at the forefront of developing guidelines for data-driven research with and for communities. In Chapter 8 (The Feminist Data Manifest-NO), Tonia Sutherland, Marika Cifor, T. L. Cowan, Jasmine Rault, and Patricia Garcia propose distinct principles for feminist approaches to data that center minoritized communities. These principles—informed by humanistic thinking about data and ethics, people and lives—can be carried over, and indeed have been carried over, into social-scientific work. In the sections that follow, I show how current computational research with social media data must continue to bridge digital humanities and social science perspectives, particularly with regard to three key challenges: community engagement, citation, and data sharing. To think through these challenges, I draw on models and recommendations from the Documenting the Now project; from research about social media activism by Moya Bailey, Sarah J. Jackson, Brooke Foucault Welles, Deen Freelon, Charlton D. McIlwain, and Meredith D. Clark; and from research about ethical approaches to fanfiction data by Brianna Dym and Casey Fiesler. These researchers and their projects are especially helpful because they focus on the data of marginalized or otherwise vulnerable communities, such as Black activists and LBGTQ fanfiction writers. Centering research on these communities can help scholars develop best practices that consider the most vulnerable as a baseline and adjust from this baseline based on context.

Community Engagement

Because of the technical affordances of social media data, it is not only possible but common for researchers to collect users’ personal data without their permission and even without their knowledge. Fiesler and Nicholas Proferes have shown that many Twitter users are not aware that researchers can and do collect their data. Perhaps equally worrisome is that researchers often lack knowledge about the users and communities whose data they collect. As Bergis Jules, Ed Summers, and Vernon Mitchell of Documenting the Now assert, “the internet affords the luxury of a certain amount of distance to be able to observe people, consume information generated by and about them, and collect their data without having to participate in equitable engagement as a way to understand their lives, communities, or concerns” (3). For these reasons, they and other scholars argue that researchers should engage with and be knowledgeable about the communities whom they are studying and collecting data from—whether through conversation, collaboration, interviews, or ethnographic approaches. This insistence on community engagement aligns with one of the principles of the Feminist Data Manifest-NO (see Chapter 8), in which the authors refuse “work about minoritized people” and commit instead to “working with and for minoritized people in ways that are consensual and reciprocal and that understand data as always co-constituted” (emphasis added). Here, I argue that humanities scholars who work with social media data must meaningfully and deliberately engage with the communities they study, as well.

The significance of community engagement for humanistic social media research has already been demonstrated in some of the public discourse surrounding the Fan Engagement Meter, the fanfiction project out of Penn’s Price Lab for Digital Humanities. Led by Peter Decherney, James Fiumara, and Scott Enderle, the Fan Engagement Meter is an interactive tool that displays lines of film dialogue commonly reused in Archive of Our Own fanfiction stories—stories that were computationally collected and analyzed for the purposes of the project. After the Fan Engagement Meter was spotlighted by the university’s news publication, Penn Today, some fanfiction writers and scholars, who were previously unaware of the project, began to voice concerns about it (Shepard). In a piece for the online publication Mary Sue, fanfiction writer Jessica Mason referred to the project’s methods as “creepy data mining,” and she amplified similar criticism from fanfiction scholar and literary critic Alexandra Edwards, who tweeted: “And I have to wonder . . . did they go through IRB for this? How do we know these researchers won’t sell the aggregated data (especially the bullshit predictive model stuff)? Are works on AO3 protected from this kind of exploitation in any way?”11 These comments and questions register some people’s discomfort, confusion, and apprehension about researchers collecting their data and the subsequent uses or potential misuses of that data. Such apprehension is especially heightened for fanfiction communities, as Dym and Fiesler address, because they often represent vulnerable populations and privacy-sensitive contexts “due to not only the large number of LGBTQ participants . . . but also different stigmas associated with fandom.” For computational reception studies, and for broader digital humanities research, these concerns underscore the importance of addressing ethical questions head on and clearly explaining approaches to data collection, data analysis, and data sharing in all published results.

But the central criticism voiced in Mason’s piece is not ultimately about data mining or even quantification, but rather it is about a lack of engagement with the fanfiction community on the part of the researchers and, more specifically, about a lack of dialogue with relevant work by women, queer people, and people of color:

The use of quantitative, data-based research isn’t new in fan studies. In fact, there are many academics and non-academics out there doing amazing work, such as DestinationToast on tumblr. The problem here is these men approaching fanfic like they’re the first people to analyze it. . . . I hope this tool can somehow be useful, but I also hope these researchers take the time to listen to the female, queer and POC voices in fan studies that are doing great work and see what they can learn. (Mason)

This hope, the conclusion of Mason’s piece, underlines that working with, understanding, and listening to communities—as well as to scholars already researching in these spaces, even if they do not use the same methods—is one of the most important considerations for computational research with social media data, and indeed for all academic research.

By directly engaging with the users and online communities whom they hope to study, scholars can also make better, more informed, and more context-dependent decisions about other parts of the research process, such as whether to cite a specific user in published research. For example, in Dym and Fiesler’s extremely useful best practices for studying online fandom data, they recommend that researchers who are unfamiliar with fan communities “spend time [in online fandom spaces] and take the time to talk to fans and to understand and learn their norms,” not only to learn more about the community but also to become more “mindful of each user’s reasonable expectations of privacy, which may be dependent on the community or platform.”12 They further underscore that making connections with individual users is possible even for large-scale, data-driven research: “Even for public data sets in which individual participants might number in the tens of thousands to the millions, it might be possible to talk to some members of the target population in order to better understand what values and concerns people might hold that would deter them from consenting to their data being used” (Dym and Fiesler). Likewise, for digital humanities scholars, even those working with large datasets, talking to actual internet users can be an essential step toward producing more ethical research.

Mixed-methods approaches to social media data have already proved successful in leading research at the border of the social sciences and digital humanities. For example, in her research on the #GirlsLikeUs hashtag, created by trans advocate Janet Mock, Moya Bailey sought Mock’s permission to work on the project before it began, and she collaborated with Mock to develop her research questions and determine the project’s direction. Though Bailey’s research included data collection and quantitative analyses, it was also shaped by a consenting collaborator who was part of the community being studied. In their 2020 book #HashtagActivism: Networks of Race and Gender, Bailey and her colleagues Sarah Jackson and Brooke Foucault Welles similarly frame the social media users whom they study as collaborators, and they intentionally make space for “hashtag users to speak for themselves” (Jackson, Bailey, and Foucault Welles).13 To this end, they pair each chapter of their book with “an essay written by an influential member of a particular hashtag activism network” (Jackson, Bailey, and Foucault Welles). Along similar lines, Deen Freelon, Charlton McIlwain, and Meredith Clark used large-scale network analysis techniques to study 40 million #BlackLivesMatter (BLM) tweets, but they also interviewed dozens of BLM activists and allies “to better understand their thoughts about how social media was and was not useful in their work.” Such mixed-methods approaches, shaped by the social sciences, offer a productive model for computational reception studies and broader digital humanities research.

Yet, as we have seen, mixed-methods approaches are also not without precedent in the humanities, as Radway’s study of romance readers clearly demonstrates. After all, in Reading the Romance, Radway does not argue that ethnographic approaches should “replace textual interpretation” but rather that they might be “fruitfully employed as an essential component of a multifocused approach that attempts to do justice to . . . historical subjects” (6). In a similar vein, I argue that direct interactions with online users and ethnographic approaches need not replace computational text analysis, but rather that they might be integrated as one part of a multifaceted approach that can more richly and responsibly consider what readers, writers, and amateur critics actually care about, what texts mean to them, and why they share texts in the contemporary world.

Citation

Researchers often face another difficult question when dealing with social media data: how, if at all, to quote or cite specific social media posts in published research. While some researchers attempt to avoid this issue by sticking with aggregated representations of data, more humanistic research often demands engagement with specific examples. In my research on Baldwin and the BLM movement, for instance, I found that one of the most popular tweeted quotations was a misquotation of Baldwin’s words, one that had many subtle mutations in the dataset, and I wanted to attend to these differences by closely reading some of the individual tweets (Walsh). But I then faced the difficult decision of how to cite the authors of those tweets, how to balance protecting users’ privacy and safety with honoring their creativity and agency. I ultimately decided to contact each user, inform them of the citation, and give them the option of not being included in the article. Most people replied and actively voiced interest in being included. Some of these users asked for more details about the research, some expressed surprise that one of their tweets had been deemed research-worthy, and some conveyed mild shock that they had even authored the tweet I was referring to, forgetting about the 140 characters they had released into the ether years earlier. While this approach to citation worked for me at the time, I have continued to reflect on my choices, and today I would make slightly different ones. In alignment with arguments by Dorothy Kim and Eunsong Kim, I would now follow a stronger consent-centered approach, one that recommends that researchers who wish to cite specific social media posts should make their best effort to contact the authors (especially if they are nonpublic figures) and that they should seek explicit permission to use posts, in addition to asking for a desired authorial attribution, such as a username, real name, or pseudonym. Even when researchers make their best efforts, however, gaining direct permission from users is not always possible—a challenge that fanfiction scholars have noted and that my own research experience affirms as well.14 In these cases, it can be helpful to turn to other citation strategies, which I will discuss in more detail below.

First, however, it is important to establish why both citation and anonymization are potentially harmful for users.15 Citation can be detrimental to users because it can expose their material to a new, unexpected, and/or larger audience, which can lead to unwanted attention, harassment, doxxing, physical harm, adverse professional consequences, personal complications, and other negative outcomes. For example, the Documenting the Now project discusses how “activists of color . . . face a disproportionate level of harm from surveillance and data collection by law enforcement,” and thus amplifying their words or actions (as documented through social media) might put them in danger, even more so than the average social media user (Jules, Summers, and Mitchell). In a similar vein, when fanfiction writers were asked how they felt about researchers or journalists citing their stories, many fanfiction writers, “whether identifying as LGBTQ themselves or simply thinking about their friends, worried that exposing fandom content to a broader audience could lead to fans being accidentally outed” (Dym and Fiesler). Because of such risks and other privacy concerns, some scholars choose to anonymize social media posts. This would seem to be an easy way to protect the privacy of individual social media users while also including direct textual evidence in published research.

Yet anonymity is not a sufficient strategy both because it is not actually effective for protecting users’ privacy and because it robs users of authorship. “Even when anonymized by not including usernames, content from social media collected and shared in research articles can be easily traced back to its creator,” as Dym and Fiesler assert, drawing on a study by John W. Ayers and colleagues. Furthermore, anonymity does not give users proper intellectual credit, as Amy Bruckman argues. For these reasons, seeking explicit permission from users is usually the best approach to citation, and it can even lead to more meaningful and substantive forms of community engagement. For example, in my research with Maria Antoniak on Goodreads reviews, we sought permission from each Goodreads user whom we directly quoted in our published work. We messaged these users on the Goodreads platform, described our research, asked for permission to quote from a specific review, and offered different attribution options, such as their real name, username, or the anonymous pseudonym “Goodreads user.” While some users simply replied with a quick “yes” or “no,” many users responded with follow-up questions about the research, how exactly we planned to cite them, and where they could find the article when it was published. Some users even offered helpful contextualization and further explanation of their reviews. Asking for consent thus opened the door to a more reciprocal, collaborative, and informed relationship with the users whom we were studying.

Though seeking explicit permission from these Goodreads users was a rewarding experience in many ways, it also highlighted some of the drawbacks of this approach, including the fact that many Goodreads users did not respond to our inquiries at all.16 When it is not possible to gain permission from users, Dym and Fiesler recommend paraphrasing posts in such a way that they are not traceable back to the original post or ethically “fabricating” material in ways that Annette Markham has advocated and described, such as by creating a composite account of a person, interaction, or dialogue. Paraphrasing strategies were effectively used by Antoniak, David Mimno, and Karen Levy in their computational analysis of the Reddit community r/BabyBumps, a community for sharing birth stories. The authors chose to paraphrase Reddit posts included in the article in order to “minimize the possible identification of and harm to the authors” (23). Taking a different tack, Freelon, McIlwain, and Clark decided to include links to tweets rather than the full texts of tweets in their study, which allowed Twitter users to delete their tweets and to effectively remove themselves from the research. Additionally, they only linked to tweets that were already reasonably exposed to the public, such as tweets with more than 100 retweets, tweets published by officially verified Twitter accounts, or tweets published by Twitter accounts with more than 3,000 followers.17 Such thresholds—metrics that can be used to assess whether republishing a post will unduly increase exposure or risk to the author—can be devised for other platforms and contexts as well. All of these strategies—paraphrasing posts, linking to posts rather than quoting text, and establishing “reasonably public” thresholds—can be effective approaches to citation in humanistic research with social media data.

Yet it is important to acknowledge that many of these approaches disrupt comfortable and commonly used methodologies in humanistic scholarship, such as close reading. I feel the discomfort, too. As a literary critic with a reverence for texts, it is slightly painful for me to think about paraphrasing an especially colorful tweet or a hilarious Goodreads review. Yet I also recognize that the evolving nature of humanistic research demands that we expand beyond our comfort zones and disciplinary trainings. If embracing new approaches means better protecting people, isn’t it worth the discomfort?

Sharing Data

Sharing data and code has become an important practice in quantitative literary studies and in the broader digital humanities. As the Journal of Cultural Analytics, one of the field’s leading research journals, contends: “Shared data helps foster a community of critical analysis and widens the circle of who can participate.”18 In keeping with this conviction, the Journal of Cultural Analytics requires that “all data and code relevant to articles published in [the journal] will be made publicly available,” including “underlying text, audio, or image files; derived data used in the analysis; and code used to acquire, clean, and analyze collections.” But social media data troubles this policy, as the journal acknowledges, because “user-generated content should not be recirculated without permission.” The risks previously discussed with regard to social media citation are amplified many times over when sharing full datasets, because full datasets include more potentially identifiable data about more users.

Testifying to such risks, the authors of a white paper on the Documenting the Now project discuss how their technical lead, Ed Summers, was once asked to share some of the Twitter data that he had collected about the 2014 #Ferguson protests—data that documented both virtual and on-the-ground protests sparked by the unjust murder of Michael Brown, who was shot and killed by a police officer in Ferguson, Missouri. The person asking for this #Ferguson data, Summers discovered, was an employee of a social media data mining company that was collaborating with law enforcement and security services. Though Summers refused the request, the Documenting the Now team members realized “how easy it could be for the collections we build to be used against marginalized communities” (Jules, Summers, and Mitchell). On the other hand, Jackson, Bailey, and Foucault Welles have also stressed how essential social media data access is for research and knowledge production: “The threats to privacy and security that are introduced through unwanted use of social media data are real, but so too are the threats to social-scientific insight if we are unable to create pathways for researchers to access data within and across social media platforms” (206). The question of how to safely share social media data is an extremely challenging one, and there is no universal solution to the issues that are raised.

While the conversation around the sharing of social media data is still evolving, I will briefly point to two potential paths for ethically sharing social media data: sharing data in ways that allow for users’ right to be forgotten, and sharing data in repositories that offer varying levels of restriction and access. The first path is perhaps best exemplified by Twitter’s policies for sharing data. Twitter’s terms of service do not allow users to share full datasets of tweets, but they do allow the sharing of tweet IDs—unique identifiers assigned to every tweet that can be used to retroactively access tweets from Twitter’s application programming interface (API). If an ID is connected to a tweet that has been deleted, however, the tweet can no longer be accessed. “If you squint right,” Ed Summers has argued of this policy, “Twitter is taking an ethical position for their publishers to be able to remove their data: to exercise their right to be forgotten.”19 This system allows researchers to share Twitter data while also allowing individual users to remove themselves from future versions of the data if they wish. Many researchers, such as Freelon, McIlwain, and Clark, have shared their data in this format, and the Documenting the Now project even hosts a crowdsourced repository of tweet IDs that currently contains more than 6 billion of them (catalog.docnow.io). Though this policy is specific to Twitter, the same principles can be applied to other platforms and domains. If a researcher collected millions of photos from Instagram, as the Journal of Cultural Analytics describes in one of its data-sharing scenarios, that researcher would be discouraged from sharing the actual photo data, but they would be encouraged to share the code used to collect the photo data. This would allow other researchers to collect similar data while also affording Instagram users more time to remove their posts. Admittedly, this prioritization of users’ privacy comes at the cost of replicability. These data-sharing strategies do not allow for the exact replicability of data or the exact reproducibility of results. But they can still come close to replicating data and reproducing results, and they can do so, importantly, in ways that honors users’ right to be forgotten.

The second possible path for safely sharing social media data is to use data repositories that offer varying levels of restriction and access. For example, when Ryan Gallagher and his colleagues collected tweets for a study about the #MeToo movement and online disclosures of sexual violence, they chose to share tweet IDs for the underlying data through the Inter-University Consortium for Political and Social Research (ICPSR), housed at the University of Michigan (Gallagher et al.). ICPSR is a data repository commonly used in the social sciences and affiliated with more than 780 universities and research organizations around the world. To download any data from this repository, interested parties must agree to terms of responsible use, which include pledges to protect users’ privacy and commitments not to redistribute or sell data. ICPSR also allows researchers to place restrictions on who can access their data—only affiliated ICPSR members, for example, or, more restrictively, only affiliated ICPSR members who fill out an extensive application and gain IRB approval. Because of the sensitive nature of #MeToo tweets, Gallagher et al. chose to place strict restrictions on their data.20 To access it, a researcher must submit an application package that includes, among other things, IRB approval, an approved security plan, and a confidentiality pledge. I believe that data repositories like ICPSR may also be useful for humanistic research with social media data. In fact, ICPSR is developing a repository specifically designed for social media data, the Social Media Archive (SOMAR), which will make it particularly useful (Hemphill, Leonard, and Hedstrom; Hemphill). More important than the specific data repository, however, is the larger principle of controlling who data can be shared with, and by what means it can be shared. For scholars who wish to share full social media datasets, especially if the data involves sensitive information, I recommend placing restrictions on who can access it.

Living Data

Computational reception studies is not just a future chapter of quantitative literary studies—it is one that is already being written. This exciting work, as I have shown, is already taking place. But the living data that computational reception studies often relies on (and will increasingly rely on)—Goodreads reviews, fanfiction stories, Tumblr comments, and more—requires a rethinking of how we cite texts, how we share data, and how we engage with the communities who produce these texts and data. Moving forward, we must listen to and participate more fully in data ethics conversations happening in other areas of the digital humanities and in the social sciences, and we must recognize our deep connections to both of these fields. The ethical complexities involved with studying social media data can be overwhelming, especially for scholars who have not worked with social media data before. But if we take the time to understand current best practices and ethical guidelines, we can continue to author this exhilarating new chapter of quantitative literary studies and the digital humanities.

This type of research can also help advance a goal that has long been associated with the digital humanities: connecting our work back to the communities that initially inspired it. Among Dym and Fiesler’s findings on fanfiction data was that, “despite some risk—many of [their] participants were excited about the idea of research shining a light on the practices and communities of fandom.” I have witnessed similar excitement when interacting with Twitter and Goodreads users. Some of them have sent me long direct messages explaining why they love books and publishing reviews online, and some have been unexpectedly eager to learn about my research. It has been a surprise and joy to meet the people behind my data, and it has made me ever more aware that this data is, in the words of Dym and Fiesler, living data. Recognizing and being responsible to the people behind social media data may be one of the most difficult and challenging aspects of future research in this area, but as I have found in my own work, it may also be one of the most rewarding.

Notes

  1. For more on the history of institutional review boards and the egregious abuses that motivated them, see Won Oak Kim, “Institutional Review Board (IRB) and Ethical Issues in Clinical Research,” Korean Journal of Anesthesiology 62, no. 1 (January 2012): 3–12, https://doi.org/10.4097/kjae.2012.62.1.3; Todd W. Rice, “The Historical, Ethical, and Legal Background of Human-Subjects Research,” Respiratory Care 53, no. 10 (2008): 5; Patricia Cohen, “As Ethics Panels Expand Grip, No Field Is Off Limits (2007),” New York Times, February 28, 2007, Arts, https://www.nytimes.com/2007/02/28/arts/28board.html.

    Return to note reference.

  2. Some IRB offices still recommend caution with publicly available data, however. For example, as of 2013, Cornell University’s Office of Research Integrity and Assurance recommended that researchers who use publicly available social media data seek “formal confirmation of non-human participant research status for the study . . . because of the emerging ethical sensitivities in this area” (see https://researchservices.cornell.edu/sites/default/files/2019-05/IRB%20Policy%2020.pdf).

    Return to note reference.

  3. See Bailey.

    Return to note reference.

  4. Though the subfield that encompasses this work does not yet have a coherent or widely recognized name, I offer computational reception studies as one potentially unifying term. I use the term reception studies because it capaciously captures the diversity of data relevant to the subfield, and I use the term computational rather than quantitative because computation holds special consequences in this area (e.g., the computational collection of users’ data).

    Return to note reference.

  5. See also DeWitt, who points out that readers are often “central” to Franco Moretti’s arguments yet completely “absent from his evidence” (162). Similarly, Tatlock and colleagues claim that quantitative approaches in literary studies tend “to downplay reader agency and heighten attention to ‘objective’ textual features.” However, they emphasize that the same methods can also be used to analyze reading behavior and “enhance our understanding of how meaning is co-constructed” (Tatlock et al.).

    Return to note reference.

  6. See Ryan Cordell and David Smith’s 2017 “Viral Texts Project: Mapping Networks of Reprinting in 19th-Century Newspapers and Magazines,” http://viraltexts.org; Lincoln Mullen, “America’s Public Bible: Biblical Quotations in U.S. Newspapers,” http://americaspublicbible.org/; “Reading Chicago Reading,” https://dh.depaul.press/reading-chicago/; and “Open Syllabus,” https://opensyllabus.org/.

    Return to note reference.

  7. See Black, who similarly argues that “using the Internet as a data source would afford access to text written by both professionals and amateurs, a distinction that is often not available when working with more formal archives” (103).

    Return to note reference.

  8. Emphasizing the usual absence of firsthand evidence from readers, Richard D. Altick once said of Victorian readers that “the great majority of the boys and girls and men and women into whose hands fell copies of cheap classic reprints did not leave any printed record of their pleasure. Only occasionally did the mute, inglorious common reader take pen in hand.” Altick, “From Aldine to Everyman: Cheap Reprint Series of the English Classics 1830–1906,” Studies in Bibliography 11 (1958): 3–24, https://www.jstor.org/stable/40371227.

    Return to note reference.

  9. Scraping data from the web is by no means free of complication. As my work with Maria Antoniak shows, Goodreads and its parent company Amazon purposely limit the amount of review data that can be accessed from the website (Walsh and Antoniak).

    Return to note reference.

  10. Similarly, Micah Bateman traces references to poets such as Maya Angelou and Audre Lorde in the tweets of Democratic politicians, arguing that the citation of these Black poets is strategically intended to mark the politicians as progressive. Micah Bateman, Lyric Publics: The Uses of Poetry in American Social Media Campaigns (PhD diss., University of Texas at Austin, 2021). He also traces quotations of Bertolt Brecht in Trump-era tweets; Micah Bateman, “Tweeting (in) ‘Dark Times’: Brecht’s Second Svendborg ‘Motto’ Post-Trump,” Ecibs: Communications of the International Brecht Society, no. 2020:1 (April 6, 2020), https://e-cibs.org/issue-2020-1/#bateman.

    Return to note reference.

  11. Alexandra Edwards, PhD (@nonmodernist), “And I have to wonder . . . did they go through IRB for this?” Twitter, December 19, 2020, https://twitter.com/nonmodernist/status/1207804598414647296.

    Return to note reference.

  12. This emphasis on the contextual nature of privacy echoes another principle of the Feminist Data Manifest-NO, which asserts that “risk and harm associated with data practices can[not] be bounded to mean the same thing for everyone, everywhere, at every time” and that “historical and systemic patterns of violence and exploitation produce differential vulnerabilities for communities” (see Chapter 8, The Feminist Data Manifest-NO.)

    Return to note reference.

  13. “We view hashtag users and creators as researchers themselves, and we see part of our charge as practicing a more egalitarian model of research whereby our ‘subjects’ are understood to be collaborators, particularly in light of the way some researches have exploited prominent Twitter and hashtag users. We shift this practice of potential harm by working collaboratively, ensuring that creative voices are front and center” (Jackson, Bailey, and Foucault Welles, xl).

    Return to note reference.

  14. See especially Busse (12–13) for an insightful reflection on why hard permission policies are not always tenable.

    Return to note reference.

  15. As Moya Bailey puts it, “Digital Humanists interested in conducting research that is ethical and feminist must go beyond the simple politics of citation, as citation itself may be the thing that creates the harm to the community.”

    Return to note reference.

  16. Busse (12–13) discusses encountering similar issues with regard to permission.

    Return to note reference.

  17. Freelon, McIlwain, and Clark note that 3,000 followers placed a user in the top one percent of the most followed Twitter accounts at that time.

    Return to note reference.

  18. On its About web page, the Journal of Cultural Analytics explains various policies, including its Data Sharing Policy.

    Return to note reference.

  19. Ed Summers, “On Forgetting,” Medium, May 19, 2017, https://medium.com/on-archivy/on-forgetting-e01a2b95272.

    Return to note reference.

  20. See also Ryan J. Gallagher, Elizabeth Stowell, Andrea G. Parker, and Brooke Foucault Welles, “#MeToo Tweet IDs, October 15–28, 2017,” Inter-University Consortium for Political and Social Research, November 11, 2019, https://doi.org/10.3886/ICPSR37447.V1.

    Return to note reference.

Bibliography

  1. Antoniak, Maria, David Mimno, and Karen Levy. “Narrative Paths and Negotiation of Power in Birth Stories.” Proceedings of the ACM on Human-Computer Interaction 3, CSCW (November 2019): 1–27, https://doi.org/10.1145/3359190.

  2. Ayers, John W., Theodore L. Caputi, Camille Nebeker, and Mark Dredze. “Don’t Quote Me: Reverse Identification of Research Participants in Social Media Studies.” Npj Digital Medicine 1, no. 1 (2018): 1–2, https://doi.org/10.1038/s41746-018-0036-2.

  3. Bailey, Moya. “#transform(ing)DH Writing and Research: An Autoethnography of Digital Humanities and Feminist Ethics.” DHQ: Digital Humanities Quarterly 9, no. 2 (2015), http://www.digitalhumanities.org/dhq/vol/9/2/000209/000209.html.

  4. Black, Michael L. “The World Wide Web as Complex Data Set: Expanding the Digital Humanities into the Twentieth Century and Beyond through Internet Research.” International Journal of Humanities and Arts Computing 10, no. 1 (2016): 95–109, https://doi.org/10.3366/ijhac.2016.0162.

  5. Bode, Katherine. “The Equivalence of ‘Close’ and ‘Distant’ Reading; or, Toward a New Object for Data-Rich Literary History.” Modern Language Quarterly 78, no. 1 (2017): 77–106, https://doi.org/10.1215/00267929-3699787.

  6. Bourrier, Karen, and Mike Thelwall. “The Social Lives of Books: Reading Victorian Literature on Goodreads.” Journal of Cultural Analytics 5, no. 1 (February 2020), https://doi.org/10.22148/001c.12049.

  7. Bruckman, Amy. “Studying the Amateur Artist: A Perspective on Disguising Data Collected in Human Subjects Research on the Internet.” Ethics and Information Technology 4, no. 3 (2002): 217–31, https://doi.org/10.1023/A:1021316409277.

  8. Busse, Kristina. “The Ethics of Studying Online Fandom.” In The Routledge Companion to Media Fandom, edited by Melissa A. Click and Suzanne Scott, 9–17. New York: Routledge, 2017, https://doi.org/10.4324/9781315637518-3.

  9. Dewitt, Anne. “Advances in the Visualization of Data: The Network of Genre in the Victorian Periodical Press.” Victorian Periodicals Review 48, no. 2 (2015): 161–82, https://doi.org/10.1353/vpr.2015.0030.

  10. Dombrowski, Quinn, Steele Douris, and Masha Gorshkova. 2020. “Harry Potter and the Global Phenomenon of Fanfic.” Presentation at the Center for Spatial and Textual Analysis (CESTA) Seminar, January 21, 2020, https://cesta.stanford.edu/events/cesta-seminar-harry-potter-and-global-phenomenon-fanfic.

  11. Douris, Steele, Mark Algee-Hewitt, and David McClure. “Fanfiction: Generic Genesis and Evolution.” Accessed August 24, 2022, https://litlab.stanford.edu/projects/.

  12. Dym, Brianna, and Casey Fiesler. “Ethical and Privacy Considerations for Research Using Online Fandom Data.” Transformative Works and Cultures 33 (June 2020), https://doi.org/10.3983/twc.2020.1733.

  13. English, Jim, Lyle Ungar, Rahul Dhakecha, and Scott Enderle. “Mining Goodreads: Literary Reception Studies at Scale.” Price Lab for Digital Humanities. Accessed August 24, 2022, https://pricelab.sas.upenn.edu/projects/goodreads-project.

  14. Fiesler, Casey, and Nicholas Proferes. “‘Participant’ Perceptions of Twitter Research Ethics.” Social Media + Society 4, no. 1 (2018), https://doi.org/10.1177/2056305118763366.

  15. Freelon, Deen, Charlton D. McIlwain, and Meredith Clark. “Beyond the Hashtags: #Ferguson, #Blacklivesmatter, and the Online Struggle for Offline Justice.” Center for Media & Social Impact, American University. 2016, https://doi.org/10.2139/ssrn.2747066.

  16. Gallagher, Ryan J., Elizabeth Stowell, Andrea G. Parker, and Brooke Foucault Welles. “Reclaiming Stigmatized Narratives: The Networked Disclosure Landscape of #MeToo.” Proceedings of the ACM on Human-Computer Interaction 3, CSCW (November 2019): 1–30, https://doi.org/10.1145/3359198.

  17. Hegel, Allison. “Social Reading in the Digital Age.” PhD diss. University of California, Los Angeles, 2018, https://search.proquest.com/pqdtglobal/docview/2061506661/3AEADF32C9E44C24PQ/1.

  18. Hemphill, Libby. “Updates on ICPSR’s Social Media Archive (SOMAR).” Presentation, May 30, 2019, https://doi.org/10.5281/zenodo.3612677.

  19. Hemphill, Libby, Susan H. Leonard, and Margaret Hedstrom. “Developing a Social Media Archive at ICPSR.” In Proceedings of Web Archiving and Digital Libraries (WADL’18). New York: Association for Computing Machinery, 2018, https://pdfs.semanticscholar.org/3006/94cfbcb169c55bd14331a493d983aaa352ed.pdf.

  20. Jackson, Sarah J., Moya Bailey, and Brooke Foucault Welles. #HashtagActivism: Networks of Race and Gender Justice. Cambridge, Mass.: MIT Press, 2020.

  21. Journal of Cultural Analytics. “About the Journal: Data Sharing Policy.” Accessed December 1, 2020, https://culturalanalytics.org/about.

  22. Jules, Bergis, Ed Summers, and Vernon Mitchell. “Ethical Considerations for Archiving Social Media Content Generated by Contemporary Social Movements: Challenges, Opportunities, and Recommendations.” Documenting the Now White Paper. April 2018, https://www.docnow.io/docs/docnow-whitepaper-2018.pdf.

  23. Kim, Dorothy, and Eunsong Kim. “The #TwitterEthics Manifesto.” Model View Culture (blog). April 7, 2014, https://modelviewculture.com/pieces/the-twitterethics-manifesto.

  24. Kirschenbaum, Matthew. “What Is an @uthor?” Los Angeles Review of Books. February 6, 2015, https://lareviewofbooks.org/article/uthor/.

  25. Klein, Lauren F. “Dimensions of Scale: Invisible Labor, Editorial Work, and the Future of Quantitative Literary Studies.” PMLA 135, no. 1 (January 1, 2020): 23–39, https://doi.org/10.1632/pmla.2020.135.1.23.

  26. Manshel, Alexander, Laura B. McGrath, and J. D. Porter. “Who Cares about Literary Prizes?” Public Books. September 3, 2019, https://www.publicbooks.org/who-cares-about-literary-prizes/.

  27. Markham, Annette. “Fabrication as Ethical Practice.” Information, Communication & Society 15, no. 3 (April 2012): 334–53, https://doi.org/10.1080/1369118X.2011.641993.

  28. Mason, Jessica. “Researchers Use Algorithm to Mansplain Fanfiction and Miss the Whole Point.” The Mary Sue. December 20, 2019, https://www.themarysue.com/researchers-use-algorithm-fanfiction/.

  29. McGrath, Laura B. “America’s Next Top Novel.” Post45. April 8, 2020, https://post45.org/2020/04/americas-next-top-novel/.

  30. Piper, Andrew, and Richard Jean So. “Study Shows Books Can Bring Republicans and Democrats Together.” The Guardian. October 12, 2016, https://www.theguardian.com/books/2016/oct/12/goodreads-study-books-bridge-political-divide-america.

  31. Porter, J. D. “Popularity/Prestige.” Stanford Literary Lab, pamphlet 17 (September 2018), https://litlab.stanford.edu/LiteraryLabPamphlet17.pdf.

  32. Radway, Janice A. Reading the Romance: Women, Patriarchy, and Popular Literature. Chapel Hill: University of North Carolina Press, 1991.

  33. Shepard, Louisa. “‘May the Force Be with You’ and Other Fan Fiction Favorites.” Penn Today. December 18, 2019, https://penntoday.upenn.edu/news/penn-digital-humanities-fan-fiction-meter-star-wars.

  34. Tatlock, Lynne, Matt Erlin, Douglas Knox, and Stephen Pentecost. “Crossing Over: Gendered Reading Formations at the Muncie Public Library, 1891–1902.” Journal of Cultural Analytics 3, no. 3 (March 2018), https://culturalanalytics.org/article/11038.

  35. Tompkins, Jane P. Reader-Response Criticism: From Formalism to Post-Structuralism. Baltimore: Johns Hopkins University Press, 1980.

  36. Underwood, Ted. “No Such Thing as Bad Publicity: Toward a Distant Reading of Reception.” Presentation at the Modern Language Association Annual Convention, Seattle, Washington, January 10, 2020, https://tedunderwood.github.io/badpublicity/.

  37. Walsh, Melanie. “Tweets of a Native Son: The Quotation and Recirculation of James Baldwin from Black Power To# BlackLivesMatter.” American Quarterly 70, no. 3 (2018): 531–59.

  38. Walsh, Melanie, and Maria Antoniak. “The Goodreads ‘Classics’: A Computational Study of Readers, Amazon, and Crowdsourced Amateur Criticism.” Post45 1, no. 7 and Journal of Cultural Analytics 6, no. 2 (April 2021).

Annotate

Next Chapter
Chapter 19
PreviousNext
Royalties from the sale of this book will be donated by the editors to the Ricky Dawkins Jr Memorial Scholarship.

Copyright 2023 by the Regents of the University of Minnesota
Powered by Manifold Scholarship. Learn more at
Opens in new tab or windowmanifoldapp.org