Voices from the Server Room
Humanists in High-Performance Computing
Quinn Dombrowski, Tassie Gniady, David Kloster, Megan Meredith-Lobay, Jeffrey Tharsen, and Lee Zickel
High-performance computing (HPC) is a technical resource that can vastly reduce the processing time necessary for computationally intensive digital humanities work. HPC is generally considered to involve aggregating or parallelizing computational resources. This includes tasks such as very large-scale optical character recognition (hundreds of thousands of pages or more), photogrammetry for generating 3D models, or dependency parsing corpora of thousands of documents for text analysis. While centrally run HPC clusters with free or low-cost access for faculty are common service offerings at research universities, national HPC resources in the United States and Canada are available to anyone affiliated with a university or college. In the annual usage reports prepared by HPC service providers, disciplinary diversity is valued as a way to show that the considerable funds directed toward HPC are serving the entire research community and not just the “usual suspects” in astrophysics or computational chemistry. In this context, humanities examples of HPC use are particularly valued. And yet, despite the utility of HPC for certain kinds of digital humanities (DH) work and the desire of HPC staff for humanists to use their services, HPC use by humanists remains rare and is often marked by frustration on both sides.
This chapter, focusing on HPC labor and the North American institutions in which the resources are located, draws on the experiences of five humanists who work (or have worked) in HPC support roles at institutions or national organizations in order to identify some of the barriers to HPC use by humanists and to assess the ways in which humanists can better engage with and in institutional HPC infrastructure, locally or nationally. Awareness of potential pitfalls and ways to mitigate them is important for managing the expectations of both the new HPC user and the HPC providers. In this chapter, we describe the value of having humanities disciplinary experts as part of HPC support staff, with the goal of offering readers arguments to advocate for these roles at their institutions or as part of national HPC centers open to all institutionally affiliated scholars. Particularly for scholars who do not have access to local HPC staff, we describe the peer support resources available through three national centers and how scholars can get involved with those groups. Finally, given the expanding number of HPC support roles open to humanists, we describe the skills necessary to succeed in such a role and the diverse ways of pursuing such a professional pathway.
In addition, we argue for the ongoing need for humanists on the service provider side of HPC and advocate for greater consideration of HPC support as a meaningful alt-ac career path.
Humanities Use Cases for HPC
The growing availability of large datasets, such as those created and maintained by cultural heritage institutions, has increased the likelihood that scholars in the humanities may be confronted with more data than they can realistically process on their own computers. At the same time, there have been advances in useful but computationally expensive methods, such as the creation and application of machine learning models for everything from optical character recognition (OCR) to text analysis and image classification. They have increased the need for humanists to have access to more powerful hardware than has traditionally been common outside of fields that process multimedia.
Despite the demand for better computational resources stemming from more data and more complex methods, the number of projects that have been able to take advantage of the HPC resources offered locally at R1 institutions and nationally for any institutionally affiliated scholar in the United States and Canada has been limited.1 Typically, projects that have successfully used HPC have involved collaboration with DH-aware staff on HPC support teams or staff with a mix of technical and disciplinary expertise in the library, DH centers, or similar groups. Melissa Terras and colleagues describe a series of case studies where a team collaborated with humanities scholars to help them use large datasets from the British Library—one involving tracking the term “cholera” over time and another tracking changes in the size and technique of images as they corresponded to book genre. A number of projects have used HPC to perform OCR at scale (e.g., Köntges et al.) or develop systems to improve OCR quality for particular kinds of text (e.g., Christy et al.). Others describe the use of HPC for photogrammetry to generate high-quality 3D models for researchers in fields including cultural heritage and architecture (Gniady et al.; Ruan et al.). HPC also has a role in managing large textual corpora at the Stanford Literary Lab (McClure et al.). And work such as the large-scale investigations of maps without first rendering them as vectors, as described by Katherine McDonough in “Maps as Data” in this volume, is likewise well-suited to HPC.
But just because there is a match between the technical needs of humanists doing the kinds of work described here and the computational resources provided by HPC centers locally or nationally, it does not mean that there is always an easy path for humanists to use HPC. In the next section, we describe the translation work that needs to take place to go from a humanist’s needs (e.g., around large-scale text or image analysis or computationally intensive processing tasks like OCR or dependency parsing a corpus) to a successful engagement with HPC service providers.
Domain Translation
When we talk about support for HPC in the humanities, it helps to broaden the view to think less about large servers and compute nodes and more about a digital research infrastructure. What is digital research infrastructure? The PARTHENOS training project, part of the European Commission DARIAH network, defines research infrastructures for humanities as “shared, unbounded, heterogeneous, open, and evolving socio-technical systems comprising an installed base of diverse information technology capabilities and their user, operations, and design communities.”2 This definition works for digital research using HPC as well as for any other research computing context. But how do the needs of DH scholars differ from those of other disciplines, and how can institutions (including national HPC centers) address those needs through the development of digital research infrastructures that better support humanities scholars?
The area in which DH scholars benefit the most from domain-specific HPC support is in translating humanistic inquiries into questions that a computer can answer. The questions posed by humanists are often fundamentally different from the mathematical or scientific questions historically solved by computing clusters. John Unsworth, in a 2006 report on cyberinfrastructure for the humanities and social sciences from the American Council of Learned Societies, wrote:
Humanities scholars and social scientists will require similar facilities but, obviously, not exactly the same ones: grids of computational centers are needed in the humanities and social sciences, but they will have to be staffed with different kinds of subject-area experts; comprehensive and well-curated libraries of digital objects will certainly be needed, but the objects themselves will be different from those used in the sciences; software toolkits for projects involving data-mining and data-visualization could be shared across the sciences, humanities, and social sciences, but only up to the point where the nature of the data begins to shape the nature of the tools.3
Importance of Local Support
Supporting this domain-to-domain translation for DH scholars wishing to use HPC can best be realized with local, institutionally situated support from members of the HPC support staff.4 Having a staff member on the HPC team who understands who is doing DH and how they are doing DH at the institution can increase the success of researchers wanting to access the infrastructure. This diversity of DH needs can also be addressed where the HPC staff also understands the backgrounds of other members of the HPC support team and their ability and willingness to work with humanists and act as a go-between where necessary.
It is well known that DH is a field often defined more for its broad scope than for specific tools or methods common across the field. Entire institutions themselves often specialize in only a few genres of DH. One institution may have a number of people working on large-scale text analysis of thousands of written documents and centered in an English department, while another may have archaeologists whose datasets are a heterogeneous mix of text, images, and physical objects, such as soil samples and artifacts, and the digital analogues of the same. Lack of support at the local level can mean that researchers are unable to bridge the gaps between their disciplinary needs and the infrastructure available at the institution.
In addition to helping individual researchers, advocating upward for digital research infrastructure that supports DH is an important part of a humanities liaison role. When money is available for system improvements, it is more likely to go toward the kinds of features that make HPC more accessible for humanists (e.g., user-friendly portals of various kinds) if someone on the HPC support staff is advocating for them and soliciting use cases from local humanities faculty. For an institution to achieve something close to Unsworth’s vision of “grids of computational centers,” there needs to be support from staff who understand humanities research, are well versed in how humanities research translates into digital research, and can articulate the infrastructure needs of the humanities community to those who build and support that infrastructure.
This brings us to the challenges facing DH scholars who do not have access to someone with experience in both humanities research and HPC systems. Though this researcher may be able to access HPC infrastructure through their institution or nationally, receiving help on specific problems may require going through a service process or ticketing support service that may not address the full context of the challenges the researcher is facing. Researchers in this position can look toward national HPC platforms, such as the Digital Research Alliance of Canada or ACCESS for help.5 Looking toward a group like this, even if it is not in your home country, is one way of finding others with similar challenges and receiving help with best practice.
Navigating Culture Clashes in HPC Support
The translation needed by most humanists to successfully engage with HPC is not limited to bridging the gap between humanities research questions and the specific technical affordances of HPC resources. The presentation of HPC service offerings and the customer service style typically employed in addressing requests and questions from users is steeped in scientific culture and expectations. Long-standing HPC support staff are accustomed to a particular kind of collegially terse exchange with their expected users: scholars for whom programming is an integral, basic part of their academic training and for whom navigating command-line interfaces is as intuitive and familiar as humanists find navigating bibliographies. The move toward opening HPC to all scholars as a core resource has had some welcome effects from the perspective of HPC support teams (i.e., additional funding), but many of the new users, including scholars from the humanities, do not have the background knowledge that HPC staff are used to.
Humanists are not alone here—computational biologists, psychologists, and many others do not share the expectations and practices of the HPC mainstream core user block from fields like astrophysics, nuclear engineering, and chemistry. A question about how to run code on the cluster may be “answered” with a one-line response containing a link to highly technical documentation that covers specific parameters and variables used by that individual cluster but none of the scaffolding needed to get to the point where one could make use of that information. Important information that can often be overlooked includes how to log in, how and where to transfer files, how compute-time allocations work or how long one might wait in a queue, the importance of parallelization and how to do it, and how to install software or run containers. A synchronous conversation can surface these gaps quickly, but a user facing documentation in isolation may not even realize what prerequisites have been omitted until they can’t make the instructions “work.” These kinds of exchanges are highly frustrating for both parties: HPC staff grow annoyed that their usual responses are met with bafflement instead of gratitude, and scholars outside traditional HPC disciplines conclude that HPC is too complicated and the staff can’t be counted on to make it any easier.
One way to improve this relationship could be to implement technical scaffolding documents as well as documentation with some basic explanation of why a process works and what it is doing. A humanities domain expert on the HPC support team is well positioned to develop these kinds of documents and help both nontraditional HPC users and HPC support staff navigate moments of culture clash.
While the addition of humanists to HPC staff groups is a significant step toward better support for nontraditional HPC users, the overall impact is limited when the HPC team takes the approach of directing all “unusual” requests to the staff member with the unusual background rather than adopting a more integrated cross-training model. Humanists working with HPC groups (at their institution or at national organizations) that lack a “translator” can benefit from reaching out to colleagues in similar disciplines who use HPC to see if they can help navigate the process or serve as a kind of advocate for humanists with the HPC group.
Online workshop materials are another resource to support those unfamiliar with HPC, especially complete beginners. An example is HPC Carpentry.6 These self-directed courses may be daunting for scholars who have never worked with the command line or code or those who prefer some hands-on training. However, working through some preliminary Programming Historian tutorials that explain the command line may be enough of an on-ramp to then work through HPC Carpentry successfully.7 An introductory overview like HPC Carpentry can, at a minimum, provide scholars with enough of a vocabulary to ask the kinds of specific questions that are more likely to get a useful answer from HPC staff if there is no humanities domain expert who can serve as a translator.
Peer Support Groups for HPC
National HPC organizations, such as Compute Canada, XSEDE, or the HathiTrust Research Center (HTRC), can play a significant role in ensuring that the entire DH community, regardless of individual institutional resources, is not left out of the HPC ecosystem and its opportunities. The national reach of these organizations allows them to interact with the DH community across broad geographic areas and collaborate more easily across institutional boundaries. Each of these organizations has its own emphasis and strategies for user engagement, as described below.
The Digital Research Alliance of Canada (The Alliance)
The Digital Research Alliance of Canada (formerly Compute Canada) uses the model of a national disciplinary support team. One opportunity for DH scholars to build on this model would be through discipline-specific support teams within a broader DH team. This could take the shape of discipline-specific mini teams within national HPC organizations, headed by a member of the national team with members drawn from the research communities. A national disciplinary support team could also encourage more collaboration among the researchers using national HPC systems. This is a strategy that would not only work in the humanities but would help support all researchers using national digital research infrastructure to share best practices and stories of things that went well and things that could have gone better and to give coordinated feedback to national coordination teams on user needs, challenges, and opportunities. An example might be a group made up of all DH projects hosting a project on the Alliance infrastructure using the Islandora platform or a group of researchers whose projects are all hosted on Alliance cloud infrastructure. There is also room for HPC practitioners at the national level to engage in prototyping for the community on national infrastructure in order to allow DH scholars to more easily use the infrastructure and integrate datasets from outside sources such as the cultural heritage datasets (described in Terras et al., “Enabling Complex Analysis of Large-Scale Digital Collections”). Finally, DH researchers, with some coordination at the national level, should take a more active role in helping to shape future funding calls that take into account the needs of DH scholars in terms of sustainability of web-based research projects and the need for expertise in development and tool-building.
ACCESS
Even though ACCESS (previously XSEDE) and its service providers are based in the United States, ACCESS has community groups to help promote global outreach around HPC resources. The ACCESS acronym stands for Advanced Cyberinfrastructure Coordination Ecosystem: Services & Support and is funded by the National Science Foundation, yet it offers help for DH practitioners in the form of MATCH Services. With MATCH Services the researcher submits an engagement request and a mentor and student are assigned based on the project needs, including DH projects. Additionally, Campus Champions are still offered as an assistance option with ACCESS as they were with XSEDE. While some champions are domain specific, other champions are specific to an institution (campus champions) or a region (regional champions), the latter of which could be a resource for scholars whose institution does not have its own HPC services. In addition, ACCESS offers online training and YouTube video tutorials to help new users. However, concerns about gaps in The Alliance’s support are also applicable to ACCESS. Even with DH assistance, there is a need for training and workshops developed with humanists in mind. These humanist-specific trainings and workshops could be more easily created within a DH support team made up of humanists from different disciplines, as suggested above.
HTRC
Another global resource that offers HPC services is the HathiTrust Research Center. While the HTRC is not a traditional HPC resource in the same vein as ACCESS and The Alliance, it is an asset for humanists looking to leverage not only additional computing power but datasets, tools, and resources not available with other national or global HPC organizations. The HTRC allows users access to over 16 million digitized volumes in the HathiTrust Digital Library and provides built-in analytics tools as well as Linux data capsules where users can pull both volumes that are in and out of copyright and perform various text analysis algorithms using tools like Voyant or their own scripts.8 In addition, the HTRC provides access to volume metadata and lets users create worksets of volumes and then share those worksets with other users, who can then pull them into their data capsules and run their own analyses. Finally, HTRC users are not limited to what is in the HathiTrust; they can transfer their own datasets via secure file transfer protocol (SFTP) or download them from their preferred cloud storage option while the capsule is in maintenance mode. This allows users to combine their texts or other data with other texts or data in the HathiTrust. While these tools and resources are invaluable, the HTRC data capsules only allow a default maximum of twenty gigabytes of RAM and ten CPUs. This makes a data capsule about as powerful as a decent laptop. However, if more computing power is needed for special projects, it is possible to request additional resources, apply for special project grant money outside of HTRC, or apply to be one of the Advanced Collaborative Support (ACS) projects offered yearly by the HTRC. The ACS projects come with specialized expertise, developer time, and compute resources. Outside the ACS projects, the HTRC has a combination of programmers and computer scientists as well as DH specialists on hand to provide support to users and help bridge the gap between humanists and developers.
Pursuing Professional Paths in HPC Support
HPC support groups have been expanding to better serve disciplinarily diverse audiences by hiring specialists from a wider variety of fields who have deep expertise in HPC. As a result, the kinds of humanities liaison roles we have described here have become a viable alt-ac career path for many humanities scholars. Unlike the digital scholarship librarian position, which is becoming an increasingly well-defined (and credentialed) career path, humanists come to HPC support roles through many different routes. Humanists working in HPC groups tasked with helping other humanists use available HPC resources need to be able to communicate effectively in both worlds, which means having training and experience in both domains. The academic paths taken by humanists in HPC support roles are as varied as the field of DH itself.
A quick look at the backgrounds of the authors for this publication attests to this diversity. Many have advanced humanities degrees, and they acquired their computational expertise through vocational roles before earning their degrees, while earning their degrees, or after the fact. Others have multiple degrees, often combining a humanities degree as well as a degree in informatics, computer science, or library science with some emphasis in digital libraries or, in more increasing instances, digital humanities. For example, Indiana University’s Department of Information and Library Sciences offers a specialization in digital humanities for their Master in Library Science and Master in Information Science degrees, the University of Central Florida offers a DH minor, and Loyola University–Chicago offers an MA in DH.9 The common thread is that humanists in HPC (or even DH in general) have a degree in a humanities discipline, be it undergraduate or graduate, and a degree or vocational experience in a computational or informatics discipline.
Another popular degree trend is the aforementioned library science and/or information science degree or an MA in digital humanities. The appeal of earning these degrees is that it can be a one-stop-shop for those seeking to gain the knowledge and skills necessary to acquire employment in a DH position, such as providing HPC support to humanists. For those with a library science (as opposed to information science) degree, a focus in digital libraries or digital humanities is helpful if a job working with humanists and HPCs is desired. However, even with the digital library or digital humanities focus, much on-the-job learning is necessary in order to help humanists with their HPC needs as these specializations, at best, provide only introductory training. Many universities house their DH centers in the library, and they are often funded, at least partly if not entirely, by the library. In the absence of a DH center, many academic libraries employ librarians whose job descriptions require skills that are well suited to help with potential DH projects. This close association between DH and libraries and library degrees has led to many taking this road to obtain the degree, knowledge, and skills necessary to gain employment in a DH position.
For individuals not currently holding a PhD, having an MLS/MIS degree is a qualifying alternative when applying for library-associated DH positions. The proliferation of these types of jobs has led to the notion that a library degree is an advantage in the field. This route has become so popular that the other degree paths mentioned above are often viewed as less conventional, even when bridging the HPC and humanities gap. However, many library-based DH specialists do not have the experience or training needed to help with HPC needs that may arise. This is why even with a library-based DH center or DH librarian, a person who understands HPCs in addition to humanities scholarship is necessary. While the MLS/MIS degrees can provide a good base to prepare for a DH position involving HPCs, it should be expected that the lion’s share of the learning will be done as part of the vocation until MLS/MIS degree programs begin to offer more courses on HPC systems. This would allow those interested in a DH career to qualify for DH positions in both the library-centric DH world as well as the emerging world of DH in the information technology (IT) centers supporting HPC systems. Until this happens, university IT centers looking to hire people who can bridge the gap between humanists and HPCs will need to understand that unless they happen to find one of the handful of people who are not already employed and already have experience or training with HPC systems, they will need to be patient and provide the necessary training.
Traditional and Nontraditional Paths to Expertise in Tech/Computer Science and DH
The traditional path to advanced computer science skill-building comes through a degree or concentration in computer science, often during undergrad, when basic and advanced algorithm design in languages like C++ or Java, training in a variety of data structures, and working daily within a command-line environment on an HPC cluster are all part of the standard curriculum. Few digital humanists take this route, however, as students wishing to pursue both humanistic and computer science training must balance their workload and fulfill degree requirements across divisions that often do not coordinate well. The far more common path is for a student pursuing a humanities degree to become aware of and interested in applications of technology (including computer science) to further their research and then to begin learning those technologies on an ad hoc basis with a research goal in mind. Other humanities-focused students have found a general knowledge of technology (including computer science) can be beneficial as they work to diversify their skill sets and serves to open professional career options that would otherwise not be available with solely a humanities degree.
In DH, even for those working in high-performance computing, all of these tracks can be viable options, depending on one’s area of specialization, overall skill set, and the primary needs of the research community. While formal training in computer science affords a deeper understanding of some forms of technology—algorithm design in particular—advanced competency and even expertise can be achieved by the dedicated student. This skill set includes, but is not limited to, a scripting language such as Python or R, methods such as text analysis or network analysis, and technologies such as geographic information systems (GIS) and natural language processing (NLP).
Indeed, a broader skill set is likely more useful in the long run for translator positions and for supporting the needs of humanistic and social scientific research across the academic spectrum, especially when combined with an MLS or MIS degree and deep knowledge in a specific discipline such as art history, digital media, linguistics, or philology. Skills acquired within a computer science (CS) curriculum are generally easier for potential employers to measure than skills gained without certification. Researchers without a CS degree (or analogous certificate) wishing to demonstrate their competency will likely want to spend some time either working professionally in information technologies (for example, as a developer or software engineer, either in the private sector or at a public institution) or by taking certification exams offered by leading information technology corporations like Microsoft, Google, or Amazon/AWS.
As with most disciplines, there are no real shortcuts to mastery of any part of DH. However, an achievable short-term goal is focusing on one area or specialization that will allow the digital humanist interested in working in HPC to become competent, proficient, and ultimately an expert in one part of the spectrum that makes up the current range of digital and computational approaches to humanistic inquiry. Many of these core skills can also be learned in concert; for example, most modern programming languages tend to use similar logical structures, and gaining an understanding of the command line and *NIX operating systems is excellent preparation for working within an HPC environment. Many skill sets within the geographical sciences tend to build on each other (from basic GIS to spatial data science to complex computational approaches to urban design). Similarly, a focus on textual data and corpus creation can lead to expertise in corpus linguistics, digital philology, or library collection and federation practices.
Another skill component to DH HPC is project management. Strong project management skills allow for the articulation of research goals into achievable and usually quantifiable outcomes.10 Experience working on and ultimately managing projects is an excellent way to learn how to identify potential pitfalls and roadblocks and how to avoid them, how to plan for the entire lifetime of a project, and how to help make the process as efficient and as appropriate for the needs of the project researchers and stakeholders as possible. We would again like to emphasize that the DH practitioner need not enter HPC as an expert programmer. The fact that humanities disciplinary liaisons are required to navigate different communication styles, levels of technical fluency, and types of documentation, as well as to advocate for humanities-oriented priorities in infrastructure development, means that high levels of coding proficiency are less crucial than in roles such as research software engineers (described by Damerow et al., “Of Coding and Quality,” in this volume). Skill-building and gaining expertise in DH methods using HPC resources is a process, and no single practitioner can reliably claim expertise in the entirety of the field. As is the case with most disciplines, a broad base of general knowledge supplemented by knowledge acquisition in particular areas can most reliably predict individual success.
National computing resources in the United States and Canada have made strides in opening up their infrastructures to humanities practitioners, not only by making available computing resources but by creating staff positions for disciplinary experts who can translate humanists’ questions and needs to HPC support staff, and vice versa. Many R1 institutions have likewise moved in this direction. In both local and national contexts, there continues to be a need for humanities scholars to advocate for better support for the kinds of work they are interested in doing with HPC, be it through the hiring of additional staff fluent in DH methods or by fostering communities of practice oriented toward humanities tools and methods.
The urgency around this kind of support has increased significantly in the wake of the Librarian of Congress granting a 2020 request for a new exemption to section 1201 of the Digital Millennium Copyright Act (DMCA) that supports the computational analysis of multimedia, including texts (in the form of encrypted ebooks) and video (in the form of encrypted DVDs). While this exemption paves the way for more U.S.-based humanities scholars to legally use cultural materials that have until recently been rendered inaccessible by digital rights management and other technological protection measures, the onerous security and access conditions on the exemption will inevitably lead humanities scholars to the doorstep of HPC support groups, with requests akin to those from the medical school.
The likely expansion of HPC support with an eye toward a growing humanities user base will foster new humanities career tracks. For students in the humanities who are considering different career options, the path toward becoming a humanities domain specialist in an HPC group shares some overlap with paths for digital scholarship librarians, developers, or research software engineers. However, the lack of HPC-specific courses in programs such as LIS degrees means some degree of personal motivation and self-study is likely part of the journey. Having more humanities students develop some exposure to the research methods that HPC enables, and then take up HPC staff roles themselves, will lead to increasing support for humanists using computational methods. In the meantime, scholars—particularly faculty with the power to influence their institutions or advocate nationally—can make strides toward this vision by developing some technical know-how, through the formal and informal paths to mastery we have laid out, and engaging with HPC support staff, if only to speak up for their needs. This combination of advocacy from senior scholars and skill-building among students who are then positioned to pursue careers in HPC support will establish a path for the successful and sustainable use of HPC in the humanities.
Notes
1. According to the Carnegie Classification of Institutions of Higher Education 2021 update, R1 institutions are those that give research a “very high” institutional priority. The classification includes “institutions that awarded at least 20 research/scholarship doctoral degrees during the update year and also institutions with below 20 research/scholarship doctoral degrees that awarded at least 30 professional practice doctoral degrees in at least 2 programs.” In addition, these institutions had “at least $5 million in total research expenditures” (https://carnegieclassifications.acenet.edu/carnegie-classification/classification-methodology/basic-classification/).
2. “Welcome to the PARTHENOS Project,” https://www.parthenos-project.eu/.
3. J. B. Unsworth, “Our Cultural Commonwealth: The Report of the American Council of Learned Societies Commission on Cyberinfrastructure for the Humanities and Social Sciences” (New York: ACLS, 2006), 8.
4. We recognize that every institution will be able to provide local support and address other means of HPC support elsewhere. However, there are advantages to having someone on the ground who is able to facilitate HPC interaction and effectively advocate for changes, where necessary, to support practices in order to better meet humanists’ needs. Scholars using national-level cluster resources (e.g., when local resources do not exist at their institution) can also benefit from establishing a working relationship with a staff person at the national cluster who is familiar with humanities questions (e.g., the digital humanities specialist role at Compute Canada), though that person may not be able to provide the same level of attention and one-on-one support as a local, institutional person.
5. The Digital Research Alliance of Canada (https://alliancecan.ca/en/services/advanced-research-computing/national-services/humanities-and-social-sciences), for instance, has a national humanities and social sciences group that consults regularly with researchers and conducts outreach and training for the DH community. ACCESS (previously XSEDE; https://support.access-ci.org/match/overview) matches you with a mentor and student based on your research project and computing needs.
7. As an example of a tutorial, see the “set up” category of lessons at Programming Historian here: https://programminghistorian.org/en/lessons/?topic=get-ready.
8. The main page for the built-in tools can be found at https://analytics.hathitrust.org/explore. Some examples include the HTRC Feature Reader (https://github.com/htrc/htrc-feature-reader), the workset builder (https://solr2.htrc.illinois.edu/solr-ef/), and use of data capsules (https://analytics.hathitrust.org/staticcapsules).
9. To learn more about the degrees at IU, UCF, and Loyola-Chicago, see, respectively, https://ils.indiana.edu/programs/specializations/digital-humanities.html, https://www.ucf.edu/degree/digital-humanities-minor/, and https://www.luc.edu/digitalhumanities/.
10. Currently, end goals/outcomes often come in the form of statistics to support a research conclusion, data visualizations, or interactive environments, but these are just a few of many viable (project) outcomes.
Bibliography
- Christy, Matthew, Anshul Gupta, Elizabeth Grumbach, Laura Mandell, Richard Furuta, and Ricardo Gutierrez-Osuna. “Mass Digitization of Early Modern Texts with Optical Character Recognition.” Journal on Computing and Cultural Heritage 11, no 1 (December 2017): 1–25. https://doi.org/10.1145/3075645.
- Gniady, Tassie, Guangchen Ruan, William Sherman, Esen Tuna, and Eric Wernert. “Scalable Photogrammetry with High Performance Computing.” In PEARC ’17: Proceedings of the Practice and Experience in Advanced Research Computing 2017 on Sustainability, Success and Impact, art. 72, 1–3. New York: Association for Computing Machinery, 2017. https://doi.org/10.1145/3093338.3104174.
- Köntges, Thomas, Rhea Lesage, Bruce Robertson, Jennie Sellick, and Lucie Wall Stylianopoulos. “Open Greek and Latin: Digital Humanities in an Open Collaboration with Pedagogy.” IFLA World Library and Information Congress (2019). http://library.ifla.org/2551/1/178-kontges-en.pdf.
- McClure, David, Mark Algee-Hewitt, Steele Douris, Erik Fredner, and Hannah Walser. “Organizing Corpora at the Stanford Literary Lab. Balancing Simplicity and Flexibility in Metadata Management.” In Proceedings of the Workshop on Challenges in the Management of Large Corpora and Big Data and Natural Language Processing (CMLC-5+BigNLP) 2017, Including the Papers from the Web-as-Corpus (WAC-XI) Guest Section. Mannheim: Institut für Deutsche Sprache, 2017. https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/6261.
- Ruan, Guangchen, Eric Wernert, Tassie Gniady, and Esen Tuna. “High Performance Photogrammetry for Academic Research.” In PEARC ’18: Proceedings of the Practice and Experience on Advanced Research Computing, art. 45, 1–8. New York: Association for Computing Machinery, 2018. https://doi.org/10.1145/3219104.3219148.
- Terras, Melissa, James Baker, James Hetherington, David Beavan, Martin Zaltz Austwick, Anne Welsh, Helen O’Neill, et al. “Enabling Complex Analysis of Large-Scale Digital Collections: Humanities Research, High-Performance Computing, and Transforming Access to British Library Digital Collections.” Digital Scholarship in the Humanities 33, no. 2 (2018): 456–66. https://doi.org/10.1093/llc/fqx020.