Chapter 10
Digital Brush Talk
Challenges and Potential Connections in East Asian Digital Research
Aliz Horvath
Hitsudan 筆談, literally “brush talk,” refers to a common premodern means of communication, predominantly practiced by the literati of the East Asian region, constituting a live “written conversation” in classical Chinese.1 This was both possible and necessary because the major languages of the area, Japanese, Chinese, and Korean, are linguistically distant from one another, but they historically shared a mutual characteristic in the form of writing. Both Japan and Korea, while developing their own writing systems, adopted the use of classical Chinese script as well, thus making it a sort of “written lingua franca” in the Sinosphere. However, the dominance of China, the most important source of innovation and cultural products for its neighbors until approximately the early nineteenth century, gradually decreased with the increased interest of other areas in exploring their own individual heritage and thus strengthening their positions in the region. One manifestation of this phenomenon was the spread in the use of separate writing systems, the hangŭl in Korea and the kana system in Japan, which eventually influenced the channels of transnational communication as well.2
The importance of heterogeneity can be detected in modern scholarship as well, in the form of geographically distinct research specializations, although with a growing interest and demand for transnational inquiries. This relative divisiveness and, particularly in the context of digital humanities, the difference in writing systems are among the most important aspects defining the landscape of East Asian digital humanities. This field is relatively young and small, therefore the number of relevant publications is very limited, but growing interest in specialized workshops and conferences will likely lead to an increase in this regard as well.
One factor that seems to further describe the impact of the digital in East Asian studies, beside the “spatial” or “geographical” differences, is what can be called a “temporal” divide. This latter element stems from the distinction between premodern texts written entirely in classical Chinese and more modern materials, where the content is dominated by simplified characters or locally created scripts. Dealing with non-Western corpora can always be challenging, but as I will later explain in more detail, modern materials generally present relatively less obstacles for scholars compared to premodern pieces, which often cannot be processed through the same tools, hindering the completion of projects involving them. Nevertheless, as I will argue, this problem could perhaps be lessened by combining the above-mentioned “spatial” and “temporal” aspects through a more systematic collaboration between scholars conducting digital projects using premodern corpora from various parts of the Sinosphere, or as Thomas Mullaney named it, the “kanjisphere” (“Controlling the Kanjisphere”). This is what I will refer to as “modern hitsudan.”
I will break up this question into two major parts. First, I will introduce the peculiarities of the different segments of digital practice in the context of East Asian studies, followed by a survey of the state of the field, focusing on a variety of recent initiatives that aim to tackle the problem and expand the usage of digital components. My own current research is based on the study of the Dai Nihonshi (History of Great Japan), the most monumental product of history writing in Japan, written entirely in kanbun (by and large classical Chinese with reading aids). The compilation of this history covered the 250 years of what we would call the early modern period (seventeenth to nineteenth centuries) in Japan. Owing to the nature of this corpus, the examples featured in this chapter will come predominantly from the Japanese field, but I will make repeated references to the cases of China and Korea as well.
Digitization and Data Management
As I mentioned above, besides the spatial distinction of digital projects in East Asian studies, there seems to exist a temporal divide as well. One of the most important reasons for this latter phenomenon is clearly related to the question of digitization and how we interpret the “digital.” We can discuss this from both an institutional and an individual perspective.
From an institutional point of view, the “digital” manifests itself predominantly in the form of texts available online, as in the case of the Academia Sinica (one of the pioneers of digitization in Taipei), the National Library of China in Beijing, or the National Diet Library in Tokyo. As major national repositories, these organizations can all be considered hubs and, to a large extent, starting points for research both domestically and abroad. Their range of materials is, for the most part, accessible without any particular difficulty, rendering the initial exploratory stage of research projects smoother (and more economical). But because many of these collections are not searchable, their effectiveness can be limited to text mining.
What hinders the progress of these digitization initiatives in general, besides potential financial and copyright issues, are the linguistic characteristics of the Asian languages in question. The Chinese script consists of thousands of logographic characters (hanzi in Chinese)3 with numerous variations in their classical form, whereas the modern version of the Japanese writing system is essentially a combination of Chinese characters and the two types of syllabic kana. Modern Korean writing, on the other hand, is based on a limited number of hangŭl characters, occasionally tinged in South Korea with classical Chinese hanzi (hanja in Korean) to eliminate ambiguities over homophone words.
Another interesting feature of these writing systems, with the exception of modern Korean, is the lack of space between words and, in premodern texts, even between sentences. This point may appear to be more relevant to text mining, but certain premodern materials exist only in the cursive format—sometimes, as in Japanese female writing, in top-down diagonal rows, rendering the separation of characters, and thus the OCR process, exceedingly challenging.
As an example, Figure 10.1 demonstrates a scanned page from the first volume of the Dai Nihonshi (The history of great Japan), one of the central primary sources of my current research project.4 The section in question is an early modern item, but it was printed in the early 1900s. Despite the high quality of the scanned version, Figure 10.2 shows the questionable outcome of the OCR process, depicting the frequent experience of scholars working with nonmodern corpora. It is clear that the problem did not stem from the direction of the text (top-down, right to left, vertical) or its format. Rather, it is owing to two other factors: first, the fairly large number of character variations that the software failed to recognize; and second, the reading aids, or miniature marks next to the main body of kanbun texts (Japanese materials written in classical Chinese). These marks facilitate the reordering of the characters in the course of reading to match the word order of the Japanese language, which vastly differs from the grammatical structure of Chinese.5 However, most OCR software interprets these supplementary characters as integral parts of the actual text, often leading to confusing results.
A further issue, not present in these images, is a stylistic peculiarity of a number of classical Chinese texts and the Dai Nihonshi as well: they follow the example set by ancient Chinese dynastic histories of inserting footnotes in the form of dual lines squeezed into the space of a single column. This feature reduces the size of the text in the footnotes, resulting in a tendency of most OCR software to treat two neighboring characters as one.
Figure 10.1. A page from the Dai Nihonshi.
Figure 10.2. The result of the OCR process conducted on the same page from the Dai Nihonshi.
It is important to point out that this example is part of an individual project. Institutional and larger collaborative initiatives with more substantial resources may achieve somewhat higher degrees of accuracy, although most software employed by scholars in the West was not designed with premodern Asian scripts in mind. The prospects are significantly brighter in the case of modern corpora. Two collaborative projects in Japan, however, have each developed a tool, named SMART-GS and KuroNet, respectively, which may provide a solution to mitigate, if not solve, the “OCR problem” for premodern material as well. Scholars engaged in digitization projects at the Shiryō hensanjo (the Historiographical Institute of the University of Tokyo), one of the premier archives in Japan, inform me that this software in its current form can achieve 80 percent accuracy in recognizing kana components in premodern handwritten sources (Clanuwat, Lamb, and Kitamoto, “KuroNet”; Hashimoto, “Shugōji de yomu rekishi shiryō”; Hashimoto et al., “SMART-GS Project”).6 The Japanese kana system is significantly simpler and more limited in number than the range of Chinese characters, but the fact that both SMART-GS and KuroNet are able to identify handwritten script to a degree, and even distinguish between variations, is unequivocally a major breakthrough in the field, especially as the reading of handwritten premodern texts (kuzushiji in Japanese) itself requires special training.
Collaboration serves as the solid basis of a variety of long-term projects both on an institutional and a more modest level, showcasing another layer of “the digital.” For instance, the award-winning Japanese Text Initiative, housed at the University of Virginia (UVA) with the contribution of scholars from a variety of institutions, offers access to a growing number of premodern Japanese literary classics. What distinguishes this project from most of the library-initiated collections mentioned above is its aim to offer fully searchable texts, tagged in accord with the Text Encoding Initiative (TEI), both in Japanese and in English translation.7 Such collections can not only facilitate scholarly projects with text mining components, but can also be useful pedagogical tools, providing more rapid and economical access to major primary sources that frequently appear on Japanese literature syllabi.
The University of Tokyo, with the participation of researchers from other Japanese institutions, has initiated a similar concept but with a more religious focus. The SAT Daizōkyō Text Database, complemented by the Digital Dictionary of Buddhism (DDB), offers a comprehensive repository of Buddhist texts accessible online, with the purpose of laying the foundation of a complex platform for more analytical scholarly activities as well. The question of online dictionaries, another layer of digital humanities, could form the subject of an entire paper on its own; I will therefore mention only the dictionary component of the above-mentioned project.8 This is the outcome of a major international collaboration led by Charles Muller at the University of Tokyo. Established in 1995 and with more than 70,000 bilingual entries, the DDB represents a model of meticulously curated online dictionaries.9
There is growing interest in DH among scholars of the China field as well. In this case, however, digitization and database-related engagements seem to be more diversified: this means that besides a variety of macro-level initiatives between institutions, generating projects like the China Biographical Database and the related China Historical GIS project, led by Peter Bol at Harvard University, the number of independent collaborations outside any institutional framework is also on the rise. An emblematic example of this latter category is the Chinese Text Project, or Ctext, created by Donald Sturgeon in 2006 (Sturgeon, “Chinese Text Project”). It is one of the largest hubs of materials accessible online. Owing to the research interests of the owner, the project initially focused on digitizing ancient Chinese classics, but with the active contribution of other volunteers, the range of texts has gradually expanded and now incorporates later works as well. At first glance, this project seems to take the same direction as the UVA Japanese Text Initiative, but the additional embedded features of Ctext, such as the parallel passages tool or the dictionary, allow users to experiment across the various layers of “the digital.”10
Korea, on the other hand, takes a somewhat different direction. As Javier Cha explains in his comprehensive overview of the relation between the humanities and “the digital” in Korea, there too most large-scale projects are collaborative institutional initiatives. But unlike in the cases of Japan and China, the digital has been used in South Korea as an instrument in the government’s hand to encourage preservation of the historical heritage for the purpose of economic development (Cha, “Digital/Humanities,” 130–31). Interestingly though, as Cha points out, in spite of the monumental financial support, which surpasses the funding most Western projects receive, participating scholars often seem to resent these novel methods, reverting to a more traditional approach in their own work once the digitization project is completed (139).
This curious tendency, which Cha named the “Digital/Humanities divide,” represents another though not extensive manifestation of “the digital,” since certain scholars have still incorporated digital components into their research. These individual efforts include large-scale digitization by Kim Hyŏn and the creation of databases by Edward Wagner as the basis of a statistical analysis of the characteristics of premodern Korean civil service examinees. They are independent of the government’s initiatives and, unlike the tools of the Ctext project which encourage digital engagements, their purpose has been merely functional, not a broader promotion of humanities computing (Cha, “Digital/Humanities,” 137–38).
Further Layers of “the Digital” and the Importance of Collaboration
Beside these projects in large-scale digitization and database design, another way to explore the characteristics and meaning of digital research in East Asian studies is by examining how they handle their corpora. One way to gain insight into the outcome of data processing is, naturally, to survey the existing scholarship. Owing to the relative novelty of the digital as a method in the context of East Asian studies, the relevant literature is still fairly limited, but what is available reveals some interesting trends in how the digital is understood and used by East Asia scholars. I would divide their output into two major categories, practical and theoretical.
The first category relates to the practical application and incorporation of various digital methods in examining specific topics. Within English-language scholarship in this field, China-related publications are somewhat more numerous than in other areas. This is probably due to the wider circle of scholars engaged in such research, as well as the greater number and size of open-access digital corpora and databases. These favorable circumstances may account for a curious phenomenon: the aforementioned premodern/modern divide in terms of digitization and research methods, which, in my view, significantly influences the conduct and outcome of digital projects in the Japanese field, does not seem to affect the China field to the same extent.
The work of Hilde De Weerdt and Paul Vierthaler, for example, uses a variety of digital methods to examine novel questions in historical and literary studies, which would be exceedingly challenging to answer through more traditional approaches. De Weerdt, leader of the multiyear “Chinese Empires” project at Leiden University, has contributed to the study of premodern empires through the combined method of quantitative and qualitative analysis. Using a newly created markup tool named MARKUS, De Weerdt’s team was able to tag the names of provincial officials across the Chinese Empire in the medieval period and examine the frequency of their written correspondence with each other. Synchronizing the tags created in the MARKUS system with a broad range of digitally available notebooks from the Song dynasty, and also with the above-mentioned existing collections of data, the China Biographic Database and the China Historical GIS project allowed De Weerdt to examine the formation of political networks and the exchange of information between them (De Weerdt, Ming-Kin, and Ho Hou-leong, “Chinese Empires in Comparative Perspective”).
On the margins of history and literature, Paul Vierthaler’s recent projects use statistical analysis to create a novel framework that can help to place individual sources in the broader context of late imperial Chinese print culture and the study of genres (Vierthaler, “Analyzing Printing Trends,” “Fiction and History”). Vierthaler’s meticulous explanation of his exploratory analysis, involving hierarchical cluster analysis and principal component analysis, not only sheds light on alternative approaches to long-standing debates in Chinese literature, pertaining for example to the potential correlation between the size and popularity of printed products, but also offers helpful insights into the advantages and challenges of dealing with large corpora.
Beside historical approaches, the field that receives perhaps the highest degree of attention, comparatively speaking, is literary analysis, where the favorable situation as regards accessible online corpora has encouraged the emergence of digital projects. The individual and collaborative contributions of Hoyt Long, for example, dominated by the question of scale, aim to explore novel territories in the conceptualization of Japanese and world literature in the context of modernism (see, for example, Long, “Fog and Steel”; Long and So, “Literary Pattern Recognition,” “Turbulent Flow”).11
One of the most recent manifestations of digital literary engagements is a special issue of the Journal of Chinese Literature and Culture, edited by Thomas Mazanec, Jeffrey Tharsen, and Jing Chen (“Digital Methods and Traditional Chinese Literary Studies”), exclusively dedicated to digital applications in the field of premodern Chinese literature. Because of the otherwise relatively scattered and narrow state of the field, this entire volume concentrating on humanities computing in Chinese literature may well become a milestone in the development of the field and an encouraging example for other areas as well.
In the projects described above, the authors explain how they built certain tools into their textual inquiries. But this snapshot of the taxonomy of extant scholarship would not be complete without the mention of another, narrower category of papers dedicated to the technology itself. These pieces include Thomas Mullaney’s substantial book and paper on the history of the Chinese typewriter (Mullaney, Chinese Typewriter, “Controlling the Kanjisphere”), and a meticulous analysis by Donald Sturgeon, the initiator of the Ctext project, of his experiment to train an OCR engine only compatible with modern Chinese to more effectively recognize classical Chinese characters as well (Sturgeon, “Unsupervised Extraction of Training Data”). Such ventures not only add to our understanding of the origins of specific issues pertinent to the harmonizing of Asian scripts and computing, they also offer useful insight into potential ways to overcome some of these problems.
Another way to explore major trends in digital projects in East Asian studies is through the content of thematic and skill-based workshops and conferences. In recent years, the Japan field in particular has excelled in this regard, with Mark Ravina’s “Japanese Text-Mining” workshop at Emory University in 2017 and “The Impact of the Digital in Japanese Studies” workshop and its subsequent “Redux” event in 2016 and 2018, organized by Hoyt Long at the University of Chicago.12 The former provided a multiday practical training in R language and topic modeling, whereas the latter two events created a platform for Japan scholars engaged in digital practice to share updates and progress reports on their projects, along with brainstorming sessions toward the formation of a more organized and interdisciplinary community. Moreover, the significant number of Japanese librarians and archivists attending these programs indicated their awareness and commitment toward stepping up the digitization of Japanese materials—which, as I observed above, is the most important foundation and also the greatest obstacle to productive work in the field.
One point that became clear at these events, in my view, was the benefit and challenge of diversity. Digital humanities can serve as an umbrella, an overarching framework with the capacity to attract scholars from a variety of fields—from literature and history through film and media studies to political science—creating an interdisciplinary environment that can also become the source of collaborations. I should not fail to mention that many presentations showcased the successful integration of digital tools, such as ArcGIS and linguistic tools, into the curriculum of history and translation classes, thus drawing attention to another layer of “the digital” in the context of pedagogy.
On the other hand, these events also underlined the existence of the premodern/modern divide. Presenters working on nonmodern corpora emphasized their difficulties with the OCR process much more frequently, and tended to focus more often on manually curated data sets and their visualizations, than those using modern corpora. One reason for this difference, beside problems with digitization, was the differing number of available tools for data processing. Both the available OCR software and tokenizer tools, such as ChaMame by the National Institute for Japanese Language and Linguistics (NINJAL) and the MeCab text segmentation library, needed to separate individual words or compounds, have been trained to recognize modern characters and kana, but have more difficulties handling premodern texts written in kanbun.13 Since most scholars—for the most part self-taught in digital tools—often turn to non-Asia-specific software that recognizes these special characters, the successful data visualizations that these scholars create can, at this point of scholarship, be regarded as major achievements.
Nonetheless, I believe that the long-standing premodern/modern divide, particularly in Japanese studies, could be overcome more effectively by collaborating with China scholars, who have attained more success in premodern projects, and by experimenting with tools developed for Chinese texts. Despite the highly uplifting atmosphere of Japan-specific workshops, a scholar working on kanbun texts can get discouraged realizing the ineffectiveness of a number of methods that work for modern topics. In spite of the minor differences between classical Chinese and Japanese kanbun, I believe that scholars working on these types of corpora could greatly benefit from more interaction with each other.
The DHAsia conference at Stanford University in 2018, and the novel extension of the MARKUS markup software to process Korean texts, can be considered important starting points for this modern hitsudan. The Association for Asian Studies, the premier organization for Asia scholars, has also recognized the importance of this area and added special digital sessions to its annual conference repertoire.14 I myself organized digital roundtables for their meetings in 2020 and 2021 (the earlier of which got canceled owing to the Covid-19 pandemic), featuring both Japan and China scholars and even an information designer, in order to provide a platform for discussion—in this case, with a focus on the role of data visualization in East Asia–related digital research.
Such examples are certainly promising, and one hopes they will introduce novel perspectives to the DH landscape in East Asian studies. But more concerted efforts to organize joint workshops, panels, and other programs could not only enhance technical developments (for example, by extending Chinese tools to apply to kanbun) but also serve as the seedbed for future collaborations. Looking yet farther, it could advance thematic and methodological innovation in transnational and interdisciplinary inquiries.
Notes
I would like to thank Jeffrey Tharsen for helping me produce high-quality images of the text for this chapter.
The romanization of Japanese terms follows the conventions of the Hepburn system.
The romanization of Korean terms follows the conventions of the McCune-Reischauer system.
The romanization of Chinese terms follows the conventions of the pinyin system.
The page shown in Figure 10.1 is from volume 1 of the Dai Nihonshi, initiated by Tokugawa Mitsukuni in the seventeenth century. For the entire source written in kanbun, see Tokugawa Mitsukuni, Dai Nihonshi.
These reading aids are themselves either simple Chinese characters such as 一, or resemble small kana, such as レ.
For more information regarding the challenges and development of OCR practices in the context of Japanese handwritten sources, see Yamamoto and Osawa, “Kotenseki honkoku no shōryokuka,” 819–827, and Yukimasa, “Toppan insatsu.”
For more information, see University of Virginia Library, “Japanese Text Initiative.”
For more information on Asian online dictionaries, see Schneider and Tharsen, “Digital Resources for Sinologists 1.0.”
For more information regarding these digitization projects, see Muller et al., “Origins and Current State of Digitization,” and Nagasaki et al., “Bridging the Local and the Global.”
Further projects with the goal of transcribing and sharing primary sources in a machine-readable format in the context of East Asian studies include the Japanese “Minna de honkoku” initiative by the Kyoto University Historical Earthquake Study Group, as well as the “Ten Thousand Rooms Project” at Yale University. The latter also offers the opportunity to annotate, translate, and comment on digitized texts. For more information, see Minna de Honkoku, “Kyoto University Historical Earthquake Study Group”; Ten Thousand Rooms Project, “Welcome.”
For other relevant examples, see Long and So, “Network Analysis and the Sociology of Modernism” and “Network Science and Literary History,” as well as Long, Detwyler, and Zhu, “Self-Repetition and East Asian Literary Modernity.”
For more information on these events, see Emory University, “Japanese Text Mining”; University of Chicago Center for East Asian Studies, “Impact of the Digital on Japanese Studies”; University of Chicago Humanities and Social Sciences, “Impact of the Digital on Japanese Studies.”
For more information, see MeCab, “MeCab”; WebChaMame, “WebChaMame NINJAL.”
Further relevant scholarly organizations include the Japanese Association for Digital Humanities and the Taiwanese Association for Digital Humanities, among others (see bibliography).
Bibliography
Cha, Javier. “Digital/Humanities: New Media and Old Ways in South Korea.” Asiascape 2, no. 1–2 (2015): 127–48.
Clanuwat, Tarin, Alex Lamb, and Asanobu Kitamoto. “KuroNet: Pre-Modern Japanese Kuzushiji Character Recognition with Deep Learning.” arXiv:1910.09433. October 21, 2019.
De Weerdt, Hilde, Chu Ming-Kin, and Ho Hou-leong. “Chinese Empires in Comparative Perspective: A Digital Approach.” Verge: Studies in Global Asias 2, no. 2 (2016): 58–69.
Emory University, ECDS, QuantTM. “Japanese Text Mining: Handbooks and Guides.” https://scholarblogs.emory.edu/japanese-text-mining/handbooks-and-guides/.
Harvard University Center for Geographic Analysis. “About.” https://gis.harvard.edu/about.
Harvard University China Biographical Database Project (CBDP). “Methodology.” https://projects.iq.harvard.edu/cbdb/methodology.
Hashimoto, Yuta. “Shugōji de yomu rekishi shiryō—SMART-GS ga jikkensuru gurūpu ridingu.” [“Reading historical texts through collective intelligence: Group reading achieved by SMART-GS”]. Jinbun Jōhōgaku Geppō [Digital Humanities Monthly], August 25, 2014. https://www.dhii.jp/DHM/DHM37_smartgs.
Hashimoto, Yuta, Kenro Aihara, Susumu Hayashi, Minao Kukita, and Makoto Ohura. “The SMART-GS Project: An Approach to Image-based Digital Humanities.” DH2014 Digital Humanities Conference Poster, Lausanne, July 9–11. https://web.archive.org/web/20180831213440/http://dharchive.org/paper/DH2014/Poster-48.xml.
Japanese Association for Digital Humanities. “Purpose.” February 10, 2012. http://jadh.org/purpose.
Long, Hoyt. “Fog and Steel: Mapping Communities of Literary Translation in an Information Age.” Journal of Japanese Studies 41, no. 2 (2015): 281–316.
Long, Hoyt, Anatoly Detwyler, and Yuancheng Zhu. “Self-Repetition and East Asian Literary Modernity, 1900–1930.” Journal of Cultural Analytics 1, no. 1 (2017). https://culturalanalytics.org/article/11040-self-repetition-and-east-asian-literary-modernity-1900-1930.
Long, Hoyt, and Richard Jean So. “Network Analysis and the Sociology of Modernism.” boundary 2 40, no. 2 (2013): 147–82.
Long, Hoyt, and Richard Jean So. “Network Science and Literary History.” Leonardo 46, no. 3 (2013): 247.
Long, Hoyt, and Richard Jean So. “Literary Pattern Recognition: Modernism between Close Reading and Machine Learning.” Critical Inquiry 42, no. 2 (2016): 235–67.
Long, Hoyt, and Richard Jean So. “Turbulent Flow: A Computational Model of World Literature.” Modern Language Quarterly 77, no. 3 (2016): 345–67.
MARKUS. “MARKUS—Is a Reading and Text Analysis Platform with a Wide Range of Functionality.” https://dh.chinese-empires.eu/markus/.
Mazanec, Thomas, Jeffrey Tharsen, and Jing Chen, eds. “Digital Methods and Traditional Chinese Literary Studies.” Special issue, Journal of Chinese Literature and Culture 5, no. 2 (2018).
MeCab. “MeCab: Yet Another Part-of-Speech and Morphological Analyzer.” http://taku910.github.io/mecab/.
Minna de Honkoku. “Minna de Honkoku: The Kyoto University Historical Earthquake Study Group.” https://v1.honkoku.org/index.en.html.
Mullaney, Thomas S. The Chinese Typewriter: A History. Cambridge, MA: MIT Press, 2017.
Mullaney, Thomas S. “Controlling the Kanjisphere: The Rise of the Sino-Japanese Typewriter and the Birth of CJK.” Journal of Asian Studies 75, no. 3 (2016): 725–53.
Muller, A. Charles, Kōzaburō Hachimura, Shoichiro Hara, Toshinobu Ogiso, Mitsuru Aida, Koichi Yasuoka, Ryo Akama, Masahiro Shimoda, Tomoji Tabata, and Kiyonori Nagasaki. “The Origins and Current State of Digitization of Humanities in Japan.” In Digital Humanities 2010: Conference Abstracts, 68–70. King’s College London, 2010.
Nagasaki, Kiyonori, A. Charles Muller, Toru Tomabechi, and Masahiro Shimoda. “Bridging the Local and the Global in DH: A Case Study in Japan.” Paper presented at the DH2014 Digital Humanities Conference, Lausanne, July 9–11, 2014. http://web.archive.org/web/20180831200653/http://dharchive.org/paper/DH2014/Paper-833.xml.
Schneider, Holger, and Jeffrey Tharsen. “Digital Resources for Sinologists 1.0.” Dissertation Reviews, May 27, 2014. http://dissertationreviews.org/archives/9213.
Sturgeon, Donald. “Chinese Text Project: Tools.” 2006. https://ctext.org/tools.
Sturgeon, Donald. “Unsupervised Extraction of Training Data for Pre-Modern Chinese OCR.” In Proceedings of the 30th International Florida Artificial Intelligence Research Society Conference (FLAIRS-30), 613–18. Marco Island, FL: AAAI Press, 2017.
Taiwanese Association for Digital Humanities. “Taiwanese Association for Digital Humanities: Guanyu xuehui.” http://tadh.org.tw/.
Ten Thousand Rooms Project. “Welcome to the Ten Thousand Rooms Project!” https://tenthousandrooms.yale.edu/.
Tokugawa Mitsukuni, ed. Dai Nihonshi [The History of Great Japan]. Tokyo: Dai Nippon Yūbenkai, 1928–1929.
University of Chicago Center for East Asian Studies. “The Impact of the Digital on Japanese Studies Redux: Schedule and Panels.” May 4, 2018. https://ceas.uchicago.edu/news/digital-humanities-workshop-redux.
University of Chicago Humanities and Social Sciences. “The Impact of the Digital on Japanese Studies: Schedule.” University of Chicago Library News, November 9, 2016. http://news.lib.uchicago.edu/blog/2016/11/09/the-impact-of-the-digital-on-japanese-studies/.
University of Virginia Library. “Japanese Text Initiative: About JTI and Online Texts.” http://jti.lib.virginia.edu/japanese/texts/index.html.
Vierthaler, Paul. “Analyzing Printing Trends in Late Imperial China Using Large Bibliometric Datasets.” Harvard Journal of Asiatic Studies 76, no. 1–2 (2016): 87–133.
Vierthaler, Paul. “Fiction and History: Polarity and Stylistic Gradience in Late Imperial Chinese Literature.” Journal of Cultural Analytics 1, no. 1 (2016). https://culturalanalytics.org/article/11059-fiction-and-history-polarity-and-stylistic-gradience-in-late-imperial-chinese-literature.
WebChaMame Morphological Analysis Interface. “WebChaMame NINJAL.” https://chamame.ninjal.ac.jp/.
Yamamoto, Junko, and Tomejiro Osawa. “Kotenseki honkoku no shōryokuka: Kuzushiji wo fukumu shin hōshi OCR gijutsu no kaihatsu” [“Labor saving for reprinting Japanese rare classical books: The development of the new method for OCR technology including kana and kanji characters in cursive style”]. Jōhō kanri 58, no. 11 (2016): 819–27.
Yukimasa, Kazuyoshi. “Toppan insatsu, Edo izen no kuzushiji wo koseido ni OCRsuru gijutsu wo kaihatsu” [“Toppan printing to develop technology for the highly accurate OCR recognition of pre-Edo kuzushiji (Japanese cursive style writing)”]. http://ascii.jp/elem/000/001/025/1025165/.