¶ 1 Leave a comment on paragraph 1 0 This essay is part of the second iteration of the anthology. Since public review and commentary help scholars develop their ideas, the editors hope that readers will continue to comment on the already published essay. You may also wish to read the draft essay, which underwent open review in 2015, and the project history.
Digital Representations of Letters
¶ 2 Leave a comment on paragraph 2 0 Of the primary forms of editorial work practiced by literary scholars across all fields and periods, the practice of editing modern letters has been among the least theorized.1 This despite a wide range of examples and practices of digital letters collections and editions, from exhibits of an individual correspondence (like the Darwin Correspondence Project) to substantial digital archives (like The Walt Whitman Archive) to major union catalog and visualization efforts (such as Early Modern Letters Online and the Mapping the Republic of Letters project, discussed below). In a period of digital scholarship making the turn toward increasingly collaborative, open, and social paradigms like those proposed by Alan Liu, in his essay in this collection, and Kenneth Price (“Collaborative Work”), letters offer us an important test case for theorizing the larger disciplinary and institutional changes concomitant with the move to digital infrastructures for communication and publication.
¶ 3 Leave a comment on paragraph 3 0 Letters have properties of immediate interest for the next wave of digital scholarly communication. Their sources and physical locations are usually far-flung: the letters of D. H. Lawrence, to take a typical example, are located in more than twenty-six archival and personal collections (xx–xxi). The communities of interest for letters are often equally dispersed, because archivists, special collections, scholars, auction houses, and book dealers are all concerned with identification, aggregation, and preservation of the documents. Moreover, printed editions of letters are typically incomplete, unable to contain all the responses, associated letters, private letters, telegrams, inscriptions, and associated social documents that might concern a given scholarly community. Letters thus clearly defy the scholarly dream of a definitive print edition, as reviews of collected letters have long noted (Finneran). The editors of the Letters of Virginia Woolf note that they are unable to print Woolf’s voluminous archive of party invitations and acceptances, for example, though these would surely interest readers of Mrs. Dalloway (ix).
¶ 4 Leave a comment on paragraph 4 0 Most important, the primary form in which we now study literary correspondence essentially reduces and thus misrepresents the relational aspect of correspondence. A letter is (almost) always written to someone and from someone and is therefore inherently a relational form that connects, persons, groups, institutions, and political bodies; yet we usually study letters in books organized by a proper name (“The Letters of Samuel Johnson”). The phenomenological and bibliographic experience of reading a letter, unlike that of reading a novel or a volume of poems, necessarily differs from the way scholars read and represent letters in codex form. Literary correspondence is usually written within some larger social circle, a group of friends and writers all talking about shared concerns, subjects, and historical events (the Shelley circle, the Bloomsbury group). Letters are written in response to other letters, call for a response in return, or designate the limits of a community (as in the First Epistle to the Corinthians); none of these relations can be completely addressed in conventional editions of letters, though editorial footnotes and supplements serve to bridge the gaps.
¶ 5 Leave a comment on paragraph 5 0 For all these reasons, the digital representation of correspondence offers us the chance not just to remediate a central part of our cultural inheritance but also to begin to do justice to the larger social fields in which letters were written and thereby better represent the social dimension of epistolary thinking. We can do this without losing track of the materiality of letters as artifacts: as with all editions, digital letters will always be surrogates of some otherwise inaccessible document, in a form that permits interrogation, aggregation, and preservation. Digital collections of letters thus allow systematic, rigorous, and progressively more accurate representation of the original letters through facsimiles, transcription, or a combination of the two. Editors also have the opportunity to build collections of letter metadata that are supplementary and additive to existing print editions of standard correspondence, editions that will remain essential tools. Crucial for both digital editions and collections of metadata, however, will be shared standards, practices, and platforms for the representation, preservation, and communication of digital forms of literary correspondence. In this essay I outline the history of editorial theory and practice in regard to modern editions of letters, then discuss the best current practices for representing and sharing digital collections of letters. Finally, I point to new horizons for the scholarly knowledge infrastructure concerned with literary correspondence and to the project of connecting disparate yet linked collections of letters.
Editing Editions of Letters: Editorial Theory and Practice
¶ 6 Leave a comment on paragraph 6 0 Once we begin representing letters, we are engaged in an editorial task. (Library metadata specialists and archivists more interested in cataloging and preservation should look below, to “Digital Editions of Letters: Standards for Preservation, Access, and Dissemination.”) Theoretical questions concerning the editing, indexing, and representation of literary letters have most often been taken up by practicing editors, as we would expect. Since the practice of editing letters (and the commissions from publishers) usually precede the stage of explicit theoretical reflection, modern theoretical work on editions of letters is almost inseparable from the considerations entailed in editing particular authors. Robert Halsband’s often cited essay “Editing the Letters of Letter-Writers,” for example, derives many of its proscriptions and procedures from the highly literary letters of Lady Mary Wortley Montagu (1689–1762), even as it sets out, in an admirably transparent, well-exemplified discussion, more general problems of editing letters. The theory and practice of editing letters thus tend to converge: the editions advance explicit or implicit theories of epistolary representation, and the theoretical articles are often supported by a single exemplary case. Nevertheless, a set of common questions, concerns, and problems can be traced in the modern literature on the subject. Though full exposition and evaluation of the wider range of editorial issues involved in literary correspondence must wait, we can outline the issues relevant for the digital editions in particular.
¶ 7 Leave a comment on paragraph 7 0 The primary distinction of twentieth-century editions of letters, in the view of Halsband, was that, instead of producing editions oriented toward “a reading public who bought and read,” as did the nineteenth-century editors, representations of letters were constructed for a specialist scholarly audience. This change in audience entails a fundamental change in editorial approach: “In our view letters have become ‘documents’; and the editor, instead of presenting a literary work, is setting up an archive” (26). This shift from the literary reading of letters to a scholarly approach to documents elicits a certain nostalgia in Halsband, as the editor of a writer who was known mainly for her letters and specialized in “the letter as a genre” (27). The distinction between literary reading and scholarly approach still applies today in the division between digital correspondences designed for public exhibition (which is most, as discussed below in “Paradigms for Digital Correspondence”) and those edited according to the protocols of documentary or critical editorial theory.
¶ 8 Leave a comment on paragraph 8 0 Despite his opening note of nostalgia, Halsband’s extensive summary of mid-twentieth-century editorial practices surveys an enormous diversity of editorial approaches to presentation, selection, transcription, and normalization and also attests to a growing consensus as to audience and goals: letters should be represented, by analogy to documents, in an archive with facticity, accuracy, and completeness. Most twentieth-century editions of letters have focused on documents, not authorial works, though most take a primary author as the basis of their selection. The primary task of most critical editions—determining the authority of variant texts of a work—is thus usually not an issue for editions of letters, as the text of most letters exists in only one version.2 As documentary editors know, a series of editorial decisions remain: any edition of letters offers a representation, a surrogate, of documents, in a form that attempts to balance intelligibility and fidelity with the sense of distant (sometimes lost) material artifacts. The editorial choices to be made extend beyond the requirements of documentary fidelity with regard to selection or transcription, the most significant initial issues. This is true of digital as well as print editions of letters, neither of which transparently stands in for the original artifact (though linked facsimile images and transcripts come closer to this goal).
¶ 9 Leave a comment on paragraph 9 0 More recently, twenty-first-century editors of digital letters have moved toward what Elena Pierazzo calls the “digital documentary edition” (DDE), prompted in part by the clear advances afforded documentary editions in a digital environment (“Digital Documentary Editions”). Editors at the William Blake Archive, the Walt Whitman Archive, and elsewhere seek to “establish documentary texts” rather than reconstruct authorial intentions (Editorial Policy Statement), a stance that seems appropriate to documents never intended for revision and publication. Pierazzo defines the digital documentary edition broadly, as “the recording of as many features of the original document as are considered meaningful by the editors, displayed in all the ways the editors consider useful for the readers, including all the tools necessary to achieve such a purpose” (“Rationale” 475). This definition includes the “digital infrastructure (visible to the final user or not) necessary for the publication and exploitation of such content,” so that editors are responsible for the many possible forms of presentation enabled by new codes and scripts (usually XML encoding and XSLT transformations): “DDEs can assume the form of diplomatic, ultra-diplomatic, semi-diplomatic, or reading editions on demand, thanks to digital . . . delivery of an appropriately encoded text” (“Digital Documentary Editions”). In addition, digital documentary editions can combine facsimile images with facing editorial transcriptions in a way that allows the reader to become part of the improvement of a textual transcription.
¶ 10 Leave a comment on paragraph 10 0 Some in the critical editing tradition have worried about the potential for digital documentary editing to distance the editor from the reader of the edition, through an exclusive emphasis on the level of the artifact and document (see Robinson 127). As Pierazzo notes above, however, for standardized texts there is no necessary opposition between a clear-text reading edition and a complexly annotated diplomatic edition: both are forms of the same underlying text. To fully embrace the new possibilities, digital documentary editors require a range of technical skills and collaborative engagements. To produce a reading text, specific editorial decisions about methods of transcription, error checking, presentation, and standardization must still be made, both for the documentary editing tradition associated with historians and for the critical editing tradition associated with literary studies. Given the need for interdisciplinary collaboration in the digital humanities more generally, a convergence between these two traditions is highly desirable with regard to standards and tools, if not editorial frameworks. Literary scholars interested in digital editions of letters must learn to collaborate with historians, archivists, and library science specialists if they are to produce solid, long-lasting editions, union catalogs, and digital archives.
¶ 11 Leave a comment on paragraph 11 0 Transcription and presentation are the primary editorial problems for both the documentary and critical editing traditions. The central problem for most letters, according to Halsband, is “the fact that the copy-text is a manuscript,” thus the editor has to decide whether to produce a literal (or diplomatic) transcription, however unclear, or normalize spelling and capitalization, expand abbreviations, and clarify stenographic symbols and shorthand (30–32). As Pierazzo indicates (“Digital Documentary Editions”), digital documentary editors can avoid some of the difficulties associated with transcription methods by producing both literal transcriptions and clear reading texts displayable at the reader’s choice. Too much choice, however, can lead to exhaustion and lack of clarity, in both print or digital editions. Editors of digital letters have developed a range of pragmatic approaches to transcription, all of which address the tension between clear presentation to the reader and accurate representation of a documentary text. The Walt Whitman Archive aims to present its digital letters as “an inclusive text representing as nearly as possible a clean, reading version of the letter,” without recording deletions, insertions, authorial “metacommentary” such as “(over)” or the physical layout of the holographs. The editors envision an updated future version in which deletions and insertions would be recorded, and two different methods of viewing the text: as a “clean” reading text or as a diplomatic transcription (Editorial Policy Statement [sec. G: “Correspondence”]). The editors of the Collected Letters of Robert Southey, on the other hand, employ a range of standard editorial conventions to represent deletions, illegible characters, or difficulties with the manuscripts (Bolton and Packer). Names and places are hyperlinked to standard entries, and inserted editorial notes provide essential context throughout, somewhat at the expense of an uninterrupted reading experience. The editors of the Diplomatic Correspondence of Thomas Bodley, 1585–1597 have encoded their letters so that they may be read according to “varying degrees of diplomatic or normalized transcription,” depending on selections made by the reader (Diplomatic Correspondence). These examples show the range of current digital editorial practice as well as a continuing move toward the embrace of multiple presentation styles, for the purposes of multiple audiences, based on a single underlying root document.
¶ 12 Leave a comment on paragraph 12 0 Regularization, normalization, and silent emendation are a secondary set of linked problems for any documentary edition. In their Guide to Documentary Editing, Mary-Jo Kline and Susan Holbrook Perdue discuss these issues under the heading of the “inclusive text,” the preferred middle ground for many scholarly editors of twentieth-century letters (164–68). In an inclusive text (or expanded transcription), an editor compromises between the full diplomatic transcription of all details of a document (through textual symbols and footnotes) and a reading text, in which all editorial insertions are minimalized. As Kline and Perdue note, the Center for Editions of American Authors (CEAA), the predecessor body to the Committee on Scholarly Editions (CSE), specifically named “manuscript letters or journals or notebooks” as appropriate objects for inclusive texts in their Statement of Editorial Principles of 1972 (9). In the ideal version of the inclusive text, most editorial changes and emendations would be recorded in some way, either through a record of emendations separated from the reading text or through an editorial statement on the transcription policy. In practice, few twentieth-century printed editions of letters came close to this ideal: many silently emended malformed letters and punctuation, most regularized the placement of postscripts and addresses, and some omitted major features of the document, such as canceled passages. A digital editing environment offers more flexibility than print, though the fundamental tension between readable text and accurate transcription remains. The editors of the Southey letters note that “Southey’s original spelling, and mis-spelling, grammar, punctuation and any slips of his pen have been retained” and solve the problem of accurate indexing through hyperlinks to standardized entries for people and place noted above (Bolton and Packer). Whatever forms of transcription and regularization are chosen, a documented rule of transcription like that employed by the editors of the print edition of the Letters of D. H. Lawrence or a statement of editorial policy on regularization allows to some degree the critical reconstruction of the original manuscript, even in the absence of facsimile images (Lawrence xviii–xix). Technical mediation of transcriptions and facsimile images also calls for a description of editorial policy. Description of the specific procedures for textual encoding, optical-character-recognition methods, and proofreading procedures would be as welcome in digital as in print editions.3 The questions recorded in the CSE’s Guidelines for Editors of Scholarly Editions pertaining to electronic editions are useful points of consideration here as well (Guidelines).
¶ 13 Leave a comment on paragraph 13 0 A third major editorial issue for any correspondence project is the selection of letters: whether the letters to a given writer should be represented alongside the letters from that writer or whether a broader circle of letter writers could be represented (e.g., the Shelley circle). The digital editing environment allows us to aim at a more fully social notion of correspondence, departing from the norm for the last century of editors. Halsband chose to present only the letters of Lady Montagu, quoting from the replies, where relevant, in annotations; the forty-three volume “Correspondence” of Horace Walpole, organized by selected recipients, presents both sides of a variety of conversations (Halsband 28–29). The Walt Whitman Archive has decided to edit incoming, outgoing, familial, and also institutional or scribal correspondence (e.g., letters written and received by Whitman when he worked as a clerk in the United States Attorney General’s office).4 Digital archives of correspondence offer an obvious practical advantage over print in terms of breadth and flexibility: a given correspondence (say, of Ezra Pound) could be exhaustive at the level of internal content (containing records of all letters sent and received by Pound) yet selective in its external presentation (allowing individual reading editions of Pound’s exchanges with specific correspondents). Nor does a digital scholarly edition exclude the possibility of a print reading edition: one could imagine printed on demand for classroom purposes a volume of a particular correspondence selected from a larger digital archive.
¶ 14 Leave a comment on paragraph 14 0 Incompleteness is the corollary of editorial selection. Large editions of a correspondence make bibliographic and rhetorical arguments for their completeness, but every correspondence is selected in some way, in a more or less explicit form, as Richard Finneran’s acute review of the Oxford Collected Letters of W. B. Yeats argues at length. Editors overlook manuscripts, miss private collections, and omit items “because they add nothing to our knowledge of Virginia Woolf,” as in the case of Woolf’s omitted party invitations (Woolf ix). Inevitably such omissions lead to the publication of “additions and corrections,” to cite the title of the last volume of the Walpole correspondence. Errors in transcribing autograph documents, in particular, are inevitable and make one strong argument for an ongoing, progressively corrected digital edition. Errors in annotation, as Finneran notes, are also inevitable, producing fundamental misrepresentations of facts and documents, “which again an electronic archive could solve” (51). With digital documentary methods that pair manuscript images with transcriptions, an edition can serve as another stage of the proofreading process.
¶ 15 Leave a comment on paragraph 15 0 The process of indexing print volumes inevitably entails elisions, conflations, and missed references; here there is a clear advantage for progressively indexed digital projects. As Bob Rosenberg and others have noted, however, such expanding projects create their own problems—for example, the problem of citations from previous versions of a project, a difficulty when public records of versions and changes are not kept. Unique identifications for letters in an archive should be tied to a record of changes in and improvements to transcription and representation, along with a record of format changes (e.g., from SGML to XML). As Rosenberg’s account of the multidecade development of the Edison Papers shows, difficulties may arise from changing data formats, the gap between established print and archival cultures, a lack of separation between content and presentation, changing digital standards, and idiosyncratic storage and preservation methods (see also Kline and Perdue, Guide [“Initiating an Editorial Project”]). Finally, open data standards and interoperability stand as challenges and opportunities for all digital editions.
Paradigms for Digital Correspondence: Archives, Visualizations, Union Catalogs
¶ 16 Leave a comment on paragraph 16 0 Specific digital standards and guidelines for representing letters arise out of digital projects that serve particular archives, communities, and goals. I locate those projects in relation to their mode of representing and interrogating correspondence, in order to disseminate best practices as well as evaluate the utility of the approaches. My account is not committed in advance to any one paradigm among those discussed; rather, I write to further interoperability, accessibility, and interdisciplinary collaboration, which are the goals of the Twentieth-Century Literary Letters project and its collaborators.
¶ 17 Leave a comment on paragraph 17 0 Among the major paradigms for editing correspondence in a digital environment, by far the most important and thorough are those building on the standards of the Text Encoding Initiative (TEI), as discussed by Susan Schreibman in this anthology and extended in the work of the Digital Archive of Letters in Flanders (DALF) at the Dutch Centre for Scholarly Editing and Document Studies. Since not all scholars will want to build critical (usually authorial) editions of letters, however, and since the learning curve for TEI encoding tends to be long, I postpone discussing the encoding standards for digital editions of letters, particularly those necessary for aggregation in field-specific, digital, peer-review organizations (NINES, 18thConnect, MESA, and the emerging ModNets); I first review paradigms for representing digital correspondence that are oriented toward goals different from those required for permanent documentary editions.
¶ 18 Leave a comment on paragraph 18 0 For many projects and institutions, the primary goal has been the exhibition of existing archives of individual or small-group correspondence for a general audience or public rather than for a scholarly audience. According to the useful report on digital correspondence produced by Jan Broadway at the Centre for Editing Lives and Letters (CELL) at University College London in 2009, the majority (84%) of smaller projects then surveyed encompassed only the letters of an individual or family, while a quarter (24%) contained fewer than fifty letters. Many of these projects were transcriptions of correspondence, without images or statements on editorial procedure. As the CELL report notes, in the absence of explicit statements about transcription policy there is little documentary value in such projects. Visibility and preservation are particular challenges for first-generation projects. Web sites based on a single collection of objects have an understandable tendency to degrade quickly and may be no more visible than their archival counterparts. These problems demonstrate the need for standards for preservation and dissemination.
¶ 19 Leave a comment on paragraph 19 0 Quite different are the scholarly digital archives oriented toward the comprehensive presentation of a single author’s correspondence, as in the forthcoming Complete Letters of Willa Cather, in The Walt Whitman Archive, or in the Diplomatic Correspondence of Thomas Bodley, 1585–1597. These newer digital archives are far better supported, standardized, and indexed than the projects surveyed in the CELL report. It is important that they are also associated with academic and library communities committed to their long-term preservation and accessibility.
¶ 20 Leave a comment on paragraph 20 0 Not all letters projects take archival display and preservation as their primary goals. Another goal can be visualization as a tool for interpretation and accessibility, best represented by the Mapping the Republic of Letters project at Stanford and its associated tools for visualization and interpretation. Mapping grew out of data from the Electronic Enlightenment project at Oxford and has explored ways to present correspondence metadata from the eighteenth century, focusing on the major philosophes that formed our concept of the Enlightenment’s spread as extended cosmopolitan debate. As in the various case studies that have grown out of Voltaire’s correspondence, mapping these letters lets scholars interrogate the reach, spread, and depth of correspondence between English and French Enlightenment philosophers, for example, in interfaces that allow the reader to query underlying metadata on letters of interest. The project has also developed several platforms and tools for use on any set of letters with usable metadata, notably Palladio, an easy-to-use tool for producing rich social and GIS (geographic information system) visualizations out of structured letter metadata. The ePistolarium project, used for a database of seventeenth-century Dutch letters, offers a parallel set of faceted search tools and visualizations.
¶ 21 Leave a comment on paragraph 21 0 It should be emphasized that we are dealing mainly with letter metadata (author, date, recipient, location of recipient, etc.), not a digital representation of the entire content of the letter. That content is located behind the paywall in the Electronic Enlightenment project, whereas in the ePistolarium project, full text is available, with keywords and similarity search capability. Both projects are limited in scope, accessibility, and reusability. In the ePistolarium project, the sources of the letters vary in availability, format, spelling, and language, as the project team has responsibly made clear (Corpus Metadata). None of these projects make original marked-up documents openly available in XML or a similar reusable format, as do the TEI-based projects, such as The Collected Letters of Robert Southey, nor are standards for linked open data clearly articulated (though the Electronic Enlightenment project offers to link its data with partner projects).
¶ 22 Leave a comment on paragraph 22 0 A third and related paradigm for representing correspondence is what we might think of as the digital union catalog model, which aggregates and assembles metadata, and potentially also content, from a large range of archival sources and institutions. The foremost example here is the Early Modern Letters Online project (EMLO), which has expanded beyond its initial core collection at the Bodleian Library to encompass eight collections and a range of data ingestion, display, and editing tools for correspondence and has plans to extend its model through a pan-European network over the next six years (see Reassembling). EMLO has become our best model for a large-scale union catalog of correspondence, one that has ambitions to unite a very large set of interlinking correspondences and authors in an easily accessible, information-rich, aesthetically pleasing interface. The digital union catalog model can be limited to the assembly of rich, linked metadata alone—a particularly important possibility for twentieth-century letters, whose contents are embargoed by copyright restrictions or privacy concerns.
Digital Editions of Letters: Standards for Preservation, Access, and Dissemination
¶ 23 Leave a comment on paragraph 23 0 Best practices for ensuring access, dissemination, and accurate preservation of letters have been developed primarily in the community of scholars using the TEI guidelines. Usable forms of letter data and metadata can be produced in many contexts. EMLO has developed a range of ingestion approaches for structured data in many formats, from Excel spreadsheets to Access databases. The original entry point for systematic data can thus be a spreadsheet, a FileMaker form, a Zotero bibliography, EMLO’s custom-built EMLO-Edit tool, or a TEI document edited in the oXygen XML Editor. Those data, however, will be usable only if they are well-structured, consistent, and cleanly entered (i.e., not the product of faulty OCR processes). TEI has the advantage of foregrounding the issues of structure, standardization, documentation, and validity: an individual TEI document can be validated during production to see if it conforms to the data standards used by any given project. Though TEI has a somewhat daunting initial learning curve for scholars more interested in letters than in digital archival preservation, other data formats are to some extent translatable to TEI if produced with standard documentation, vocabularies, and name authority control.
¶ 24 Leave a comment on paragraph 24 0 The reason to begin with standards, practices, and documentation rather than with a fixed structural language is that much available letter data are now entered in the special collections of libraries in the form of MARC (machine-readable cataloging) data and encoded archival description (EAD) records, or in online collections in a variety of other forms. If we begin by assuming that all letters will be entered under the guidance of scholars or archivists trained in textual studies and TEI, we will exclude potential collaborators and institutions in the community of digital humanities librarians, archivists, and museum specialists. Metadata should be made accessible and searchable for collections that remain largely archival and analog, not just for those that have been thoroughly digitized. Nevertheless, documenting and disseminating standards are needed by the digital scholarly infrastructure, which, although still evolving, possesses a definite history.
Leave a comment on paragraph 25 0
General standards for describing letters are now coming into focus. The TEI Special Interest Group on Correspondence has finalized the new
<correspDesc> TEI element, providing some clarity on the standard fields for correspondence description (“Correspondence Description”). Their standards separate the description of a letter into two divisions, the event of sending a letter (
<correspAction>) and the social context around that letter, including other related correspondence (
<correspContext>). This ideal type of description allows the editor to encode not only the location, date, author, and recipient of the letter but also the correspondence that led to the letter and the correspondence that is a response to it.5
Leave a comment on paragraph 26 0
The specific standards discussed here are in line with the TEI
<correspDesc> standards and with the lean metadata requirements of the Advanced Research Consortium, an umbrella organization originating in literary studies that will aggregate the project-specific metadata of various period-specific nodes (the Networked Infrastructure for Nineteenth-Century Electronic Scholarship [NINES], 18thConnect, the Medieval Electronic Scholarly Alliance, and other emerging peer-review and aggregation sites for modernism and early modern studies). Readers concerned with archival description should look to the standards of the DALF group, which go considerably beyond the scope of my discussion. The audience for these standards is imagined to be not just scholars and editors of letters but also librarians holding manuscripts in special collections and teachers interested in producing original scholarship in a graduate or undergraduate class setting.
Necessary Metadata Fields
¶ 27 Leave a comment on paragraph 27 0 Metadata is data about data: “date,” for example, is a metadata field that tells us how to read “1810-05-07.” Though one needs to employ TEI markup to create a scholarly edition of letters, it takes only a small set of metadata fields to aggregate and disseminate information about letters. This minimal set is a good place to begin, for three reasons. First, it requires little training, and that training can be integrated into even an undergraduate class curriculum. Second, the minimal units necessary for aggregation and access point toward the larger social purposes of letter data as an object of scholarly interest rather than as a peripheral offering of a good critical edition. Third, objects correctly described in terms of metadata allow us to produce more accurate indexes, better search mechanisms, and more legible transcriptions. Metadata entries themselves should conform to the standard vocabularies and name control authorities given below, derived largely from the practices of the archivist and library community.
Leave a comment on paragraph 28 0
The minimal metadata fields required for peer review by the Advanced Research Consortium (ARC) and its aggregation software, Collex, are described on the ARC wiki and explained at more length in the examples (Submitting). They employ standard frameworks, such as Dublin Core description, when possible, and have developed mostly from the pioneering work at NINES. The metadata fields are represented here by elements in angled brackets, in accordance with the requirements of XML, so that “
<dc:title>Letters of Virginia Woolf
</dc:title>,” for example, indicates the standard Dublin Core metadata field “title of object.” The TEI
<correspDesc> elements are in the same XML format. The elements are united and converted for aggregation through a resource description framework (RDF) model, as discussed below. What the elements mean and how they are standardized are more important than the specific instantiation in XML or otherwise.
Leave a comment on paragraph 29 0
There are eleven standard metadata fields required for aggregation in ARC. Nodes like NINES, MESA, and 18thConnect may have their own additional required fields for peer review. Passing over the fields employed by ARC for aggregation (
<collex:federation>), we can briefly note the meaning of the required fields. The title (
<dc:title>) refers to the title of the object described; the type (
<dc:type>) describes the medium or form (e.g., a signed typescript or manuscript, often signaled through an abbreviation like TLS or MS). The date can be recorded as a year, range of years, or a possible year. For both ARC and TEI, this should have an entry in the form YYYY-MM-DD. The role of author is required (
<role:AUT>); for the purposes of letters, we suggest that the roles of editor and publisher be listed as well, when appropriate. Editor would refer to the party responsible for creating the representation of the letter’s text, not necessarily the party remediating that text in digital form. One or more disciplines related to the object are required (e.g., literature, book history, history), listed under the element
<collex:discipline>, though these could be added later in the process. The genre of the object (
<collex:genre>) will usually be correspondence, but other genres, such as collection and translation, are available.
<correspDesc> <correspAction type="sent"> <persName>Carl Maria von Weber</persName> <settlement>Dresden</settlement> <date when="1817-06-23">23 June 1817</date> </correspAction> <correspAction type="deliveredTo"> <persName>Louis de La Foye</persName> <placeName>Caen</placeName> <date>unknown</date> </correspAction> </correspDesc>
Leave a comment on paragraph 31 0
In this form of description, we have two specific “actions” (sending and receiving) wrapped in one letter description. The first action is the sending of a letter by Carl Maria von Weber from Dresden on 23 June 1817. The second action is the reception of that letter by Louis de La Foye at Caen on an unknown date. Note that these fields do not necessarily describe the physical object itself or the precise way in which the author addresses his recipient. Instead, the person name field (
<persName>) is standardized, as is the place and date of address. Several persons may be listed as authors or recipients of any one letter.
Leave a comment on paragraph 32 0
The other mandatory elements are specific to ARC aggregation but are needed in some form for any project that wants to disseminate data in the long term. Thus the archive element (
<collex:archive>) refers to the originating project (e.g., Rossetti project) and the federation element (
<collex:federation>) refers to the peer-review node to which the project belongs (e.g., NINES, 18thConnect, ModNets). The freeculture element (
<collex:freeculture>) indicates whether the following text is open-sourced, open for reuse, or protected under copyright; the default value is TRUE. Aggregation also requires a standard uniform resource identifier (URI), usually the location of a file on a Web site (if already hosted online), which distinguishes among objects in a given project. The document URI need not be meaningful. Some projects (e.g., the Rossetti Archive and the Walt Whitman Archive) have found that making such document URIs intelligible creates more problems than it solves. Other projects, as the one for the Southey letters, use URIs that are descriptive of the letter they refer to (e.g., southey_letters_letterEEd.26.1729.xml).
¶ 33 Leave a comment on paragraph 33 0 For letters, a few metadata fields beyond the standard aggregation elements are required. I have already mentioned the roles of editor and publisher, necessary when information from already published letters is aggregated. The most important additional element is the recipient (recipients) or addressee (addressees), as in the TEI example above. The location of both, if available, is optional but may be of interest to many audiences. The archival location and standard call number, if the letter is located in a library or archive, are required fields (e.g., UVa Special Collections; MSS 6251—6251-bn Box 1) for archival materials, and metadata from published editions should include the volume and page range of the original (e.g., vol. 1, pp. 35–36). Again, the specific name of these metadata fields is less important than their presence; the form of entries in this field should be in accord with a standard vocabulary, if one is available. For projects more archival and descriptive, the full set of comprehensive metadata elements developed by the DALF project allows for preservation of bibliographic, annotative, and physical data (Vanhoutte and Van den Branden). For most scholarly projects on correspondence, however, the minimal set of elements will likely suffice: author, recipient, date, place of author, place of recipient, archival location, and type (autograph manuscript, typescript, telegram, etc.). The title for the letter, in most collections, derives from the author, recipient, and date (e.g., Henry James to James Whistler, 1868-05-27). Not all collections will have this information available, but it is considered minimal for the purposes of formal description.
Leave a comment on paragraph 34 0
A brief note here on the data model implied by metadata standards. All complex digital data have a specific structural model that tells machines (and human beings) how to read and use them: the suffixes .html, .doc, and .txt designate the most familiar structures for reading data. The minimal fields required for ARC metadata ingestion are described through a RDF model, a semantic network way of describing data that differs from a hierarchical or document-based model (XML) as well as from a relational database model (MySQL or other approaches reliant on primary keys). Those interested in understanding the differences among these models and their approaches in detail can consult useful tutorials (e.g., Tutorial 1) and books; RDF and the semantic web are described in Semantic Web for the Working Ontologist (Allemang and Hendler). The specific technologies will change, but the essential structural distinctions have a good deal of relevance for the study of letters. XML is a hierarchical textual data format, useful for describing the rich data contained in documents (TEI is a variety of XML). MySQL depends on the relational database model, in which data are conceived as separable tables containing a series of relations in row-and-column form. RDF describes graph database relations in which no necessary hierarchy or primary keys exist. Rather, objects are conceived as existing in a variety of semantic relations (
Fitzgerald is a FriendOf Hemingway; some of Fitzgerald's letters HaveRecipient Hemingway). As these examples show, RDF description occurs in a series of subjects, predicates, and objects, on the line of natural language predication, like English.
¶ 35 Leave a comment on paragraph 35 0 Relations, as Henry James says in his preface to Roderick Hudson, stop nowhere, “. . . and the exquisite problem of the artist is eternally but to draw, by a geometry of his own, the circle within which they shall happily appear to do so.” What is true of fiction counts for metadata as well. Each project interested in aggregation must decide what RDF relations are allowed and what standard vocabularies (or data ontologies) will be used, a decision formalized in the RDF schema of the project (and appended at the top of every document in that project). Even for those projects not employing TEI encoding for a formal edition, the concept and documentation of a standard vocabulary is essential, because such a vocabulary is needed to permanently record, preserve, and disseminate digital letters.
Standard Vocabularies and Authority Control
¶ 36 Leave a comment on paragraph 36 0 The problem with human relations is not only that they stop nowhere but also that they are impossible to fix, define, and unify in a simple way—thus the pleasures and equivocations of ambiguous Jamesian sociability in a novel like The Wings of the Dove (one of his many novels that hinge on the contents of a letter). Machines, alas, require more strict determinations for their forms of procedural reading. Standard vocabularies and authority controls, the tools of information science and library science, allow us to assign a term that is “consistently, uniquely, and unambiguously” assigned to particular people, places, subjects, and agents (Cataloguing). Rose is a rose is a rose is a rose, as Gertrude Stein said, but Gertrude Stein is not Gertie is not Stein, Gertrude, 1876–1946. That any given name will appear in multiple forms across different materials is a problem that can to some extent be resolved after data entry with tools like OpenRefine. If a standard way of referring to sources is consistently used and documented, our work is much more likely to be preserved in a useful form.
¶ 37 Leave a comment on paragraph 37 0 In editions prepared with TEI, standard vocabularies and their sources are maintained in the formal header of the document, before the main text. Metadata prepared in other contexts should use these or other documented sources for their vocabulary, where possible. Many standard vocabularies are linked to the required elements already discussed; Dublin Core elements, for example, are standard library metadata fields. Roles are controlled through the Library of Congress MARC codes—for example, “aut” for author. Most important are the names of individuals and places, both of which now can be dynamically consulted during the data-entry process. For works and individuals listed in library systems, the most comprehensive reference point is the Virtual International Authority File, a system that includes foreign names and the Library of Congress name authority files. GeoNames provides place-names through a Web services application programming interface (API) as well as a standard search interface. All these services are available as dynamic Web services in platforms like Collective Access, an open-source museum and library service that allows easy creation of metadata and RDF schemas like those required by the TEI and ARC guidelines.
¶ 38 Leave a comment on paragraph 38 0 Setting up new letters projects that would be in line with these guidelines remains conceptually and practically daunting, even as the means to create and host collections of letters have become far more powerful and accessible. Collaboration and consultation are key here: developing digital letters projects entails working with digital library specialists, hiring Web site specialists for interface and user design, sharing resources on Web hosting and data entry, consulting with experienced digital project managers, and dividing scholarly duties with specialist experts. Experienced editors and technical experts, invaluable here, are easiest to find in institutional contexts like the Cultures of Knowledge project, at digital humanities centers like Nebraska’s Center for Digital Research in the Humanities (and the associated projects focused on Cather and Whitman), and at the various digital humanities training institutions at Maryland, Oxford, Leipzig, and the University of Victoria. Digital Humanities Questions and Answers also provides expert responses to a range of practical questions, along with an archive of vetted responses. On our own Twentieth-Century Literary Letters project blog, we have assembled an evolving annotated bibliography and list of specific resources relevant to digital correspondence. We welcome collaborators interested in specific collections of twentieth-century correspondence, in the development of linked open data for correspondence metadata, and in the process of querying and visualizing correspondence metadata as a form of scholarly analysis and inquiry.
¶ 39 Leave a comment on paragraph 39 0 There are several possible horizons for future scholarly work on letters as objects of study and digital representation. None of them will mean the end of the printed edition of letters or the obsolescence of existing volumes of correspondence. Quite the contrary: developing more extensive digital editions of letters could lead to complementary print editions of selected letters, as the process of creating and printing selected correspondences becomes easier (although outside funding for printed critical editions has become increasingly scarce). The key question is whether the editions we create in any medium will be reliable, readable, and produced according to the standards of the best current exemplars.
¶ 40 Leave a comment on paragraph 40 0 Clearly we must develop digital editions of correspondence according to the TEI guidelines and standardize other edited correspondence in TEI/XML form. As more editions and authors come into the public domain, reliable open-source digital editions will be needed. When encoded with RDF semantic data, they should provide the first steps toward linked open models of correspondence data. In addition, there will continue to be collections of individual and group correspondence for a variety of academic and public audiences. Along with this, we will see a variety of evolving visualization projects that take advantage of the social, geographically specific, and culturally imbricated dimensions of letters, making use of them for the purposes of cultural history, reader-response theory, and the study of material and literary networks.
¶ 41 Leave a comment on paragraph 41 0 Finally, an important area of work will be in union catalogs of letters on the model of EMLO. They will likely become progressively more expansive, inclusive, and multi-institutional in their scope and range. Whether full content will be easily and freely available, or whether it will be largely hidden behind paid subscriber services, the collection of linked metadata will enable new forms of interconnected scholarly arguments and histories. As we move into a new era of interconnected digital editions and groups of correspondence, an important question of permanence and accessibility will be, How do print editions of letters intersect with and complement larger correspondence databases and union catalogs? Essential will be preserving the high production values, depth of annotation, and editorial standards that the best print editions of major correspondence, like recent volumes of letters by Ernest Hemingway and T. S. Eliot, exemplify. New work on digital letters should retain a sense of the value of these well-crafted material objects, even as we explore new ways of representing epistolary ontology. Whether digital or print, letters represent an essential aspect of civilized life for generations of letter writers, an aspect worthy of continued representation in what has become very nearly a postepistolary era.
¶ 42 Leave a comment on paragraph 42 0 1. For the purposes of this essay, let us define “modern letters” as post-seventeenth-century and pre-e-mail, though e-mail presents similar challenges. Early modern letters have been significantly better theorized and studied. Exceptions to the lack of interest in modern letters are discussed below in “Editorial Theory and Practice” and in conjunction with the standards developed by the Digital Archive of Letters in Flanders (DALF) correspondence project. See also Halsband; Berg; Phillips; Kline and Perdue 90–94, 129–32; and Jolly and Stanley.
¶ 45 Leave a comment on paragraph 45 0 4. Editorial Policy Statement, sec. G. For this and other valuable suggestions, I would thank Kenneth Price. Acknowledgments are also due to Marcel Illetschko, Elizabeth Williamson, Anouk Lang, and the participants in the open review process on MLA Commons.
Allemang, Dean, and James Hendler. Semantic Web for the Working Ontologist, Second Edition: Effective Modeling in RDFS and OWL. 2nd ed. Waltham: Morgan, 2011. Print.
Berg, Temma. “Truly Yours: Arranging a Letter Collection.” Eighteenth-Century Life 35.1 (2011): 29–50. Print.
Bolton, Carol, and Ian Packer. “About This Edition.” The Collected Letters of Robert Southey. Romantic Circles, U of Maryland, n.d. Web. 12 Nov. 2015. <http://www.rc.umd.edu/editions/southey_letters/letterEEd.26.about.html>.
Broadway, Jan. Digitizing Correspondence Workshop Report. Centre for Editing Lives and Letters, 10 June 2009. Web. <http://www.livesandletters.ac.uk/downloads/DC_report.pdf>.
Cataloguing Authority Control Policy. Natl. Lib. of Australia, n.d. Web. 15 June 2014. <http://www.nla.gov.au/policy-and-planning/authority-control>.
Corpus Metadata. Circulation of Knowledge and Learned Practices in the Seventeenth-Century Dutch Republic, n.d. Web. 15 June 2014. <http://ckcc.huygens.knaw.nl/?page_id=43>.
“Correspondence Description.” TEI P5: Guidelines for Electronic Text Encoding and Interchange. TEI Consortium, 15 Oct. 2015. Web. 14 Dec. 2015. <http://www.tei-c.org/release/doc/tei-p5-doc/en/html/HD.html#HD44CD>. Version 2.9.1, revision 46ac023.
Diplomatic Correspondence of Thomas Bodley, 1585–1597. Version 5. Ed. Robyn Adams. Centre for Editing Lives and Letters, July 2011. Web. 24 Oct. 2015.
Editorial Policy Statement and Procedures. The Walt Whitman Archive. Ed. Ed Folsom and Kenneth M. Price. Center for Digital Research in the Humanities, U of Nebraska, Lincoln, n.d. Web. 15 Nov. 2014. <http://www.whitmanarchive.org/about/editorial.html>.
Finneran, Richard. “The Collected Letters of W. B. Yeats: A Project in Disarray.” Review 18 (1996): 45–58. Print.
Guidelines for Editors of Scholarly Editions. MLA, 29 June 2011. Web. 12 Nov. 2015. <https://www.mla.org/Resources/Research/Surveys-Reports-and-Other-Documents/Publishing-and-Scholarship/Reports-from-the-MLA-Committee-on-Scholarly-Editions/Guidelines-for-Editors-of-Scholarly-Editions>.
Halsband, Robert. “Editing the Letters of Letter-Writers.” Studies in Bibliography 11 (1958): 25–37. Print.
James, Henry. Prefaces to Volumes of the New York Edition. Henryjames.org, 13 Oct. 2013. Web. 24 Oct. 2015. <http://www.henryjames.org.uk/prefaces/home.htm>.
Jolly, Margaretta, and Liz Stanley. “Letters As/Not a Genre.” Life Writing 2.2 (2005): 75–101. Print.
Kelemen, Erick. Textual Editing and Criticism: An Introduction. New York: Norton, 2009. Print.
Kline, Mary-Jo, and Susan Holbrook Perdue. A Guide to Documentary Editing. 3rd ed. Charlottesville: U of Virginia P, 2008. Print.
Lawrence, D. H. The Letters of D. H. Lawrence. Volume 1: 1901–1913. Ed. James T. Boulton. Cambridge: Cambridge UP, 1979. Print.
Phillips, Siobhan. “Elizabeth Bishop and the Ethics of Correspondence.” Modernism/Modernity 19.2 (2012): 343–63. Print.
Pierazzo, Elena. “Digital Documentary Editions and the Others.” Scholarly Editing 35 (2014): n. pag. Web. 27 Nov. 2014. <http://www.scholarlyediting.org/2014/essays/essay.pierazzo.html>.
———. “A Rationale of Digital Documentary Editions.” Literary and Linguistic Computing 26.4 (2011): 463–77. Print.
Price, Kenneth M. “Collaborative Work and the Conditions for American Literary Scholarship in a Digital Age.” The American Literature Scholar in the Digital Age. Ed. Amy E. Earhart and Andrew Jewell. Ann Arbor: U of Michigan P; U of Michigan Lib., 2011. 9–26. Print.
Reassembling the Republic of Letters, 1500–1800: A Digital Framework for Multi-lateral Collaboration on Europe’s Intellectual History. European Cooperation in Science and Technology, 16 Nov. 2013. Web. 18 Nov. 2015. <http://www.cost.eu/COST_Actions/isch/IS1310>.
Robinson, Peter. “Towards a Theory of Digital Editions.” Variants 10 (2013): 105–32. Print.
Rosenberg, Bob. “Documentary Editing.” Electronic Textual Editing. Ed. Lou Burnard, Katherine O’Brien O’Keeffe, John Unsworth, and G. Thomas Tanselle. New York: MLA, 2006. 92–104. Print.
Southey, Robert. The Collected Letters of Robert Southey. Ed. Lynda Pratt, Tim Fulford, and Ian Packer. Romantic Circles, 1 Feb. 2009. Web. 27 Nov. 2014.
Submitting RDF. Advanced Research Consortium, 17 Sept. 2013. Web. 15 Nov. 2014. <http://wiki.collex.org/index.php/Submitting_RDF>.
Tutorial 1: Introducing Graph Data. LinkedDataTools.com, n.d. Web. 18 Nov. 2015. <http://www.linkeddatatools.com/introducing-rdf>.
Vanhoutte, Edward, and Ron Van den Branden, eds. DALF Guidelines for the Description and Encoding of Modern Correspondence Material, Version 1.0. Gent: Centrum voor Teksteditie en Bronnenstudie, 4 May 2005. Web. 15 June 2014. <http://www.kantl.be/ctb/project/dalf/dalfdoc/>.
Walpole, Horace. The Yale Edition of Horace Walpole’s Correspondence. 48 vols. Ed. W. S. Lewis. New Haven: Yale UP, 1937–83. Print.
Woolf, Virginia. The Letters of Virginia Woolf. Ed. Nigel Nicolson and Joanne Trautmann Banks. Vol. 1. London: Hogarth, 1975. Print.