Return to the MLA Commons
An Evolving Anthology

Under Review Gabriel Hankins, “Correspondence: Theory, Practice, and Horizons”

Correspondence: Theory, Practice, and Horizons

Gabriel Hankins, Clemson University

1 Leave a comment on paragraph 1 0 This draft essay has been chosen by the editors for open review. You are invited to read this draft and comment on it. The editors may then ask the authors to revise their work in the light of your comments, with the goal of eventually including the revised essay in the anthology.

I. Introduction: Digital Representations of Letters

2 Leave a comment on paragraph 2 0 Of the primary forms of evidence studied and created by digital literary scholars across all fields and periods—letters, diaries, archival documents, editions—modern letters have been among the least theorized.1 This despite a wide range of examples and practices of digital letters collections and edition, from exhibits of an individual correspondence (e.g. the Darwin Correspondence Project) to substantial digital archives (like the Whitman Archive) to major union catalogue and visualization efforts (such as Early Modern Letters Online and the Mapping the Republic of Letters project, discussed below). In a period of digital scholarship making the turn towards increasingly collaborative, open, and social paradigms like those proposed by Alan Liu and Kenneth Price, letters offer us an important test case for theorizing the larger disciplinary and institutional changes concomitant with the move to digital infrastructures for communication and publication. Letters have properties of immediate interest for the next wave of digital scholarly communication: first, their sources and physical locations are usually far-flung: the letters of D. H. Lawrence, to take a typical example, are located in more than twenty-six archival and personal collections (xx-xxi). The communities of interest for letters are often equally dispersed, with archivists, special collections, scholars, auction-houses, and book-dealers all concerned with identification, aggregation, and preservation of ambiguously related documents. Moreover, printed editions of letters are typically incomplete, unable to contain all the responses, associated letters, private letters, telegrams, inscriptions, and associated social documents that might concern a given scholarly community. Letters thus clearly defy the scholarly dream of a “definitive” print edition, as reviews of “collected letters” have long noted (Finneran). The editors of the quite extensive Letters of Virginia Woolf note that they are unable to print Woolf’s voluminous archive of party invitations and acceptances, for example, though these would surely interest readers of Mrs. Dalloway (ix).

3 Leave a comment on paragraph 3 0 Most importantly, the primary form in which we now study literary correspondence essentially reduces and thus misrepresents the nature of a written letter, what we might think of as its specific epistolary ontology. A letter is (almost) always written to someone(s) and from someone(s), and is therefore inherently a relational form, connecting persons, groups, institutions, and political bodies; yet we usually study letters in bound books, organized by their reference to a proper authorial name (“The Letters of Samuel Johnson”). Differently from novels or volumes of poems, the phenomenological and bibliographic experience of reading a letter necessarily differs from the way scholars read and represent letters in codex form. Literary correspondence is usually written within some larger social circle, a group of friends and writers all talking and writing about shared concerns, subjects, and historical events (“The Shelley Circle,” the “Bloomsbury Group”). Letters are written in response to other letters, or call for a response in return, or designate a limit that cannot be crossed (we think of Kafka’s final letter to Max Brod); none of these relations can be completely addressed in conventional editions of letters, though editorial footnotes and supplements serve to bridge the gaps.

4 Leave a comment on paragraph 4 0 For all these reasons, the digital representation of correspondence offers us the chance not just to newly remediate a central part of our cultural inheritance, but also to begin to do justice to the larger social fields within which letters were written, and thereby better represent the essentially social dimension of epistolary thinking. We can do this without losing track of the materiality of letters as artifacts: as with all editions, digital letters will always be “surrogates” of some otherwise inaccessible document, in a form that affords the chance for interrogation, aggregation, and preservation. Digital collections of letters thus offer the chance for a systematic, rigorous, and progressively more accurate representation of their originals through facsimiles, transcription, or a combination of those two practices, while not obviating the need either for good reading editions or good archival research practices. We also have the opportunity to build collections of letter metadata that are supplementary and additive to existing print editions of standard correspondence, editions that will remain essential tools. Essential for both digital editions and collections of metadata, however, will be shared standards, practices, and platforms for the representation, preservation, and communication of digital forms of literary correspondence. In the following I first briefly outline the history of editorial theory and practice in regards to modern editions of letters; I then discuss the best current practices and standards for representing and sharing digital collections of letters. Finally, I point towards new horizons for the scholarly knowledge infrastructure concerned with literary correspondence, and towards the project of interconnecting disparate yet linked collections of letters.

II. Editing Editions of Letters: Editorial Theory and Practice

5 Leave a comment on paragraph 5 2 Once we begin representing letters, we are engaged in an editorial task. Briefly reviewing the theory and practice of editors of letters is thus requisite and instructive. Theoretical questions concerning the editing, indexing, and representation of literary letters have most often been taken up by practicing editors, as we would expect. Since the practice of editing letters (and the commissions from publishers) often begins long before the stage of printed theoretical reflection, modern theoretical work on editions of letters is almost inseparable from the specific considerations entailed in editing particular authors. Robert Halsband’s often-cited essay on “Editing the Letters of Letter-Writers,” for example, derives many of its proscriptions and procedures from the highly literary letters of Lady Worley Montagu (1689–1762), while setting out more general problems of editing letters in an admirably transparent, well-exemplified discussion. The theory and practice of editing letters thus tend to converge, with the editions advancing explicit or implicit theories of epistolary representation, and the theoretical articles often supported by a single exemplary case. Nevertheless a set of common questions, concerns, and problems can be traced in the modern literature, if the latter is seen more on the lines of a canon of common-law precedents than a formally developed set of prescriptive positions. Though full exposition and evaluation of the wider range of editorial issues involved in literary correspondence must wait, we can begin to outline the issues relevant for digital editions of literary correspondence in particular.

6 Leave a comment on paragraph 6 0 The primary distinction of twentieth-century editions of letters, in the view of the mid-century editor Robert Halsband, was that rather than producing editions oriented toward “a reading public who bought and read,” as did the nineteenth-century editors, representations of letters were constructed for a specialist scholarly audience (125). This entails a fundamental change in editorial approach: “In our view letters have become ‘documents’; and the editor, instead of presenting a literary work, is setting up an archive” (ibid.). This shift from the literary reading of letters to a scholarly approach to documents elicits a certain nostalgia in Halsband, as the editor of a writer (Lady Worley Montagu) known mainly for her letters, and one who specialized in “the letter as a genre” (127). The distinction between an approach to editing letters oriented towards a larger reading (and browsing) public, and a “critical” approach oriented towards specialist scholars still applies today in the division between digital correspondences oriented towards public exhibition (still the majority, as discussed below in “Paradigms for Digital Correspondence”) and those edited according to the protocols of documentary or “critical” editorial theory.

7 Leave a comment on paragraph 7 0 Despite that opening note of nostalgia, Halsband’s extensive summary of mid-twentieth century editorial practices surveys both an enormous diversity of editorial approaches to presentation, selection, transcription, and normalization, and also attests to a growing consensus as to audience and goals: letters are to be represented on the analogy of “documents” within an “archive” (rather than “texts” within a “work”), with facticity, accuracy, and completeness the priorities of its scholarly audience. Most twentieth-century editions of letters, whether produced by scholars in the traditions of documentary editing or “critical editing” (if we mean by the latter the so-called Greg-Bowers-Tanselle line) have aimed at editions of documents, not of authorial works, though the majority take a primary author as the basis of their selection. The key task of most critical editions, determining the authority of variant texts of a work, is thus usually not an issue for editions of letters, as the text of most letters exists in only one version.2 As documentary editors know, a series of editorial decisions remain: any edition of letters offers a representation of some surrogate document(s), in a form which aims to balance intelligibility and fidelity to a distant (sometimes lost) material artifact. The editorial choices to be made extend beyond the requirements of documentary fidelity with regard to selection or transcription, the most significant initial issues. This is true of digital as well as print editions of letters, neither of which transparently stand in for the original artifact (though linked facsimile images and transcripts come closer to this goal).

8 Leave a comment on paragraph 8 0 More recently, twenty-first century editors of digital letters have moved towards what Elena Pierazzo calls the “digital documentary edition” (DDE), prompted in part by the clear advances afforded documentary editions in a digital environment. Editors at the William Blake Archive, the Walt Whitman Archive and elsewhere seek to “establish documentary texts,” rather than to reconstruct authorial intentions (Walt Whitman Archive, “Methodology and Standards”), a stance that seems particularly appropriate to documents never intended for revision and publication. Pierazzo defines the digital documentary edition broadly, as “the recording of as many features of the original document as are considered meaningful by the editors, displayed in all the ways the editors consider useful for the readers, including all the tools necessary to achieve such a purpose.”3 For Pierazzo, this definition includes the “digital infrastructure (visible to the final user or not) necessary for the publication and exploitation of such content,” so that editors are potentially responsible for the many possible forms of presentation enabled by new codes and scripts (usually XML encoding and XSLT transformations): “DDEs can assume the form of diplomatic, ultra-diplomatic, semi-diplomatic, or reading editions on demand, thanks to digital … delivery of an appropriately encoded text” (3). In addition, digital documentary editions can combine facsimile images with facing editorial transcriptions, in a way that allows the reader to potentially become part of the improvement of a textual transcription.

9 Leave a comment on paragraph 9 0 Some in the critical editing tradition have worried about the potential for digital documentary editing to distance the editor from the reader of the edition, through an exclusive emphasis on the level of the artifact and document.4 As Pierazzo notes above, however, for standardized texts there is no necessary opposition between a clear-text reading edition and a complexly annotated diplomatic edition: both are potential forms of the same underlying text. To fully embrace these new possibilities, however, digital documentary editors require a range of technical skills and collaborative engagements. The question of producing a “reading” text still requires specific editorial decisions about methods of transcription, error-checking, presentation, and standardization, questions addressed in both the documentary editing tradition associated with historians and the critical editing tradition associated with literary studies. Given the need for interdisciplinary collaboration within the digital humanities more generally, a convergence between the critical editing and historical editing traditions is highly desirable with regard to standards and tools, if not editorial frameworks. Literary scholars interested in digital editions of letters must learn to actively collaborate with historians, archivists, and library-science specialists if we are to produce solid, long-lasting editions, union catalogues, and digital archives.

10 Leave a comment on paragraph 10 0 Transcription and presentation are the primary editorial problems for both the documentary and critical editing traditions. The central problem for most letters, according to Halsband (working in the Greg-Bowers tradition), is “the fact that the copy-text is a manuscript,” and thus the editor has to decide whether to produce a literal (or diplomatic) transcription, however unclear, or whether to normalize spelling and capitalization, expand abbreviations, and clarify stenographic symbols and shorthand (130-132). For digital documentary editors, as Pierazzo indicates, some of the difficulties associated with transcription methods can be avoided by producing both literal transcriptions and clear reading texts which can be displayed at the reader’s choice: too much choice, however, can lead to exhaustion and lack of clarity, in either print or digital editions. Editors of digital letters have developed a range of pragmatic approaches to the question of transcription, all of which address the tension between clear presentation to the reader and accurate representation of a documentary text. The Whitman Archive aims to present its digital letters as “an inclusive text representing as nearly as possible a clean, reading version of the letter,” without recording deletions, insertions, authorial “metacommentary” such as “(over),” or the physical layout of the holographs.5 The editors envision an updated future version in which deletions and insertions would be recorded, and two different methods of viewing the text: as a “clean” reading text or as a diplomatic transcription (ibid.) The editors of the Collected Letters of Robert Southey, on the other hand, employ a range of standard editorial conventions to represent deletions, illegible characters, or difficulties with the manuscripts.6 Names and places are hyperlinked to standard entries, and inserted editorial notes provide essential context throughout, somewhat at the expense of an uninterrupted reading experience. The editors of the Diplomatic Correspondence of Thomas Bodley, 1585–1597 have encoded their letters so that they may be read according to “varying degrees of diplomatic or normalized transcription,” depending on selections made by the reader. These examples show the range of current digital editorial practice, as well as a continuing move towards the embrace of multiple presentation styles, for the purposes of multiple audiences, based on a single underlying root document.

11 Leave a comment on paragraph 11 0 Regularization, normalization, and silent emendation are a secondary set of linked problems for any documentary edition. Kline and Perdue discuss these issues under the heading of the “inclusive text,” the preferred middle ground for many scholarly editors of twentieth-century letters.7 In an inclusive text (or “expanded transcription”), an editor aims at some median range between the full diplomatic transcription of all details of a document (through textual symbols and footnotes), and a “reading text” in which all editorial insertions are minimalized. As Kline and Perdue note, the Center for Editions of American Authors (CEAA), the predecessor body to the Committee on Scholarly Editions (CSE), specifically name “manuscript letters or journals or notebooks” as appropriate objects for inclusive texts in their Statement of Editorial Principles of 1972 (9). In the ideal version of the inclusive text, most editorial changes and emendations would be recorded in some way, either through a “record of emendations” separated from the reading text, or through an editorial statement on the transcription policy. In practice, few twentieth-century printed editions of letters came close to this ideal, with all silently emending malformed letters and punctuation, most regularizing the placement of postscripts and addresses, and some omitting major features of the document, such as cancelled passages. A digital editing environment offers more flexibility here, though the fundamental tension between a readable text and accurate transcription remains. The editors of the Southey letters note that “Southey’s original spelling, and mis-spelling, grammar and punctuation has been retained,” and solve the problem of accurate indexing through hyperlinks to standardized entries for people and place noted above. Whatever forms of transcription and regularization are chosen, a documented “rule of transcription” like that employed by the editors of the print edition of the Letters of D. H. Lawrence, or statement of editorial policy on regularization, allows to some degree for the critical reconstruction of the original manuscript, even in the absence of facsimile images (xviii-xix). Technical mediation of transcriptions and facsimile images calls for a description of editorial policy as well. Description of the specific procedures for textual encoding, optical character recognition methods, and proofreading procedures would be welcome in digital as print editions.8 The questions recorded in the Committee on Scholarly Edition’s Guidelines for Scholarly Editions pertaining to electronic editions are useful points of consideration here as well.9

12 Leave a comment on paragraph 12 0 A third major editorial issue for any correspondence project is the selection of letters: whether the letters to a given writer should be represented alongside the letters from his or her correspondents, or alternatively whether a broader circle of letter-writers should be represented (e.g. the “Shelley Circle”). As noted above, the digital editing environment allows us to aim at a more fully social notion of correspondence, but this departs from the norm for the last century of editors. Halsband chose to present only the letters of Lady Montagu herself, with citations from the replies, where relevant, in annotations; the forty-three volume “Correspondence” of Horace Walpole, organized by selected recipients, presents both sides of a variety of conversations (Halsband 128-129). The Walt Whitman Archive has decided to edit incoming, outgoing, familial, and also institutional or scribal correspondence (that performed by Whitman while working as a clerk in the US Attorney General’s office, for example).10 Digital archives of correspondence offer an obvious theoretical advantage here in terms of breadth and flexibility: a given “correspondence” (of Ezra Pound, for example) could be exhaustive at the level of internal content (containing records of all letters sent and received by Pound) and selective in its external presentation (allowing individual “reading editions” of Pound’s correspondence with individual figures to be produced through database queries and script transformations). Nor are digital scholarly editions exclusive of print reading editions: one could imagine a printed volume of a particular correspondence selected from a larger digital archive and printed on demand for classroom purposes.

13 Leave a comment on paragraph 13 0 Incompleteness is the inevitable corollary of editorial selection. Though large editions of “complete correspondence” make bibliographic and rhetorical arguments for their own perfection, every correspondence is “selected” in some way, in a more or less explicit form, as Richard Finneran’s acute review of the Oxford Collected Letters of W. B. Yeats argues at length. Editors overlook manuscripts, miss private collections, and omit items “because they add nothing to our knowledge of Virginia Woolf,” as in the case of her omitted party invitations (ix). Inevitably this leads to the publication of further “Additions and Corrections,” to cite the title of the last volume of the Walpole correspondence. Errors in transcribing autograph documents, in particular, are inevitable, and make one strong argument for an ongoing, progressively corrected digital edition. Errors in annotation, as Finneran notes, are also inevitable, producing fundamental misrepresentations of facts and documents “which again an electronic archive could solve” (51). Digital documentary methods that pair manuscript images with transcriptions offer the chance for the presented edition to act as another stage of the proofreading process.

14 Leave a comment on paragraph 14 0 The process of indexing print volumes entails inevitable elisions, conflations, and missed references; here there is a clear advantage for progressively indexed digital projects. As Bob Rosenberg and others have noted, however, progressively expanding digital projects create their own specific problems: among these are the problem of citations from previous versions of a project, a difficulty for projects that do not keep public records of versions and changes. Unique identifications for letters within an archive should be tied to a record of changes and improvements to transcription and representation, along with a record of changes in format (SGML to XML, for example). As Rosenberg’s account of the multi-decade development of the Edison Papers shows, difficulties may arise from changing data formats, the gap between established print and archival cultures, a lack of separation between content and presentation, changing digital standards, and idiosyncratic storage and preservation methods. The following discussion presents some models for systematically addressing the final two issues in particular.

15 Leave a comment on paragraph 15 0 Finally, the challenge of open data standards and interoperability stand as a challenge and opportunity for all digital editions. We will discuss new methods and organizations dedicated to open and interoperable standards below in “Best Standards for Preservation, Access, Dissemination.”

III. Paradigms for Digital Correspondence: Archives, Visualizations, Union Catalogues

16 Leave a comment on paragraph 16 0 Here I briefly discuss some of the major contemporary standards for editing, visualizing, and standardizing digital representations of correspondence. As with traditional print editions, specific digital standards and guidelines for representing letters stand in parallel to specific projects, which are sometimes built for very specific archives, communities, and goals; I will locate those projects in relation to their mode of representing and interrogating correspondence, in order to disseminate best practices as well as evaluate the utility of specific approaches. My account is not committed in advance to a specific paradigm among those discussed; rather, I write to further the goals of interoperability, accessibility, and interdisciplinary collaboration, the specific goals of the Twentieth Century Literary Letters project and its collaborators.

17 Leave a comment on paragraph 17 0 Among the major paradigms for editing correspondence in a digital environment, by far the most important and thorough are those building on the standards of the Text Encoding Initiative (TEI), as discussed by Susan Schreibman, and extended in the work of the “Digital Archive of Letters in Flanders” (DALF) at the Dutch Centre for Scholarly Editing and Document Studies, as well as in specific TEI editions like the Southey Letters. Since not all scholars will want to build critical (usually authorial) editions of letters, however, and because the learning curve for TEI encoding is necessarily somewhat complex, I will postpone discussing the encoding standards for digital editions of letters, particularly those necessary for aggregation in the field-specific digital peer review organizations (NINES, 18th-Century Connect, MESA, and the emerging ModNets), first reviewing paradigms for representing digital correspondence that are oriented towards other goals than those required for permanent critical editions.

18 Leave a comment on paragraph 18 0 For many projects and institutions, the exhibition of existing archives of individual or small group correspondence for a general audience or public, rather than a scholarly audience, has been the primary goal. According to the useful report on Digital Correspondence produced by Jan Broadway at the Centre for Editing Lives and Letters (CELL) at University College London in 2009, the majority (84%) of smaller projects then surveyed encompassed only the letters of an individual or family, while a quarter (24%) include less than fifty letters. Many of these projects are transcriptions of correspondence, without images or statements on editorial procedure; as the CELL report notes, in the absence of explicit statements on transcription policy, there is very little documentary value for such projects. Visibility and preservation are questions for smaller projects as well, particularly for first-generation work. Websites based on a single collection of objects have an understandable tendency to degrade quickly, and may be no more visible than their archival counterparts. These problems demonstrate the need for standards oriented towards preservation and dissemination, like those developed by the Text Encoding Initiative discussed below.

19 Leave a comment on paragraph 19 0 Quite different are the scholarly digital archives oriented towards the comprehensive presentation of a single author’s correspondence, as in the forthcoming Complete Letters of Willa Cather, The Walt Whitman Archive, or the Diplomatic Correspondence of Thomas Bodley, 1585–1597. These newer digital archives are far better supported, standardized, and indexed than the projects surveyed in the CELL report. Importantly, these archives are also associated with academic and library communities committed to their long-term preservation and accessibility.

20 Leave a comment on paragraph 20 0 Not all letters projects take archival display and preservation as their primary goals. Another objective is visualization as a tool for interpretion and accessibility, as best represented by the Mapping the Republic of Letters project at Stanfordand its associated tools for visualization and interpretation. Mapping the Republic of Letters grew out of data from the Electronic Enlightenment Project at Oxford, and explored ways to visualize correspondence metadata from the eighteenth century, focusing on the major philosophes that formed our concept of the Enlightenment’s spread as extended cosmopolitan debate, through a “republic of letters.” As in the various case studies that have grown out of Voltaire’s correspondence, mapping out these letters lets scholars interrogate the reach, spread, and depth of correspondence between English and French Enlightenment philosophers, for example, in interfaces which allow the reader to query underlying metadata on letters of interest. The project has also developed several platforms and tools for use on any set of letters with useable metadata, notably Palladio, a very easy to use tool for producing rich social and GIS visualizations out of structured letter metadata. The ePistolarium project, used for a database of 17th-Century Dutch letters, offers a parallel set of faceted search tools and visualizations.

21 Leave a comment on paragraph 21 0 In former cases above, we should emphasize that we are primarily dealing with letter metadata (author, date, recipient, location of recipient, etc), not a digital representation of the entire content of the letter, located behind the paywall of the Electronic Enlightenment Project in the first case above. In the ePistolarium project, full text with keywords and similarity search is available. Both projects have limitation in terms of scope (often the contents of a particular library’s collection are primarily represented, as is also the case in the Early Modern Letters Online discussed next), accessibility, and reusability; in the case of the ePistolarium project, the sources of the letters are varied in availability, format, spelling, and language, as the project team has responsibly made clear (“Corpus metadata”). None of these projects make original marked-up documents openly available in XML or a similar reusable format, as do the TEI-based projects like the Southey Letters, nor are standards for linked open data clearly articulated (though the Electronic Enlightenment offers to link its data with partner projects).

22 Leave a comment on paragraph 22 1 A third and related paradigm for representing correspondence is what we might think of as the “digital union catalogue” model, which aggregates and assembles metadata, and potentially also content, from a large range of archival sources and institutions. The foremost example here is the Early Modern Letters Online project, which has expanded beyond its initial core collection at the Bodleian Library to encompass eight collections and a range of data ingestion, display, and editing tools for correspondence, and has plans to extend its model through a pan-European network over the next six years.11 EMLO has become our best model for a large-scale “union catalogue” of correspondence, one which has ambitions to unite a very large set of interlinking correspondences and authors within an easily accessible, information-rich, aesthetically pleasing interface. The digital union catalogue model can be limited to the assembly of rich linked metadata alone—a particularly important possibility for twentieth-century letters whose contents are embargoed by copyright restrictions or privacy concerns.

IV. Digital Editions of Letters: Best Standards for Preservation, Access, Dissemination

23 Leave a comment on paragraph 23 3 Now that I have discussed goals for collections of letters concerned primarily with visualization and access, rather than archival preservation, I return to our first topic: best practices for ensuring access, dissemination, and preservation of letters. Those standards have primarily been developed within the community of scholars using the Textual Encoding Initiative (TEI) guidelines, as discussed by Susan Schreibman in this anthology. Important to note, however, particularly in light of the major projects discussed above, is the fact that useable forms of letter data and metadata can be produced in many contexts. EMLO, in particular, has developed a range of ingestion approaches for structured data in many formats, from Excel spreadsheets to Access databases. The original entry-point for systematic data can thus be a database form created in Google Sheets, a FileMaker form, a Zotero bibliography, EMLO’s custom-built EMLO-Edit tool, or a TEI document edited in the oXygen XML editor. That data, however, will only be as useable if it is well-structured, consistent, and “cleanly” entered (i.e. not the product of faulty OCR processes). TEI has the advantage of foregrounding the issues of structure, standardization, documentation, and validity (i.e. an individual TEI document can be “validated” during production to see if it conforms to the data standards used by any given project). Though TEI has a somewhat daunting initial learning curve for scholars more interested in letters than in digital archival preservation, other data formats are to some extent translatable to TEI if produced with standard documentation, vocabularies, and name authority control.

24 Leave a comment on paragraph 24 0 The reason to begin with standards, practices, and documentation, rather than a fixed structural language, is in part because much available letter data is now entered within the special collections of libraries in the form of MARC (“MAchine-Readable Cataloging”) data and Encoded Archival Description (EAD) records. If we begin by assuming that all letters will be entered under the guidance of scholars or archivists trained in textual studies and TEI, we will exclude potential collaborators and institutions within the community of digital humanities librarians, archivists, and museum specialists. Metadata should be made accessible and searchable for collections that remain largely archival and analog, as well as those which are thoroughly digitized. Standards will also vary across fields and disciplines, a situation made particularly acute in the case of letters (hardly ever confined within the limits of a particular discipline). The letters of members of the Bloomsbury group, to take one example, will be of interest to researchers in art history, literary studies, economics, political science, and of course intellectual history, to name some unusual disciplinary bedfellows. Nevertheless documenting and disseminating standards stands among the most important current needs of digital scholarly infrastructure, an infrastructure currently evolving but nevertheless possessed of a definite history.

25 Leave a comment on paragraph 25 0 The specific standards discussed here are all in line with the “lean” metadata requirements of the Advanced Research Consortium, an umbrella organization originating in literary studies which will aggregate the project-specific metadata of various period-specific “nodes” like the Networked Infrastructure for Nineteenth-century Electronic Scholarship (NINES), 18th-Connect, the Medieval Electronic Scholarly Alliance, and other emerging peer-review and aggregation sites for modernism (ModNets) and early modern studies (Renaissance English Knowledgebase). Where we are concerned with features specific to archival representation of letters, I refer to the exhaustive extension developed by the “Digital Archive of Letters in Flanders” (DALF) group, which go considerably beyond the scope of my discussion, as well as the discussion over metadata standards within the editorial board of the Twentieth Century Letters Project (“DALF guidelines”). Where standards, platforms, or tools are discussed more thoroughly elsewhere, I point to the relevant source rather than summarize at length. The audience for these standards is imagined to be not just scholars and editors of letters, but also librarians holding interesting manuscripts in special collections and teachers interested in producing original scholarship within a graduate or undergraduate class setting.

V. Necessary Metadata Fields

26 Leave a comment on paragraph 26 0 Though learning TEI standards will be necessary for a scholarly edition of letters, the minimal set of metadata fields necessary to aggregate and disseminate information about letters is a good place to begin, for three reasons. First, the level of training needed to produce metadata information is quite achievable, and can be integrated into even an undergraduate class curriculum. Second, metadata fields are the minimal units necessary for aggregation and access, and point toward the larger social purposes of letter data as a primary object of scholarly interest, rather than as a secondary benefit of good critical editions. Third, objects correctly described in terms of metadata allow us to produce more accurate indexes, better search mechanisms, and more legible transcriptions. Metadata entries themselves should conform to the standard vocabularies and name control authorities given below, derived largely from the practices of the archivist and library community.

27 Leave a comment on paragraph 27 0 The minimal metadata fields required for peer review by the Advanced Research Consortium and its aggregation software, Collex, are described on the ARC wikiand explained at more length in the examples (“ARC wiki”); they employ standard frameworks, such as Dublin Core description, when possible, and have developed primarily from the pioneering work at NINES. The metadata fields are represented here in terms of “elements” or “elements” in angled brackets, in accordance with the requirements of XML, so that <dc:title>Letters of Virginia Woolf</dc:title> indicates the standard Dublin Core metadata field “title of object.” The elements are united and converted for aggregation through a Resource Description Framework (RDF) model, as discussed below; importantly, what the elements mean, and how they are standardized, is more important than the specific instantiation in XML or otherwise.

28 Leave a comment on paragraph 28 0 There are eleven standard metadata fields required for aggregation in ARC; “nodes” like NINES, MESA, and 18th-Connect may have their own additional required fields for peer review. The required fields are best exemplified in the simple examples provided on the ARC wiki. Passing over the fields specific to the form employed by ARC and its nodes to aggregate examples (<rdf:RDF>; <collex:archive>; <collex:federation>), we can briefly note the meaning of the required field. The title (<dc:title>) refers to the title of the object described; the type (<dc:type>) describes the medium or form (i.e. signed Typescript or Manuscript, often signaled through the abbreviations TLS, ALS, or MS). The date can be recorded as a year, range of years, or uncertain field of years, as well as associated with a labeled entry, but should have a formal entry in the form YYYY-MM-DD. Uncertainty at the level of the decade can be indicated, and a separate element for general uncertainty exists as well. The role of “author” is required (<role:AUT>); for the purposes of letters, we suggest that the roles of editor and publisher to be listed as well, if appropriate. Editor would refer to the party responsible for creating the representation of the letter’s text, not necessarily the party remediating that text in digital form. One or more disciplines interested in the object are required (i.e. Literature, Book History, History), listed under the element <collex:discipline>, though these could be added late in the process. The genre of the object (<collex:genre>) will usually be “Correspondence,” but other genres such as “Collection” and “Translation” are available as well.

29 Leave a comment on paragraph 29 0 The other mandatory elements are specific to ARC aggregation, but will be necessary in some form for any project that wants to disseminate data in the long term. Thus the “archive” element refers to the originating project (i.e. “Rossetti Project”) and the “federation” element refers to the peer-review “node” to which the project belongs (i.e. NINES, 18th-Connect, ModNets). The “freeculture” element (<collex:freeculture>) indicates whether the following text is open-sourced and open for reuse, or protected under copyright; the default is TRUE. Aggregation also requires a standard unique resource identifier (URI), usually the location of a file on a website (if already hosted online), that distinguishes between objects in a given project. The important thing to note is that the document URI should be unique, but not necessarily meaningful. Some projects (the Rossetti Archive and the Whitman Archive, e.g.) have found that making such document URIs “intelligible” creates more problems than are solved. Other projects, as in the Southey Letters, use URIs that are approximations of the letter they describe (e.g. southey_letters_letterEEd.26.1729.xml).

30 Leave a comment on paragraph 30 0 For letters, other minimal metadata fields beyond the standard aggregation elements are clearly required. I have already mentioned the roles of editor and publisher, necessary when aggregating information from already-published letters. The most important additional element is clearly the recipient(s) or addressee, a field that conveys the unique social relationship essential to the epistolary form. The DALF project employs an <addressee> element for this relationship, which could be represented in a number of other ways as well. More important is that the author and addressee have standardized name entries, as addressed in the next section. The location of both sender and recipient, if available, is optional but will be of interest to most projects, and can be easily visualized if entered alongside standard coordinates (with a standard geocoder service), as in the case of Palladio. The archival location and standard call number, if located in a library or archive, are required fields (i.e. UVa Special Collections; MSS 6251 — 6251-bn Box 1) in the case of archival materials, and metadata from published editions should include the volume and page range of the original (i.e. Vol 1, 35-36). Again, the specific name of these metadata fields is less important than their presence; the form of entries in this field should be in accord with a standard vocabulary, whenever available (see below).

31 Leave a comment on paragraph 31 2 For more archival-oriented and descriptive projects, the full set of comprehensive metadata elements developed by the DALF project allows for preservation of bibliographic, annotative, and physical data (Vanhoutte and Van den Branden). For most scholarly projects, however, the minimal set of elements will likely be more useful as a starting point. To reiterate, the minimal suggested fields for correspondence metadata (outside the metadata specific to ARC or other project aggregation) are author, recipient, date, place of author, place of recipient, archival location, and type if available (i.e. autograph manuscript, typescript, telegram, etc.). The title for the letter, in most collections, derives from the author, recipient, and date (i.e. Henry James to James Whistler, 1868-05-27). Not all collections will have this information available, but they constitute imagined minimum for the purposes of formal description.

32 Leave a comment on paragraph 32 0 A brief note on the “data model” implied by metadata standards is required here. All complex digital data have a specific structural model that tells machines (and humans) how to read and use that data: the suffixes .html, .doc, and .txt designate the most familiar structures for reading data. The minimal fields required for ARC metadata ingestion are described through a metadata model called the Resource Description Framework (RDF), a “semantic network” way of describing data that differs from a hierarchical or document-based model (XML) as well as a relational database model (MySQL or other approaches reliant on primary keys). Those interested in understanding the differences between these models and their approaches in detail can consult useful tutorials and books; RDF and the semantic web are described in Semantic Modeling for the Working Ontologist (Allemang and Hendler). The specific technologies will change, but the essential structural distinctions have a good deal of relevance for the study of letters. XML/TEI is a hierarchical textual data format, useful for describing the rich data contained in documents; MySQL depends on the relational database model, in which data is conceived as separable tables containing a series of relations in row and column form; RDF is form of describing graph database relations, in which no necessary hierarchy or primary keys exist, but rather objects are conceived as existing in a variety of semantic relations (Fitzgerald is a FriendOf Hemingway, some of Fitzgerald’s letters HaveRecipient Hemingway). As those examples show, RDF description occurs in the form of a series of subjects, predicates, and objects, on the line of natural language predication in a subject-verb-object language like English.

33 Leave a comment on paragraph 33 0 Relations, as Henry James says in his preface to Roderick Hudson, stop nowhere: “and the exquisite problem of the artist is eternally but to draw, by a geometry of his own, the circle within which they shall happily appear to do so.” What is true of fiction counts for metadata as well. Each project interested in aggregation must decide what RDF relations are allowed, and what standard vocabularies (or “ontologies”) will be used, a decision formalized in the “RDF schema” of the project (and appended at the top of every document within that project). Even for those projects not employing TEI encoding for a formal edition, the concept and documentation of “standard vocabulary” is essential. Those standard vocabularies are the second essential element for recording, preserving, and disseminating digital letters, as I shall discuss below.

IV. Standard vocabularies and authority control

34 Leave a comment on paragraph 34 0 The problem with human relations is not only that they stop nowhere but that they are impossible to fix, define, and unify in a simple way: thus the pleasures and equivocations of ambiguous Jamesian sociability in a novel like The Wings of the Dove (one of his many novels that hinge on the contents of a letter). Machines, alas, are as of yet in need of more strict “determinations” for their forms of procedural reading. Standard vocabularies and authority controls, the tools of information science and library science, allow us to assign a term which is “consistently, uniquely, and unambiguously” assigned to particular people, places, subjects, and agents (“Cataloguing authority control policy”). If Rose is a rose is a rose is a rose, as Gertrude Stein said, nevertheless Gertrude Stein is not “Mr. Cuddlewuddle” is not Stein, Gertrude, 1876–1946. Any given name will appear in multiple forms across different materials, a problem that can to some extent be resolved after data entry with tools like OpenRefine. If a standard way of referring to sources is consistently used and documented, however, our work is much more likely to be preserved in a useful way.

35 Leave a comment on paragraph 35 0 In the case of editions prepared with TEI, standard vocabularies and their sources are maintained within the formal “header” of the document, before the main text. Metadata prepared in other contexts should use these or other documented sources for their vocabulary, where possible. Many standard vocabularies are linked to the required elements already discussed: Dublin Core elements, for example, are standard library metadata fields. Roles are controlled through the Library of Congress MARC codes, i.e. “aut”ufor author. Most important are the names of individuals and places, both of which now can be dynamically consulted during the process of data-entry. For works and individuals listed within library systems, the most comprehensive reference point is the Virtual International Authority File (VIAF.org), a system that includes foreign names and the Library of Congress name authority files. GeoNames.org provides placenames through a web services API as well as a standard search interface. All these services are available as dynamic web services within platforms like Collective Access, an open-source museum and library service that allows easy creation of metadata and RDF “schemas” like those required by TEI and ARC guidelines.

36 Leave a comment on paragraph 36 0 Setting up new letters projects that would be in line with these guidelines remains conceptually and practically daunting, even as the means to create and host collections of letters have become far more powerful and accessible. Collaboration and consultation remain key here: developing digital letters projects entails working with digital libraries specialists, hiring website specialists for interface and user design, sharing resources on web hosting and data entry, consulting with experienced digital project managers, and dividing scholarly duties with specialist experts. Such experts and expert knowledge communities are easiest to find within institutional contexts like the Cultures of Knowledge project, at digital humanities centers like Nebraska’s Center for Digital Research in the Humanities, and at the various digital humanities training institutions at Maryland, Oxford, Leipzig, and the University of Victoria. DH Questions and Answers also provides expert responses to a range of practical questions, along with an archive of vetted responses. On our own Twentieth-Century Literary Letters project blog, we have assembled an evolving annotated bibliography and list of specific resources relevant to digital correspondence. We welcome collaborators interested in specific collections of twentieth-century correspondence, in the development of linked open data for correspondence metadata, and in the process of querying and visualizing correspondence metadata as a form of scholarly analysis and inquiry.

VII. Conclusions

37 Leave a comment on paragraph 37 0 Let us conclude by briefly outlining several possible horizons for future scholarly work on letters as objects of study and digital representation. Note first of all that none of the above will mean the end of the printed edition of letters, nor the “obsolescence” of existing volumes of correspondence. Quite the contrary: developing more extensive digital editions of letters could well lead to complementary editions of selected letters, as the process of creating and printing selected correspondences becomes easier (though outside funding for printed critical editions has become scarce). The key question is whether the editions we create (in any medium) will be reliable, readable, and produced according to the standards of the best current scholarly editions.

38 Leave a comment on paragraph 38 0 One clear future path of work on digital correspondence will be in developing editions according to the TEI guidelines, and in standardizing other edited correspondence into TEI/XML form. As more editions and authors come into the public domain, good open-source digital editions will be needed. When encoded with RDF semantic data, these editions should already provide the first steps towards “linked open” models of correspondence data. Second, there will continue to be collections of individual and group correspondence, some privately held, and some for small groups of researchers interested in full text but not in a full critical edition. Along with this, we will see a variety of evolving visualization projects which take advantage of the social, geographically specific, and culturally imbricated dimensions of letters, making use of them for the purposes of cultural history, reader-response theory, and the study of material and literary networks.

39 Leave a comment on paragraph 39 0 Finally, an important area of work will be in union catalogues of letters on the model of EMLO. These union catalogues will likely become progressively more expansive, inclusive, and multi-institutional in their scope and range. Whether full content will be easily and freely available, or whether it will be largely hidden behind paid subscriber services, the collection of linked metadata will enable new forms of interconnected scholarly arguments and histories, the first of which we are beginning to see. As we move towards a new era of interconnected digital editions and groups of correspondence, an important question of permanence and accessibility will remain: how will print editions of letters intersect with and complement larger correspondence databases and union catalogues? Essential will be preserving the high production values, depth of annotation, and editorial standards that the best print editions of major correspondence, as with recent volumes of letters by Ernest Hemingway and T. S. Eliot, so beautifully exemplify. New work on digital letters should retain a sense of the value of these well-crafted material objects, even as we explore new ways of representing epistolary ontology. Whether digital or print, letters represent an essential aspect of civilized life for generations of letter-writers, and a central part of our cultural inheritance, one worthy of continued representation within what has become very nearly a post-epistolary era.


40 Leave a comment on paragraph 40 0 1. For the purposes of this essay, let us define “modern letters” as post-seventeenth century and pre-email, though email presents similar challenges. Exceptions to this lack of interest are discussed below in “Editorial Theory and Practice,” and in conjunction with the standards developed by the “Digital Archive of Letters in Flanders” (DALF) correspondence project. See also Halsband, Berg, Phillips, Kline and Perdue, and Jolly and Stanley.

41 Leave a comment on paragraph 41 0 2. On the distinctions between “texts,” “works,” and “versions,”as well as other editorial terms, see Kelemen’s glossary.

42 Leave a comment on paragraph 42 0 3. Elena Pierazzo, “A Rationale of Digital Documentary Editions,” Literary and Linguistic Computing 26, no. 4 (2011): 475.

43 Leave a comment on paragraph 43 0 4. See Peter Robinson, “Towards a Theory of Digital Editions,” Variants 10 (2013): 127.

44 Leave a comment on paragraph 44 0 5. See “About the Archive,” section E: “Correspondence.”

45 Leave a comment on paragraph 45 0 6. See “Editorial Methodology,” (http://www.rc.umd.edu/editions/southey_letters/letterEEd.26.about.html).

46 Leave a comment on paragraph 46 0 7. See Kline and Perdue, Chapter Five section V, “The Middle Ground: Inclusive Texts and Expanded Transcriptions”:164-171, <http://gde.upress.virginia.edu/05-gde.html#h2.5>.

47 Leave a comment on paragraph 47 0 8. See again Kline and Perdue for a thorough account of the process of recording procedures for documentary editing in a digital environment.

48 Leave a comment on paragraph 48 0 9. Available online: <http://www.mla.org/cse_guidelines>.

49 Leave a comment on paragraph 49 0 10. See Walt Whitman Archive, “About the Archive,” section E: “Correspondence,” and Kenneth Price, personal communication.

50 Leave a comment on paragraph 50 0 11. See “Reassembling the Republic of Letters, 1500–1800,” <http://www.cost.eu/COST_Actions/isch/Actions/IS1310>.

Works Cited

“ARC Wiki.” Advanced Research Consortium, September 17, 2013. Web. 15 Nov. 2014. <http://wiki.collex.org/index.php/Submitting_RDF>.

Allemang, Dean, and James Hendler. Semantic Web for the Working Ontologist, Second Edition: Effective Modeling in RDFS and OWL. 2nd ed. Waltham, MA: Morgan Kaufmann, 2011.

“DH Questions & Answers.” Association for Computers and the Humanities (ACH), n.d. Web. 10 June 2014. <http://digitalhumanities.org/answers/>.

Berg, Temma. “Truly Yours: Arranging a Letter Collection.” Eighteenth-Century Life 35, no. 1 (2011): 29–50.

Broadway, Jan. “Digitizing Correspondence Workshop Report.” London: Centre for Editing Lives and Letters, Queen Mary, University of London, June 10, 2009. Web. <http://www.livesandletters.ac.uk/downloads/DC_report.pdf>.

“Cataloguing Authority Control Policy.” Canberra: National Library of Australia, n.d. Web. 15 June 2014. <http://www.nla.gov.au/policy-and-planning/authority-control>.

Clement, Tanya. “Knowledge Representation and Digital Scholarly Editions in Theory and Practice.” Journal of the Text Encoding Initiative, no. Issue 1 (June 8, 2011). <doi:10.4000/jtei.203>.

Collective Access. Seth Kaufman, Lead Developer. Accessed 10 June 2014. <http://www.collectiveaccess.org>.

“Corpus Metadata.” Circulation of Knowledge and Learned Practices in the 17th-century Dutch Republic, n.d. Web. 15 June 2014. <http://ckcc.huygens.knaw.nl/?page_id=43>.

Diplomatic Correspondence of Thomas Bodley (1585–1597). “Editorial Policy.” Robyn Adams, ed. London: Centre for Editing Lives and Letters, Queen Mary, University of London. Version 5: July 2011.

“Early Modern Letters Online.” Cultures of Knowledge Project, Bodleian Library, Oxford, n.d. Web. June 20, 2013. <http://emlo.bodleian.ox.ac.uk/>.

“ePistolarium.” Circulation of Knowledge and Learned Practices in the 17th-century Dutch Republic, n.d. Web. 10 June 2014. <http://ckcc.huygens.knaw.nl/epistolarium/>.

Finneran, Richard. “The Collected Letters of W.B. Yeats: A Project in Disarray.” Review 18 (1996): 45–58.

Halsband, Robert. “Editing the Letters of Letter-Writers.” Studies in Bibliography 11 (1958): 25–37.

James, Henry. “Prefaces to the New York Edition: Volume 1.” New York : Scribners, 1907.

Jolly, Margaretta, and Liz Stanley. “Letters As/not a Genre.” Life Writing 2, no. 2 (2005): 75–101.

Kelemen, Erick. Textual Editing and Criticism: An Introduction. New York: W.W. Norton & Co., 2009.

Kline, Mary-Jo, Susan Holbrook Perdue, and Association for Documentary Editing. A Guide to Documentary Editing. Charlottesville: University of Virginia Press, 2008.

Lawrence, D. H. The Letters of D.H. Lawrence. Vol.1: 1901–1913. Edited by James T. Boulton. Cambridge: Cambridge University Press, 1979.

Liu, Alan. “From Reading to Social Computing.” In Literary Studies in the Digital Age, edited by Kenneth M. Price and Ray Siemens. Modern Language Association of America. Accessed June 20, 2013. <http://dlsanthology.mla.hcommons.org/from-reading-to-social-computing/>.

“Mapping the Republic of Letters,” Republic of Letters project, Stanford University, 2013. Web. 15 June 2014. <http://republicofletters.stanford.edu>.

“OpenRefine.” Accessed June 20, 2013. <http://openrefine.org>.

Phillips, Siobhan. “Elizabeth Bishop and the Ethics of Correspondence.” Modernism/Modernity 19, no. 2 (April 2012): 343–63.

Pierazzo, Elena. “Digital Documentary Editions And The Others.” Scholarly Editing: The Annual Of The Association For Documentary Editing 35 (2014). Web. 27 Nov. 2014. <http://www.scholarlyediting.org/2014/essays/essay.pierazzo.html>.

Price, Kenneth M. “Collaborative Work and the Conditions for American Literary Scholarship in a Digital Age.” In The American Literature Scholar in the Digital Age, edited by Amy E. Earhart and Andrew Jewell, 9–26. Ann Arbor, MI: U of Michigan P; University of Michigan Library, 2011.

Southey, Robert. The Collected Letters of Robert Southey. Pratt, Lynda, Tim Fulford, and Ian Packer, eds. Romantic Circles, 1 Feb. 2009. Web. 27 Nov. 2014.

Rosenberg, Bob. “Documentary Editing.” In Electronic Textual Editing, edited by Lou Burnard, Katherine O’Brien O’Keeffe, John Unsworth, and G. Thomas Tanselle, 92–104. New York, NY: Modern Language Association of America, 2006.

Schreibman, Susan. “Digital Scholarly Editing.” In Literary Studies in the Digital Age, edited by Kenneth M. Price and Ray Siemens. Modern Language Association of America, 2013.

Palladio. Stanford Design+Humanities, Stanford University, Version 0.8.0 (November 2014). Web. 10 April 2014. <http://palladio.designhumanities.org>.

SNAC: The Social Networks and Archival Context Project. Inst. for Advanced Technology in the Humanities, U of Virginia, n.d. Web. 13 July 2012.

“The Electronic Enlightenment Project.” Bodleian Libraries, University of Oxford, 2008–2014. Accessed 10 June 2014.

Vanhoutte, Edward, and Ron Van den Branden, eds. “DALF guidelines for the description and encoding of modern correspondence material, Version 1.0.” Gent: Centrum voor Teksteditie en Bronnenstudie, 2003. Web. June 15th, 2014. <http://www.kantl.be/ctb/project/dalf/dalfdoc/>.

Walpole, Horace. The Yale Edition of Horace Walpole’s Correspondence. Edited by W. S Lewis. New Haven: Yale University Press, 1937.

Whitman, Walt. “About the Archive: Editorial Policy Statements and Procedures.” The Walt Whitman Archive. Eds. Ed Folsom and Kenneth M. Price, n.d. Web. 15 Nov. 2014.

Woolf, Virginia. The Letters of Virginia Woolf. Edited by Nigel Nicolson and Joanne Trautmann Banks. Vol. 1. London: Hogarth Press, 1975.

Source: https://hcommons.org/?page_id=353&preview=1&_ppp=2c8b6a9182