The SALit CD-ROM: A virtual library of South African literature
Prof A.J. van Wyk (University of Durban-Westville) and Dr G.D.J. Stewart (M.L. Sultan Technikon)
Centre for the Study of Southern African Literature and Languages: University of Durban-Westville
Department of Library and Information Studies: M.L. Sultan Technikon
The SALit CD-ROM is an artefact, the result of a research and development project conducted in a new discipline that has come to be known as Humanities Computing, described by Schreibman (1999: 55) as seeking to "provide new keys for entering the centuries-old doors of texts". Underlying the SALit research is a theoretical framework that draws on post-colonial literary theory and hypertext theory to deliver an information resource appropriate to the contemporary study of South African literature. In common with other aspects of South African life, the study of literature has undergone a profound post-apartheid transformation that involves a fundamental shift in perspective. The SALit Project originated in an attempt to produce an Encyclopaedia of South African Literature designed to transcend the limiting categories of language and race of the past. Springing as it does from a post-modern imperative to unmask implicit ideological influence, the Encyclopaedia project chimes with the intentions of the much-publicised African Renaissance, championed by Thabo Mbeki and articulated here by Ntuli and Smit (1999):
It is here where a group of dedicated intellectuals can initiate innovative programmes in the quest for groundwork for the African Renaissance. The project, this cultural revolution, could be a catalyst for revolutionary change and development. It will help us to question and problematise the limits of eurocentric ideas, liberate them from their context and re-present them to Europe with an African freshness. Whereas knowledge industries are made to serve Western thought—even as they draw on African (and Eastern) thought to enrich their own cultures, this has to be done from a new angle—an angle that must mainly benefit African knowledge industries.
To liberate South African literary works from their contexts requires firstly that the wealth of suppressed, banned and otherwise marginalised texts from the past be discovered, salvaged and re-published; and secondly, that they take an equal place with long-established works (the "canon") in a South African literary history characterised by diversity and non-linearity.
To start with, this paper briefly introduces the theoretical framework underpinning the SALit Project – the title of Landow’s seminal work Hypertext: the Convergence of Contemporary Critical Theory and Technology (1992) – neatly encapsulates the notion. Then the rest of the paper is devoted to a descriptive walk-through of the SALit Web CD-ROM itself, describing its structure and illustrating its principal features. We conclude with a consideration of some of the future developments of the project.
In post-modern literary research, the notion of discourse has increasingly gained importance: "‘Discourse’ has become one those key notions around which all else is constructed" (Bruce 1993: 360). Correspondingly, electronic text provides an ideal analytical tool for the investigation of the explicit and implicit structures of literary texts. The full texts included in the SALit Web Virtual Library provide an extensive and versatile resource for the close study of literature, once the user has mastered the software to tag the texts and then use the marked-up texts for analysis.
Any study of electronic texts in the humanities must of necessity be inter-disciplinary. The literature in the field reflects this diversity, showing that research into electronic texts in the humanities has emerged from the collaborate efforts of academics in the Humanities (notably in Classical, Medieval and Biblical Studies), Library Science and Information Technology. Interdisciplinary study is also essential to post-colonial studies in South African literature: Van Wyk lists anthropology, history, psychoanalysis, linguistics and philosophy as important in re-defining South African literary history, serving to "free the researcher from ‘national meaning’" (Van Wyk 1996: 36).
The theoretical framework from which electronic literary analysis has emerged is that of computational linguistics - the attempt to produce a complete linguistic description of the language in which the text is written. But literary scholars have adapted the analytical tools of computational linguistics for the purposes of literary analysis and interpretation - in short, to find answers based on countable features in texts rather than on intuition or impression. Whereas linguistic researchers strive for scientific accuracy and the recognition of verifiable patterns in the language as a whole, literary researchers make use of statistical analysis as a means to supplement literary critical interpretation and hermeneutics.
The influential post-structuralist theorist, Michel Foucault’s formulation of an interpretive literary methodology implies, as Bruce has suggested, that computer-based textual analysis has peculiar applicability to post-modern (and, by association, post-colonial) literary investigation. Foucault’s
method of analysis (the quantitative treatment of data, the breaking-down of the material according to a number of assignable features whose correlations are then studied; interpretative decipherment, analysis of frequency and distribution) (Foucault 1972: 10-11)
is eminently achievable through the application of computer-based methods in deconstructing texts. This is not to pretend that literary researchers have embraced computer-based investigation with a great degree of enthusiasm, even those who subscribe to post-modernist theory. There is still widespread skepticism and even hostility towards the new technology that has delayed its incorporation into mainstream literary research activity, particularly in South Africa.
Bruce (1993) identifies the key weakness in current literary computing as the tendency for researchers to view computer analysis as a "minor adjunct" to the pursuit of conventional literary analysis while ignoring the epistemological status of the electronic text itself. He argues that literary (humanities) computing needs to be developed within the framework of existing models in text and discourse theory "if it is to play a significant role for scholars and teachers" (Bruce 1993: 359). He suggests the use of computerised analysis within the area of textual theory where "text" and "discourse" can be regarded as being signifying systems, where discourse implies a set of anonymous, historically-situated rules determined by the time and space that determine a specific epoch, and where text is a specific articulation of that discourse. Because discourse emerges through text, analysis of the signifiers and signifieds that make up that text is the most promising use to which the processing strength of computers could be applied. The logical and statistical analyses of textual corpora possible with electronic texts lend themselves to this project. Bruce views the role of computerised textual analysis with mixture of enthusiasm and irony:
We find ourselves in the rather unusual situation of possessing an exciting methodological tool most of whose practitioners rarely articulate a supporting theoretical model adequate to the phenomenon. (Bruce 1993: 360)
The use of computer-based analysis that fits best with the semiotic investigation of texts is its ability to be almost exhaustive in its examination of a given text or corpus. Where scholars in the past have had to work with relatively small corpora, basing their conclusions on samples selected by intuitive means and through familiarity with known texts, computer analysis makes it possible to discover discursive regularities by thorough inspection of virtually unlimited corpora.
However, the use of electronic texts in the analysis and criticism of modern texts is still relatively undeveloped, with no definitive journal devoted entirely to the subject (although Computers in the Humanities and Literary and Linguistic Computing both publish literary research). Being able to use electronic texts for textual analysis is a key feature of the design of the SALit Virtual Library and a fundamental assumption in this project has been the provision of electronic access to primary and secondary material as an appropriate new approach to literary studies. (Van Wyk’s "free[ing] the researcher from ‘national meaning’" and Ntuli and Smit’s "liberat[ing] them from their contexts" (see above). It is also assumed that the ever-increasing familiarity of learners and researchers with computers will salvage what has tended to be a marginal activity carried out by enthusiasts unversed in the issues of contemporary literary theory and bring it into the mainstream.
The common thread running through the post-modern literary theory and hypertext theory, which I now go on to consider in relation to the SALit Web project is the notion of non-linearity. Most commentators emphasise the non-sequential nature of hypertext as its most important distinguishing feature: "Hypertext is non-sequential; there is no single order that determines the sequence in which the text is to be read" (Nielsen 1995: 1). Theorists in both areas emphasise the democratic, non-teleological and relativistic character of their conceptual frameworks.
Expressing the essential characteristic of hypetext, Deleuze and Guattari’s (1987) notion of the rhizome (a chaotically distributed network rather than a regular hierarchy of trunk and branches) is central to the theoretical debate on the nature of hypertext.
Many people have a tree growing in their heads, but the brain itself is much more a grass than a tree. (Deleuze and Guattari, 1987)
The medium of hypertext encourages readers to navigate texts nonsequentially by allowing them the choice of moving on in various directions to other "nodes" or lexias. As hypertext is essentially non-linear, what better medium to express the fluid relationships between the elements of a new South African literary history? Landow (1994) draws a distinction between "axial" hypertexts in which references, commentary and related texts radiate from a single central text, and a network structure resembling the rhizome. Although elements of the SALit Web resource, such as the Encyclopaedia for instance, retain an axial structure, the hypertext Web as a whole harnesses the potential of hypertext to present the literary information: events, publications, author biographies in an inclusive non-canonical manner. Foregrounding the literary grass, so to speak, and de-emphasising the tree, to borrow Deleuze and Guattari’s trope.
It would be naïve to imagine that hypertext will automatically produce profound literary historical insights, especially as our growing familiarity with the Internet (hypertext writ large) reveals a propensity for mindless point-and-click behaviour that has no more to do with critical thought than obsessive television channel-switching. There is no shortage of critics who are skeptical of the medium, see Birkerts (1997) for instance. However, it is difficult to deny hypertext’s unique capacity to offer a value-free virtual landscape dotted with familiar texts alongside previously marginalised texts for re-examination by literary critics and historians. Two pronouncements from Miller (1992), underscore the possibilities of the SALit Web:
The electronic book will be potentially democratic and non-canonical not because of some ideologically motivated decision, but by virtue of its technological nature.
History will much more evidently not be something objectively out there, but something constituted now for some particular purpose, in a transformative act of memory oriented towards the future.
What then of the SALit Web and of the Virtual Library itself?
"Currently there are three broad areas of Humanities Computing: student-centred editions, digital libraries, and scholarly editions." (Schreibman 1999: 57). Schreibman’s categories delineate in a crisp and precise way what is in reality a very large digital portmanteau of references, cross-references, texts and multimedia files that has more than once recalled Bhaktin’s description of the nineteenth century novel: "a baggy monster". The SALit Web contains all three of Schreibman’s categories: with the last, scholarly editions, being as yet the least developed. But before describing what the SALit Web does contain, it is necessary to dwell briefly on this last category because of its importance as a goal towards which the production of the electronic book in South Africa must inevitably strive.
There will be no South African equivalent of significant electronic editions such as Peter Robinson’s edition of Chaucer’s Wife of Bath’s Prologue on CD-ROM (1996) or the ground-breaking Electronic Beowulf (1996) until hypertext enters our literary academic mainstream. Another reason for the bias, in our own project, towards breadth rather than depth, is our belief that, unlike developed Western nations, our attention should be on fast-tracking the introduction of a previously invisible body of South African literature into the public domain. The West can still afford well-stocked libraries and can choose to ignore, for the time being, the challenge that the new media represents for the preservation of literary heritage and in particular, the voices of the disenfranchised.
With the twin aims of salvaging previously marginalised texts and "liberat[ing] them from their contexts" we believe that literature must stake its claim in the apparently unsympathetic realm of Information Technology. For most of South Africa’s inhabitants, this country qualifies as one of the world’s "bookless places" as this state has been described in at least one debate on electronic texts (Hart et al.1998). Apart from the economic impossibility of building or maintaining extensive book-based libraries in our new schools or universities, we have a vast untapped resource of previously marginalised literature - novels, poetry, drama, diaries, historical accounts and travel writing - that were denied to us by political or commercial proscriptions. Ironically, at a time when our newly open society is ready to rediscover previously unavailable texts, the expense of book production prevents it. Our library shelves will continue to reflect the narrow collection policies of the past because we simply cannot afford to redress the situation; and even if we could, the titles would often not be there to buy.
Although the specific missing texts might be unique to South Africa, the phenomenon of inaccessibility to marginalised writing is not. Women’s writing, for instance, has been similarly subjected to political, commercial and societal suppression worldwide. Two major electronic texts projects, the Brown University Women Writers Project (1996), and the Victorian Women Writers Project have sought to address the historical devaluing of women’s writing by re-publishing or making available for the first time, works that are essential for a full appreciation of women’s place in literature. The same impulse, but in the context of South African state oppression, motivated the development of the Mayibuye Centre’s Apartheid and the History of the Struggle for Freedom in South Africa CD-ROM.
Even though the production of scholarly editions is not our immediate priority, we cannot leave this topic before briefly addressing the rigorous international standards that have been developed for the encoding or tagging of literary texts. Predominant in this area is the major achievement of the Text Encoding Initiative (TEI) that culminated in the publication of the Guidelines for electronic text encoding and interchange - TEI P3 (Sperberg- McQueen and Burnard) in 1994. The TEI specifies a comprehensive set of descriptive tags within Standard Generalised Markup Language (SGML) and more latterly eXtensible Markup Language (XML) that ensures the long-term viability and compatibility of electronic texts. The guidelines provided by the Oxford Text Archive for the submission of electronic texts should be the first point of for any researcher contemplating a scholarly edition in electronic form. The texts in the SALit Web Virtual Library are presently marked up in HTML, also a subset of SGML, but having a much reduced tag set, limited mainly to procedural rather than descriptive markup. However, it is probably superfluous to note that HTML is the markup language of the Internet and as such, effectively provides the accessibility and platform-independent attributes so fundamental to our aim of bringing the South African literary heritage into the public domain.
Even within the limitations of an HTML environment (from a scholarly markup point of view) there is much that the the present version of the SALit Web CD-ROM can provide in the way of Schreibman’s first two categories, namely, student-centred editions and digital libraries. The opening page of the SALit Web CD-ROM best illustrates its top structure:
As a student-centred resource, the SALit Web CD-ROM offers three entry points, the Encyclopaedia, Timeline, and the Virtual Library. While superficially these appear as separate content, hypertext ensures considerable cross-referencing across the elements. Missing from this version of the CD-ROM are the "Period Tours", specifically designed to lead the user through a richly-hyperlinked narrative, that graphically demonstrates the inter-connectedness of the events, texts and pictorial elements of a South African literary history seen from a post-modern perspective. Until we have completed the painstaking but essential process of securing copyright permissions for the photographs, reproductions of paintings and some of the texts in the Tours, we cannot include these on a publicly-accessible version of the Web. Similar constraints have precluded us from using the numerous photographs of authors from the Encyclopaedia section. There is still plenty on the present disk to occupy the curious student, however.
The Encyclopaedia section has been developed from the 50,000 author, title, and event entries in the SALit Web database, providing the only resource of its kind that provides such extensive bibliographic and biographical information across the literature of South Africa in all its languages. In addition to references to texts in the official South African languages, Portuguese, German and Dutch are also represented, for instance in the important area of 18th century travel writing about the country. Because the language medium of the SALit Web is English, some of the African language text references include English synopses: an important step in alerting researchers to themes and preoccupations in these works that have not previously been taken into account in the critical appreciation of South African literature. A related project being undertaken in collaboration with the English Department at the University of Essen in Germany, aims to develop this aspect through the compilation of an annotated bibliography of African language novels, including short synopses and commentaries in English, which will eventually be incorporated into the SALit Web. The continuing development of the SALit database, which underpins the SALit Web CD-ROM is, one of the projects subsumed under a German-led project, based in Essen, of writing a South African Literary History. Further reference is made later to the scope this project offers for contributing to research in the area of literary history, while simultaneously adding to the database.
The second path open to user of the SALit Web CD-ROM is the Timeline section, the title of which is to some extent self-explanatory. However, in the context of post-modern theory, the diachronic arrangement of this comprehensive listing of people, places, events and literature, has a special significance. As previously mentioned the notion of "discourse" is central to post-modern theory. To the post-modern theorist, literature is only one of many discourses (amongst politics, art, journalism, etc.) that are not regarded as having a causal or logically unfolding influence on literary output, but need to be discerned as discursive layers (discursive formations) to "determine to what extent they are reflected in the literatures of South Africa, or form part of those litertures, or to what extent literatures have written those formations." (Coetzee 1996: 14). In this light, the Timeline, particularly in its exhaustive coverage of diverse literary and other events detached from their usual associative contexts, provocatively juxtaposes hitherto unrelated "statements" so that events may more easily be identified as reflecting similar discursive formations. This potential increases as more material and references in the African languages are added to the list.
Of course, the Timeline has value on a less theoretical level, by highlighting the synchronous relationships between literary works and events of historical and social importance.
Finally, the Virtual Library section (in this version of the CD-ROM) contains three works
Sol. T. Plaatje. (1900). Boer War Diary
George Mc Call Theal. (1884). Kaffir Folk-Lore; Or, A Selection From The Traditional Tales: Current Among The People Living On The Eastern Border Of The Cafe Colony. With Copious Explanatory Notes.
Olive Schreiner (1897). Trooper Peter Halket Of Mashonaland
Some 97 South African texts are in various stages of completion for electronic re-publication as part of the Encyclopaedia, along with a wide selection of general and title-specific criticism. The completed texts have been prepared in a word-processing program and marked up in HTML. The markup is fairly light, with hypertext contents pages and explanatory notes. A more extensively marked up version of Thomas Pringle’s 1834 African Sketches exists in Folio Views format. This version was prepared as part of an experiment with different software programs during my doctoral project. Although the program very successfully generated a full word index allowing quick searching for words and phrases and allowed cross-referencing of poems, notes and the text of the Narrative, we eventually decided to focus on platform-independent mark-up to allow wider accessibility to the texts.
The concept of the Pringle text brings us closer to Schreibman’s third category, namely "scholarly editions". While successfully restoring the original format of the work: by combining for the first time since the 19th century Part 1: Poems Illustrative of South Africa and Part 2: Narrative of a Residence in South Africa into a single (now electronic) text, the Pringle text also offered considerable scope for the development of a wider information web around the original text. Amongst the possibilities of the wider web was the assemblage of primary and secondary information sources relating to Pringle’s involvement with the establishment of South Africa’s first newspaper, The South African Commercial Advertiser and the subsequent clash between the publishers and the governor of the Cape, Lord Charles Somerset.
Future development of the SALit Web in its various forms has been assured by our collaboration with the South African Studies Centre (SASC) at the University of Essen in Germany and in South Africa, between the M.L. Sultan Technikon’s Information Management and Multimedia project (IMEM) and the University of Durban-Westville’s Centre for the Study of Southern African Literature and Languages (CSSALL).
Birkerts, S. 1997. The Gutenberg Elegies. New York: Faber.
Bruce, D. 1993. Towards the implementation of text and discourse theory in computer assisted textual analysis. Computers and the Humanities. 27: 357-364.
Coetzee, A. 1996. Rethinking South African National literary history. South Africa? History? Literary History? Rethinking South African Literary History. Smit, J.A., van Wyk, J. and Wade, J-P. Durban: Y Press. 10-19.
Foucault, M. 1972. The archaeology of knowledge. London: Routledge.
Hart, M. 1992. History and philosophy of Project Gutenberg. Project Gutenberg Home Page. Internet. (http://jg.cso.uiuc.edu/pg_home.html)
Landow, G.P. 1992. Hypertext: The Convergence of Contemporary Critical Theory and Technology. Baltimore: Johns Hopkins University Press.
Miller, J.H. 1992. Illustration. Cambridge: Harvard University Press. 30.
Nielsen, J. 1995. Multimedia and Hypertext: The Internet and Beyond. Boston: AP Professional
Ntuli, P.and Smit, J.A. 1999. Speaking Truth to Power: A Challenge to South African Intellectuals. Alternation: International Journal for the Study of Southern African Literature and Languages. Vol 6. No 1.
Reckwitz, E. 1997. Literary history: a problem sketch. In Reckwitz, E., Reitner, K. and Vennarini, L. (Eds) South African Literary History: Totality and/or Fragment. Essen: Der Blaue Eule. Englischsprachige Literaturen Africas, Bd 12: 11-22.
Robinson, P.M.W. 1996. The Wife of Bath's Prologue. By Geoffrey Chaucer. Cambridge: Cambridge University Press. CD-ROM.
Schreibman, S. 1999. Humanities computing: text to hypertext. The European English Messenger. Vol. 8, No. 1: 55-57.
Sperberg- McQueen, C.M. and Burnard, L. (Eds). 1994. Guidelines for electronic text encoding and interchange. TEI P3 Chicago and Oxford.
The British Library Board. 1996. The Electronic Beowulf Project. The Electronic Beowulf Homepage. London: The British Library. Internet. http://www.bl.uk/access/ beowulf/electronic-beowulf.html.
Van Wyk, A.J. 1996. Towards a South African Literary History. In Smit, J.A., J. van Wyk, and J-P, Wade. Eds. Rethinking South African Literary History. Durban: Y Press: 31-39.
Van Wyk, A.J. 1997. Concise historical survey: South African Literature. University of Durban-Westville.