Graham Stewart, M.L. Sultan Technikon, August 2001
This paper describes work in progress on the development of the South African Literature Online Subject Directory (SALO). The paper details the aims of the project within the context of the South African Literary Heritage Project, the methodology employed in the development of the directory, and the organisational design of the website.
The development of the South African Literature Online Subject Directory (hereafter referred to as SALO) arises out of a collaborative research initiative between the School of Languages at the University of Durban-Westville and the Department of Library and Information Studies at the ML Sultan Technikon in Durban. SALO aims to provide a website interface comprising a comprehensive, indexed set of current links to South African literary resources on the Internet. While the primary purpose of the project is to underpin the various research activities of Literary Heritage Project (of which SALO is a component), the website is also available to the public, and thus anyone with an interest in South African literature may access the resource. In this way, SALO contributes to the broad aims of the South African Literary Heritage Project, viz: to explore digital technologies for the promotion of communication and collaboration amongst researchers, and to increase access to previously marginalised work in all the South African languages:
[The aim is to] develop the archiving of South African literature through using new information technology and in this way to lay the foundations for the intertextual, cross-cultural and interdisciplinary study of South African literature and to make people aware of the importance of this new field of study in terms of knowledge production. As an interdisciplinary approach the project includes the different South African Language Literatures, History, Anthropology, Electronic Programming and Design, Library Science and Translation (CSSALL, 2001).
The development of SALO springs from assumptions both explicit and implicit in the Literary Heritage Project as a whole: that web-based publication and information sources offer increased access to literary and related material; that digital technologies offer an opportunity to bring works in the African languages to a wider public (of readers, scholars and learners) and that online publication and discussion forums offer unique opportunities for dialogue, learning and creativity. The specific focus of the present project, however, is enhanced information retrieval for research, drawing on expertise in the use of digital technologies for searching, finding, evaluating and re-packaging information in the field of South African literature. As most users of the Internet soon discover, an awareness that the information they require is "out there somewhere" in cyberspace is no guarantee that they will find it. Academics, like other users of the Internet, often lack the time, skills or the inclination to retrieve the material they need. Literary researchers (whether they be secondary school learners or professors) are no exception. SALO addresses the need for a quick, effective information retrieval tool as a service to literary scholars at all levels.
A brief synopsis of the SALO development is as follows: A review of the literature pertaining to the planning and structuring of online subject directories (also referred to as "web indexes") was conducted. An evaluative comparison of existing online subject directories and traditional library classification schemes informed the design of the pilot SALO web site. The results of an intensive online search and retrieval exercise were categorised according to a scheme appropriate to South African literary studies, and a pilot web site was published on the Internet.
Our research revealed that a successful subject directory (or web index) must have two key elements: firstly, the widest possible range of links to resources, and secondly, a clear (user-friendly) interface. Although concise and meaningful annotations also make a significant contribution to the quality of the retrieval process, links and interface are paramount. To illustrate this, consider a subject directory that fulfils only the first criterion. Examples of such sites were discovered during our study – "directories" that consist of page after page of hyperlinks, arranged either haphazardly or in alphabetical order. Scouring such a list may uncover few useful links, but locating resources in an un-categorised or non-indexed directory can be as laborious and time-consuming as searching the full Internet itself. At the other extreme, however, we found examples of attractive and clearly laid out pages, with disappointingly few useful links. For instance, a promising-looking map of Africa offering literature-related hyperlinks to various African countries was found to have very few operational links, and most of the "live" links led to obscure and unrepresentative resources (Agatucci, 2001).
The rest of the paper is structured to provide a review of the literature pertaining to the design of online directories, an account of the methodology used to gather information, a discussion of the development of the organisational structure of the directory, a brief description of the databases that underlie SALO; and finally an invitation to readers to participate online in building and improving the resource.
Research in this area ranges from highly technical accounts of the programming and construction of major bibliographic databases such as the Online Library Computer Center (OCLC), to meticulous descriptions of adaptations to machine-readable fields in library classification schemes. With the limited resources at our disposal, we found smaller scale, yet sound, solutions to achieving our goals in articles describing the development of two accessible and popular Internet subject directories: Yahoo! and the Librarians’ Index to the Internet (LII).
Both sites use what could be described as "cascading" menu designs: broad categories at the top level (first page), leading to related sub-categories on lower levels. A straightforward example is Author è Specific author names è Titles by or about the author. Such hierarchical structures are typical of schemes used in traditional libraries. In her discussion of the Yahoo! subject directory, Callery (1996: 1), refers to this design as a "hierarchical subject index".
Amongst the advantages of the cascading menu design, according to Callery, are the higher relevancy rate of retrieved items, and the reduced demand it makes on the user to know all the synonyms of a search term to find a topic (as would be the case in an open search engine, like Google, for instance). Callery gives the following example: "The user ... browses a short list of subcategories, selects a subcategory called Organizations, and there are the sites. It’s not necessary for the user to bring these entities together herself; they are already arranged that way." (1996: 1)
Of even more relevance to the SALO project - given the context of library science in which it has been conducted - is Callery’s reference to Yahoo’s practice of employing cataloguers, who examine submissions to the directory, exercise quality control on the sites selected, and index them according to the directory structure. Callery’s observation that "Every site added to Yahoo! is examined by a human being" (1996:2) drives home the point that - unlike Yahoo! - most major Internet search engines are fully automated, unable in their present state of development to distinguish adequately between primary web pages and their often numerous sub-pages - hence the clutter of unwanted search results familiar to Internet users. As will be seen below, we chose to use a similar procedure in selecting sites for inclusion in the SALO directory: by evaluating each submission posted on our web server, and other sites retrieved during our wide-range Internet searches. Yahoo! offers Internet users two distinct search options: either one can browse the subject directory (tree) itself, or alternatively, use what Callery refers to as a "keyword search". The keywords on which this latter option is based comprise all the words included in the "title" and "description" fields of the database, together with any terms that might have been included as metatags in the HTML header by the web page author. In this way the combination of descriptive words in the fields form a rich (and relevant) source of metadata as a basis for the subsequent automated search. The flexibility offered by these options has also been incorporated into the design of the pilot SALO site.
Beekink (1999: 32-4), in reviewing the professional literature, observed that although much attention had been paid to the growing number of Internet search engines, little mention had been made of subject directories, that usually provide more guidance to users. His observation that subject directories retrieve fewer but more relevant sources of information than search engines recalls Callery’s comments on the Yahoo! directory, and has similarly influenced the design of SALO. In particular, Beekink noted that "Subject directories have fewer problems with the wrong spelling of words and obsolete links, and guide users to related terms".
On the issue of site design, Henninger (1999: 182), echoing advice given by other authorities in the field (Lynch and Horton, 1999; Nielsen, 2000) alerts web designers to the sense of disorientation that can beset a user accustomed to the logical sequence of a print document. He recommends that a good Web index should include sound information design elements to provide this orientation, such as navigation devices that provide immediate access to all the major 'chunks' of information in the 'document' or site. Most importantly, these navigation devices need to be visible, though not intrusive, at all times. Hence, ease of navigation, and the reduction of "clutter" informed the design of the SALO site, with links to the major categories clearly and simply signposted. Henninger also stresses the effectiveness of using annotations ("a type of descriptive metadata") as valuable additions to subject gateways, significantly improving the information retrieval process. Combined with the Yahoo! style "description" field in an underlying database, such annotations create the ideal metadata to drive a free search option in a subject directory, and our inclusion of a similar field in the SALO database has created the potential for enhancing our search options as the site continues to develop.
Two critical questions underpinned the methodology used to develop the subject directory: (a) Why is a subject directory of online South African literature resources necessary? and (b) How would such a subject directory be organised? (together with the related question: What steps are necessary in the development and publication of such a subject directory?)
To answer to the first question, we tested our assumption that there were no comprehensive subject directories of South African literature on the Internet, by consulting both secondary and primary sources. Published material on subject directory development was reviewed, and primary data on the content and design of existing Internet subject directories was collected. The exercise confirmed that there were no comprehensive resources on South African literature available. Where directories and lists of web links dealing with partial aspects of the subject were found, details were recorded for inclusion in the SALO directory database. Notable amongst these "partial" directories was the excellent list of South African literary resources maintained by Stanford University (Fung, 2001) and the MWeb Litnet site (in particular, De Aar). Another promising, but under-developed site was the Brown University Post-colonial Web (Landow, 2001). Comparisons between the links provided by these sites and our own pilot list of links, revealed that many valuable resources had been omitted; moreover, the other sites tended either to reflect the literature of only one language (for example, Afrikaans on Litnet, and English on the Stanford site) or contained too few links to be representative of the full range of South African literary output.
Proceeding to our second critical question - How would such a subject directory be organised? – we analysed the design and organisation of existing subject directories dealing with literature (both South African and international).These sites were located by searching the WWW (using search engines and subject directories) and by directly requesting recommendations from members of the South African Literary Heritage research team. As mentioned, we finally settled on Yahoo! and the Librarians’ Index to the Internet (LII) as ideal exemplars for the overall structure of SALO. Secondary sources were also consulted to examine design principles that had been tested elsewhere, and to review suitable models for the structure of SALO.
To decide on the most suitable set of categories for SALO, we tabulated three lists of category headings, drawn from Yahoo! (Literature), the Librarians’ Index to the Internet (Literature) and the Dewey Decimal Classification (DDC) tables (Table 3) -see Figure 1. The inclusion of the DDC categories was influenced by Callery’s (1997) account of the hierarchical development of the Yahoo! directory. While the Yahoo! developers had, after careful consideration, steered away from conventional library classification schemes, deeming them inappropriate for a web subject directory, they acknowledged that they looked to other systems such the Library of Congress Classification (LCC) for ideas and guidelines for the organisation of certain areas. The decision to include the DDC in selecting categories for SALO was an attempt to take advantage of our likely users’ familiarity with the DDC, while retaining the flexibility of web-based subject directory design.
Occurrences of similar categories in all three authorities were noted, and ranked in terms of frequency. At the same time clearly unsuitable headings (bookstores in California) were deleted, while unlisted but South Africa-specific heading (Technikons and Universities) were added.
Figure 1: Comparison of category headings table
The question of the selection of first-level category headings was debated by members of the project group. Eventually it was decided to adopt the founding principle of the Centre for the Study of Southern African Languages as the starting point: the study of South African Literature in all its languages. Thus the equal status accorded to the eleven official languages in the South African Constitution (1996) was used to determine the first level of the subject directory hierarchy. Once work had begun on searching for sources, some problems emerged as a result of this initial choice. For instance, past inequalities have produced under-representation in the literary output in some languages, leading to empty directories in certain second-level categories (forms and genres). There are also a few unavoidable absurdities arising from the imposition of hierarchical links leading from "Language" to "Form" and "Genre" - for instance, no extant "novel" entries in the "San" or "Sign language" categories. Another, unanticipated obstacle to our "equal language" strategy, was the lingering existence of apartheid-era configurations in university departmental structures. We found that links to academic and research sites in South Africa led inexorably to "Afrikaans", "English" and "African Languages" departmental structures, frustrating our attempts to reflect a flat language "landscape" in which no one grouping was privileged. Nevertheless, the subject directory has the potential to accommodate future growth and development of literatures across the range of South Africa’s languages.
Generic non language-specific categories were separately listed, and appear alongside Languages in the first level, connecting directly to online resources. As discussed above, the second-level categories (under "Language") consist of form, genre and other language-specific descriptors.
At this point in the development of SALO, it was decided to omit certain community languages such as German, Greek, Gujarati, Hindi, Portuguese, Tamil, Telegu and Urdu; and languages used for religious purposes: Arabic, Hebrew and Sanskrit. The three "semi-official" Khoe, San and Sign languages have, however, been retained in Level One.
Figure 2: SALO home page showing first-level categories
Both the review of the literature and the analysis of Internet sites confirmed that the two key criteria for a good online subject directory: many useful links, and a user-friendly interface. The SALO user interface aims to meet both criteria. Three search possibilities are presented to the user on the home page: search by language, search by category and key word search (see Figure 2). At the present stage of development, the first two options retrieve all the entries, arranged alphabetically according to the category chosen. The key word search option allows the entry of a single key word that retrieves entries that include the key word in the relevant metadata field ("Description" in the SALO database).
Further work on the subject directory will test the feasibility of including cascading menus under each of the language categories, and a wider range of search options in the key word search feature.
Although the database contains eighteen fields, only three of these are displayed to the user: Title, Description and URL (the URL field provides a direct hyperlink to the referenced online resource). The other fields (primarily based on the Dublin Core, see below) are hidden from the user and allow for sorting, and for the inclusion of further details regarding the site, e.g. copyright information, and date of publication.
Although largely invisible to users of the site, the systematic description of each resource is a key factor in the delivery of relevant search results (see the discussion of metadata fields, above). The underlying SALO database contains a set of database fields based on the Dublin Core (Dublin Core, 2000). The Dublin Core fields were developed by the academic Museum community as finding aids for digital material, and have been widely adopted for their versatility and common-sense approach to cataloguing. This flexibility has proved to be ideal for accommodating our SALO data. In order to fast-track the publication of our pilot site, we performed the "reality check" recommended by the Dublin Core working group: "Is the record itself, and each element within that record, useful for resource discovery? If not, leave it out." (CIMI, 2000: 9). We began by concentrating on only two fields: the resource title, and its associated URL. At the time of writing we have stopped adding sites, and are writing annotations (or short descriptions) of each resource to enhance the metadata in the "Description" field.
In addition to being able to search the SALO site, users are able to submit their own recommendations (and comments on the subject directory itself) via an online submission form labelled "Submit a site". Thirty-five recommendations from remote users were recorded on the submissions database during the first two weeks following the publication of the pilot site. Readers are invited to participate in building our database of links, and to report any errors by accessing the site at http://nymphs.udw.ac.za/sisal.htm.
The assistance of members of the Literary Heritage project, in particular project leader, Johan van Wyk, was indispensable in the development of the pilot SALO site. Acknowledgment is also made to technikon B.Tech research assistants, Judgement Khoza, Portia Rakoma and Gugulethu Mkhize for their invaluable contribution to the project.
1996. Chapter 1: Founding Provisions, Section 6: Languages. Constitution of the Republic of South Africa. Act 108 of 1996. Retrieved 12 March 2001, from the World Wide Web: http://www.polity.org.za/govdocs/constitution/saconst.html
2001. Literature. Yahoo Subject Directory. Yahoo! Inc. Retrieved 12 March 2001, from the World Wide Web: http://dir.yahoo.com/Arts/Humanities/Literature/
Agatucci, C. 2001. Culture(s) & Literature(s) of Africa. Central Oregon Community College: Home Page. Retrieved 6 June 2001, from the World Wide Web: http://www.cocc.edu/cagatucci/classes/hum211/literarymap.htm
Beall, J. 1997. Cataloging World Wide Web sites consisting mainly of links. Journal of Internet Cataloging. 1 (1): 83-92.
Beekink, M. 1999. Een andere benadering van het web: wetenschappelike subject directories. [A different approach to the Web: academic subject directories.] Informatie Professional. 3 (90): 32-4.
Brummer, A. et.al. 1997. The Role of Classification Schemes in Internet Resource and Discovery. DESIRE - Development of a European Service for Information on Research and Education. Retrieved 2 March 2001, from the World Wide Web: http://www.ub.lu.se/desire/radar/reports/D3.2.3/class_v10.html
Callery, A. 1996. Yahoo! Cataloging the Web. Journal Of Internet Cataloging. 1(1): 83-92.
CIMI Dublin Core Metadata Working Group. 2000. Guide to Best Practice: Dublin Core. Version 1.1. Consortium for the Computer Interchange of Museum Information. Retrieved 2 December 2000, from the World Wide Web: http://www.cimi.org
CSSALL. 2001. South African Literary Heritage Project - Developing an electronic archive for a new discipline. Centre for the Study of Southern African Literature and Languages (CSSALL) Home Page. Univeristy of Durban-Westville. Retrieved 1 June 2001, from the World Wide Web: http://nymphs.udw.ac.za/lhp.htm
Dewey, M. 1979. Dewey Decimal Classification and relative index. Volume 2, Schedules 000-599. 19th Edition. Albany : Forest Press. 1389-98.
Fung, K.(Ed). 2001. Africa South of the Sahara: Selected Internet Resources. South African Libraries and Archives. Stanford University Libraries and Academic Information Resources. Retrieved 6 March 2001, from the World Wide Web: http://www-sul.stanford.edu/depts/ssrg/africa/southafrica/rsalit.html
Garlock, K.L., Piontek, S. and Culshaw, J. 1997. Building the Service-Based Library Web Site: A Step-by-Step Guide to Design and Options. Technical services quarterly. 14 (4).
Henninger, M. 1999. What makes a good Web index? Indexer. 21 (4): 182-3.
Landow, G.P. 2001. The Republic of South Africa. Contemporary postcolonial and postimperial literature in English. Brown University Scholarly Technology Group. Retrieved 5 April 2001, from the World Wide Web: http://landow.stg.brown.edu/post/sa/safricaov.html
Leita, C. (Ed.). 2001. Literature and Books. The Librarian's Index to the Internet (LII). Retrieved 15 March 2001, from the World Wide Web: http://www.lii.org/search/file/literature
Lynch, P. and Horton, S. 1999. Web Style Guide : Basic Design Principles for Creating Web Sites. New Haven: Yale Univ Press.
Nielsen, J. 2000. Designing Web usability. The practice of simplicity. Indianapolis: New Riders.
Stewart, G.D.J. et.al. 2001. South African Literature Online Subject Directory. Centre for the Study of Southern African Literature and Languages (CSSALL). Retrieved 6 June 2001, from the World Wide Web: http://nymphs.udw.ac.za/sisal.htm
Wallis, J. and Burden, P. 1995. Towards a Classification-based Approach to Resource Discovery on the Web. Proceedings: 4th International W4G Workshop on Design and Electronic Publishing, Abingdon, UK. 20-22 November 1995. Retrieved 17 April 2001, from the World Wide Web: http://www.scit.wlv.ac.uk/wwlib/position.html