8 Institutions and Projects

This section describes several institutions and projects around the world which provide services or which hold digital resources which are relevant to the AHDS. In section 8.1 UK Services we list some existing UK-based Institutions which already offer facilities similar to those proposed for AHDS Service Providers, to a greater or lesser extent. In section 8.2 Comparable services based outside the UK, we list some comparable Institutions in Europe and North America. Finally, in section 8.3 Example projects , we list a few examples of current Arts and Humanities projects likely to be of interest to the AHDS community.

It should be emphasized that these lists are all intended for illustrative purposes only, and are not intended to be comprehensive. The services discussed are not all exclusively concerned with Humanities resources, and range from major well-established Services on the one hand to small-scale pilot projects on the other. Information here was provided by the Services concerned before or during the feasibility study, and has not been checked in detail.

8.1 UK Services

8.1.1 British Atmospheric Data Centre (BADC)

8.1.1.1 Overview

This Centre provides data on the atmosphere and its boundaries to researchers working in the area of atmospheric sciences, global climate change and earth observation. Like others which hold data on other physical sciences (e.g. oceanography, solar-terrestrial physics, etc.), the Centre provides access to experimental data sets, many of which can be very large, and very quickly generated (for example, the digital data from satellite observations). Such data sets are thus rather different from those typically used by humanities disciplines, but other aspects of the service provided by such Centres are very similar.

The emphasis is entirely on research rather than teaching.

8.1.1.2 Key services:

8.1.1.3 Extent

Data relevant to atmospheric science in its widest sense.

8.1.1.4 Source of funding

National Environmental Research Council.

8.1.2 Computers in Teaching Initiative (CTI) Centres

8.1.2.1 Overview

There are 20 CTI Centres in the UK, of which six are immediately relevant to the Humanities disciplines. The Centres, which are based in universities, promote and support the use of computers in the teaching of higher education subjects.

In some ways the CTI structure may be seen as a model for the proposed AHDS structure, with a number of autonomous discipline- related Centres, and a central support organisation --- the Computers in Teaching Initiative Support Service (CTISS), based at Oxford.

8.1.2.2 Key services:

8.1.2.3 Extent

Each centre typically consists of two or three persons; their holdings of software and data sets can amount to some hundreds of items. The humanities-related centres cover the following subjects:

8.1.2.4 Source of funding

Higher Education Funding Councils.

8.1.3 ESRC Data Archive (EDA)

8.1.3.1 Overview

The Archive's primary mission is to promote wider and more informed use of data for teaching and research. As well as encouraging current use, there is a strong emphasis on long-term preservation. In January 1993 the Archive set up a History Data Unit, with funding from the ESRC for an initial two years. Some continuation funding will be provided by the British Academy and is being sought from other sources.

A particular feature is the BIRON catalogue. This catalogue covers most of the Archive's holdings, and has been built up over a number of years by trained librarians using a controlled vocabulary. The Archive has relationships with several other institutions, such as:

Much data documentation is paper based, but work is under way to introduce electronic formats, and so far this has been completed for about 850 data sets.

The Archive currently holds the secretariat of the Council of European Social Sciences Data Archives (CESSDA), and the UK membership of the Us-based Inter-University Consortium for Political Science Research (ICPSR). It is due to become a European specialist centre for data archiving methodology, under the CESSDA umbrella.

8.1.3.2 Key services

8.1.3.3 Extent

Disciplines include social sciences, economics, history, geography, and law.

The Archive is one of the largest social science archives in the world. It has over 7,000 data sets, of which nearly 4,000 are fully catalogued and the rest are opinion polls. The number of data sets is increasing at a rate of about 250 per year. [See note 47] Each year more than 2,000 data sets are accessed, and over 1,500 data catalogues are distributed.

8.1.3.4 Source of funding:

ESRC, Essex University, JISC.

8.1.4 Global Environmental Change Data Network Facility (GENIE)

8.1.4.1 Overview

This is a project initiated by the ESRC, which is funding and managing the facility on behalf of the Inter-Agency Committee on Global Environment Change (IACGEC). It is a consortium consisting of the University of Loughborough, University of Nottingham and Genasys II Ltd; it started work in April 1992. It consists of a centralised facility and a decentralised network. Of particular interest in this project is the use of knowledge engineering approaches to provide intelligent concept-based searches of materials held at all the participating institutions.

The primary tasks of the project are:

8.1.4.2 Key services:

8.1.4.3 Extent

The project is still in its initial pilot phase. The aim is to cover all the relevant materials held at all the participating institutions.

8.1.4.4 Source of funding

Inter-Agency Committee on Global Environment Change and ESRC.

8.1.5 Manchester Information Datasets and Associated Services (MIDAS)

8.1.5.1 Overview

This organisation exists for the installation and management of strategic research and teaching data sets for access and analysis by the UK academic community, and the provision of the services required to facilitate and support this. It specializes in the provision of on-line access to large- scale data sets and the software needed to process them for research and teaching purposes. Principal examples of the data are the 1981 and 1991 census data, and government survey data sets. Data sets are currently stored in software-specific formats. Software supported include SIR, Ingres, BRS/Search and SASPAC.

Access is available without charge to all bona fide requesters, subject to the copyright restrictions imposed by the depositors.

8.1.5.2 Key services:

8.1.5.3 Extent

A large number of the data sets managed by MIDAS are obtained via the ESRC Data Archive (see
8.1.3.4 Source of funding: ), and support services relating to them are run in conjunction with the Data Archive.

The data sets are relevant to the social sciences and some humanities disciplines.

There are currently some 800 users, largely staff and postgraduates, from about 100 sites. However, MIDAS is a new service, and the number of registrations increases daily. For its predecessor, there were several thousand users registered, from over 150 sites.

8.1.5.4 Source of funding

JISC.

8.1.6 NetEc

8.1.6.1 Overview

``NetEc'' is a term that unites a number of projects for networked interaction in academic Economics. At the time of writing, there are three projects, called BibEc, WoPEc and CodEc. They aim to improve the communication of new research results in Economics via electronic media. Traditionally, new research results have been published in paper documents called ``working papers'' or ``discussion papers''. The aim of BibEc is to enhance the awareness of these papers in the academic community by establishing means to announce the publication of new papers via electronic media. The aim of WoPEc is to build a collection of electronic working papers in postscript format, for free retrieval by anybody with Internet access. CodEc, which opened in June 1994, aims to build a collection of software routines which are useful in the study of Economics.

The effort is multinational in scope, with involved people in six countries, but no formal management structure. The project uses an internal bibliographic format referred to as BibEc format. Papers are stored as PostScript files, usually compressed.

8.1.6.2 Key services:

8.1.6.3 Extent

Over 20,000 bibliographic references (late 1993), growing rapidly; small number of papers.

Covers academic economics only.

8.1.6.4 Source of funding

Managed on a voluntary basis.

8.1.7 Office for Humanities Communication (OHC)

8.1.7.1 Overview

This organisation aims to influence, and to provide a centre of expertise on, academics' use of computers in all aspects of their work (compare the CTI Centres discussed in section
8.1.2.4 Source of funding below).

8.1.7.2 Key services:

8.1.7.3 Extent

The office, which consists of three people, covers all the Humanities disciplines in Britain.

8.1.7.4 Source of funding

British Library Research and Development Department.

8.1.8 Oxford Text Archive (OTA)

8.1.8.1 Overview

The OTA exists to serve the interests of the academic community by providing archival and dissemination facilities for electronic texts at low cost.

It offers scholars long term storage and maintenance of their electronic texts free of charge. It manages non-commercial distribution of electronic texts and information about them on behalf of its depositors.

The texts vary in format, quality and degree of mark-up. There is a gradual process of converting to standard formats. The texts are available for research and teaching, subject to the copyright limitations imposed by their depositors.

8.1.8.2 Key services:

8.1.8.3 Extent

More than 1,300 texts, mainly in the domains of literature and linguistics, in many languages.

8.1.8.4 Source of funding

Funded by Oxford University Computing Services and a grant from the Text Encoding Initiative (TEI). It also makes a small charge to cover media services when copies are sent other than by Internet.

8.1.9 UK Office for Library and Information Networking (UKOLN)

8.1.9.1 Overview

UKOLN provides services to the library and information community. It conducts, manages and publishes research, and also maintains a number of information resources (on networking issues, the Internet, bulletin boards, discussion groups, ``gopher- space'', the World Wide Web, WAIS, etc). UKOLN also plans to set up a digital archive of library literature in the near future.

8.1.9.2 Key services:

8.1.9.3 Extent

Consists of seven fulltime staff; areas of interest are:

8.1.9.4 Source of Funding

British Library Research and Development Department and JISC.

8.2 Comparable services based outside the UK

8.2.1 Bibliothèque Nationale de France (BNF)

8.2.1.1 Overview

This library is currently building a data service which will be one of the largest, if not the largest, in Europe. It will consist of a public library, which will succeed the Bibliothèque Nationale as France's major repository of printed books. However, it will also include a vast collection of books, images, sound and motion pictures in digital form. The digital resources will include Frantext and several other substantial collections. It is due to open in 1997.

8.2.1.2 Key services

The library will provide facilities at its site for members of the public and accredited researchers to refer to digital documents. Large numbers of workstations, some specially developed, will be available for consultation of these resources. The number of workstations envisaged by the year 2000 is:

The academic research workstations are particularly interesting. Named the Computer-Aided Reading Environment (CARE) or in French: Poste de Lecture Assistée par Ordinateur (PLAO)), these will offer sophisticated facilities for accredited researchers to use text and image documents. The facilities will include features such as navigation, temporary annotation, linking, searching, printing, extracting and temporary saving.

The library has negotiated with French publishers agreements to permit electronic storage and use of works still in copyright, thought the initial agreements are for a limited period. Perhaps for this reason, there are no plans to allow remote access to the digital collection; all access will be from the main site in Paris.

8.2.1.3 Extent

The digital collection is due to include books, journals, and secondary literature, extending to:

The intention is to build a strong corpus of French literature and reference texts. The main domains identified for digital holdings are French literature, anthropology, philosophy, history, ethnology, economics, law and the history of science. The corpus is expected to contain both original texts and modern editions which include commentary on the originals.

8.2.1.4 Source of funding

French Government.

8.2.2 Center for Electronic Texts in the Humanities (CETH)

8.2.2.1 Overview

This center was established by Rutgers and Princeton Universities in 1991 and is funded by the National Endowment for the Humanities to act as a national focus within the USA for the creation, dissemination and use of electronic texts in the humanities.

8.2.2.2 Key services:

8.2.2.3 Extent

Bibliographic database; other material is mostly primary source texts in the humanities.

8.2.2.4 Source of funding

Rutgers and Princeton Universities, the National Endowment for the Humanities, and the Andrew W Mellon Foundation. A subscription-paying ``consortium'' of member institutions will start in 1995, and there are also plans for associated grant-funded projects.

8.2.3 Canadian Heritage Information Network (CHIN)

8.2.3.1 Overview

This agency has 21 years of experience acting as an information service for the museums and heritage communities both in Canada and elsewhere.It creates national collection databases for humanities, natural sciences and for archaeological sites from data contributed by museums and heritage agencies across Canada. Beside these national databases, CHIN hosts more specialised databases related to the heritage community.

Holdings include bibliographic databases, directories, controlled vocabularies (for example the Getty Art History Information Program's Art and Architecture Thesaurus), indexes to artists, etc. These databases are created by other organisations and distributed and managed by CHIN.

8.2.3.2 Key services:

8.2.3.3 Extent

The range of materials covered includes heritage information, archaeology, humanities, and the natural sciences.

8.2.3.4 Source of funding

Communications Canada, an agency of the Canadian Government.

8.2.4 Dansk Data Arkiv (DDA)

8.2.4.1 Overview

This Archive preserves and disseminates data sets in the domains of social sciences, history and medical research, almost exclusively from questionnaire-based surveys. Early holdings were in social sciences, the other domains having been added since 1985. On an exceptional basis, the archive may accept electronic texts outside the domain of history. It supports both research and teaching functions. The current level of use is rising ``rapidly'', but is currently approximately 250 requests, representing 750 data sets, per year.

Data is obtained from the following sources:

Note that deposit of data with the DDA is often a condition of grant from Danish research councils.

8.2.4.2 Key services:

8.2.4.3 Extent

Archives data from the following disciplines:

About 1,000 data sets are held with complete documentation.

8.2.4.4 Source of funding

Danish Ministry of Culture.

8.2.5 DANTE InfoFLOW

8.2.5.1 Overview

DANTE was set up in 1993 by several European research networks. Its purpose is to provide advanced international computer network services for the European research community. One of its major activities is the management of EuropaNET. It is today a small organisation employing some ten people (but it is actively recruiting more).

In June 1994, DANTE prepared a proposal to set up an information server called InfoFLOW for the European research community, the main purpose of which would be to assist people to locate valid information sets. It is now recognised that further work is required before such a proposal can be taken forward. Work on the proposal is expected to continue, though there is no firm timetable for it.

If InfoFLOW goes ahead in the shape currently envisaged, it would have, potentially at least, a large area of overlap with AHDS. Specifically, the AHDS and InfoFLOW would each wish to hold metadata about the same data sets.

8.2.5.2 Key services

InfoFLOW would hold metadata in a series of servers in different countries, with the actual data sets being held mainly by the data provider (i.e. on systems not under the control of DANTE). The proposal describes indexing by subject and location.

8.2.5.3 Extent

The proposal envisages a service to serve all disciplines (not just Arts and Humanities). Users would refer to the metadata by WWW and Gopher unless and until other approaches displace these as de facto standards.

It is suggested that accession decisions will be taken by external subject experts contracted for this purpose.

8.2.5.4 Source of funding

Not determined yet; likely to be drawn from infividual information providers, national networks and the European Union.

8.2.6 Netherlands Historical Data Archive (NHDA)

8.2.6.1 Overview

This institution, which is located at the University of Leiden, provides services and resources for research and (to a lesser extent) for education. Its central aim is to collect, document, store and disseminate research files created by historians working within academic and official institutions.

It is also a centre of excellence for scanning and OCR of ``difficult'' materials (such as carbon copies); and it administers part of a postgraduate programme at the university.

8.2.6.2 Key services:

Holds about 100 data sets, mainly structured social and economic history files.

8.2.6.3 Source of funding

Approximately 50 per cent from projects of the Dutch Ministry of Education or National Science Association; the balance from other external projects.

8.2.7 Getty Art History Information Program (AHIP)

8.2.7.1 Overview

This is one of seven operating programs of the J. Paul Getty Trust, based in Santa Monica, California. Its purpose is to make computer-based art-historical information accessible to scholars. It collaborates with a number of US and international organisations in a wide range of research and development projects.

Among AHIP's priorities are the development of standards relevant to art historians --- e.g. of terminology, the extension of bibliographic resources, and the development of tools to aid art historical research --- such as The Authority Reference Tool.

8.2.7.2 Key services

Initiating and supporting programs in four major areas:

8.2.7.3 Extent

Amongst major research resources developed by AHIP or with its support are:

8.2.7.4 Source of funding

The J. Paul Getty Trust.

8.2.8 Norwegian Computing Centre for the Humanities (NCCH)

8.2.8.1 Overview

This centre exists to provide information and consultancy services for Norwegian research projects involving the use of computers in the humanities. NCCH is also involved in research activities; it is particularly known for its expertise in corpus building.

Most work at present is in support of research; but it is felt that support for teaching activities will grow.

In the furtherance of its research activities, it has assembled a collection of digital resources. However, the Centre does not get involved with the commercial aspects of marketing and distribution.

8.2.8.2 Key services:

8.2.8.3 Extent

All holdings and activities relate to the humanities.

8.2.8.4 Source of funding

Until 1992, the NCCH was funded directly by the Research Council of Norway. In 1992, it became a part of the University of Bergen, and is now jointly funded 50 per cent by the Research Council and 50 per cent by the University. Funding is guaranteed for the next five years.

8.2.9 Norwegian Social Sciences Data Archive (NSD)

8.2.9.1 Overview

This is the primary archive service in Norway for the preservation of electronic research data. It started as an archive for social sciences, but is now active in other fields.

It supports both research and teaching functions.

Some data sets receive a high level of use; up to 400 accesses per month in one case.

8.2.9.2 Key services:

8.2.9.3 Extent

Archives data from the following disciplines:

8.2.9.4 Source of funding

60 per cent from the Research Council of Norway, balance from special projects and the provision of services.

8.2.10 Research Libraries Group (RLG)

8.2.10.1 Overview

RLG was founded in 1974, as a non-profit alliance of Higher Education institutions, devoted to improving access to information that supports research and learning. It sees its major role as supporting and co-ordinating the development of co-operative solutions to the problems associated with the acquisition, delivery and preservation of research information.

Its structure is relevant to the proposed AHDS structure, in that has a co-ordination and support role among its constituent organisations, and centrally manages certain key services. One of its new projects is also significant --- the Archival Server Project, which seeks to address most of the issues of resource management, preservation and access which will be fundamental to AHDS.

8.2.10.2 Key services:

8.2.10.3 Extent

RLG has more than 120 member institutions, including universities, archives, historical societies and museums.

The RLIN database holds more than 56 million items drawn from over 200 institutions.

8.2.10.4 Source of funding

There are three revenue streams:

8.2.11 University of Virginia Electronic Text Center (UVA)

8.2.11.1 Overview

An ``Electronic Text Center'' was founded at the University of Virginia in 1992. It exists to create and support a new broadly based user community within the humanities at the University, and to establish the use of electronic texts as a mainstream resource for pedagogy and research in the University.

It has a diverse collection of humanities data sets (see below). Most are in text form, with a few in image form. The texts are encoded with TEI-conformant SGML. A key feature of the Center is that the same access software can be used across the collection, even though some of the documents come from CD-ROMs which included proprietary software.

Because of contractual obligations with the vendors who supply the texts and the search software, access to the on-line text service is restricted to University of Virginia students and staff.

8.2.11.2 Key services:

8.2.11.3 Extent

Originally the Center started up with commonly available humanities resources in English and Latin, such as the Oxford English Dictionary, the Patrologia Latina and a collection of texts from the Oxford Text Archive. This corpus has since been added to by data sets generated by the University's staff and students, and it is expected to grow continually in this way. English is the main language, but there are also texts in many other languages including French, German, Latin, Greek and Hebrew.

8.2.11.4 Source of funding

University of Virginia, Alderman Library.

8.3 Example projects

In this section we list a number of projects in the Arts and Humanities which are concerned with the creation or dissemination of resources of potential relevance to the AHDS community. No attempt has been made here to undertake a systematic overview; instead we have tried only to indicate the range and variety of data sets currently or potentially available. The entries are grouped by broad project type.

8.3.1 Network and metadata projects

8.3.1.1 Georgetown Catalogue of Projects in Electronic Text (CPET)

From 1989 to 1992, the CPET project at Georgetown University Center for Text and Technology compiled a catalogue of projects that create and analyse electronic text in the humanities. The result is a powerful database which includes information on electronic text projects throughout the world; though it is not wholly up to date. It includes a variety of information on many collections of literary works, historical documents, and linguistic data which are available from commercial vendors and scholarly sources.

8.3.1.2 Norwegian Documentation Project

This project at the University of Oslo started in 1991, and is due to last six years. Its objective is the conversion of paper based archives of the so-called collection departments at universities in Norway ranging from the Viking ship museum to the lexicographic departments.

The final product of the project is to be a networked national information system for the humanities, called ``The Norwegian universities' databases for language and culture'', which will integrate these diverse resources to allow (inter alia) multi- disciplinary research. It will also allow for improved public access. An article by C. Ore describing the project is to appear in the Norwegian special issue of Computers and the Humanities, in 1995.

8.3.1.3 Writers And Their Copyright Holders (WATCH)

This project is a collaborative venture between the University of Reading library and the Harry Ransom Humanities Research Center at the University of Texas, Austin. It started in April 1994.

The project aims to build a database of copyright holders, initially for literary works in the English language whose papers are housed in archives and manuscript repositories. The database is intended to be freely accessible via the Internet. Early work suggests that 8,000 authors should be included in the first phase, and that this may be achieved by 1995.

8.3.1.4 InfoServices

This a co-operative project in the Netherlands between the National Research Network (SURFNET) and the Netherlands Royal Library. Its purpose is to manage a national information server, and in so doing to establish standards for good practice. It aims to develop organised access to Internet resources, taking over the Dutch National Entry Point, and experimenting with a subject-based approach to search hierarchies. It has links with the SURFDOC project, which will make technical and other literature available in a consistent way over linked servers. [See note 50]

8.3.1.5 The Nordic WAIS/World Wide Web Project

This project was funded by NordInfo and involved the Danish Technical Library and the Lund University Library. Its primary goals are to develop tools for automated processing of WAIS source files to produce classified indexes available over World Wide Web, and to develop better gateways between WAIS and World Wide Web. [See note 51]

8.3.1.6 The Allison Research Index of Art and Design (ARIAD)/head>

ARIAD is a comprehensive bibliographic database covering all aspects of research in art and design. It is based at De Montfort University, and its early development was funded by a grant from the DTI. Currently the database is available electronically on discs in MS-DOS and Macintosh formats. However there is great interest in making it available over the networks.

8.3.2 Art and Art History projects

This group is represented because of the increasing importance, to art historians and other scholars, of networked information, including high quality images.

8.3.2.1 Remote Access to Museums and Archives (RAMA)

This is an experimental project which demonstrates network links between participating museums in six European countries. It allows images and text (primarily) stored in these museums to be found and viewed from terminals in the other locations. One of the museums also contributes video clips with sounds. At present, the project is at demonstration stage. It is hoped that permanent connections will be established at a later stage. It is funded by the European Union under its Esprit III initiative.

8.3.2.2 Beazley Archive

This is an archival collection of photographs, notes and drawings relating to ancient Greek art located in the Ashmolean Museum, Oxford.

Currently based on text, the 50,000-record database is being enhanced to include images and video clips. Additionally, work is in progress to allow remote access to these enhancements. The text portion is already available to over 25 remote sites around the world via the Internet.

8.3.2.3 Corpus of Romanesque Sculpture

This is a project at the Courtauld Institute of Art, sponsored by the British Academy, to create a complete record of the surviving heritage of sculpture produced in Britain and Ireland between c.1066 and 1200, by means of scanned monochrome photographs, and textual descriptions. It is intended to make available the images to visitors and wider distribution may also be considered.

8.3.2.4 Database of Ancient Greek Sculpture

This project, funded by the Leverhulme Trust, is based at the Ashmole Archive, King's College London. Its purpose is to research and process the complete evidence for all known ancient Greek sculptors, including ancient texts, epigraphy, sculptures, coins, etc and to link this mass of data together in a systematic way in order to create a major new research resource. It will include a large number of images, based largely on the photographs of the Ashmole Archive, but including also material obtained from a number of museums and other institutions.

8.3.2.5 Witt Computer Index

This is a major art history project funded by the Getty Art History Information Program and the Courtauld Institute of Art. It is based on the holdings of the Courtauld's Witt Library --- photographs and illustrations of American and 18th century British paintings and drawings. It is creating a comprehensive database of information about these works, using the Iconclass system to classify and describe each work. This makes it possible for information about the works to be accessed under a wide range of headings --- by artist, date, and subject, and in ways not previously possible. Currently the database is entirely textual, and no images of the works are included.

8.3.3 Text and image projects

This group is represented because of the increasing number of projects which are ``text''-based but where the inclusion of images, in particular of manuscripts, significantly enhances the usefulness of the resource as a research tool.

8.3.3.1 Archivo Digital de Manuscritos y Textos Españoles (ADMYTE)

Based at the Biblioteca Nacional of Madrid, this project has the objective of capturing in digital form a large collection of manuscripts and incunabula. The result will include monochrome and colour images, and a modern interpreted version of the documents in full text, tagged with SGML. The project is running in parallel with the development of a dictionary of 15th century Spanish.

8.3.3.2 Canterbury Tales Project

This is a project, with major funding from the British Academy, to digitize in text and image form all 83 manuscripts and four pre-1500 printed editions of Chaucer's Canterbury Tales, together with some analyses. The project is in its early stages.

The project is taking place at the University of Sheffield and Oxford University.

8.3.3.3 Electronic Peirce Consortium

This is a project to scan and transcribe the manuscript works of Charles Saunders Peirce, the 19th century American philosopher. The project may grow to include up to 100,000 pages (depending on funding), though only 500 pages are available initially. The project is based at a number of institutions in the US, including Georgetown University, Brown University and Texas Tech.

8.3.3.4 Wittgenstein Archive

This project is based at the University of Bergen, Norway. It is preparing all Wittgenstein's unpublished manuscripts (over 20,000 pages) in machine-readable and facsimile form, for which purpose it has also developed a sophisticated mark-up scheme, and software to process it.

8.3.3.5 Hartlib Papers

The Hartlib Papers project, sponsored by the British Academy and the Leverhulme Trust, is based at Sheffield University. It is preparing a text and page-image database of the manuscripts of Samuel Hartlib, who lived in the 17th century. The database will include some 20,000 leaves of manuscript, plus printed material in four languages.

Hartlib had the self-appointed task of collecting and disseminating knowledge; his surviving papers cover every aspect of 17th century intellectual life, including education, language and literature, philosophy, science, religion, politics and agriculture. The database will include hypertextual links between the page images and text transcriptions.

8.3.4 Language corpus projects

8.3.4.1 International Computer Archive of Modern English (ICAME)

This is an informal international organization of linguists and scientists working with English machine-readable texts. Its aims are to collect and distribute information about English language materials available for computer-aided linguistic research, to compile an archive of English text corpora in machine-readable form, and to make material available to research institutions. The Norwegian Computing Centre for the Humanities acts as a distribution centre for ICAME materials.

8.3.4.2 British National Corpus (BNC)

This is a collaborative project jointly funded by the DTI and SERC. It has created a 100 million word representative corpus of modern English of all kinds, spoken as well as written, tagged in SGML with word-class annotations. The corpus will be made freely available for research. The partners in the project are Oxford University Press, Longman UK, Chambers-Larousse, Oxford University Computing Services, the University of Lancaster, and the British Library Research and Development Department.

8.3.4.3 Corpus of Contemporary Spanish

This project, based at King's College London, aims to create a corpus of 20 million words drawn from a wide range of sources and subject areas, and covering both Iberian peninsula and Latin-American Spanish. The first phase of the project --- 5 million words of Iberian Spanish --- is nearing completion. It is intended that the corpus will be published on CD-ROM and over the networks.

8.3.5 Historical, musical and specialist interest projects

There is a very large number of historical projects, a growing number of music and music history projects, and a wide variety of projects concerned with highly specialist interests. One project from each area is described briefly below. The AHDS would help to ensure the wide availability of such resources to their academic audiences, both nationally and internationally.

8.3.5.1 Prosopography of the Byzantine Empire

This is a major project of the British Academy, housed at King's College London. Its goal is to record all surviving information about every individual mentioned in Byzantine sources during the period from 641 to 1261, and every individual mentioned in non-Byzantine sources during the same period who is ``relevant'' (on a generous interpretation) to Byzantine affairs. It is intended to make the material available, with sophisticated search tools, both on CD-ROM and on-line.

8.3.5.2 Thesaurum Musicarum Latinarum (TML)

This is an evolving database that will eventually contain the entire corpus of Latin music theory written during the Middle Ages and the early Renaissance. The project is run by a consortium of universities; the Project Office is centered at Indiana University, Bloomington. Special provision is made in the database for musical notation, both by encoding in a standard manner and by the inclusion of scanned images compressed in GIF format. The database is freely accessible via the Internet.

8.3.5.3 Syriac Computing Institute

This Institute, at Cambridge University holds a number of texts about, and in, Syriac. These include the Syriac Electronic Data Retrieval Archive(SEDRA) and four new testaments in syriac (versions according to the siniaticus and curetonianus manuscripts, and peshitta and harklean versions), all of which are available for research use.


BACK TO TABLE OF CONTENTS
ON TO NEXT SECTION
BACK TO PREVIOUS SECTION