|
Authors: Tony Cargnelutti
Librarian; Electronic Information Resources Group
Tel: +61 2 9385 1531 Fax: +61 2 9662 6309
Email: t.cargnelutti@unsw.edu.au
Fred Piper
Technical Manager; Electronic Information Resources Group
Tel: +61 2 9385 2634 Fax: +61 2 9662 6309
Email: f.piper@unsw.edu.au
University of New South Wales Library
Sydney 2052
Australia
ABSTRACT
Theses are underutilised information resources, due to limitations on information about their content and restrictions on access to them.
This paper describes current developments of a pilot project that has the potential to enhance knowledge about, and access to, Australian theses via the web. The Australian Digital Theses Project (ADT) develops a local model based on work originally carried out in the U.S.A. The model is for the creation of a database of digitised theses, accessible via the web using locally modified software for the self-submission of theses directly into this database.
This Australian project, a collaborative effort by seven Universities, further extends the original concept to develop a prototype nationally distributed database of theses.
THESES IN GENERAL
There are approximately 4,000 research degrees awarded each year in Australia. This means postgraduate theses represent a significant proportion of Australia's research activity.
However, lack of easy access renders theses as an underutilised information resource. Lack of access and subsequent lack of use can be attributed to a number of factors: lack of knowledge that the thesis exists; lack of information about the contents of the thesis; lack of ready availability.
The question arises then as how to ‘unlock’ the valuable information contained in theses and make it easily available to researchers everywhere. A possible solution would be to create a searchable database and deliver the theses in digital format via the web. This is the aim of the Australian Digital Theses (ADT) Project.
ADT PROJECT GENESIS
This pilot project is a collaborative effort involving 7 institutions: The University of New South Wales (lead institution), Australian National University, Curtin University of Technology, Griffith University, University of Melbourne, University of Queensland and University of Sydney. The project is being funded by an ARC-RIEFP grant and the initial stage of the work is being carried out in 1998/1999.
The central aim of the ADT Project is to establish a distributed database of digital theses produced by postgraduate research students at the participating institutions with the theses delivered across the web. Approximately half the of the research theses awarded in Australia come from the partner institutions in the ADT Project.
The project has 3 primary objectives:
- to establish standards for the for electronic theses creation, storage and access
- to create an electronic archive of frequently requested theses
- to establish procedures for the submission of electronic theses by students as part of the conditions of the award of the degree.
The model adopted for the ADT Project is the one developed by the Virginia Polytechnic Institute (commonly known as Virginia Tech or VT) and State University. This US based initiative is centred around the Networked Digital Library of Theses and Dissertations (NDLTD) with VT being the leader. All Australian ADT partners are members of the NDLTD initiative.
THE VT MODEL
The Virginia Tech model is based on self submission of theses in digital format using a web-based form, the whole process being automated with the theses ultimately being accessible via the web. This self submission process is seen as the key to increasing efficiency and to be cost effective for both the student and the institution concerned. The model parallels the traditional ‘paper trail’. VT has developed this model over the past 15 years with electronic self submission being made compulsory 2 years ago. See: http://etd.vt.edu/
THE ADT MODEL
Whilst the local ADT model is a modified version of the VT one, there are significant differences that had to be thought about and thought out, and ultimately factored into the process and software for the purposes of the project at hand.
The VT model, now compulsory, has been developed for that institution specifically and is based exclusively on self submission. There are also cultural and semantic differences between the 2 countries that needed transcribing.
The significant differences that the ADT pilot had to deal with and find solutions for where:
- as a collaborative project the web-form used for submission, and the software program behind it, had to be flexible with changeable variables to work across the 7 institutions, eg - able to be re-badged to reflect the local corporate look as well as working within the local IT infrastructure and architecture. In fact, it had to be as generic as was possible to make
- the flexibility had to also allow for the range of options described in the original project proposal, eg – ultimate self submission (as per VT), parallel paper/electronic submission (as will be the case for the foreseeable future) plus retrospective conversion and submission of older theses (for ILL purposes etc)
- generation of Metadata. This was ultimately programmed to be generated automatically from the information in the web form
- creation of a distributed database across the 7 institutions. While the database was across the sites the actual theses files where to be held on the local institutions’ servers. The ADT project has developed a model where the automatically generated Metadata forms the distributed database. This will then be gathered and searched using an Australian designed metadata robot/search engine with links back to the theses files at the local institution
- factor in the possibility of using e-commerce options for charging a fee or royalties for access to the full text of theses
- develop a range of standards for all the options described above
- develop standards and processes in keeping with international trends and practices, eg UMI
- liaise with the National Library and use the ADT project to beta-test new processes, eg – Metadata, Universal Resource Name (URN) Resolver service
- maintain the transparency and simplicity of the process
THE ADT PROJECT DEVELOPMENT
The team at UNSW commenced work on the project mid 1998. Virginia Tech agreed to release a mirror of their site, which included the submission form and software. When all the necessary changes had been made and the modified software rigorously tested the ADT site and software was released to the project partners – April 1999. The collaborative ADT site at UNSW is at: http://www.library.unsw.edu.au/thesis/thesis.html This site will be also maintained as the archive of information on the project’s history and development.
The next phase is for each of the partners to install the ADT software locally and begin putting theses through the process and to collectively develop a body of digitised theses. The creation of a body of theses and the testing and development of the distributed database and search engine will take place during the latter part of 1999.
ADT PROJECT STANDARDS
During the software testing and modification phase the UNSW team proposed a set of draft standards which where ratified, after further augmentation and amendment, at a workshop including all representatives from the partners involved.
The core standards are:
- Definition of thesis. The ADT project will only process PhD or Masters Theses by research
- Submission be change to Deposit as the word better reflects all the options being investigated by the project, ie not just self submission of actual electronic theses, but electronic copies of existing theses already in the system
- Metadata (Dublin Core Standard) will be automatically generated out of the ADT Deposit form. This Metadata will form the basis of the database of distributed digitised theses across the 7 participating institutions. The specific tool to gather and search this Metadata is likely to be the HotMeta - Metadata Search Engine developed by the Distributed Systems Technology Centre (DSTC). See: http://www.dstc.edu.au/RDU/HotMeta/
- Specific URL / PURL / URN for ADT theses will be as follows - "adt-" will be immediately followed by the institution code as per the Australian Interlibrary Resource Sharing Directory (ie the NUC code) immediately followed by the year the thesis is deposited followed by a running 4 digit number, eg "1997.0001" to make ../adt-NUN1997.0001 for the first thesis from UNSW for 1997. Examples of the other institutions would be:
……../adt-ANU1997.0001(Australian National University) .…..../adt-NU1997.0001 (University of Sydney) …..../adt-QCU1997.0001 (Griffith University) ….../adt-QU-1997.0001 (University of Queensland) ….../adt-VU1997.0001 (University of Melbourne) …../adt-WCU1997.0001 (Curtin University)
- The standard document format will be Adobe Acrobat PDF with security set to allow read and print only. The security settings are optional as authors may wish to set more security. PDF is rapidly becoming a worldwide standard, offers high levels of document security, appears to be reasonably stable and because it has become a standard document format will have to be able to be migrated to any future standard/s, is flexible and easy to use, a high quality reader is readily, easily and freely available as well as being a format that is platform independent
- Filename standard. There will be a minimum of 2 PDF files for each thesis. The first file will always be called front.pdf and will contain title/author information; abstract; acknowledgments; table of contents; introduction; preface and any other introductory text that is not part of the main body of the thesis. The other, or subsequent files will contain the body of the thesis. It is further recommended that these files have some meaning names, eg chapter1.pdf etc..
- Retrospective conversion. Although the software is best suited for self submission it can be used for retrospective conversion of theses as well. This will involve obtaining permission from the authors concerned and for a third party to do the submission, eg staff member. However, the attraction and cost benefit will come with the theses only having to be converted the one time only and be available either at cost, or free for any individual or institution to access at anytime. One of the partners, Melbourne University, is concentrating on this aspect of the project and will develop guidelines and standards for the project
Further details on issues and standards can be obtained from the ADT site at UNSW: http://www.library.unsw.edu.au/thesis/etd.html
CURRENT STATUS OF ADT PROJECT
- HotMeta/Metadata software: installed and being evaluated by the UNSW project team. Issues involved include the degree of customisation required to meet local site specific views as well as for the overall distributed database
- E-commerce: possibilities are being examined by project team and Communications Unit at UNSW with view of making recommendations to project partners intending to charge a fee/royalties for full access. This charge would ‘kick’ in after the accessing the front.pdf file
- Rules and procedures: partners are in varying stages of having the official requirements changed to accept at least as an option, the submission of electronic theses instead of paper. With these changes, and ultimate implementation will come related issues of training and support of both examiners and students
- Submission/deposit software: is currently being installed and/or being used at each partner institution
- Griffith University is in the process of finalising the necessary approval for the archival library copy to be in digital format only. This would involve a parallel submission path – a paper copy for the examiners, and when the thesis has been awarded the library copy will be submitted in electronic format using the ADT project software. If all goes to plan, Griffith University Library will become the first in the project to archive in electronic format exclusively and developments there will be keenly watched by all partners concerned
- UNSW has also developed a local model for a link to the full text of the theses to be available via the web catalogue. Presently this involves some encoding by the cataloguers but ultimately will become fully automated when true self submission becomes a reality See: http://alpha.nun.unsw.edu.au/cgi-bin/brief_disp?00771745=on
FURTHER COMMENTS
While there are many issues and concerns specifically relating to digitising theses and their subsequent archiving, that are not canvassed in this paper, the ADT project has demonstrated, and will further demonstrate as it develops, that a successful local model has been established. This model is also flexible, scalable and could become a prototype for a truly national database of electronic theses. As the project further develops, with the possibility of including other interested institutions, more work and research will need to be done on issues such as document integrity, copyright and plagiarism, and the long term archiving of electronic theses.
The project team is aware that not all theses can be easily converted into digital format, or into PDF specifically. However, the team is confident that most could be, and for those that cannot be, there is enough scope within the model by way of the metadata and front.pdf files to at least find their existence, and some information about them. To this end, the project team has always focussed on what has been realistically achievable and tried to work ways around obvious, and probably unsolvable problems. To follow developments on the project keep an eye on the ADT site - see the address in the beneath the title.
INFORMATION SOURCES
- Edward A. Fox, John L. Eaton, Gail McMillan, Neill A. Kipp, Laura Weiss, Emilio Arce, and Scott Guyer. "National Digital Library of Theses and Dissertations: A Scalable and Sustainable Approach to Unlock University Resources,'' D-Lib Magazine, September, 1996.
- Tony Cargnelutti, Fred Piper, Karen Kealy. "The Australian Digital Theses (ADT) Pilot Project: the trials, tribulations and (some) successes." EDUCAUSE in Australasia: People and Technology Doing IT Right. Sydney Hilton Hotel, April 1999, Sydney, Australia.
URL: http://www.library.unsw.edu.au/~eirg/cause99.html
- For a collection of papers on digital theses and the NDLTD initiative in particular go to:
- For links to relevant digital theses sites and associated information go to:
URL: http://www.library.unsw.edu.au/thesis/thesite.html
|