|
SCOTT AND THE RACE FOR THE SOUTH POLE: THE TRIUMPH OF METADATA
Kerrie Talmacs
Metadata Coordinator
University of New South Wales Library
Sequel to "Scott and the Race for the South Pole: the Heroic Failure of Cataloguing the Web" by Reg Mu Sung
Abstract
Just as many factors determine the success of expeditions, many approaches will be required to overcome the chaos of the World Wide Web. While some people contend that librarians have left their run too late, there are many experiments and projects underway which aim to assist Web searching. One of the new tools of the trade is metadata. Traditional metadata such as MARC cataloguing is not appropriate for all Web resources, and other new forms such as Dublin Core, AGLS etc. are being promoted and used in Web pages and subject gateways. Several University of New South Wales projects are examined to illustrate how metadata might play a role in assisting access to the Web
Amundsen’s Success
Amundsen, the Norwegian hero, conquered the South Pole, December 14th 1911.Will librarians be like Amundsen and succeed in their bid to tame the Web? How did Amundsen triumph over Scott in the race for the South Pole? Several factors contributed to his success (Ryne 1998):
- a lifetime of polar exploration;
- the realisation that a common failing of polar explorers was their inability to captain a vessel;
- the selection of an Antarctic base a the Bay of Whales, 60 miles closer than Scott’s at Cape Evans;
- a reliance on dog sleds to pull the team who were all excellent skiers;
- a focussed determination after believing that he was beaten to the North Pole.
Although recent evidence indicates that Amundsen was actually first to the North Pole (Norway Now 1998), in 1911 he saw capturing the South Pole as a consolation prize, and took all the necessary steps to beat his rival, Scott.
Many Solutions to Web Access
Many factors contribute to a successful expedition. Similarly many approaches to providing access to information combine to make sense of the Web. No one solution will fill all the needs of resource creators or searchers. While nobody would argue that the combination of library or union catalogues, and abstracting and indexing services was the perfect solution to providing access to information, the options were at least clear. Now librarians and information professionals are experimenting with projects, gateways and electronic interfaces to determine the best ways for searchers to find the information they need. Some people suggest that librarians have missed the boat. For example Martin Dillon, Executive Director OCLC Institute, speaking at a May 1998 graduation speech to the University of North Carolina School of Information and Library Science said " It is clear to virtually all who use the Internet that chaos reigns there. What impact is the library profession having on this chaos? I would say it is almost unnoticeable" (1998: 47). While a lot remains to be done in terms of organising or cataloguing the Web, librarians are doing some truly innovative work often in partnership with other professionals and after learning some new skills.
Recent literature often refers to librarians as navigators. Catlin (1996) says "It has been claimed that in the future, instead of fortresses of knowledge, there will be an ocean of information and libraries, therefore, should be considered as harbours – places where skilled navigators and the latest equipment can be found". A review of the "2020 Vision" report to the Libraries Working Group of the Cultural Ministers’ Council states that it "describes their future role as navigators through the confusions of the new knowledge economy and points out that if they are either unable or unwilling to steer customers through the bewildering profusion of web sites … they can be sure there will be plenty of competitors who will" (Norman 1997: 60)
This paper examines one of the new tools from the librarian’s toolbox – metadata, and suggests that it will be significant in pointing the way to Web resources. It is, nevertheless, too early to say with any certainty that metadata will fully live up to first expectations.
Metadata
Wendler defines metadata in the library context as "the information needed to identify, locate, manage and access materials the library wishes to make available to its users" (1999: 43).This definition covers both traditional MARC cataloguing and new metadata schemas devised for the Web such as Dublin Core. While MARC cataloguing is well suited to certain material such as electronic journals, it is the new style of metadata that is being used in many new projects. Hopkins states:
"It should be emphasized that the Dublin Core is not cataloging; it adheres to no descriptive rules, guidelines, or habits. Taking the Dublin Core Creator or Contributor elements as examples, the Dublin Core has no rules to specify the form in which either personal or corporate names should be given …The result is that two objects to which Dublin Core data elements have been assigned could have Creator fields that present the same individual’s name in different forms; cataloging rules are devised to prevent this type of inconsistency." (1999: 64)
Weibel, one of the leaders of the Dublin Core initiative acknowledges that metadata is "more informative than an index entry but is less complete than a formal cataloging record" (1995: 1). "The Dublin Core metadata rest on six principles to achieve ease of creation and broad applicability. The Dublin Core data elements are descriptive only of intrinsic properties, eliminating the use of external references (to cataloging rules or authority files), are extendable to include additional specialized information, are syntax independent, are optional as well as repeatable, and can be modified through qualifiers to convey meaning beyond the commonly understood definition " (Weibel in Younger 1997: 467)
While accepting that metadata is less rigorous than MARC cataloguing, it is being recognised that metadata does have a role to play in describing Web resources.
Benefits of metadata include:
- Simplicity of creation and maintenance;
- Commonly understood semantics;
- International scope, with a broad consensus;
- Consistency across forms and types of resources;
- Conformity with existing frameworks, e.g. libraries, Web search engines.
Through its simplicity, and with the help of appropriate tools, metadata can be created, in theory, by anyone who has received suitable training. This is one aspect of its appeal, but as most cataloguers realise there are hidden traps, and there is a lot more to resource description and access provision than one might think at first glance. Cataloguers can play a significant role in training others, interpreting standards and formulating these standards.
Many schemas have been developed, with the most well known in Australia being the Dublin Core. The latter forms the basis of other metadata used in Australia, e.g. AGLS (Australian Government Locator Service), EdNA (Education Network Australia), and the government recordkeeping metadata.
UNSW Applications
To illustrate how metadata can assist Web resource discovery and management I will describe some projects I’ve been working on in my role as Metadata Coordinator at the University of New South Wales.
- Library Metadata
The Library was selected as a test-bed for metadata creation because of its in-house expertise and knowledge of traditional metadata i.e. MARC cataloguing records. Metadata has been made a priority and is mandatory for all new Web pages. Retrospective work will be completed by the beginning of first session 2000.
Library issues included:
- choice of schema and standards;
- choice 0f thesaurus, if any;
- who will create metadata;
The Dublin Core was a natural choice, and decisions were made about the elements and core requirements. Some qualifiers are used and nine elements are mandatory. The UNSW Library template (1998) automatically generates the Library as the publisher and the UNSW copyright is also automatic. "Text/html" for format and "en" for Language are defaults, thereby minimising the data to be input. Natural language terms are used in the subject area and not LCSH as we use in the catalogue. This was a difficult and not entirely popular choice, but LCSH is not easy to access or use, and terminology is often inappropriate and out of date. The Library Web Coordinator adapted the Nordic Project template (1998) to suit our purposes, and the Metadata Coordinator added explanations about elements according to current Dublin Core standards. Management preferred distributed metadata creation, with Web pages developers responsible for adding metadata to their pages at the time of creation. Training involved two hours of explanation and hands-on work. The librarians in the various departments cut and paste the metadata created by the template into the Head section of Web pages.
The standard has been kept simple, and the approach conservative due to the unfinished nature of qualified Dublin Core (to be confirmed in October 1999 at the DC Frankfurt meeting). The DC educational working group will be completing their deliberations on educational qualifiers in early 2000.
- Campus Metadata
The University’s search engine – Infoseek Ultra, supports Dublin Core metadata, enabling searchers to do more precise fielded searches, and benefiting from the additional information in the metadata. Searches performed using other search engines will benefit from the metadata only if those search engines support metadata.
The Division of Information Services encourages the use of metadata across campus. Web developers have been surveyed, and although they show strong support for the use of metadata, they cannot in all cases, commit resources to the task. Automatic metadata generation, e.g. by using the MetaWeb generator, is an option and there could be assistance to Web developers if a campus thesaurus of terms were developed. Whatever the eventual outcome, the result will probably be a hybrid model. As mentioned above, the University’s search engine supports metadata, and widespread use by Web developers should enhance more precise searching.
- National Projects
The UNSW Library is working collaboratively on several projects, which depend on metadata for their database content:
- MetaChem – a catalogue of chemistry resources (1998);
- AVEL – Australasian Virtual Engineering Library: engineering & IT gateway (1999);
- Australian Digital Theses Project (1998).
The databases of the above projects comprise metadata either created by humans in the case of the two gateways, or automatically generated (for the ADT Project). The latter process is determined by specifications drawn up by the Metadata Coordinator.
All the projects use the HotMeta software developed by the Distributed Systems Technology Centre. This software fully supports the metadata used – mainly Dublin Core, with some additions from EdNA and AGLS schemas. AVEL also uses some gateway specific elements – AVEL.Publisher, AVEL.Type and AVEL.Comment and
A-Core (for metadata creator details). The DSTC metadata editors Reg and Reggie are used, allowing metadata creators to concentrate on the content without worrying about the syntax. They are also assisted by drop down menus and linked element explanations.
The metadata in all cases provides the index for searching or browsing, with users having the option to search on "All elements" or nominated elements, with or without stemming. The metadata for the projects resides in a database unlike the Library’s metadata, which is embedded in the Head section of Web pages. The embedded metadata does not appear on the Web page and is only visible when one clicks on "View" then "Source" which displays all the "behind the scenes" HTML data. The difference with the AVEL and MetaChem projects is that metadata creators are not the Web pages creators and, unless they gain permission, cannot access the Web servers for those pages. MetaChem and AVEL store the "pointer" data in separate databases. The purpose of the gateways is to aid searchers in their quest for quality resources in their subject area. The sites are thereby promoted through the inclusion of metadata records in the subject gateways.
Based at the UNSW Library, the Australian Digital Theses Project aims to establish a distributed national database of digital theses from the consortium of eight participating universities. It is based on the US initiative of the Networked Digital Library of Theses and Dissertations, led by the Virginia Polytechnic Institute and State University. During the trial electronic versions volunteered by students are submitted in parallel with paper versions. In the future the trend will be toward library archival copies being in digital format only. It appears that Griffith University may be the first institution in the project to archive in electronic format exclusively. As with the subject gateways, metadata forms the basis of the database and index for searching. In this case, however, the metadata is generated automatically from student submission details. Nine elements are used – Creator, Title, Subject (keywords), Date (date approved), Rights (copyright statement), Description (Abstract), Language, Contributor (School/faculty), and Identifier (URL of thesis). The Metadata Coordinator and project leader drew up metadata specifications with assistance from the National Library’s Metadata Coordinator. Although the metadata is very simple and is limited in application, it should be sufficient to aid searchers looking for these on specific topics or by particular authors.
- Recordkeeping Metadata
The UNSW has recently purchased a new electronic recordkeeping system. Integral to its operation is metadata which will, inter alia, identify, authenticate, document use history, assist retrieval, and restrict unauthorised use. The metadata will in fact play a key role in the process of managing and preserving records in the electronic environment. Much progress has been made in the area of recordkeeping metadata standards, with the National Archives of Australia publishing "Recordkeeping Metadata Standards for Commonwealth Agencies" (1999). The latter takes into account the Australian Government Locator Service (AGLS) and SPIRT (Strategic Partnership with Industry – Research and training) metadata standards, and it conforms to Australian standard AS4390. The SPIRT project based at Monash University (1999) aims to develop a framework for standardising recordkeeping and archival metadata, and supports interoperability with the generic standards. This is one more example where metadata can assist in Web access in an environment where more records are being stored electronically.
Conclusion
The above projects illustrate how one organisation is using, or plans to use, metadata to organise and access information. While full MARC cataloguing would be inappropriate and uneconomic in these applications, metadata has filled the gap between the two extremes of full cataloguing and search engine access. It has a flexibility and simplicity which has made it attractive. This simplicity, however, brings with it a myriad of problems easily seen by those steeped in the traditions of cataloguing. Many issues remain unresolved, and metadata standards have yet to evolve to a satisfactory level, e.g. Dublin Core qualifiers have not been determined. Nevertheless I think it is safe to say that metadata does have a future.
In 1928 Umberto Nobile, the engineer who had built Amundsen’s airship for a 1926 North Pole flight, crashed in the Arctic. In spite of some enmity between the two men, Amundsen showed his magnanimity and went to the rescue. While Nobile was rescued, Amundsen and his crew were lost, and the plane was never found.
Bibliography
Australian Digital Theses Project. (1998) Australian Digital Theses Project. Available: http://www.library.unsw.edu.au/thesis/thesis.html [1999, September 27]
Catlin, Ivan. 1996. Technolust and Other Sins: the Public Library in the Wired Society. Aplis 9 (3/4) September/December 1996
Dillon, Martin. 1998. In Oder, Norman. Cataloging the Net: Can we do it? Library Journal vol. 23, no.16, Oct 1 1998
MetaChem. 1998. MetaChem – A Catalogue of Chemistry Resources. Available: http://metachem.ch.adfa.edu.au [1999, September 27]
National Archives of Australia. 1999. Recordkeeping Metadata Standard for Commonwealth Agencies. Available: http://www.naa.gov.au/govserv/techpub/rkms/intro.htm [1999, September 27]
Nordic Project. 1998. Dublin Core Metadata Template. Available: http://www.lub.lu.se/cgi-bin/nmdc.pl [1999, September 27]Norman, Rosemary. 1997. [Review] of 2020 Vision: Towards the Libraries of the Future: a Report Prepared for the Libraries Working Group of the Cultural Ministers’ Council, 1996. Aplis, 10 (1), March 1997.
Strategic Partnership with Industry – Research & Training (SPIRT). 1999. Recordkeeping Metadata. Available: http:// www.sims.monash.edu.au/rcrg/research/spirt/index.html [1999, September 27]
University of New South Wales Library. 1998. Dublin Core Metadata Template – UNSW Library. Available: http://www.library.unsw.edu.au/~eirg/metatemp.html [1999, September 27]
Wendler, Robin. 1999. Branching Out: Cataloging Skills and Functions in the Digital Age. Journal of Internet Cataloging. Vol.2 (1)
Younger, Jennifer A. 1997. Resources Description in the Digital Age. Library Trends, vol.45 (3), Winter 1997
Kerrie Talmacs
Metadata Coordinator
University of New South Wales Library
|