Databases Used in Medicine, 1983
Page  1

DATABASES USED IN MEDICINE Martin M. Cummings, M.D. Director, National Library of Medicine Mr. Chairman, distinguished colleagues. It is a privilege for me to be here today and to have this opportunity to share some thoughts with you regarding the computerized data bases in use today to support the health professions. I am especially pleased to be present at this meeting of the International Pharmaceutical Federation, for much of what we do at the National Library of Medicine has application to the pharmacy profession. Pharmacy has evolved into a complex professional discipline since its beginning in the Babylonian-Assyrian and Egyptian cultures 4,000 years ago. Drugs derived from plants, animals and minerals became so numerous that they required some degree of classification and documented prescription, centuries before the beginning of formalized medical science. From the several hundred early formulations used by Egyptian practitioners about 1,000 years, B.C., we now live in a world where several million chemical compounds have been identified, of which thousands are in common daily medical use. It is awesome to contemplate the volume of literature which has been generated since those early beginnings. We estimate that the literature of harmacology/toxicology is growing at a rate of 140,000 articles each year, scattered among some 12,000 journals. Even a cursory inspection of our catalog reveals at least 20,000 monographs on the topic. The seeker of drug and chemical information is faced with a formidable task.

Page  2 Primary sources of information are found through the bibliographic apparatus developed by the early European pharmacopoeias and treatises, and more recently by institutions such as the National Library of Medicine. Today, we must of necessity turn to computer-based systems to provide both references and data about drugs and chemicals. The National Library of Medicine, with its unrivalled collections of biomedical literature, has pioneered the effort to provide such systems to the medical community. The Library traces its history back to 1836, when a collection of medical books was established in the Army Surgeon General's Office. It maintained its connection with the military until 1956, when the Congress transferred the Library to the U.S. Department of Health, Education, and Welfare (now, Department of Health and Human Services), and formally renamed it the National Library of Medicine. The Library is located just outside of Washington, D.C., on the grounds of the National Institutes of Health in Bethesda, Maryland. NIH is one of several agencies that make up the Public Health Service, a part of the United States Department of Health and Human Services. Within the confines of the National Library of Medicine is the world's largest collection of biomedical literature. From 11th century manuscripts to contemporary audiovisual materials, the Library attempts to collect all of the world's substantive medically-related literature, regardless of form. The holdings today number over three million books, journals, technical reports, manuscripts, theses, pamphlets, microfilms, audiovisual materials, prints and photographs. More than 70 languages are represented in this vast and unique collection.

Page  3 The single most important influence on the early development of the Library was John Shaw Billings, a brilliant Army surgeon, who directed the Library from 1865 to 1895. Under his guidance, the collection became a truly national resource--seeking out and acquiring medical literature published throughout the world and providing services to physicians everywhere. Billings instituted a vigorous program that resulted in a stream of literature flowing into the Library that had to be brought under bibliographic control. To meet this challenge, he developed an index to the world's biomedical journal literature known as the Index Medicus, which is still a thriving publication. Another of his great achievements, begun one year later, was the Index-Catalogue of the Library of the Surgeon General's Office, which covered all of the world's published literature in the Library's collection. Sonnedecker, in his excellent revision of Kramer and Urdang's History of Pharmacy (1), praised the Index-Catalogue as an unexcelled, massive, and accurate work with entries listed by author, title, and subject. These two publications continue to this day--the Index Medicus, now a monthly listing of articles from over 3,000 medical journals, and the catalog, now known as the National Library of Medicine Current Catalog. Since 1964 the high-speed computers have been producing in a matter of hours the typeset copy for each month's 1000-page Index Medicus--a chore that previously took weeks by conventional methods.

Page  4 In recent years, the range of NLM's activities has been broadened considerably. In 1965 the Congress passed the Medical Library Assistance Act, giving the NLM a grant authority to support research, training, library resources, publications, and a network of regional medical libraries. In 1967 the Toxicology Information Program was established within the Library to provide a national focal point of access to information on toxicology. Under this program, the Library has set up a center to provide reference functions and has created several specialized online bibliographic and data retrieval services, with emphasis upon hazardous chemicals and drugs. Also in 1967, the National Medical Audiovisual Center became a part of NLM. This Center had as its principal goal to improve the quality and use of biomedical audiovisuals in health professions schools and the biomedical community generally. The Center also developed an online data base of audiovisual materials employed in health sciences education. Earlier this year, the Audiovisual Center was merged with another NLM component, the Lister Hill National Center for Biomedical Communications. The Lister Hill Center, established in 1968, is the research and development component of the Library. Its mission is to explore the uses of advanced computer and communications technology to improve health education, biomedical research, and health care delivery. The Lister Hill Center was instrumental in developing the Library's online retrieval services and has conducted communications experiments using satellites, microwave and cable television, computer-assisted instruction, videodiscs, and other new technologies.

Page  5 Despite all of the new responsibilities that have been added to the Library, I can assure you that the NLM is not neglecting its basic mission of collecting, organizing, indexing, and making available the scientific and scholarly literature of the biomedical sciences. The Library serves primarily institutional clients throughout the world. It is, in effect, a "library's library." NLM fulfills this function through two network arrangements. The first is for the provision of photocopied articles and original materials on interlibrary loan. In the United States this is accomplished through the Regional Medical Library Network, supported and coordinated by NLM. More than two million loans are provided annually through this network. The second network is for the provision of computerized reference retrieval services, called MEDLARS. In the United States this network consists of more than two thousand institutions that have online access to the NLM's databases. These institutions represent hospitals, medical schools, universities, research foundations, government agencies, and commercial health-related firms. I will describe a little later the range of online services made available to them. Both of these networks have an important international aspect. Thousands of photocopied articles are sent out by NLM each year in response to requests from libraries in other countries. In addition, the Library has literature exchange agreements with some 400 partners in 72 countries.

Page  6 As to the computerized retrieval services, we have shared our databases with thirteen international centers located in Australia, Brazil, Canada, Colombia, England, France, West Germany, Italy, Japan, Mexico, South Africa, Sweden, and here in Switzerland. Through these major centers, MEDLARS services have been extended to some 140 nations around the globe. Thus, the Library serves as an international resource for practitioners and scientists of the health disciplines. I would now like to discuss MEDLARS, and some of the 20 specialized databases it supports, because I believe it is of particular interest to you as pharmacists. The main objective of MEDLARS is to provide references to the world's biomedical literature; presently, MEDLARS contains some 6 million references to the literature dating from 1964. Spurred by increasing demands for bibliographic services and a growing backlog of reference requests, NLM undertook in the late 1960s to develop a responsive and widely accessible computer system. The result, after several years of experimentation with a prototype online system, was the introduction of MEDLINE (MEDLARS Online) in October 1971. Using MEDLINE, it is possible for health professionals throughzout the United States and in other countries to have immediate access to base of about 800,000 citations to the most recent articles. These articles are derived from more than 3000 different biomedical journals, including not only the 2700 indexed for Index Medicus but additional special journals in fields such as nursing and dentistry. Access is through computer terminals that are linked to the National

Page  7 Library of Medicine's computers by telecommunication lines. That online bibliographic searching was welcomed enthusiastically by the biomedical community is evidenced by the fact that the network of institutional users has grown to more than 2000, and that more than 2.2 million computer searches-are being performed annually. These searches are being used to satisfy a diversity of user needs. A recent study has shown that 36% of all searches were for patient care, 26% for research, 16% for education, and the remainder for management and other purposes. By far the largest number of search requests was made by physicians--43%. An important finding was that less than 1% of the searches were not health related. The increasing use of home computers and the sophistication of users have led to a number of individual health professionals joining NLM''s network. Access to MEDLINE is also available through such private vendors as Bibliographic Retrieval Services, Inc. (BRS) and Lockheed/DIALOG, who lease NLM's MEDLARS tapes. MEDLINE is an interactive system, that is, the user identifies the journal article references he needs by carrying on a dialog with the computer--typing in successive queries on the terminal keyboard. MEDLINE may be accessed by some 14,000 medical subject headings in the controlled MeSH vocabulary, from "Abattoirs", to "Zymosan." It may also be accessed by "free text" terms that appear in the title or abstract of an article, or by any combination of controlled or free text vocabulary. Users may also search by an author's name, a publication date, language, a specific journal title or a combination of these elements.

Page  8 After pertinent references are located, the computer prints at the terminal the author, title, and journal source for each citation. English abstracts are available in about half of the citations. If a large number of references are retrieved they may be printed overnight at the National Library of Medicine and mailed to the requester the next day. The entire online search usually takes less than 10 minutes. The MEDLINE search service has received enthusiastic acceptance by health practitioners, researchers, and educators throughout the world. This acceptance is reflected not only in the growing numbers of terminals and their geographic distribution, but also in additional information services and products which we are making available. Several data bases are of special importance to your profession. One of these is TOXLINE (Toxicology Information On-Line). TOXLINE contains over 1.4 million bibliographic records and abstracts dealing primarily with the toxicology and pharmacology of drugs, pesticides, environmental pollutants, and hazardous waste and industrial chemicals. While one of the primary sources for TOXLINE information is MEDLARS, there are eleven secondalry sources from which the TOXLINE file is built; among these is International Pharmaceutical Abstracts, published by the American Society of Hospital Pharmacists. TOXLINE is updated monthly and is growing at the rate of about 140,000 records per year. Some 100,000 searches are done on this data base annually.

Page  9 As with MEDLINE, TOXLINE contains references from the most recent years, while older information is carried in backfiles. TOXLINE is of special importance to drug and poison information centers throughout the United States. Another file, CHEMLINE (Chemical Dictionary On-Line) is the Library's interactive chemical dictionary file, whereby records for more than 550,000 chemical substances can be searched and relevant information retrieved online. CHEMLINE contains such information as Chemical Abstracts Service Registry Numbers, molecular formulae, and generic and trivial names. The Toxicology Data Bank, or TDB, is an online file composed of over 4,000 comprehensive chemical records. Compounds selected for the TDB include hazardous waste, high production volume or high exposure chemicals and drugs with actual or potential toxicities. Records in the TDB contain up to sixty different data elements which are grouped into eight categories; these include pharmacological and toxicological data, environmental and occupational information, manufacturing and use data, and chemical and physical properties. In addition to MEDLINE and the toxicology-related data bases, there are several other specialized files I will mention briefly. In cooperation with the National Cancer Institute, NLM has made available a number of cancer-related databases. The largest of these is CANCERLIT (Cancer Literature) which contains more than 300,000 references to articles, books, and reports dealing with various aspects of cancer. HEALTH PLANNING & ADMINISTRATION is another data base that contains references on health planning, organization, financing, management, and related subjects. There are also NLM files on population research, POPLINE; history of

Page  10 medicine, HISTLINE; and ethical questions in health care and research, BIOETHICSLINE. The three newest data bases on the NLM system are all collaborative undertakings. The first two were developed by the National Cancer Institute: Protocol Data Query, known as PDQ, contains protocol descriptions of some 700 active NCI-sponsored research projects, and CANCEREXPRESS, a selective file of current references on all aspects of cancer, taken from several hundred high quality journals. The third new data base, which became available last month, is Directory of Information Resources Online, or DIRLINE. The DIRLINE data base was developed by the Library of Congress and contains descriptions of some 13,000 organizations which provide specialized information. In addition to the data bases offered by the National Library of Medicine, there are several other online files useful to the practice of medicine and related health professions. One of these is Excerpta Medica. Excerpta Medica's online system consists of a bibliographic file of abstracts from 3,500 biomedical journals. The online file corresponds to the more than 40 specialty abstract journals and two literature indexes which make up the printed Excerpta Medica, plus an additional 100,000 records annually that do not appear in the printed journals. One of the newest online services is BIOMED, offered by the Institute for Scientific Information, a commercial firm in

Page  11 Philadelphia. (2) BIOMED is a data base that is directly searchable by the user. It consists of some 700,000 recent references from 1400 journals. A unique feature of BIOMED is that the user can search by what the Institute calls "research fronts." A research front is a cluster of journal articles related by overlap in the references the articles cite. An important multi-purpose data base is the American Medical Association's "AMA/GTE Telenet Medical Information Network," or AMA/NET. This service was introduced last year with the goal of making information accessible to the office-based practitioner. The service will provide clinical, administrative and medical practice information, as well as a bibliographic file on current medical literature. Some of the files being made available are a drug information file that contains data on the clinical use of drugs, and will eventually include drug alerts, new product bulletins, patient medication instructions, dosage algorithms and adverse drug reaction reporting. Another file of this new service will contain disease and diagnostic information on 3,500 identifiable diseases, plus laboratory data. A third file will include the literature reference and document order service. Still other files will include medical procedure coding and nomenclature, socio-economic bibliographic information, a continuing medical education bulletin board and an electronic mail service. These and other data base sources are helping to ensure that all health professionals have rapid and efficient access to the world's biomedical literature. But what of the future?

Page  12 Libraries, by tradition and etymology, have been concerned with books and journals. Increasingly, however, we see our role not as collectors, storers, and disseminators of published works, but as providing access to information. This information may be in books and journals, but it also may be in microform, on videocassettes or discs, and in computer files. At the National Library of Medicine, we have large collections of data that exist only in a computerized form, with no printed counterpart. The reason for this, of course, is that our patrons seek information, in whatever form. Our past accomplishments in applying computer technology to handling large volumes of biomedical information have aided virtually all our Library operations--from selecting and ordering the literature, to indexing and cataloging it, and to making the resulting bibliographic information available worldwide. However, remote searching of large, centralized, online data bases is "old" technology. During the remaining minutes, I would like to acquaint you with what we think future new technologies may do to improve information services. In attempting to assess the potential impact of new technology, it is helpful to identify the relevant technologies as well as the specific information services to be affected. The two most salient technologies in the near future are the microcomputer and laser-based technologies. Microcomputers, of course, are already in widespread use and will increasingly be found in every facet of information use. Under the term "laser technologies" are found high resolution

Page  13 digital scanning, optical storage, and computer-driven laser printing. The application of these technologies has tremendous potential not only for biomedical communications but for all library and information services. I will briefly describe a few applications. The first has to do with that most basic library function--archival storage. It will soon be possible to achieve "prospective preservation"--that is, capturing and storing.the printed literature in machine-readable form at the time of publication. And with only one videodisc, 100,000 pages of text or the total annual output of some 100 scientific journals can be stored. A second application is what is called demand publication, by which we mean the ability to retrieve documents stored in a digital form and to reprint them on demand with a quality comparable to the original. This will be possible by electronically scanning document pages at high resolution and storing the electrical signals on an optical digital disc. Under computer control, the disc may then be accessed to display and retrieve the pages at the same high resolution. Another is the electronic journal--those that will exist and be delivered exclusively in an electronic format. A number of important scientific publishers are now experimenting with these, driven by the seemingly inexorable rise in publishing costs. As electronic access becomes more common, the number of journals on paper will decrease, behooving users as well as librarians to learn how to search the new electronic publications.

Page  14 The delivery of randomly accessible graphics via optical videodiscs is another potential application. Such graphics capability will make feasible the concept of online encyclopedias that contain not only text, color plates, and halftones, but also randomly accessible audiovisual motion sequences under the control of the viewer. There is the exciting possibility of replicating billions of characters of scientific information on digitally encoded optical discs and then providing low cost copies to local, regional, and national information centers. One potential application of this technology would allow the entire MEDLINE file (some 800,000 references from the last three to four years) to be stored on two or three videodiscs. The data bases could then be made available inexpensively for use on equipment at local institutions. What will be the effect of these new applications on libraries and on users? Regardless of how technology impacts the format of information and the technical operation of libraries, I believe the traditional role of the library as a permanent archive of recorded knowledge will remain basically unchanged. Libraries will continue to perform their traditional functions of organizing information, providing assistance for locating information, and serving as a source of material for scholarly efforts. They will, however, be required to provide increasing access to machine-readable information as well as computer-based tools to aid in the production of derivative products.

Page  15 The computerized data bases used in medicine today, as extensive and valuable as they are, represent only the beginning. It took 400 years from the introduction of printing in the 15 century to the development of comprehensive medical indexes in the 19th century. Only thirty years after the introduction of computers we have massive online reference data bases available worldwide. I have tried to give you a glimpse of what the new technologies will make possible in the future, but it is difficult to see very clearly, very far. However this much I can say with confidence: the only limitations we will face will be imposed not by technology but by our own imagination and creativity. What we require are solutions to the difficult and complex problems of how to use new technology to develop systems that provide answers to questions rather than citations or references. The precision of pharmacy must be matched by the precision of information sources and resources. Thank you.

Page  16 References 1. Sonnedecker, G. History of pharmacy. 4th Edition. Philadelphia: J.B. Lippincott Co. 1976. p. 569. 2. Garfield, E. The Institute for Scientific Information. In: Coping with the biomedical literature, Warren, S, ed. New York: Praeger Publishers. 1981.