Article Summary for Lecture #6 – Babb

“Cataloging Spirits and the Spirit of Cataloging”

With tongue slightly in cheek (I think), Nancy M. Babb uses an arcane set of bibliographic rules involving “spirit communications” — works from the dead, communicated to the living via mediums — to show how cataloging rules and popular culture can evolve over time, and perhaps shed light on evolving theories of authorship and bibliographic identity in the changing world of electronic resource cataloging.

Known for rapping on tables and communicating by Ouija board, spirits were prolific poets and novelists in the late 19th century, and were duly recorded as such in most library catalogs, before the advent of bibliographic standardization. An example: Food for the Million (1876) gives main entry to medium Sarah A. Ramsdell, with added author entry for spirit Theodore Parker, who in his previous fleshly life had published his own works.

Babb admits it’s an odd subject, but insists the concept of “spirit authorship” provides insight into broader underlying principles of authorship: It’s not just quaint folklore. Babb calls the field “an exemplar of complex authorship, entailing both joint authorship — as a collaboration between medium and spirit — and ascribed authorship, in which the seemingly obvious author, the medium, makes attribution to another, the spirit.”

The issue has been addressed different ways in different times. Babb researched a sample of such works and found a hodgepodge of differing treatments, with the relationship between spirit and medium either more or less vague.

Spirit communications were directly addressed in the 1941 and 1949 cataloging guidelines from ALA, with main entry generally given to the mitigating medium and the communicating spirit treated as an interviewee, making a clear distinction between spirit communication and other types of joint authorship.

But even after 1949, differences and variations remained in cataloging: “Spirit communications in which the spirits were established historical figures entailed main entry under the medium and added entry under the historical figure. Spirit communications in which the spirits were prolific, well-known, but not of proven historical existence entailed main entry under the spirit, with qualifier added to the heading. Spirit communications in which the spirit was either not prolific [or] of unknown origin…entailed entry only for the medium.”

Seymour Lubetzky’s 1960 Code of Cataloging Rules dictated: “A work…attributed to the spirit of another person, is entered under the person who prepared it, with an added entry under the person to whom it is attributed.” Perhaps surprisingly, the first set of Anglo-American Cataloging Rules in 1967 softened Lubetzky’s “bibliographic banishment of spirit communication,” reverting to pre-20th century treatment, but getting rid of the label “mediumistic writings” in favor of “reporter or person reported.”

Spirits attained further bibliographic prominence in Anglo-American Cataloging Rules of 1978, which insisted it was not the cataloger’s job to debate authorship, only to interpret the bibliographic encoded data.

Can a book be ascribed to a spirit if spirits don’t actually exist? Well, given the moral neutrality of modern cataloging, an author need not physically exist to have bibliographic identity. A main entry under the spirit is not an admission of the existence of spirits but merely “to affirm user access and authorial intent.”

Over time, credit for authorship under cataloging rules has passed back and forth across the mortal veil from Spirit, to Medium, and back to Spirit again. Today, AACR reads: “Enter a communications presented as having been received from a spirit under the heading for the spirit.”

Babb intriguingly uses the concept of spirit authorship to throw light on the ever-expanding modern idea of authorship. She smoothly intertwines the halting evolution of the catalog as it tracks with the cultural context of the various time periods. It’s a fascinating story, if a dense and confusing thicket to untangle, and Babb does her best.

Article Summary for Lecture #5 – Park

“Dublin Core metadata semantics: an analysis of the perspectives of cataloguing and metadata professionals.”

Concerned about the inconsistent and inaccurate application of the Dublin Core (DC) metadata scheme among library professionals, authors Jung-ran Park and Eric Childress studied DC metadata semantics via a web survey, to determine the difficulty those professionals perceived in applying the DC’s metadata elements. Park and Childress found that ambiguity and semantic overlap in some of the metadata elements posed significant problems for library professionals.

Metadata — traditionally housed in the form of MaRC (Machine-Readable Cataloguing) bibliographic records — has long been the keystone of resource description and discovery in libraries. Standardized metadata assists in semantic interoperability — that is, the ability of different computer systems in different libraries to reliably and accurately exchange data, enabling them to share information via a shared vocabulary.

Dublin Core, named after the Ohio city in which it originated in 1995, is one of the most widely used metadata schemes. It aims to facilitate cross-domain information exchange, and thus targets the simplest, lowest common denominator of resource description, to ensure that the information can be understood by all library systems.

But such simplification has also emerged as a weakness, resulting in conceptual fuzziness and overlaps in the definition of some elements — that is, semantic overlap — among some of the metadata elements of the Dublin Core. That has meant inconsistent metadata application, even among cataloguing professionals. Compared to other metadata schemes, the DC lacks semantic richness because of its simplicity, which promotes inconsistency across institutions.

Park and Childress aimed to fill a void of research on metadata semantics. Their web survey posed two questions:

1.Which Dublin Core metadata elements pose the greatest degree of difficulty to    cataloguing and metadata professionals during the metadata creation processes?

2. What are the primary factors engendering such difficulty?

Participants in the web survey pointed out the vague, ambiguous nature of Dublin Core metadata. One complained that “Dublin Core does not provide clear enough descriptions of what is contained in each element.”

The Relation field caused a particularly high degree of difficulty. It’s indeed a complex, perhaps ambiguously defined element, which requires the identification of both formal and informal relationships to other resources — such as a picture in a document, or a cover of a song, or some other derivation or adaptation of the original resource.

The authors turned from ambiguity to the related problem of semantic overlap – the difficulty of defining and distinguishing elements. One study participant confessed they had had to provide extensive documentation to distinguish between Format and Type, which seemed interchangeable. The difference between Format and Description was also hard to tease out.

In addition to those major problems of ambiguity and semantic overlap, the authors found that the “one-to-one principle” — that is, a single metadata description for each resource type — was often violated in practice. For instance, given a physical painting and a photograph of the original painting, should the photographer or the original painter be indicated in the Creator element field?

This was ultimately a clear and convincing study. But ironically, it skimped on supplying specific examples that would have fleshed out the more puzzling elements of DC. That would have made the underlying problem more comprehensible and made the study more effective for the novice metadata scholar. (In other words, “So what exactly is a Relation, anyway?”)

Article Summary for Lecture #4 – O’Neill

In “FRBR: Functional Requirements for Bibliographic Records – Application of the Entity-Relationship Model to Humphry Clinker,” OCLC research scientist Edward T. O’Neill attempted to “FRBRize” a single work, the 1771 English novel The Expedition of Humphry Clinker by Tobias Smollett, as a case study.

First, he explained just what FRBR is: an entity-relationship model approach to cataloging, proposed by the IFLA study group, using four “primary entities” — Work, Expression, Manifestation, and Item — and how they relate, which would encompass “products of intellectual or artistic endeavor” and enable superior collocation.

The work Humphry Clinker was chosen for three main reasons; it’s been previously studied, it’s of midlevel “complexity” (by which he seems to mean, a work with multiple expressions or editions), and it’s widely held in WorldCat.

O’Neill examined the manifestations (that is, the actual physical embodiments of an expression) of Humphry Clinker held at various libraries. Many manifestations were in fragile condition and/or encased in rare book collections. To capture information about the books while limiting potential damage, he used a digital camera instead of a copying machine (a more significant move in 2001 than today).

Along the way O’Neill made judgment calls, like determining that the Long “S” in earlier manuscripts was merely a typeface issue and didn’t quality as another aesthetic expression, but was just another physical manifestation. Other, more significant revisions — principally the addition of additional material to the original text — were considered “sufficient to create a different expression.” That included textual notes, dedications, and illustrations. “Identifying the illustrators was particularly problematic,” he wrote.

Another problem were the unreliable bibliographies that accompanied some editions. In some cases, O’Neill couldn’t discover whether the difference between bibliographic records represented different manifestations. To confirm those cases, one would have had to examine one or both of the books, but some were in too poor and fragile a condition to be loaned out.

O’Neill’s explanation of the difference between an expression and a manifestation was helpful: “Most of the expressions were created as the result of an editor adding an introduction, notes, or a bibliography; the addition of illustrations, or both…manifestations were the result of the expression either being published by a new publisher or being republished with the type being reset.”

He warned that “bibliographic records simply do not contain sufficient information to reliably identify expressions” and that “Determining if two manifestations embody the same expression proved to be very difficult” via records alone.

He also warned that, in a shared cataloging environment, FRBR could unwittingly create many duplicate records, and that the ambivalent definition of an expression by the IFLA study group may encourage excessive flexibility on the part of cataloguers who employed FRBR and who might, for convenience sake, wish to avoid birthing a “new expression, no matter how minor the modification may be.”

O’Neill brainstormed some alternatives to expressions, such as replacing them with “additional manifestation attributes.” He used illustrators as an example: One could offer scholars of, for instance, cartoonist George Cruikshank, a way to identify him with the manifestations of Humphry Clinker he illustrated.

O’Neill concluded ruefully, perhaps burying his lead finding: “The irony is that the FRBR model provides minimal benefits to the small workers that can be reliably FRBRized, but fails on the large and complex works where it is most needed.” In fact, his conclusion is notably less cheery than the article’s bland title and abstract would indicate.

Article Summary for Lecture #3 — Yee

In “Wholly Visionary,” Martha M. Yee reviews the events that led to the development of the Library of Congress card distribution program, which created a de facto national bibliography and the once-ubiquitous, now nostalgic “card catalog.” Yee’s narrowly focused history eventually expands to incorporate qualms about the Web and ruminations about the free accessibility of our cultural record in today’s political environment.

In 1901, the Library of Congress began to distribute its printed catalogue cards to U.S. libraries. Yee documents the indispensable catalysts that brought that about: men like Melvil Dewey and Herbert Putnam, the actions of the American Library Association and Library of Congress, and the inaction of publishing houses (public over private is a pattern in the article).

There were some false starts, however.

The American Library Association was founded in 1876, and Putnam, an early ALA mover and shaker, claimed a “main purpose” for the organization was “a centralization of cataloguing work.”

In 1877, ALA cofounder Dewey wondered if the Library of Congress could provide a national catalog. Yet it was another generation before the Library began producing cards for the nation’s libraries. Yee asserts the lack of a single set of cataloging rules was a major factor in the delay.

Putnam eventually became Librarian of Congress, and set about creating a national library and bringing the card distribution program to fruition. By 1905, card catalogs were replacing book catalogs, and the Carnegie Foundation was constructing libraries everywhere, often small-town “part-time” libraries that appreciated the help. Yee observes that the card system had a standardizing effect, which in turn had a leveling, democratic one.

Moving forward in time, Yee evinces concerns about jobs for librarians in this newly mechanical world. She marks the threat to librarianship posed by the Internet, calling out Google as privileging convenience over accuracy. But Yee also perceives the Web as a potential force for good, allowing cooperative cataloging to thrive.

Yee questions whether the same “progressive” attitudes operating in the 1900s are still functioning: “Some might argue that, at present, the Library of Congress serves a government that is dominated by those who wish to shrink all aspects of government that are not part of the military industrial complex….” She is concerned over the Online Computer Library Center replacing the Library’s card program in distributing the Library’s cataloging, and that it has become less standardized, given that the Library lacks cataloging copy for audiovisual materials.

Yee posits U.S. culture at a crossroads: We can choose to use our new technology foolishly, and risk misplacing our cultural record, or open it to everyone. She ends with a cri de coeur: “As always, it is up to us to choose the kind of society we want.”

This was a clear, interesting history of a program that has had an unappreciated influence in the lives of countless Americans, schoolchildren to adults. But Yee’s sub-rosa injection of political views didn’t add much. Yee fears private industry won’t make knowledge accessible to a broad audience, and connects the card program to today’s controversies over public sector-versus-private sector activity in the information field. But the transition between her history of the card program and her widescreen commentary on politics and culture reads awkwardly and does not feel “wholly” earned. Yee’s worries about falling standards are perhaps less convincing than she wishes. And she makes only passing mention of the turn-of-the-century Carnegie libraries initiative, which would potentially complicate her public-sector-good-private-sector-bad dichotomy.

Reflection on the Principle of Least Effort

According to Wikipedia, the Principle of Least Effort “postulates that animals, people, even well designed machines will naturally choose the path of least resistance or ‘effort,’” adding significantly that “This is perhaps best known or at least documented among researchers in the field of library and information science.”

The above paragraph is, of course, a demonstration of that very principle — a reference obtained by clicking on the top item in the keyword search engine, Google. Thomas Mann elucidated the principle in his 1993 book Library Research Models, when the Internet age had not quite dawned. (Google itself would not be launched until 1998.)

Mann summarized the concept of the Principle of Least Effort as it relates to the work of librarians: “…as a general rule of thumb, people tend to choose perceived ease of access over quality of content in selecting an information source or channel; that is, they usually follow the slope of the system regardless of whether it is leading them to the best sources.” [Emphasis in original.]

Mann examined various fields of endeavor, both “hard” sciences and social sciences, and found the same effort-avoidance patterns — with ease, speed, and simplicity valued over quality and depth of sources (at least as determined by Mann).

How did library and information studies scholars perform? Were their research habits any more thorough, given that the Principle of Least Effort, or PLE, is a particularly popular topic in the field? Mann hinted in the negative, quoting Robert Fairthorne: “The unwillingness, o[r] inability of information retrieval specialists to retrieve information about information retrieval is notorious.”

Mann concluded that the library profession “must consciously manipulate the ‘slope of the gameboard’ to make the best channels easier for researchers to perceive.” But his solution list is rather vague.

UCLA library scholar Elaine Svenonius, in the Bibliographic Objectives chapter of The Intellectual Foundation of Information Organization (2000) partially, unconsciously downgraded the significance of Mann’s concern about library users choosing convenience to perceived quality, at least in my reading.

While Mann laid out an impressive case defining what he considered a problem, Svenonius pointed out barriers to solving it, including cost, noting “Any task that requires an organizing intelligence to engage in research is costly.” She also glimpsed bright spots in the rise of the Web and the magic of “keyword access and…the voluntary efforts of individuals who mount information on the web.” Wikipedia was still a year away when Svenonius published, but perhaps such an easily accessible, collaborative encyclopedia of knowledge was something that would have met her approval.

Svenonius conceded that cutting-edge scholars do need a full-featured information system. Then again, those scholars could be expected to be trained in where to look, how to find it, or at least whom to ask (a librarian).

Bowing to the perceived necessity of keeping things basic and simple for average information seekers, Svenonius referenced studies showing “that often users neither need, nor are capable of exploiting, the power of a highly organized database.” Of course, an advocate for Mann-style source quality could raise the rhetorical question: What are library and information specialists for, if not to pass along such skills?

This lack of understanding on both sides of the user-provider line is indirectly touched upon in Marcia J. Bates’ article, “The Invisible Substrate of Information Science,” which appeared in the Journal of the American Society for Information Science in 1999. Bates argued that the theoretical underpinnings of the field are too often unappreciated, not only by the general public and specialists in other fields, but by newcomers to Information Science itself, who may feel they already “know” how to do their work, yet don’t fully appreciate its theoretical underpinnings — the “elements ‘below the water line,” in her phrase.

The Principle of Least Effort is based on the sensible premise that life is short. Consider that even the most convenient (read: popular) sources became that way by being reasonably reliable. Google itself marked a massive upgrade in search quality, ranking sites by the number of links they received from other sites, instead of merely by how many times a search word appeared on that site. In other words: the rankings were based on citations. As John Battelle wrote in “The Birth of Google” for Wired magazine in August 2005: “[Google co-founder Larry Page] reasoned that the entire Web was loosely based on the premise of citation — after all, what is a link but a citation?” Wikipedia has its faults, but its inaccuracies can at least be fixed or earnestly debated in a semi-open forum, whereas mistakes in library catalogs, with no outside feedback available, often lie buried in their sins. Perhaps all roads lead to Rome, even short cuts.