Wednesday, August 15, 2012

Case Study No. 0503: Tim Spalding (Wannabe Librarian)

Future of Cataloging (1/2)
Part one of my 18-minute screencast on the "future of cataloging" for a panel at the American Library Association's 2008 conference. See thingology/2008/06/ future-of-cataloging-at-ala.php for more information on the panel.
Tags: tagging cataloging classification librarything
Added: 4 years ago
From: librarythingtim
Views: 5,819

The Future of Cataloging (as seen from LibraryThing)
* Tim Spalding
* tim [at]

TIM SPALDING: So, here's what I think about the future of cataloging, and I actually ... Louie did cataloging, not the catalog, although I usually talk about that. So, LibraryThing--
["Warning: Library Science being practiced without a degree." appears on screen]
TIM SPALDING: Oh yeah, so warning! Library science being practiced without a degree ...
[the audience laughs]
TIM SPALDING: Okay, what is LibraryThing?

What is LibraryThing?
* LibraryThing in one sentence
* 450,000 registered users
* 28 million books
* 37 million tags
* 50+ imitators
* LibraryThing is your friend :)

TIM SPALDING: LibraryThing is a social network for bibliophiles. You catalog your books. Um, all you have are just what you're interested in, and then the books you have connect you with other people, okay? That's one sentence because I was using semi-colons ...
[the audience laughs]
TIM SPALDING: Um, we have four hundred and fifty thousand registered users who have cataloged twenty eight million books, so we're soon going to become the largest "library" in the world, quotes. Thirty seven million tags have been applied. Um, the largest library-sponsored tagging, Project PennTags, has accumulated in its lifetime - which is almost as long as LibraryThing - fewer tags than LibraryThing adds every single day. So this is where the tags are happening, for whatever reason.
[he clears his throat]
TIM SPALDING: We've got a lotta imitators, some of them actually fairly important now ... Shelfari, GoodReads, iRead, Visual Bookshelf. However, none of them in any way care about, link to, draw data from, or otherwise are interested in you and in libraries generally, right? They tend to appeal more to younger people, who don't know you exist, so LibraryThing is your friend!
[the audience laughs]

The ladder of use ...
* Personal cataloging
* Social networking
* "Social cataloging"
- implicit
- explicit
* (watch me drive)

TIM SPALDING: Okay, LibraryThing usage often has a sort of ladder to it. You start off with personal cataloging, that's really what the site started off as, and then from that comes this social networking. And from that comes something I call "social cataloging," and I'm gonna talk about two different types - implicit social cataloging that happens in the course of normal business, and explicit where you pretty much know what you're doing. Lemmee start off with explicit.
[the profile page for "ThomasJefferson" from LibraryThing appears on screen]
TIM SPALDING: Here's the catalog of one of our more famous members, Thomas Jefferson.
[he zooms in on Jefferson's profile picture]
TIM SPALDING: Okay? Thomas Jefferson had his books cataloged by Millicent Sowerby some time ago, and members went ahead and added all the books in that catalog, drawing on a large number of libraries that LibraryThing draws from. Coming up with a really excellent catalog of Jefferson's work, which is now available to every student in the country. Easily searched, browsed.
[he zooms in on the section marked "Books you share (27)"]
TIM SPALDING: I can even know what books I share with Thomas Jefferson over here, right? Which is a lot, 'cause I was a classics graduate student and people used to read a lotta classics, so ...
[he clicks on the "library" section ( catalog.php?view= ThomasJefferson)]
TIM SPALDING: Here's a view of his library. Incidentally, someone also took the trouble to take all of the references from his papers to any of the books in this catalog and put them in as comments. So if, in some letter to a Virginian youngster he said "Y'know, you should really read Plato's Republic", then someone's put that in in the comments. Really a cool valuable service.
[the profile page for "JohnAdams" from LibraryThing appears on screen]
TIM SPALDING: Here's John Adams, his opponent in life, but he shares the most books with Thomas Jefferson on LibraryThing.
[the audience laughs]
TIM SPALDING: So, an interesting fact about our mental universes there.
[the profile page for "MarieAntoinette" from LibraryThing appears on screen]
TIM SPALDING: Um, Marie Antoinette ... An interesting one because this is a personal reading library, it's not some official library.
[the profile page for "SamuelJohnsonLibrary" from LibraryThing appears on screen]
TIM SPALDING: Um, Samuel Johnson ...
[the profile page for "SusanBAnthony" from LibraryThing appears on screen]
TIM SPALDING: Susan B. Anthony ...
[the profile page for "MatherFamilyLibrary" from LibraryThing appears on screen]
TIM SPALDING: The Mather family ... This is one of the libraries on LibraryThing that is actually a work of original scholarship. Um, this part of the site's been taken over by one of the archivists for the Mass Historical Society, and he has brought together for the first time a library of all the Mathers and did a lot of original work himself. Tracking down volumes, so this is pretty excellent.
[the profile page for "BelleStewartGardner" from LibraryThing appears on screen]
TIM SPALDING: Uh, Isabella Stewart Gardner ...
[the profile page for "Audenwh" from LibraryThing appears on screen]
[the profile page for "ErnestHemingway" from LibraryThing appears on screen]
TIM SPALDING: Hemingway, seven thousand books. An enormous amount about fishing ...
[the audience laughs]
TIM SPALDING: And bullfighting and so forth ...
[the profile page for "WalkerPercy" from LibraryThing appears on screen]
TIM SPALDING: Uh, Walker Percy ...
[the profile page for "EzraPoundsLibrary" from LibraryThing appears on screen]
TIM SPALDING: Ezra Pound ...
[the profile page for "SylviaPlathLibrary" from LibraryThing appears on screen]
TIM SPALDING: Sylvia Plath ...
[the profile page for "FScottFitzgerald" from LibraryThing appears on screen]
TIM SPALDING: F. Scott Fitzgerald ... All of this work done by regular LibraryThing members in their spare time.
[the profile page for "SamuelRoth" from LibraryThing appears on screen]
TIM SPALDING: Here's Samuel Roth, famous pornographer ...
[the profile page for "2pac" from LibraryThing appears on screen]
TIM SPALDING: And here is Tupac Shakur, famous rapper ... Who shares one book with me, "The Dictionary of Cultural Literacy," and one book with Jefferson, which is Machiavelli's "The Prince."
[the audience laughs, then the profile page for Vonda N. McIntyre's "The Crystal Star" from LibraryThing appears on screen]
TIM SPALDING: Okay, here's a book ... some random Star Wars book to you, and to me as well.
[he scrolls down to the bottom of the page]
TIM SPALDING: Down here, we have a section that LibraryThing calls "Common Knowledge."
[he zooms in on the "Series with order" (Star Wars 14|14 ABY) and "People/Characters" (Luke Skywalker, Leia Organa Solo, Jacen Solo, Jaina Solo, etc.) sections]
TIM SPALDING: And the idea here was, let's catalog the things that aren't in library records and aren't in publisher onyx records. So, "series" to some extent is, but things like people and characters, right? Y'know, what other books have Jaina Solo in them, right? Whoever that is ... Awards and honors, publishers, editors.
[the Star Wars "series" page ( series/Star+Wars) appears on screen]
TIM SPALDING: I'm gonna skip here to the Star Wars series ... This is the Star Wars series on LibraryThing, four hundred and eighty titles. The covers come from Amazon, or many of them were uploaded by members.
[he scrolls down the page]
TIM SPALDING: Okay, huge number, right?
[he scrolls down to the "Series by title" section]
TIM SPALDING: A complicated numbering system ... "ABY" and so forth. This means something to Star Wars people, right?
[the audience laughs]
TIM SPALDING: Now, I believe ...
[he zooms in on the "Related series" section]
TIM SPALDING: And then over here we've got related series, right? We've got the Star Wars "Young Jedi Knights" series as a subseries of this, "Rogue Squadron", right? This page encapsulates more accurate information about the Star Wars books and how they relate to each other than anyone in this room has, I hope, right?
[the audience laughs]
TIM SPALDING: And then I believe has ever been assembled, okay? So if you want to ... Y'know, who is the expert when it comes to the Star Wars books? It's these guys, right? It's the people who did this.
[the Library of Congress entry on "Love in the asylum" from appears on screen]
TIM SPALDING: Um, here is my ... We're doing plugs, so this is my wife's third novel, "Love in the Asylum" by Lisa Carey. Uh, which is out in stores, and is great.
[the audience laughs, then he zooms in on the subject headings (Psychiatric hospital patients -- Fiction, Substance abuse -- Treatment -- Fiction, Drug addicts -- Fiction, Letter writing -- Fiction, Women -- Maine -- Fiction, Women authors -- Fiction, Alcoholics -- Fiction, Maine -- Fiction)]
TIM SPALDING: Um, the LCSHs for this book include alcoholism, and do not include anything about Native Americans. Well, the book has really nothing about alcoholism, and it has a lot about Native Americans. Well, how did this come about? The flap copy talked about alcoholism, and doesn't talk about Native Americans, and the cataloging was fundamentally not done from reading the book. The cataloging was done from reading the flap copy. Which means that the LC record is fundamentally cataloged by some publicist at Harper Collins, okay? Not by someone who knew the book.
[the profile page for "Love in the Asylum" from LibraryThing appears on screen]
TIM SPALDING: On LibraryThing, the tags that people use include ... do not include alcoholism, do include Native Americans.
[he zooms in on the "Member Tags" section, which is a tag cloud where the bigger terms highlighted include "new england" and "mental illness"]
TIM SPALDING: Although rather small ... Okay.
[the profile page for "The Adventures of Huckleberry Finn" from LibraryThing appears on screen]
TIM SPALDING: Here's "The Adventures of Huckleberry Finn" on LibraryThing.
[he zooms in on the top of the page (Members: 9773, Reviews: 101, Popularity: 46)]
TIM SPALDING: You'll notice the tags, you'll notice there's nine thousand seven hundred members who have this book.
[he clicks on "Other copies and editions of this title"]
TIM SPALDING: Here's all of the editions of "Huckleberry Finn" ... I'm just gonna scroll through them here.
[he scrolls down the lengthy list]
TIM SPALDING: Hundreds and hundreds of editions ... How does LibraryThing know to put them all together? Regular users have combined them, okay? On LibraryThing, it helps to combine works. If I had the Finnish edition of "Huckleberry Finn", by combining it with the other editions, I've made new friends, alright? So LibraryThing members all day long are out combining works, more than a thousand work combinations every single day, which is at least two thousand works combined. Okay? That's done, by the way, with authors so Mark Twain includes Samuel Clemens because someone on LibraryThing said that he did. The FBI, by the way, includes a large number of variants, too.
[the audience laughs, then the "Books tagged cooking" page from LibraryThing appears on screen]
TIM SPALDING: Here's the tag "cooking" on LibraryThing, okay?
[he zooms in on the top of the page ("Tag and its aliases used 77409 times by 6377 users.")]
TIM SPALDING: What's interesting about this, primarily it's kind of a popularity contest, but what's most interesting is that it is not ...
[the "Books tagged cookery" page from LibraryThing ("Tag and its aliases used 24017 times by 1125 users.") appears on screen]
[the audience laughs]
TIM SPALDING: Okay? Now, I submit that the "cookery" tag on LibraryThing is what you all eat, okay? You guys are really fond of Delia Smith, so ...
[the "Books tagged paranormal romance" page from LibraryThing ("Tag and its aliases used 7768 times by 474 users.") appears on screen]
TIM SPALDING: Um, paranormal romance, right? Something that traditional cataloging will never get into ...
[the audience laughs]
TIM SPALDING: This has been used almost eight thousand times by five hundred users, this is as real as anything in the Library of Congress, okay? Um, and there's an infinitude of these terms. Cyberpunk, steampunk, it goes on and on. Cozy mysteries.
[the "Tagmash" page for "france" and "wwii" from LibraryThing appears on screen]
TIM SPALDING: Okay, here's where I mention my tag mashes. I wanna show you one, so very few people will tag a book "non-fiction" about France during World War II. But they will tag a book "France", "World War II", and then if we cut out the fiction, we get a pretty good list of books that are non-fiction about France during World War II. So you get some of the power of hierarchy ... Um, not all of it. Some of the power of hierarchy from large large quantities of user tags.
[he tries to go back to the "Books tagged paranormal romance" page]
TIM SPALDING: And notably here, in tags ... Y'know, this is paranormal romance, but these are really really good examples. This is the paranormal romance reading list. Way down at the bottom are things that only a few people think is paranormal romance, so relevancy is built into the system in a way that something like "love stories" on LCSH isn't. Everything is equally a love story on LCSH, okay? Um, or not at all! Okay, so lemmee get back to my slides.
["Declaration: the 'Tag War' is over! Time to come out of the jungle ... " appears on screen]
TIM SPALDING: So, first of all I wanna make a declaration, which is that the tag war over! It is time to come out of the jungle, and realize that the war is over. I, I do not believe that tags are better than subjects, okay? But anyone who asserts that tags are not useful for finding certain things just hasn't spend the time to look at the evidence. Um, y'know, if you're one of those people, show up afterwards and I'll show you the evidence. Um, and I've posted some screencasts. There are things for which tags are just great at finding stuff. There are things for which tags can't touch subjects, but to some extent ... If what you care about is finding things, not asserting ontological reality, then the war is over.
["What is really going on? (one answer)" appears on screen]
TIM SPALDING: Okay, what is really going on here?
[a painting of the Death Star appears on screen, with the words "The end of the world!" superimposed over it]
[the audience laughs]
TIM SPALDING: You'll note, by the way, that's the ... I added the Death Star during the course of the talk, so--
[the audience laughs]
TIM SPALDING: I was fiddling with Photoshop while, uh ... while people were talking. I actually don't mean the end of the world, I mean ...
["The End of Intellectual Structures Based on and Limited by the Physical World" appears on screen]
TIM SPALDING: The end of intellectual structures based on and limited by the physical world ...

The physical basis of classification
* Hat-tip Weinberger
* A book has 3-6 subjects
* Subjects are equally true
* Subjects never change
* Only librarians get to add subjects
- There is only one answer. Someone "wins." (religion, gender)
- You don't get a say in how books are classified.
* Classification must be hierarchical

TIM SPALDING: So, any of you who know "Everything is Miscellaneous" by David Weinberger, a lot of this descends from his thinking. So, a book has three to six subjects? No, that's how many subjects can fit on a catalog card, right? We continue that because of that limitation.
[he clears his throat]
TIM SPALDING: Subjects are equally true. Everything that's a love story, everything that's a man-woman relationship, is equally that, right? Well, if you look at man-woman relationships ... what percentage of Western literature is about man-woman relationships? Like, eighty percent?
[the audience laughs]
TIM SPALDING: Right? It's just a matter of degree. Um, but you can't have degree on catalog cards, because how would you do that, right? The words are bigger? Alright?
[the audience laughs]
TIM SPALDING: Um, subjects never change ... Right? So, first editions of "Diary of Anne Frank" don't have the word "Holocaust" on them because the word didn't have an LCSH for that yet, right? So you can't, you don't change. Now, of course, you could change them but you don't.
[the audience laughs]
TIM SPALDING: Only librarians, only you get to add subjects, right? That also means that there is one right answer, so ... y'know, whether Sikhism is part of Islam or not, one of you has to decide. Someone has to win. Um, LibraryThing, by the way, added a "gender" field to authors so that members could contribute it, and the first thing that happened was people said "That's not enough!"
[he laughs]
TIM SPALDING: "We need intersexual, we need" ... Y'know, a million other categories. Um, and they have a point.
[he clears his throat]
TIM SPALDING: Okay, then you don't get to say ... Well, actually you do get to say how books are classified. The rest of the world doesn't get to say, right? Well, in the physical world, that makes sense. Y'know, what would it mean if people went into the Boston Public Library and they pulled out the cards and they started writing stuff in the margins and they said "Oh, I don't believe that one!"
[the audience laughs]
TIM SPALDING: Y'know, that would be chaos. In the digital world, that's fine! My scribbling doesn't impact you, unless you want it to, right? Um, and the notion that classification must be hierarchical. The world is not hierarchical, um, ideas are not hierarchical. Um, the relationship between things are much more subtle then hierarchy can contain.

* Only books are cataloged (Diane's granularity consensus)
- Alchemy example
* Cataloging has to be done in the library
- Cataloging can't be done in underpants
- WookieGuy72 can't help you
* Most librarians can't help you, each other, themselves
- Libraries are NOT good at sharing metadata
* Record creating and editing can't be distributed
* Record sharing can't be shared freely

TIM SPALDING: Okay, other examples ... Only books are cataloged. So, I think Diane might mention this idea, the granularity consensus breaking down. On LibraryThing, we've gone seriously into series, obviously, and the next step is to say "What stories do I have in my collection by H.P. Lovecraft?" Right, those are questions that traditional cataloging doesn't answer very well, and that I think will break down over time. Incidentally, the alchemy example is a great one, where you would think ...
[he clears his throat]
TIM SPALDING: So I studied classics, I actually studied alchemy and other divinatory stuff, and you would think that classics would really work in library classification, but it doesn't. That all sorts of ways where works break down and authors break down, can't be represented by traditional classifications, so you lose it. Okay, cataloging has to be done in the library, cataloging can't be done in your underpants.
[the audience laughs]
TIM SPALDING: Uh, "WookieGuy" can't help you.
[the audience laughs]
TIM SPALDING: Um, most librarians can't help each other, either. Right? So Jennifer, libraries are not good at sharing metadata. They are reasonably good at pulling metadata down from a centralized source, and very occasionally putting it back up again, right? But if you have a community of interest, a bunch of California libraries that wanna go one way, they're not sharing with each other. It's not easy ... Record creating and editing can't be distributed, um, I believe it will be. And record sharing can't be shared freely. This is primarily about the physical world, but we also have the world of OCLC to blame for that as well.

Two Futures
* The world ends
- You are paid less
- The programmers still get paid
* You move up the stack
- An IT industry analogy
- Demand increasing
- Low level work and data becomes commoditized, distributed, free
- You move higher, get paid more

TIM SPALDING: Okay, two futures! The world ends, you are paid less, programmers are still paid!
[the audience laughs]
TIM SPALDING: Second future, you move up the stack. This is a term from the world of IT, okay? Over and over again in IT, demand increases while the lowest level of work and data becomes commoditized or free. LibraryThing is built upon a free operating system, a free programming language, a free database system, okay? All of those jobs that might be there have been destroyed, okay? But there's a lot of programmers making a lot of money, okay?
[he laughs]
TIM SPALDING: It happens by moving up the stack of abstraction, moving up the stack of quality, okay? So I think the Starbucks example was good, that librarians in the future may be operating at a higher level of abstraction with respect to their records. Um, and y'know, so ... Move higher, you get paid more!

Concluding tangent: A new shelf order
* Replaces Dewey
- Free (Open Source)
- Modern
- Humble
* Decided socially, level by level
* Tested against the world
* Assignment is distributed
* I write the code
* You be Jimmy Wales (Wikipedia founder)

TIM SPALDING: Now, I just wanna make a concluding tangent, because I wanna search out some person in this audience, which is that I wanna replace Dewey with a free, open source, modern, and humble system. By "modern", I mean that Portuguese is not a dialect of Spanish, and Unitarians are not as significant as Catholics ...
[the audience laughs]
TIM SPALDING: Um, and uh ... Ouch!
[he laughs]
TIM SPALDING: And "humble", I mean that, y'know, the purpose of this is to come up with a shelf order, not to model some reality which can't fully be modelled in a shelf order. Uh, decide this socially, level by level. Let's do the aughts, right? Let's figure out what the aughts are, then let's go to the tens. One by one.
[he clears his throat]
TIM SPALDING: Tested against the world. Once we've gotten to the tens, are books breaking out in ways that make sense? Are books that used to be together in Dewey, really far apart? Why? Are books that LibraryThing thinks are owned by the same people and very closely, are they together? Why? Alright, then one by one, you solidify it. The assignment of it is distributed around the world.
[he clears his throat]
TIM SPALDING: I write the code, and you be the Jimmy Wales of this ... You look over it, but fundamentally have no power, so--
[the audience laughs]
TIM SPALDING: No, I mean, Jimmy Wales said that every year ... He's like the Queen of England, every year he has less power, right? And I think that's the ultimate goal here, is to come up with these things socially. So, I hope I can find that person that wants to do this as a cool side project. Thank you very much!
[the audience applauds]



Thursday, June 26, 2008
The Future of Cataloging at ALA

If you're at ALA in Anaheim, have nothing to do Sunday morning and are interested in the future of cataloging—and who isn't?—you might be interested in the following panel:

Creating the Future of the Catalog and Cataloging
ALA Annual Conference
Sunday, June 29, 2008 from 8:00 a.m. - 12:00 noon
Anaheim Convention Center, Rm. 204B

The panelist include Roy Tennant, Jennifer Bowen, Martha Yee, Diane Hillmann—and (gulp) me!

The moderator, Robert Wolven of Columbia, is promising to keep it snappy, with brief presentations and oodles of time to discuss the big issues.

I don't know all the panelists, but I know we include some very different visions of the future. There may be fireworks! (I won't be attacking OCLC as much as I otherwise might. Roy could disarm Rambo.)

My mini-presentation is titled "UGC: The Next Sharp Stick?" UGC is, of course, User Generated Content. And the "Next Sharp Stick? is a reference to John Hodgman's humorous one-act play "Fire: The Next Sharp Stick?" The play ends with the fire-promoting caveman being killed, of course.

What can I say? They didn't ask me on to be conservative straight-man.

No comments:

Post a Comment