The Wayback Machine - https://web.archive.org/web/20210121050811/http://kcoyle.blogspot.com/search/label/oclc
Showing posts with label oclc. Show all posts
Showing posts with label oclc. Show all posts

Wednesday, January 02, 2013

OCLC Top 50

OCLC recently released a file of 1.2 million metadata records for the most widely held items in its catalog. These are all items with 250 library holdings or more. I created a list on WorldCat of the top 50, mostly out of curiosity. I was quite surprised at the results, however.

Here's how it breaks down:
  • 16 periodicals, with Time and Newsweek being numbers 1 and 2, respectively
  • 29 kid and YA books, four of which (and very high even in this small list) from the Diary of a Wimpy Kid series
  • 5 adult books
The five adult books are:
  1. McCullough, D. G. (1992). Truman. New York: Simon & Schuster. 
  2. Brown, D. (2003). The Da Vinci code: A novel. New York: Doubleday.
  3. Johnson, S. (1998). Who moved my cheese?: An a-mazing way to deal with change in your work and in your life. New York: Putnam. 
  4. Haley, A. (1976). Roots. Garden City, N.Y: Doubleday.  
  5. Peters, T. J., & Waterman, R. H. (1982). In search of excellence: Lessons from America's best-run companies. New York: Harper & Row
This small set gives me many ideas of things to investigate in the full set. First, the monographs in this set are all recent dates, with the oldest being 1976, and most after 2000.
I am hoping to graph the full set by date. What I expect is that the items will be overwhelmingly recent publications because libraries tend to hold what people read, and my guess is that readers are mainly reading new books. Also, libraries buy from the set of things that are in print, so even if they are buying a so-called classic (as they do every time yet another movie is made of Pride and Prejudice) they are buying a current edition which will have a recent date.

The next obvious bit of information would be correlation between holdings and date, which I expect to be high for the very reasons given above.

The overall distribution of holdings is unsurprising, starting high (at almost 7000 holdings), dropping off dramatically, and creating a long tail. (I had managed to coax a chart of out ooCalc but it crashed before I captured it. Am now studying how to deal with large files and visualization. Advice gladly received.) Of course, the tail would be very, very long if you could chart the entire WorldCat database. (Anyone know how many items in WC are held by only one library? I can't find that in the available WC stats.)

I think it would be interesting to be able to analyze library holdings in correlation with the FRBR-ization that OCLC has done. In fact, I would really like to see the top 1% (or .5%) of FRBR-ized items. Related to FRBR I am mainly wondering if we can estimate how frequently FRBR might fulfill its promise of saving the time of the cataloger. But that's for another day.

Thursday, April 19, 2012

Clarification from Sweden on OCLC negotiations

The National Library of Sweden has issued a short blog post clarifying their objections to the WorldCat Rights and Responsibilities (WCRR) policy. The inability of the two parties to reconcile these issues led the library to break off contract negotiations with OCLC. I find the Library's objections to be logical and undeniable:

1. The relationship with OCLC around record use is asymmetrical, with OCLC having the right to do whatever it wishes with the records while library use is restricted by the policy.

2. The policy actually requires libraries to favor WorldCat over other services, and thus hinder competition, which is not appropriate for a national library. [kc: This may even be illegal for publicly funded libraries in the US.]

3.  Open data is of strategic importance for libraries.

They conclude with:

To this end we urge OCLC to allow members to treat downloaded records as their own, including releasing them under any open license such as CC0. We feel that this would strengthen rather than diminish OCLCs strong status as a service provider to the library community.

Thursday, December 22, 2011

National Library of Sweden and OCLC fail to agree

In a blog post entitled "No deal with OCLC" the National Library of Sweden has announced that after five years they have ended negotiations with OCLC to become participants in WorldCat. The point of difference was over the OCLC record use policy. Sweden has declared the bibliographic data in the Swedish National Catalog, Libris, to be open for use without constraints.
"A fundamental condition for the entire Libris collaboration is voluntary participation. Libraries that catalogue in Libris can take out all their bibliographic records and incorporate them instead into another system, or use them in anyway the library finds suitable." (from the blog post)
This is an example of the down-stream constraint issues that we discussed while working on the Open Bibliography Principles for the Open Knowledge Foundation. While open data may appear to be primarily an ideological stance it in fact has real practical implications. A bibliographic database is made up of records and data elements that can have uses in many contexts. In addition, the same bibliographic data may exist in numerous databases managed by members of entirely different communities. Someone may wish to create a new database or service using data coming from a variety of sources. At times someone will want to use only portions of records and may mix and match individual data elements from different sources. Any kind of constraints on use of the data, including something as seemingly innocuous as allowing all non-commercial use, require the user of the data to keep track of the source of each record or data element. Practically this means that an application using the mix of data is effectively constrained by the most strict contract in the mix. 

The Swedish library was concerned that their participating libraries would be hindered in their future systems and activities if any limitations were placed on data use. In addition, they would not be able to share their data with the Europeana project, as Europeana requires that the data contributed be open precisely because of the complications of managing hundreds or thousands of different sources with different obligations.

As many of us pointed out during the discussions about the OCLC record use policy, the practical problems of controlling down-stream use of data are insurmountable. Some people argue that the record use policy hasn't affected libraries using WorldCat, but my experience is that the policy has a chilling effect on some libraries, and is making it more difficult for libraries to embrace the linked open data model. The Swedish National Library had to make the difficult decision between WorldCat services and future capabilities. It was undoubtedly a hard decision, but it is admirable that the National Library did not give up what it saw as important rights for its users.

Sunday, February 06, 2011

Skyriver Replies

Following up on the these early stages of what will probably be an interminable legal case (it's easy to understand why one should avoid going to court whenever possible), The SkyRiver has replied to OCLC's Motion to Dismiss.[1] [2] This is the first document I have seen that to me clearly lays out Skyriver's basic contentions. Note that the major part of this document is the usual lawyerly recitation of cases supporting one statement or the other, and I have no idea what the legal arguments mean or whether they are convincing or not. But here are SkyRiver's primary facts as this document lays them out:

1. OCLC has monopolies in the US academic library market
"OCLC is monopolizing three product or service markets—bibliographic data of libraries’ holdings; cataloging service; and interlibrary lending service (ILL). OCLC is attempting to monopolize a fourth service market—integrated library systems (ILS)." p. 1
2. OCLC has used those monopoly positions to prevent competition
"Since at least 1987, OCLC has demanded that its member libraries agree to terms of membership that prohibit sharing the metadata of their own library holdings contributed to OCLC’s bibliographic database known as WorldCat with any for-profit firms for commercial use and require member libraries to use OCLC’s services. OCLC has imposed these membership terms to prevent the development of competing bibliographic databases, cataloging services or ILL services by erecting barriers to entry in these three markets. OCLC is also using its monopoly power in these three markets in its attempt to monopolize the ILS market." p.1
3. OCLC has targeted SkyRiver's business by using punitive pricing for libraries that use SkyRiver's cataloging services
"OCLC’s conduct has injured SkyRiver by deterring libraries from using its service, and has injured libraries that are using SkyRiver to reduce costs by preventing those libraries from uploading their new records into WorldCat at the price charged to everyone except SkyRiver users." p. 2
Beyond that the arguments become more complex. In particular there is the issue of the 20+ years that OCLC has been building up WorldCat under a policy that has prohibited (acc. to the response, p.4) libraries from sharing their cataloging data with for-profit entities. With no other non-profit entity providing cataloging services to US academic libraries, the records are essentially locked-up in WorldCat and no one else can enter the market.

This brings me to a point that I got wrong in a previous post, which is that Skyriver is asking for access to the WorldCat database. The argument there, if I read it correctly, is that WorldCat is the only major source of academic library holdings that can be used for an effective ILL service. WorldCat is the result of monopoly practices. To allow for competition, WorldCat (e.g. bibliographic data and holdings) should be made available for a reasonable price to competing ILL providers. While this seems jarring at first, the more I think about it the more sense it makes.

What the response does not say explicitly, and perhaps it would be irrelevant in a legal case, is that one could look on WorldCat as a shared community resource, not the property of OCLC. In fact, OCLC uses this kind of argument in its record use policy, but somehow leads to the conclusion that WorldCat should not be used to foster non-OCLC library services. It seems easy to make the opposite argument, which would be that WorldCat could be the basis for a wide range of services that would benefit libraries, even if they do not come from OCLC. Imagine if OCLC were to set non-discriminatory pricing for use of WorldCat and anyone could make use of the WorldCat data. There could be a "share-alike" clause that would require those users to return pertinent information to the bibliographic collective. WorldCat would grow, and the range of products and services available to libraries would grow. This seems like a GOOD THING.

I realize it may not be easy to do the analysis that would lead to pricing that both fosters sharing and makes it possible even for small businesses* to arise in the library market. It should be possible, given today's technology, to do this efficiently but we know very little about the cost structure of WorldCat. It is clear that there are many activities relating to the care and management of that database, all intertwined with OCLC services and valuable research projects, as well as linked deeply into tens of thousands of library systems around the world. Should the court require OCLC to open WorldCat for use, we need to see a transition that is non-destructive to the library ecology.

* The reason I emphasize small businesses is that I believe that smaller, more nimble vendors could exist to serve the needs of specialized and smaller libraries which are not OCLC members at this time. I see the potential to widen the community of sharing, even to include more non-library institutions and businesses. Another GOOD THING, IMO.

Tuesday, December 14, 2010

OCLC Motion to Dismiss, Pt II

Continuing on...

Rights

Here's a somewhat extended quote from the Motion that quotes the original complaint:
"At other points in the Complaint, without addressing the text of the records use policy, Plaintiffs characterize the policy as placing broad restriction on a library's use of its own records. ([Complaint] paras. 34-36) However, these conclusory allegations are belied by the actual terms of the records use policy pled above.. For example, Plaintiffs claim that 'a member library may not transfer or share records of its own holdings with commercial firms' ([complaint] para 35), but the records use policy states no such thing. Throughout these allegations, moreover, Plaintiffs confuse and obscure the terms 'OCLC records' and 'library records.' In reality, the situation is simple: OCLC does not prohibit a library from sharing its original cataloging records with whomever it pleases; it does, consistent with the fact that the WorldCat database is copyright, claim a legal right to the unique identifier information used to link and make usable records in WorldCat." (Motion, pp 7-8)
"Again, at most, the Complaint pleads only that libraries cannot share OCLC's records, not that they cannot share the records they themselves created." (Motion, p. 14)
This is a very interesting set of statements. First, it plays with the ambiguity in talking about "library records," denying that libraries cannot convey records of their holdings, as stated in the Complaint, then stating that they can share their original cataloging records, which is not what most in the library world would consider equivalent to "library holdings." What it comes down to is the ownership of the records in the library catalogs that represent the holdings of the library. By "the holdings of the library" I understand not just some holdings, but either all of the holdings or some useful set of those holdings. The set of records that were originally cataloged by the library is a somewhat random set, and not useful as "library holdings." OCLC claims ownership in all records in a library's catalog that were not created as original cataloging by that library. Although this is a distinction it is not a distinction that relates to any particular functionality or useful library projects relating to their holdings. It's useless nonsense, is what it is, nitpicky, and proof that OCLC was boxed into a corner as it tried to claim ownership over the millions of records created by libraries around the world.

OCLC also states in the second quote above that those records in the library data are "OCLC's records" and are not records that the libraries created. Here, "created" is a key verb. Any library that has done significant modification and upgrading to a record can probably claim at least an amount of co-creation with other libraries. The claim that those records belong to OCLC is an insult to the libraries that have put so much effort into the shared pool of bibliographic data. Of course, OCLC would counter that the libraries and OCLC are one and the same. The unilateral actions of OCLC around the record use policy definitively shattered that view.

Equally interesting is the claim of copyright on the database, a claim that has not been challenged and that might not survive a challenge. A database of bibliographic data may just be seen as a compilation of facts, essentially sweat of the brow rather than a creative output. Add to that the fact that much of the sweat was not OCLC's but was on the part of thousands of libraries, and the copyright claim looks thin. Ditto the claim to the OCLC number, which is purely a sequential number assigned to records as they enter the system. The claim that the OCLC identifier makes OCLC records usable is not defensible, IMO, in that every database assigns numbers to things as part of the mechanical database management process. There's nothing new or creative about the fact that OCLC records have OCLC database numbers.

Remember, though, that these statements are not meant for you and me; they are addressed to a court that may have very little knowledge in these matters. Obfuscation of the facts is undoubtedly part of the trial process, and on the part of all parties involved. Unfortunately, OCLC's motion goes beyond obfuscation -- it gets nasty.

Sarcasm and Nastiness

I've only read the legal documents for a few cases that I'm particularly interested in, so my experience here is limited. However, I would assume that a court case would best be won on cleverness, wily strategies and the ability to out-wit ones' opponent. In this as in other professional and public endeavors, I would expect the participants to affect a tone of detached politeness, even while skewering their rival. The OCLC motion plummets into sarcasm and nastiness. Here are some quotes:
"...Plaintiffs have thrown a plethora of allegations of OCLC's purportedly anticompetitive actions into the Complain to see if any stick..." (Motion, pp. 1-2)
"While OCLC denies that either of these libraries has suffered as the result of anything other than purchasing the Plaintiff's inferior cataloging software..." (Motion, p. 17)

"... vigorous competition against a company offering less expensive, but inferior products, is perfectly lawful." (Motion, p. 1)
"Nevertheless, what is sauce for the goose is sauce for the gander -- having pled a fiction that undercuts the existence of any claims they can pursue, Plaintiffs cannot claim to have been injured..." (Motion, p. 4, footnote)
"Nothing in the antitrust laws requires OCLC to subsidize SkyRiver's inferior product by setting its pricing for registering holdings into WorldCat as low as possible." (Motion, p. 28)
I find these statements to be embarrassingly unprofessional in nature, although for all I know this is the norm in legal arguments.

Separate Realities

I suppose that one of the main skills for legal argumentation is the ability to present "facts" in ways that benefit your client, regardless of the facts. (If I were a judge and had to listen to this stuff, I'm sure I'd be driven to homicide.) Here are some examples from the motion to dismiss:

1. The named libraries, Michigan State and Cal State Long Beach, were not harmed by OCLC, they simply declined to purchase OCLC's record upload service. This is cited as proof that they were not coerced into making a purchase (which appears to be one of the antitrust offenses). (p. 29) There is no mention that the libraries could not afford the price that OCLC offered, that the price changed without warning, etc.

2. WorldCat Local is not a competitor to ILS systems because it exists in addition to the ILS system. The Motion of course completely fails to connect WC Local, its attempt to limit use of the bibliographic data, and the upcoming "in the cloud" library systems platform. Are they worried that it might actually look like improper use of the WorldCat database?

3. SkyRiver does have bibliographic records, so OCLC cannot be accused of having a monopoly on bibliographic records. (As if any bunch of bibliographic records will do.) Elsewhere in the document they boast of having the largest bibliographic database. Are we back to the Goose and the Gander?

_____

These are just a few of the topics in the Motion, and just the ones that I found most interesting. They may not even be the most relevant topics relating to the lawsuit. I suggest that you read the Motion and other documents for yourself.

OCLC Motion to Dismiss, Pt I

OCLC has filed a motion to dismiss in the anti-trust lawsuit brought by SkyRiver/III. I presume that this is Standard Operating Procedure in cases of this type. As someone who is not versed in the complexities of antitrust law, I have no idea if OCLC makes a good case in its motion. My impression is that the OCLC lawyers are quite adept, and that bodes well for OCLC in the case.

I will comment on some interesting text and subtext of the motion. Since this will get long, here is quick summary of what follows:

  • The motion states that SkyRiver has so far offered little proof of harm due to OCLC's business practices.
  • The motion may play on the court's ignorance of the library world and of OCLC's definitions.
  • OCLC makes some interesting claims to rights.
  • The motion makes claims that twist the words of SkyRiver's complaint.
  • The motion contains some unfortunate use of sarcasm and nastiness.
  • The motion undermines some previous OCLC claims as to the force of the Record Use policy.

Little Proof

The motion claims that the SkyRiver complaint contains few hard facts that could be used to back up the anti-trust claims. (Although I have no idea how detailed such a complaint is supposed to be.) It doesn't explain the library market and OCLC's role in it. What I find particularly lacking is that there is no comparison of pricing for record uploads between the libraries that moved to SkyRiver for cataloging and other libraries that upload records to OCLC. (According to the 2009 annual report, only 12% of records added to WorldCat were added via cataloging on OCLC; the rest were batch loaded.)

Ignorance and Definitions

OCLC plays heavily on the confusion between WorldCat, the database, and the records in libraries' catalogs. This is not an easy concept to grasp, and it is not explained well in the SkyRiver complaint. Wherever SkyRiver's complaint refers to "library records" OCLC counters using "WorldCat" in its place. It makes a huge difference to be talking about the records in a library's catalog vs. the entire WorldCat database. OCLC claims that SkyRiver is demanding that OCLC make all of WorldCat available for free to competitors. What is actually said is:
"Library records should be freely and openly available for use and re-use either in the public domain or by reasonable means of access for all, including for-profit library services firms." (Complaint, para. 76)

But OCLC re-words this in its response as:
"... (a) library records should be free, regardless of OCLC's inestment in aggregating, normalizing, enhancing, maintaing(sic), and delivering services based on them..." (Motion, p. 10)
OCLC also says:
"Plaintiffs pled, at most, only that libraries cannot share OCLC records, not that they are prevented from sharing records they created." (Motion, p. 21)
What is clear here, as it is throughout the motion document, is that SkyRiver is talking about the records that are in library catalogs, and OCLC is talking about "OCLC" or "WorldCat" records. By referring to the records in library catalogs as "OCLC" records, OCLC thus claims ownership to those records. In the former meaning, the libraries are prevented from making use of the records in their catalogs as they wish; in the latter, OCLC is the owner of a database and claims are being made against that database. Unless these definitions are cleared up, the two parties are just talking past each other, and no member of the court is going to make sense of it all. That, of course, would probably be to OCLC's advantage.

Record Use Policy

The original complaint cites the OCLC record use policy as a means by which OCLC maintains
"strict control over its members' access and use of the WorldCat database...". (Complaint, para. 33)
OCLC's motion first complains that SkyRiver did not attach a copy of the Policy with its original filing (but did so to their response to the Motion to Transfer). This is irrelevant to the case, I believe, and therefore is a bit of sniping at SkyRiver's lawyers, hinting that they aren't doing a good job. Anyway, here's how OCLC replies to that:
"The nature of these documents is not pled: it is not claimed that these documents are anything other than 'guidelines' OCLC publishes or that OCLC has ever used these documents to prevent a library from providing its catalog records to Plaintiffs or any other entity." (Motion, p. 7)
There's more, but let's first examine this statement. During the big broo-ha-ha about the policy, Karen Calhoun published "Notes on OCLC's updated Record Use Policy" on the OCLC blog, and stated:
"The updated policy is a legal document. Being a player on the Web, working on behalf of libraries, requires that the policy be a legal document."
That is of course the opposite of what is said in the motion.
(See comment below by Jennifer Younger: "The new 2010 policy is correctly characterized in OCLC's Motion to Dismiss as a code of good practice to guide members' choices about how they share their copies of WorldCat records.")

What is sad, however, is the statement, true as far as I know, that OCLC has never used these documents to prevent libraries from sharing their records. It hasn't had to, because the mere threat has been enough to prevent libraries from acting. The libraries that have released their records have done so unscathed, but they are few. There are of course two ways to interpret this: libraries are afraid to release their records, fearing retribution, or that libraries agree with OCLC's argument that WorldCat would be endangered should library records be openly shared.

I'll pause here and take up again shortly.

Monday, December 06, 2010

Response to JPW


Note: John Price Wilkin of Michigan wrote a post on the Open Knowledge Foundation blog that is very critical of the library linked data movement and the creation of numerous disjoint files of bib data in linked data formats. I admit that it isn't clear to me what he thinks should happen, but it seems to be something like this photo, which I took at the Online 2010 exhibit hall. This is OCLC's booth.

A separate cloud for libraries. Totally the wrong idea.

I must say that I see things quite differently from JPW. Although I agree that a bunch of static bibliographic files do not open library linked data make, my view is:

1) Each file represents a person or group who got interested in transforming library data and went through the learning process of actually doing it. Therefore each file is a contribution to our collective knowledge about linked data. When we add these files to heterogeneous stores like Open Library or Freebase, we exercise that knowledge.

2) These files are the fodder for further experimentation with mixing library data and non-library data, which to me is one of the main points of linked library data. We are in the "training wheels" stage of this change, and like training wheels these early files may end up being discarded when we finally learn to ride. I see no harm in that.

3) This experimentation is taking place primarily outside of the US in places where the OCLC record use policy does not apply. The British Library, the National Library of Sweden, soon the Bibliotheque Nationale, and a handful of German libraries are at the forefront of this. If you cannot release your bibliographic data openly, you cannot participate in the linked data movement.

4) I do think that we will have library systems that make use of a different data format to the one we have today, but those are not the same as linked data, and are definitely not the linked open data that is the main focus of the linked data activity. How we manage our data for ourselves may well be different from how we share it with the world. We do need a well-ordered library data universe where we do our bibliographic work. That should exist in parallel with open sharing that reaches beyond the library cataloging community.

Friday, October 29, 2010

SkyRiver/OCLC suit moved to Ohio court

The judge in San Francisco's Ninth Circuit court has agreed to OCLC's request to transfer the proceedings in the SkyRiver/OCLC suit to the Southern District Court of Ohio. In an impressively thoughtful 10 page document, the judge weighs the various arguments by the parties relating to the request to transfer. In the end, the decision was based on two things:
  1. A majority of the potential witnesses that are neither SkyRiver nor OCLC employees (e.g. libraries that can give evidence) are closer to Ohio than to California.
  2. In terms of documentation as evidence, most of this documentation will need to come out of OCLC's file cabinets, since the suit refers to OCLC business practices over a significant period of time.
I was hoping to be able to sit in on some of the action in the San Francisco court, although more experienced folks have told me that it could be deadly dull. Now we need to find possible bloggers in the Ohio area to cover this. Any volunteers?

Tuesday, August 17, 2010

OCLC, SkyRiver, and the slow arm of the law

I suppose one could be gratified to learn that there are institutions that move at least as slowly as libraries, but I'm not happy about the delayed gratification that entails, nor the fact that it means that will we have to try to move forward as a community without having answers for quite a while.

The recent documents that have been filed with the court in the SkyRiver/OCLC case have the following actions and dates in them:

First, OCLC will request that the suit be moved from Northern California to the Southern District of Ohio. Just to cover that motion will take us through October, 2010.

If that does not derail the current calendar (and I presume it could cause this date to be moved back), then the Case Management Conference will be on January 14, 2011 in the San Francisco courtroom.

No, I have no idea what a "case management conference" is but it sounds like something preliminary. I would love it if someone with a legal background could offer some occasional commentary on what some of these steps mean. Right now I presume that all of this is par for the course for lawsuits of this nature, but never having observed such a case before, I really have no idea. Anyone know some law librarians who can chime in?

Friday, July 30, 2010

SkyRiver/III v. OCLC, Part II

In my previous post I covered what I saw as the stronger arguments made in the complaint. In this post I will cover points that either puzzled me or seemed to be off the mark.

The OCLC Number
The complaint states that
"This OCLC number has permitted OCLC to police its members to ensure that their records are not shared with unauthorized users." (p. 5)
Since anyone can add or delete an OCLC number from a MARC record in their own database, I don't see how this could be the case. I would like to see how this claim is supported.

The ILS Market
"OCLC is rapidly gaining market share in the ILS market by leveraging its monopoly power over its bibliographic database... " (p. 6)
Can they supply the figures to support this rapid gain in market share? They do state the number of WorldCat Local installations ("624", p. 22), but WCL is not an ILS (even though it may eventually become the basis for one).

Academic Libraries only
The complaint appears to only address academic libraries. (p.7) This could be because the evidence that they claim to have only relates to academic libraries, but both OCLC and III serve many public libraries. The complaint also states that:
"The relevant geographic market ... is the United States, because academic libraries cannot turn to suppliers of these products in other countries to meet their needs." (p. 10)
This may just be poorly worded, but if it intends to mean that there are no extra-US companies providing the service then it should have said so. The way it is worded it sounds like there are prohibitions on using non-US suppliers that pertain to academic libraries... could that be so?

New Products
In numerous places in the document, the complaint states that OCLC members are required to participate in product development as part of their membership obligation:
"Membership also obligates libraries to assist OCLC in developing new products and services to compete with for-profit firms." (p. 5)

"OCLC developed, and is still developing, WorldCat Local and WorldCat Local "quick start" through pilot programs in which many of its member university libraries have agreed to participate, without compensation, purportedly to meet the requirements of their membership in OCLC." (p. 20)
I have never heard of this requirement, and would be interested in hearing from institutions who did find themselves essentially forced to participate in pilots as part of their membership.

Acquisition of Other Companies
The complaint states that over time OCLC has expanded by acquiring 19 library industry companies, 14 of which were for-profit. (They fail to mention that at least some of those companies magically became non-profit when acquired by OCLC, cf. netLibrary.) The remainder of the sentence reads:
"... either to obtain software and other products that enable it to offer library services in competition with the remaining for-profit providers or simply to eliminate products from the marketplace." (p. 23)
These are strong words that the complainants should be prepared to prove. I'm not saying that it isn't true. However, in the few cases of which I am aware (WLN, netLibrary, RLG) the acquired company was in financial free-fall and OCLC's purchase was viewed at the time as a rescue that benefited the library community as a whole. In the case of netLibrary, OCLC had agreed to be the escrow agent for the ebooks purchased by libraries, to be called upon should netLibrary go out of business. In that case, OCLC was pretty much pre-obligated to rescue netLibrary or provide some service of its own. (I don't know what the monetary arrangements of the escrow were.) As for WLN and RLG, it's hard to know what would have happened if OCLC hadn't purchased those agencies. I suspect that the libraries using those services would have had to become OCLC members in any case in order to continue functioning as libraries. This only covers three of the 19, and may or may not be representative of OCLC's acquisitions.

[Partial list of acquisitions, gleaned from press releases and annual report:
Dewey Decimal System (1988), Information Dimensions (1993) [sold in 1997], Public Affairs Information Service/PAIS(1999), WLN (1998), netLibrary (2002, with MetaText eTextbook Division, a for-profit subsidiary), Openly Informatics (2006, OpenURL services), RLG (2006), EZproxy (2008), Amlib (2008, Australian web-based ILS), PICA (1997), Fretwell-Downing (2005), Sisis Information Systems (2005). Note: these may not be the same companies referred to in the complaint. This is my cobbled together list, and should only be seen as such.]

Head-hunting
Another strange statement is about OCLC's use of head-hunters to hire staff away from other companies:
"In addition to acquiring for-profit companies, OCLC also uses headhunters to identify and recruit employees from for-profit firms. Plaintiffs are informed and believe and based thereon allege that OCLC is using its tax-free dollars to recruit employees of for-profit vendors of library services to eliminate competition and extend OCLC's monopoly to the ILS market." (p. 26)
There's obviously a story here, but I don't know what it is. Using headhunters is standard industry practice for a well-heeled high-tech organization. Has OCLC engaged in predatory hiring behavior? And can that allegation be proved?

Access to WorldCat
The strangest thing in this complaint is the repeated insistence that OCLC should give access to the WorldCat database to potential competitors.
"...As a result of OCLC's conduct... Innovative [and SkyRiver, in another paragraph] has suffered and will continue to suffer irreparable harm ... unless this Court orders defendant OCLC to provide access to the WorldCat database to Innovative and other competitors, on such terms as are just and reasonable." (p. 31; same but ref. to SkyRiver p. 29)
This argument comes as a surprise to me. I had always assumed that the goal was to allow libraries to provide their bibliographic records freely to anyone they wished, including for-profit companies. I see that as very different from giving competitors direct access to WorldCat. It seems to me that the former goal would be very easy to argue, but direct access to OCLC's own database seems much more difficult to justify. I'm quite puzzled by this, unless I am drawing the wrong conclusion about what it means.




There's a part of me that wants this to go to court so that we can get answers to these intriguing questions. There's another part of me that sees the possiblity that this could be a lose-lose proposition. Given the overall stress in the library community, both monetary and technological, in-fighting looks to be the worst thing we could do to ourselves.

There is no doubt that a large, union catalog of library holdings is key to providing the kind of web-scale (sorry, but I couldn't think of another word) services that libraries absolutely must provide today. That said, that database does not have to be WorldCat, although WorldCat performs that function at this moment in time. The main thing is that we must have a union/universal catalog that serves libraries and their users. It shouldn't be a limited access asset that is being fought over for market share. I don't have a solution to offer, but it's clear to me that the solution is: FREE THE DATA.

SkyRiver/III v. OCLC: the lawsuit

I have now had a chance to read the legal complaint that SkyRiver/III have filed against OCLC. Marshall Breeding does a good overview of the complaint in a Library Journal piece. I'm going to focus on highlights and lowlights, what I think works and what I think doesn't. The caveat is that I do not know enough about anti-trust law to understand whether the suit is convincing on that score. So what follows is my reading of the complaint today, and I welcome corrections, other views, and any commentary.

Smoking Guns

The complaint has what I see as two smoking guns:
  • the use of differential pricing to specifically prevent OCLC members from becoming SkyRiver customers
  • the claim that OCLC paid cash "inducements" to university officials and paid for "luxury trips to expensive resorts to obtain their commitments to promote OCLC products..." (p. 21)

Both of these are extremely damaging to OCLC if they are true. The latter is possibly not illegal on OCLC's part, although it may have been illegal on the part of the officials who accepted such favors in exchange for a contract with OCLC. This, however, should come to the attention of OCLC's members, who, if this is proven to be true, will undoubtedly find this activity unacceptable for their organization.

The arguments about differential pricing are less sensational but could be equally damaging. Differential pricing is a normal practice in business, often based on concrete aspects like volume of trade or length of contract. Whether or not it is normal for a non-profit I don't know. Member libraries have accepted that each one forges a contract with OCLC which is considered confidential (although I suspect that librarians discuss with each other informally about what they pay to OCLC). SkyRiver/III claims to have proof that OCLC has used this differential pricing to punish libraries that have moved their cataloging activity from OCLC to SkyRiver. (The MSU case, as one example.) They also claim to have proof that OCLC lowered cataloging charges for some libraries that were intending to move to SkyRiver, and thus kept them as customers. (See pp. 14-19) This alone may not be illegal, but in this complaint it is described as an unfair use of OCLC's current monopoly position on cataloging services.

[Note: There appear to be more libraries that batch load their records into OCLC than ones that catalog on OCLC. In the 2008/2009 annual report, OCLC states that it has 11,810 member libraries, and 72,035 participating libraries. (I'm not sure of the difference.) In that same time frame, "the number of items cataloged by batch loading increased to 241.8 million, up from 212.1 the previous year...." They also state (p.2) that the total of cataloged records plus batch loaded records was 278.3, meaning that batch loading accounted for 87% of the records added to OCLC that fiscal year.]

Solid Arguments

The complaint has a number of solid arguments about OCLC's behavior that may be significant should this go to court. Briefly, these are:

OCLC does not act like a non-profit or a cooperative. Throughout the document the complaint uses terms like "purported member-based cooperative" when referring to OCLC. In particular, it says:
"Plaintiffs are informed and believe and based thereon allege that OCLC is not a true cooperative in that its members do not share its revenues or control its management, operations or policies. A majority of its Board of Trustees is elected by the Board itself. ... Rather than operating with transparency as a cooperative would be expected to do, OCLC charges different prices to its members for the same services and conceals those differences from its members." p. 5

The complaint also speaks to OCLC's revenue:
"An insignificant percentage of OCLC's revenues come from membership, grants or charitable contributions." (p. 26)
This is followed by a table of revenues, expenses and corporate equity (in 9-digit figures).

It isn't clear to me that this is a convincing argument. Non-profits are not required to obtain their revenue through contributions, and there are probably many non-profits that receive considerable income from services. Perhaps OCLC's "mix" of revenues is off the normal curve? That's data that would be interesting to see. However, the degree of competitive behavior against for-profit companies does seem to belie the nonprofit status of the organization.

OCLC competes directly with for-profit companies. This argument is for a large part about OCLC's entry into the ILS market with its web-based services, but also relates to its inter-library loan (ILL) services, which compete with III's ILL. The main thrust, though, is that OCLC has announced that it will go into direct competition with the primary services of commercial vendors who serve the library market with library systems. The argument is that as a non-profit OCLC has an unfair advantage because it does not pay the federal taxes that are required of its for-profit competitors. Repeatedly the complaint refers to OCLC's "tax-free profits." (see p. 2, 9, 21)

OCLC is a monopoly, and is taking advantage of its monopoly position. I believe that the unfair use of a monopoly position is essential to the anti-trust aspect of this lawsuit. I also believe that this is a point that is hard to prove. To begin with, there is nothing illegal about having a monopoly position in a market if one has acquired that position with normal dealing. And some of the accusations in the complaint may not be anything other than regular business practices, such as providing some services for free (WorldCat Local quickstart, as an example) as a way to induce customers to buy into for-fee services, or to reward customers for their loyalty. The use of pricing to make it financially untenable for its own customers to contract for non-OCLC services is probably the most damaging argument in this area.

OCLC has used its position to avoid the public procurement process. As we know, most public institutions have to go through a cumbersome process in order to procure goods and services. This process is designed to make sure that public money is spent fairly and under controlled conditions that are designed to minimize corruption. The complaint claims that OCLC has obtained contracts for WorldCat Local with public institutions without going through that procurement process. (p. 20)

Trustees are also members. There is a claim of conflict of interest in the fact that high-level employees of OCLC member institutions also sit on OCLC's board. What isn't mentioned here, oddly enough, is that some of those members draw salaries from OCLC (in addition to the salaries received from their institutions -- see any recent IRS 990 form from OCLC, which lists salaried officers). The conflict of interest is that these same individuals may have decision-making roles in their institution for the purchase of library vendor services. "By agreeing to advance the interests and products of OCLC they are effectively excluding competitors." (p. 27) This may be an issue for OCLC, but it seems that it should also be an issue for the institutions that employ these folks.

Coming next: Some odd claims, and some misses

Thursday, July 29, 2010

SkyRiver Sues OCLC over Anti-Trust

(Full document now here! Thanks Marshall Breeding!)

The newly created competitor to OCLC's cataloging services, SkyRiver, is suing OCLC in federal court in San Francisco. (Press release, PDF) I have only seen the press release, so until someone figures out how to free up the actual legal document, what we know is:

SkyRiver is claiming that OCLC is attempting to "monopolize the the markets for cataloging services, interlibrary lending, and bibliographic data, and attempting to monopolize the market for integrated library systems, by anticompetitive and exclusionary practices." The press release refers to OCLC's "tax-free profits," and that OCLC has used those profits to purchase 14 for-profit companies.

The press release quotes Leslie Straus, President of SkyRiver, as saying:
“In the process OCLC has punished its own members who have tried to seek out lower cost alternatives like SkyRiver.”
Which undoubtedly refers to the Michigan State issue, which I reported on here. In that case, OCLC appears to charge MSU an unusually large fee for uploading records to WorldCat after MSU began cataloging on SkyRiver instead of OCLC.

Undoubtedly, a good part of the concern here is over OCLC's plans to provide Web services that comprise the full functionality of an integrated library system (ILS), thus competing with current ILS vendors. You probably know that SkyRiver was started by Jerry Kline, owner of Innovative Interfaces. If OCLC successfully launches a full-service option for libraries, Innovative and other ILS's will suffer. As the representative of a major ILS company explained to me a few years ago, the library market is a zero-sum game: every time one vendor wins, others must lose, because the number of customers is not growing. The library market is a pie that can be divided into any number of slices, but the pie remains the same. This makes the rise of any one company a threat to all. In the commercial marketplace, the vendors compete over functionality and price. With its non-profit status OCLC has a distinct advantage: it doesn't pay federal income tax on the revenues it brings in. That said, given its size and depth of its involvement in day-to-day library operations, it is plausible that even without its non-profit status OCLC would be a formidable competitor for ILS vendors.

I cannot comment on the charges of anti-trust because the press release does not give enough information. Hopefully we will get more details about this suit in the near future.

Sunday, July 04, 2010

Catching up: OCLC, GBS, LOD

Some short comments on recurring themes:

OCLC Record Use Policy


OCLC has finalized its record use policy. The content is substantially the same as it was in the previous draft, which I commented on. There is one important improvement, however: the text clarifies OCLC's claims to copyright.
While, on behalf of its members, OCLC claims copyright rights in WorldCat as a compilation, it does not claim copyright ownership of individual records.
Of course, claiming copyright and actually having the right are not the same thing, especially with databases. Here's what BitLaw says:
Databases as Compilations: Databases are generally protected by copyright law as compilations. Under the Copyright Act, a compilation is defined as a "collection and assembling of preexisting materials or of data that are selected in such a way that the resulting work as a whole constitutes an original work of authorship." 17. U.S.C. § 101.
Generally, carefully selected compilations may make the "original work of authorship" cut; I'm not convinced that a union catalog of library holdings does.

Google Books

We are still waiting to hear from the judge in the Google Books case. (Every time I write that I check to see if it hasn't been released in the last hour.) Meanwhile, GBS continues to function in Internet time. Google has many publishers on board with its partners program, enough that GBS is becoming a serious rival to Amazon. It has even announced that it will begin selling e-books. The opening screen is the exact opposite of the Google Search screen -- it loads up many dozens of book covers and requires significant scrolling to browse to the bottom. Google has added personalization options ("my library") and lets you create multiple "shelves" to organize your materials.

Google was first sued in 2005. Five years is a very long time where technology is concerned. In 2005 the ebook was considered dead; now with the Kindle and the iPad, ebooks are alive and well and everyone is trying to get into that game. In that time since 2005, Google has pretty much shown the publishing industry that they can benefit from the online presence that Google is providing. The settlement reads like it was written in another era, trying to solve problems that may not really be considered problems today. The only issue remaining is that of orphan works, and if we could do a decent analysis of copyright holdings, I suspect that the number of orphan works would not be all that large.

Linked Library Data


At ALA there was a one-day preconference on linked data, and a half day un-conference attended by about 50 people. There are notes from the un-conference, which broke out barcamp-style into 6 groups for discussion.

The World Wide Web consortium has an incubator group on linked library data. This group is tasked to spend one year figuring out how to jump-start the creation of linked data in the library world.

There are ongoing efforts at Library of Congress to produce vocabularies, and of course the RDA vocabularies are available (and almost finalized). Ross Singer has announced some of the MARC codes are available (I presume on his own site). FRBR is being defined in linked data form by IFLA.

We've got just about everything but ... linked data. I'm thrilled that things are moving forward, but frustrated that I still can't see usable results. Deep breath; patience.

Friday, April 09, 2010

OCLC record use policy

OCLC has issued a new draft of its record use policy for member comment. As others have remarked, while better worded and seemingly less draconian than the previous policy (the one that was withdrawn) the substance has not changed one iota. There are many things wrong with the policy itself, but the primary problem with it is not the text of the policy but the way that OCLC has chosen to define the problem it is trying to solve. Here are some of the issues I have with the approach:

1. Pushing the river
The central issue is that OCLC wants to limit downstream use of bibliographic data that is stored in WorldCat. This simply cannot be done. The same data is also stored in individual library catalogs, some union or consortial catalogs, and in bibliographic software used by many hundreds of thousands of researchers around the world. It also often closely resembles data created outside of OCLC's sphere, such as through publisher and retailer channels. Sharing of this data is absolutely necessary for the furtherance of intellectual pursuits and scientific progress, as well as the market for new and used items. Ironically, the policy would restrict use of the data by OCLC members without restricting its use by the multitude of non-members. It would be unacceptable even if it were workable, which it isn't.

2. One-sided
The policy has a section on member rights and responsibilities, but no such section on OCLC's rights and responsibilities. (Nope, I was wrong about that. The section does exist, I must have missed it.) The policy carries the assumption that, if anything, members are the problem, OCLC the solution, and gives no sense of the policy being the result of an agreement between the parties. OCLC can make unilateral decisions about record use, such as its agreement with Google, but members must ask permission of OCLC for many uses. There is nothing here that acknowledges that there could be a situation where the interests of a library and the interests of OCLC are in conflict, nor how that would be resolved. All-in-all, it reads as if the purpose of membership were to sustain OCLC (instead of the purpose of OCLC being to support libraries).

3. Transparency
OCLC, or one of OCLC's governing groups, will make decisions. Yet there are no criteria given for making these decisions, no timelines, no reporting back to members, no mechanism for feedback. Will members know how "their" WorldCat records are being used? Will they have any choice in the matter? Will there be a way to know what requests for use have come in to OCLC, which ones have been accepted, which turned down? If WorldCat is such a "community good" shouldn't the community at least have this information about the use of that good?

4. No options
In most agreements there is some give and take. If you do X, you will get Y. The OCLC record use policy does not give members options. An example of an option would be: if you do your cataloging on OCLC, ILL will cost you $X; if you do not do your cataloging on OCLC, uploading your records will cost you $Y and ILL will cost you $Z. With clear options, libraries can decide what is best for them in their particular situation. Without clear options libraries have no way to make rational decisions about their participation in OCLC. It's not a religion, it's a business relationship, and it should be treated like one.

5. Avoids facing the problem
The problem that OCLC is trying to fix arises, as far as I can tell, because of OCLC's particular mix of costs and expenses. Most of the revenue comes in to OCLC from its cataloging service, so having members choose to catalog elsewhere is the problem. Exhorting members to keep their records in their databases so that others cannot create a large database of bibliographic data is not a solution to this problem. Large bibliographic databases do and will exist. If their existence is a threat to OCLC, then the jig is already up. Rather than stew about what others are doing with bibliographic data, OCLC needs to find a balance of income and revenue that meets the needs of its member libraries, and that might include making some hard decisions about OCLC services.

6. Ignores market forces
If someone can do it better, cheaper, more conveniently, why should libraries stick with OCLC as their vendor? For the purchase of materials or library systems or other services, libraries move to new vendors when they see advantages. With the economic downturn there is a scramble by libraries to cut costs wherever they can. No amount of loyalty to the "collective" can overcome the economic situation libraries find themselves in today. In a sense, OCLC seems to expect the libraries to act irrationally by sticking with the service even if something more economical comes along. Libraries obviously cannot afford to do this.

I cannot tell what steps OCLC's members can take at this point. The web site points to a community forum where people can post comments, but posting comments on the policy doesn't begin to solve the underlying problems as presented here. If I were a member, I think I would feel like a row boat hitching a ride behind the Titanic, hoping it will get me through the ice floes. Nothing is unsinkable, as we have unfortunately found out in the past.

Thursday, March 04, 2010

The Letters Keep Coming In

Today I received a copy of a letter written by Roman Kochan, Dean and Director of Library Services at the California State University, Long Beach (CSULB). It's the perfect day for this, because today is the national day of protest in support of education. This movement has blossomed (exploded?) over the deep cuts the California state legislature has made to the education budget in the state, cuts which are having a devastating effect on the CSU system, with the libraries extremely hard hit.

The letter is addressed to "Link+™ Member Libraries and ILL Partners." The subject line on Kochan's letter reads: Threat to CSULB Library's ILL Participation. He states that faced with budget cuts, not only this year but foreseeable for many years to come, CSULB decided to move to SkyRiver™ as their cataloging utility, with anticipated significant savings.

The next three paragraphs are worth quoting in their entirety:
"We notifed OCLC of this decision, while at the same time advising them of the Library's intent to continue membership in OCLC, to continue to make use of OCLC interlibrary loan services, and to contribute records for our current and future acquisitions to OCLC for batch upload. OCLC's charge for batch upload was (until recently) popsted on the OCLC website as 23¢ per record. That is the amount I referred to in my letter to the organization. I have subsequently learned that:
  • The price schedule for batch downloading [sic, read: uploading] that contained the 23¢ charge has suddenly and mysteriously disappeared from the OCLC website
  • Another academic library that chose to displace OCLC with SkyRiver reports that OCLC has quoted a revised charge for downloading their records that amounts to about $2.85 per record; it is a charge that they report would effectively (and one might not think coincidentally) offset the savings accrued from their change to SkyRiver.
The irony in all of this is that CSULB will still be able to have up-to-date ILL services using INN-Reach and Link+, the Innovative Interfaces (III) ILL service. It's ironic because SkyRiver was founded by Jerry Kline, the owner of III. Link+ is undoubtedly of smaller reach than OCLC's ILL services, but may in the long run grow if more III libraries move to SkyRiver.

Offsetting the cost of having a library move to another vendor may make some economic sense, but this is a matter that will need to get cleared up before other libraries move to SkyRiver thinking that they'll be able to upload their records to OCLC for $.23. MSU and CSULB were caught be surprise, which is very unfortunate.

Friday, February 26, 2010

Yet more OCLC

I have in hand a letter from Clifford H. Haka, Director of the Michigan State University Libraries, addressed to "ILL Partners" and dated February 24, 2010. The letter is a response to Larry Alford's document in my previous post. I will try to represent the facts he presents here as accurately as possible, and to distinguish those from my own opinions.

FACTS (from the letter)

MSU libraries chose to move their cataloging from OCLC to SkyRiver in a cost saving effort. They expect to save about $80,000 per year. Because MSU uses OCLC for ILL, they intended to pay to have their records loaded into OCLC. The OCLC service charge list gives the price for this service as $0.23 per record.

However, when MSU requested the upload service, OCLC offered them a price of $54,000 for five months (presumably end of fiscal year?), which would amount to $74,000 per year for 26,000 records, or $2.85 per record. (Some of this would be offset by cataloging credits.)

MSU has decided that they cannot afford this, and therefore will not be uploading current cataloging into OCLC. Haka says: "While we will continue with OCLC for ILL, I regret that our newer holdings will not be available for others to consult."

Now My Take

I find it astonishing that any corporation would choose to punish customers rather than to work to win them back. I also find it astonishing that OCLC is willing to keep current customers through threats and fear. Essentially, MSU is being made an example: if you move your cataloging to a competitor, we'll cut you out of OCLC services. This is a lesson for anyone else thinking of moving to SkyRiver or some other service.

As Haka points out in his letter, the OCLC database has a huge number of records that were not created through OCLC cataloging services. When the RLIN cataloging service still existed, many libraries that did their cataloging in RLIN uploaded those records to OCLC so that they could use the OCLC ILL service. They paid an amount similar to the $0.23 that Haka quoted from the current price list. This ability to upload (economically, I should add) is directly in support of the stated goal of maintaining WorldCat's value as a union catalog. The more complete the catalog, the more value it has for services like ILL, resource sharing, and collection development. Yet it is OCLC's action that is devaluing WorldCat by deliberately setting an upload price that MSU obviously cannot support economically. This tells me that the real issue is not the "value of WorldCat" but the revenue that OCLC receives from cataloging.

Business 101 would tell you that the existence of a competitor brings prices down in the sector. If you can't meet your competitor's price, then you can try to keep your customers through a superior product and better services, but for some price will be the main factor. If someone else can provide the same service at a better price, your customers will go there.

It seems to me, and Haka alludes to this, that OCLC's reliance on cataloging revenue may be in trouble, not just because of SkyRiver but also because of the Internet: it is now very easy for anyone to store and move metadata on the public Internet. The number of sites dedicated to the same materials that one finds in libraries in increasing rapidly. We have Amazon, Google Books, LibraryThing, Open Library, IMDB, and on and on. They all have metadata describing the things in their focus. It's not the same as library metadata, but the library catalog is no longer, and not by any means, an exclusive source of description for books, films, or music.

What OCLC has that is unique is not just the quantity of metadata but the library holdings information. And they seem to be aware of this as they load in both records and holdings from many libraries that do not do their cataloging on OCLC. OCLC's value is in the whole package, but it still relies on cataloging as its primary revenue (although shrinking as a percentage of the total income, as you can see in their annual reports).

The services, like ILL, that OCLC provides for libraries are incredibly valuable and it would be a great detriment to the library community to lose them. It does appear, however, that there has been shift in the marketplace; a shift that has nothing to do with library loyalty to the OCLC collective, but one of changing technology and economics. OCLC is trying to push water upriver, when it should be seeking a new balance in its revenue stream. Instead, OCLC is making a real mess of its relationship with its members -- first with the horribly botched record use policy (which isn't going to solve this problem anyway), and now with acting punitively toward members who make the kinds of economic decisions that we all make every day. I believe the "collective" can be saved, but only if OCLC decides to work with, not against, its members.

More thoughts (added later)

I realize now that I have many other questions about record loading on OCLC. For example, many libraries get some of their records from their book vendors, and those do get loaded into OCLC. Is that charged as cataloging, or as record loading? Are there different fees for loading records if you are doing your cataloging on OCLC vs. if you are not? Are there "load only" libraries who load their records in order to participate in ILL and other services? If so, what are they charged for record loading?

I say this because it makes sense to me that libraries that do not do their cataloging on OCLC would be encouraged to load their records so that they can participate in other services. It also makes sense that the price for this would be commensurate with that of adding your holdings online (or maybe a bit cheaper if it's more economical for OCLC to batch load rather than provide cataloging online). In fact, what difference does it make how you get your records into OCLC? The most important thing is that your records are there as part of WorldCat.

What the MSU letter tells me is that the OCLC economics are such that cataloging on OCLC is paying for other services, like record uploads, which may be under-priced. A different upload charge for non-cataloging libraries makes sense, and if that's the case then OCLC needs to make that clear. However, it wouldn't surprise me if that wouldn't make alternative cataloging services unmarketable, because as the MSU case shows, the total for cataloging elsewhere plus loading on OCLC would favor doing cataloging on OCLC. This makes perfect sense to me, but it appears that members haven't been informed of this pricing practice. Really, a little more transparency about pricing could go a long way toward avoiding situations like the MSU one.

Thursday, February 25, 2010

OCLC again

Someone slipped me Larry Alford's letter to OCLC members. This is the worst piece of "argument by innuendo" that I have ever seen. The members deserve better, much better.

I am pretty much unable to discern the message in these four pages of insinuations and scores of questions. The document is entirely devoid of facts or information. Still, I'm going to attempt to extract some sense out of it.

First, it's all about threats to WorldCat, in particular as libraries turn to other sources of bibliographic records. What these threats are should be easily quantifiable, but Alford doesn't provide us with any figures. Here's the information that is needed if one wants to make an assessment of the situation:
  1. Are member libraries adding fewer records to WorldCat? How many fewer, and what is the actual loss of revenue to OCLC? Has anyone interviewed them to ask why?
  2. Are former member libraries leaving WorldCat for other services? How many, and what is the actual loss of revenue to OCLC?
  3. What does OCLC charge for its various services? There is no information on the web site, and I've heard it said that contracts between OCLC and libraries are confidential. This makes it very hard to have a discussion about costs and how costs are affecting OCLC's services in the market. Alford makes reference to "alternate service providers" (*cough* SkyRiver) but makes no comparison of costs or services.
There are, of course, a number of red herrings in the text. I say "of course" because it is in the nature of this kind of emotional plea to bring up unsupported statements. As an example, he states that he has asked a series of questions, like
Should the OCLC cooperative create and support software that provides quality control and the ability to make global changes as librarians create new subject headings and revise authority records?

and ends with
I am pleased to note that the response of almost everyone to whom I have posed these questions has been a universal and enthusiastic "yes."

But let's look at those questions. He asks about "supporting" CONSER, NACO and BIBCO without saying the nature or cost of that support. Maybe there is something to think about there. He asks if OCLC should continue maintaining the Dewey classification. Well, what does it cost OCLC, and what revenue does it bring in? And would there be another venue for the community to maintain DDC if members decide that it's not a good activity for OCLC?

He also asks, rhetorically, whether it is better to have a single database for bibliographic and holdings information or
... is it preferable to sequentially search dozens or even hundreds of catalogs around the world to try to find that particular book or article that a researcher needs?

He should know that there are other options, but this document is not about facts but persuasion.

Oftentimes I am unclear at what he is alluding to. On page three he says that there are libraries who are doing their cataloging elsewhere but "still want to participate in the resource sharing made possible by WorldCat." I don't know what resource sharing he means, but as far as I know anything beyond a search in the open WorldCat database is done for a fee. Is he complaining that some libraries do not contribute records to WorldCat but subscribe to other services? That sounds like a revenue stream to me. He refers to these libraries as consuming more value than they return, but I don't know what the unit of the "value" is. As a matter of fact, throughout the document there are references to value that sometimes seem to be about OCLC's revenue, and at other times seem to be about the completeness of WorldCat. Mixing these two up in the discussion is not helpful, not at all.

The purpose of the mailing that this document was attached to was to let OCLC members know that a new, revised policy will soon be sent to OCLC's Council and Board of Trustees, and eventually to all members. If the policy was developed in the same kind of information vacuum that this document exhibits, I have little hope that it will be any better than the original policy that began this round of member dissatisfaction.

Wednesday, October 14, 2009

OCLC and "Competition"

The announcement of a new company, SkyRiver, providing cataloging services to libraries has sparked a number of comments about competing with OCLC and WorldCat. For a number of reasons, I don't think that the result of such a service is necessarily competitive, although I am very glad to see alternatives enter the marketplace, especially for those who do not use OCLC.

To begin with, OCLC is more than an online cataloging service. Admittedly, revenue from cataloging is OCLC's largest income source, so cataloging is not in any way just an incidental function from OCLC's point of view, but cataloging alone is not the point or purpose of OCLC to its users. I see OCLC as a kind of social network where the "beings" are libraries. The value of OCLC is directly related to the population it encompasses, and the social services it can provide based on that population. Shared cataloging copy is one service, but discovery and delivery options probably motivate OCLC members as much or even more than the cataloging effort. This was evident when RLG still existed, as some RLG member libraries who did their cataloging in RLIN also loaded their records into WorldCat in order to participate in the services that OCLC provided.

The value of the catalog copy on OCLC may be second to the value of the holdings information that OCLC maintains. Catalog copy, if that's all you want, can be found in innumerable library catalogs (including the Library of Congress), and some library systems allow you to export or retrieve a full MARC record that you can then add to your own catalog. Catalog copy can also often be found on the retro of the title page in the form of Cataloging in Publication (CIP), although not in MARC format and not as a complete record. But no one else, and no other service, has the combined holdings of some 60,000 libraries, and that's the main thing that OCLC brings to the table. It is only because of these holdings that WorldCat has value to individual searchers and to the libraries who serve them.

The view of OCLC as "the only game in town" for library cataloging ignores the fact that there are libraries who do not participate in OCLC, for a variety of reasons, but who still need to create bibliographic records. These libraries may not be able to afford OCLC's prices for cataloging services, or they may simply not wish to be bound by the standards of that society of libraries. Some libraries, in particular those in corporate settings, are not able to share their holdings publicly, and therefore are not able to participate in the social life of libraries that WorldCat represents.

There are also non-library providers of library catalog records, in particular the vendors who include catalog data with the products they sell to libraries. These vendors need a source of cataloging copy that is unrelated to particular holdings information.

If we can think further down the line, a database of bibliographic records, like that in SkyRiver or biblios.net could become a resource for anyone who needs to work with bibliographic data. This could include anyone on a research project who wants to provide a quality bibliography with a minimum of effort. Although the bibliography will follow citation standards, the basic data is the same as that found in library records.

Another advantage that these and other bibliographic services may provide to us all in the library profession is that they could be a source of data for experimentation. What with RDA looming on the horizon and much talk about updating our data format from MARC to something else, we'll need data to work with. OCLC has historically been slow to change its data, and not without reason: OCLC is integrated into the workflows of tens of thousands of libraries that depend on it for every day functionality. Although the OCLC research division comes up with innovative ideas, the OCLC core functionality is essentially the same as it was two or three decades ago. If we want to experiment with radical change, I for one expect it to come from the sidelines, not the center.

Tuesday, June 30, 2009

Even paranoids....

I'm not the most diligent of bloggers, by any means, and the contents of this blog are pretty narrow in terms of topics. Mostly I have written about Google books, about RDA and other library metadata developments, and recently about OCLC. Although each post is probably offensive to someone out there, the total number of enemies that I can make is probably quite small -- and compared to some bloggers nearly infinitesimal.

So imagine my surprise this morning when I received a notice from Google saying that my blog had been marked as Spam, and would be removed if I didn't take action. There are two ways that your blog can get the Spam qualification: 1) if it is caught by Google's automatic spam detectors and 2) if someone clicks on the "flag blog" link and reports it as spam.

Given the technical nature of my posts, I find the first possibility highly unlikely. This means that I must consider the latter. I hope it is only coincidence that my latest post (and one that has lingered here as the latest for a bit too long, perhaps) is a critique of OCLC and its record use policy. I would love to be able to say that I know that OCLC would not stoop to this kind of censorship, but unfortunately I have experience to the contrary.

Earlier this year I arrived in Dublin only to be refused admittance to a meeting that they had agreed that I could attend (and that I had flown all of the way to Ohio to attend). Than, a few months ago when OCLC was told that I would be writing an article for InfoToday on their "web-scale service" the journal's editor received numerous phone calls from OCLC's press person voicing OCLC management's "concern" that I had been chosen to write the article. What the editor was supposed to do about that concern wasn't articulated, but she kept me on the story and even resisted their request to review the article before it was published. It was a dramatic couple of days, and I'm very grateful to her for her unwavering defense of freedom of the press.

I admit that it is at least equally likely that some random person with a cosmic grudge decided to click on "this is spam," but you may understand why I'm beginning to be a bit paranoid, and wondering if I don't have real enemies.

Wednesday, January 14, 2009

OCLC pushes back policy to fall, 2009

OCLC has just announced that it is pushing back the date on which the new record use and transfer policy will take effect. The actual new date isn't known, but the announcement says:
In order to allow sufficient time for feedback and discussion, implementation of the Policy will be delayed until the third quarter of the 2009 calendar year.
OCLC will form a "review board" to solicit info from members and others, and to advise the OCLC board of trustees about the policy. Jennifer Younger will chair this committee.

This delay is welcome, but I am dubious that a review board would be able to convince the trustees that OCLC must welcome open access to bibliographic data. Minor tweaks to the policy are not going to make much of a difference, and I doubt that any "advice" is going to force the board to do an about-face.

Those of us who promote open access must use this time wisely. First, we need to get some solid legal advice. It's clear that OCLC can propose any kind of conditions in a contract and hope to get signers; it's less clear that OCLC can impose a contract on members 1) without their explicit agreement 2) that covers data created before the contract becomes valid 3) that binds third parties to the contract. Next, anyone who has bibliographic data should release it "into the wild" as quickly as possible. Once the data is circulating, it will not be possible to withdraw it. One solution is to create database dumps and to upload these to the Internet Archive. They will be there for downloading by others, and some of the data may end up in the Open Library. Assuming that bibliographic records cannot be covered by copyright, all of this data ends up in the public domain to fuel innovation and creativity.

Note: if you are preparing a data dump, my advice is:
  • use a standard format (MARC21, MARCXML, UNIMARC, etc.).
Be sure to include in each record fields that give:
  • your local record ID (MARC 001)
  • something that identifies the source of the record (your system or institution) (MARC 003)
  • the version date (either the last date the record was updated, or the date of the data dump) (MARC 005)