Nov 9th – New Webinar: Crossref for Open Access Publishers

November 9 Crossref Webinar for Open Access PublishersRegister for our webinar to learn best practices for depositing metadata and ways to help with the dissemination and discoverability of OA content.

New Crossref services are being developed that have particular application to OA publishers. Did you know that our upcoming DOI Event Tracker service was inspired by a group of OASPA publishers asking if there was a way to centrally support the gathering of data that could be analyzed as altmetrics?

A large number of Crossref members classify their content as Open Access, and we’ve been thinking about how our infrastructure can support and communicate this.  In many ways, it already does:

  • Crossref supports the deposit of license and funding information in the DOI metadata.
  • Crossref’s CrossMark Service is useful to OA publishers who need to have the means to update info about their content, no matter where it sits.
  • Crossref’s APIs allow publishers to make it easier for researchers to mine full-text content.

Register for the Crossref Open Access Webinar

Date: November 9, 2015

Time: 8:00 am (San Francisco), 11:00 am (New York), 4:00 pm (London)

Register: https://attendee.gotowebinar.com/register/4198524003003451650

Please join us for this new webinar that gives an overview of Crossref and its network of member publishers, along with information on Crossref services that have specific relevance to OA scholarly content.

Crossref will be joined by two guest speakers – Frontiers will talk about their OA workflows and how Crossref services integrate with these, and James MacGregor from PKP will show participants the Crossref Export/Registration Plugin which journals can enable to deposit DOIs with Crossref and to help them participate in other Crossref services.

There will be time for questions and discussion during the webinar. The webinar will be recorded.

2015 Annual Meeting: Speakers Announced

15th AnniversaryCurious about who will be speaking at Crossref’s Annual Meeting this year? We have a flock of scholarly communications talent gathering at the Taj Hotel in Boston from November 17-18, 2015.  In addition to our line-up of keynote speeches and technical workshops, we will be celebrating Crossref’s 15th Anniversary with a quindecennial fête on Wednesday evening, November 18th. There’s still time to register, so please join us!  

Distinguished Guest Speaker Bios:

Marc is the father and Master of Ceremonies of the Ig Nobel Prize Ceremony, honoring achievements that make people LAUGH, then THINK. The Prizes are handed out by genuine Nobel Laureates at a gala ceremony held each autumn at Harvard University and broadcast on the internet and on National Public Radio.

Marc is author of the books The Ig Nobel Prizes, The Man Who Cloned HimselfWhy Chickens Prefer Beautiful Humans, This Is ImprobableThis is Improbable TooThe Ig Nobel Cookbook, volume 1 (co-authored with Corky White and Gus Rancatore). He edited (and wrote much of) the science humor anthologies The Best of Annals of Improbable Research and Sex As a Heap of Malfunctioning Rubble (and other improbabilities).  

Marc has a degree in applied mathematics from Harvard College, spent several years developing optical character recognition computer systems (including a reading machine for the blind) at Kurzweil Computer Products, and later founded Wisdom Simulators, a creator of educational software.

  • Juan Pablo Alperin will be a keynote speaker at Crossref’s 2015 Annual Meeting. Juan is an Assistant Professor and a Research Associate with the Public Knowledge Project (PKP) at Simon Fraser University. Juan started working with the PKP in 2007, and has continued to be involved as systems developer, project manager, and researcher. Juan leads and advises on several of PKP’s R&D and Scholarly Inquiry initiatives as a complement to his research and work on scholarly communications more broadly. He can be reached via @juancommander.  ORCID ID: orcid.org/0000-0002-9344-7439.
  • Scott Chamberlain will be a keynote speaker as well as a presenter at Crossref’s 2015 Annual Meeting. Scott is a scientific programmer who contributes to the field of scholarly literature by developing software for accessing open data on the web.  He co-founded a developer collective called rOpenSci to help connect open source data into the R environment, a free software environment for statistical computing and graphics that runs on all major platforms.  Scott maintains a few clients to work with Crossref APIs, and a text mining client that leverages Crossref’s TDM service.  In addition, Scott maintains clients in R, Ruby, and Python to interact with Legotto, a platform for collecting and delivering altmetric data.  A former ecologist, Scott is currently working full time on rOpenSci at the University of California at Berkeley.  He can be reached via @recology_/@opensci.  ORCID ID: http://orcid.org/0000-0003-1444-9135.
  • John Chodacki will be a presenter at Crossref’s 2015 tech workshops.  John is the Product Management Director at PLOS (Public Library of Science). spent most of his career leading digital publishing initiatives. Before joining PLOS, he managed the product management team at VIZ Media. Prior to that, he managed cross-functional teams at O’Reilly Media, Safari Books Online, Creative Edge, and Zinio. He holds a BA in Anthropology and African American Studies from Grinnell College and an MBA from San Francisco State University. He can be reached via @chodacki. ORCID ID: orcid.org/0000-0002-7378-2408.
  • Anne Coghill will be a presenter at Crossref’s 2015 Annual Meeting. Anne is Manager, Peer Review Operations, in the American Chemical Society Publications Division.  She and her colleagues manage the manuscript submission and peer review environment for ACS’ scholarly journals and books publishing program. Anne holds a Bachelor of Science in chemistry from Illinois State University and a Master in Science in Management Studies from Northwestern University.  She is also the co-editor of The ACS Style Guide, third edition.  She can be reached via @AnneCoghill.  ORCID ID: orcid.org/0000-0002-2773-2282. 
  • Helen Duriez will be a presenter at Crossref’s 2015 tech workshops. Helen is the ePublishing Manager at the Royal Society, responsible for developing the Society’s digital journals strategy as well as the day-to-day management of its journal websites. Since digital innovation transcends the traditional boundaries of scholarly publishing, she spends a lot of time pondering a variation of Freud’s musings, ‘what do researchers want?’ Helen can be contacted via @HDuriez and @RSocPublishing.
  • Martin Paul Eve will be a keynote speaker as well as a presenter at Crossref’s 2015 Annual Meeting. Martin is Senior Lecturer in Literature, technology and Publishing at Birkbeck, University of London and a founder of the Open Library of Humanities. He is the author of three books: Pynchon and Philosophy: Wittgenstein, Foucault and Adorno (Palgrave, 2014); Open Access and the Humanities: Contexts, Controversies and the Future (Cambridge University Press, 2014); and Password [a cultural history (Bloomsbury, forthcoming 2016) and many journal articles. A strong advocate for open access to scholarly material, Martin has given evidence to the UK House of Commons Select Committee Inquiry into Open Access; served on the Jisc OAPEN-UK Advisory Board, the Jisc National Monograph Strategy Group, and the Jisc Scholarly Communications Advisory Board; been a member of the HEFCE Open Access Monographs Expert Reference Group; and is a member of the SCONUL Strategy Group on Academic Content and Communications. Martin is also a qualified computer programmer (Microsoft Professional in C# and the .NET Framework) and is the author of the digital publishing tools meTypeset and CaSSius.  He can be reached via @martin_eve. ORCID ID: orcid.org/0000-0002-5589-8511.
  • Ben Hogan will be a presenter at Crossref’s 2015 Annual Meeting.  Ben is a Regional Manager in Wiley’s Peer Review Management team, responsible for leading the North America and Open Access teams. He works with internal and external stakeholders to bring in new work and refine the peer review experience to be as efficient as possible for authors and editorial offices. Ben’s worked in publishing since 2007 in a variety of capacities, including books and journals production, training, and peer review. His interests include user experience and publication ethics.
  • Jure Triglav will be a presenter at Crossref’s 2015 tech workshops.  His presentation,Using CrossRef’s API to Make Science Writing Smarter, will explore how continuously talking to CrossRef’s API can help us write better scientific content. Topics will include calling the API from JavaScript, combining CrossRef data with modern web-based text editors, and more.Jure is an open science software developer. Jure graduated from medical school 4 years ago, but started working as a developer for Academia.edu shortly after. Now he focuses on technology issues present in open science and runs several projects in this space: @ScienceGist, @ScienceToolbox and @ScholarNinja. Jure also works with open science organizations like PLOS, working on software that will power the future of scientific publishing. He can be reached via @juretriglav.

Crossref Staff Speaker Bios:

  • Geoffrey Bilder is Director of Strategic Initiatives at CrossRef, where he has led the technical development and launch of a number of industry initiatives including CrossCheck, CrossMark, ORCID and FundRef. He co-founded Brown University’s Scholarly Technology Group in 1993, providing the Brown academic community with advanced technology consulting in support of their research, teaching and scholarly communication. He was subsequently head of IT R&D at Monitor Group, a global management consulting firm. From 2002 to 2005, Geoffrey was Chief Technology Officer of scholarly publishing firm Ingenta, and just prior to joining CrossRef, he was a Publishing Technology Consultant at Scholarly Information Strategies.  He can be reached via @gbilder.  ORCID ID: orcid.org/0000-0003-1315-5960.
  • Ginny Hendricks is Director of Member & Community Outreach for Crossref, and is responsible for Crossref’s communications, business development, member services, and product support initiatives. Before joining Crossref, she ran Ardent Marketing for nine years, where she consulted with publishers to craft multichannel marketing strategies, develop, brand, and launch online products, and build engaged communities. She previously managed Elsevier’s launch of Scopus, the abstract and citation database of peer-reviewed literature.  While at Elsevier, she established advisory boards and outreach programs with library and scientific communities. In 1998, Ginny started an early e-resources help desk for Blackwell’s information Services and later led training and communication programs for Swets’ digital portfolio in Asia Pacific, Middle East, and Africa. She’s lived and worked in many parts of the world, has managed globally dispersed creative, technical, and commercial teams, and co-hosts the Scholarly Social networking events in London.  She can be reached via @GinnyLDN.  ORCID ID: http://orcid.org/0000-0002-0353-2702.
  • Chuck Koscher has been the Director of Technology for Crossref since 2002. His primary responsibility has been the development and operation of Crossref’s core services and technical infrastructure. As a senior staff member he also contributes to the definition of Crossref’s mission and the expansion of its services such as the recent launch of Fundref. His role includes management of technical support and back-end business operations. Chuck and his team interface directly with publisher members in dealing with issues effected by new or evolving industry practices such as those involving non-journal content like books, standards and databases. Chuck has been active within the industry having served 9 years on the NISO board of directors, and a participant in initiatives such as the NISO/NFAIS Best Practices in Journal Publishing and NISO’s Supplemental Material Working Group. Prior to Crossref Chuck has over 20 years in software engineering experience primarily in the aerospace industry. ORCID ID: orcid.org/0000-0003-2181-9595.
  • Rachael Lammey is a Product Manager on Crossref’s Crosscheck plagiarism screening and Text and Data Mining API initiatives, among other tools that Crossref make available for publishers build upon.  Rachael has been with CrossRef since March 2012. She previously worked in journals publishing for Taylor & Francis for nearly six years, managing a team who worked with online submission and peer review systems. She has a degree in English Literature from St. Andrews University and a MA in Publishing Studies from the University of Stirling. She can be reached via @rachaellammey.  ORCID ID: http://orcid.org/0000-0001-5800-1434.
  • Jennifer Lin is the Director of Product Management at Crossref.  She has worked in product development, project management, community outreach, and change management within the scholarly communications, education, and public sectors since 2000. She spent four years at the Public Library of Science (PLOS) where she oversaw product strategy and development for their data program, article-level metrics initiative, and open assessment activities. Prior to PLOS, she was a consultant with Accenture, working with Fortune 500 companies as well as governments, to develop and deploy new products and services. Jennifer earned her PhD at Johns Hopkins University. Jennifer can be reached via @jenniferlin15.  ORCID ID: http://orcid.org/0000-0002-9680-2328.
  • Ed Pentz is the Executive Director of CrossRef, a not-for-profit membership association of publishers set up to provide a cross-publisher reference linking service to organise publisher metadata, run the infrastructure that makes Digital Object Identifier (DOI) links work, and rally multiple community stakeholders to develop tools and services that enable advancements in scholarly publishing.  Ed was appointed as CrossRef’s first Executive Director when the organization was created in 2000.  Crossref is now the largest DOI registrar in the world with over 75,000,000 DOIs.  Ed is also Chair of the Board of ORCID, a registry of unique identifiers for researchers established in 2010. Prior to joining CrossRef, Ed held electronic publishing, editorial and sales positions at Harcourt Brace in the US and UK and managed the launch of Academic Press’ first online journal, the Journal of Molecular Biology, in 1995. Ed has a degree in English Literature from Princeton University and lives in Oxford, England. He can be reached via @epentz. ORCID ID http://orcid.org/0000-0002-5993-8592.

DOIs in Reddit

Skimming the headlines on Hacker News yesterday morning, I noticed something exciting. A dump of all the submissions to Reddit since 2006. “How many of those are DOIs?”, I thought. Reddit is a very broad community, but has some very interesting parts, including some great science communication. How much are DOIs used in Reddit?

(There has since been a discussion about this blog post on Hacker News)

We have a whole strategy for DOI Event Tracking, but nothing beats a quick hack or is more irresistible than a data dump.

What is a DOI?

If you know what a DOI is, skip this! The DOI system (Digital Object Identifier) is a link redirection service. When a publisher puts some content online they could just hand out the URL. But the URL can change, and within a very short space of time, link-rot happens. DOIs are designed to fight link rot. When a publisher mints a DOI to an article they just published, they can change the article’s URL and then update the DOI to point to the new place. DOIs are persistent. They are URLs. They’re also identifiers (kind of like ISBNs), and they’re used in scholarly publishing as to do citations.

Crossref is the DOI registration agency for scholarly publishing. That means mostly things like journal articles. There are other registration agencies, for example, DataCite, who do DOIs for research datasets. But at this point in time, most DOIs are Crossref’s.

What does finding DOIs in Reddit mean?

It means someone used a DOI to cite something! DOIs can be used for any kind of content, but because of the sheer volume of scientific publishing, lots of DOIs are for science. Having a DOI doesn’t say anything about quality or content. But it does indicate that the person who created the DOI probably intended it to be cited. We care because it means that every time a DOI is used a tiny bit of link-rot doesn’t have the opportunity to take hold. Every time something is discussed on Reddit and the DOI is used, it means that archaeologists using the data dump in 100 years will have identifiers to find the things being discussed, even if the web and URLs have long since crumbled to dust.

Or, more likely, in five year’s time when a few URLs will have shuffled around.

The results

DOIs have been used on Reddit since 2008 (the logs start in 2006). After a rocky start, we see hundreds being used per year.

DOI submissions per month

That’s dozens per month.

DOI submissions per month

The best subreddit to find DOIs is /r/Scholar, followed by /r/science. And then a lot of others with one or two per year.

DOI submissions per subreddit per year

Opportunities

It’s great to see DOIs being used in Reddit. But let’s be honest, it’s not a massive amount.

We have a list of domains that our DOIs point to. They mostly belong to publishers, so every time we see a link to a domain on the list, there’s a chance (not a certainty) that the link could have been made using a DOI. We found a large number of these, orders of magnitude more than DOIs. We’re still crunching the data.

The data

The data is quite large. It’s a 40 Gigabyte download compressed, which comes to about 170 GB that uncompressed. It contains the submissions to reddit between 2006 and 2015, not the comments, so each data point represents a thread of conversation about a DOI.

Reproducibility (updated)

You can find the source code and reproduce the figures at http://github.com/crossref/reddit-dump-experiment. We use Apache Spark for this kind of thing.

The data and methodology are very experimental. You can download all results here:

https://s3-eu-west-1.amazonaws.com/crossref-labs-data/2015-10-06/reddit-dump-experiment.zip

It includes all data for charts in this post, as well as the full list of DOIs, the full list of URLs that could possibly have DOIs, and the full JSON input line for each of these.

More info

Read about our DOI Event Tracking strategy, including our live stream of Wikipedia citations.

Annual Meeting: Join Crossref in Boston this November!

We’d like to invite the scholarly publishing community to get together in Boston this November with the Crossref Annual Meeting as a rally point. This is the event we hold just once a year to get the whole team under one roof, host a lively discussion with the leading voices in scholarly communications, present technical workshops, and offer you the chance to get hands’ on with our latest metadata services. Our free two-day event takes place from November 17-18, 2015 in Boston, MA.

Agenda:

  • Tuesday, November 17 – Tech Workshops:

The morning is an opportunity to get into small groups and talk directly with our development and support teams. We will present best practices around using Crossref’s metadata. After lunch, we will feature member case studies with tips on implementation and lessons learned. If you’re on the technical production side of scholarly publishing, you’ll want to be there — and not just for the beer & pretzels afterwards.

  • Wednesday, November 18 – Member Meeting:

A day to hear from thought leaders from the larger scholarly publishing community as well as from inside Crossref. Our keynote speaker will be Dr. Ben Goldacre (Bad Science), and our distinguished speakers include Dr. Scott Chamberlain (rOpenSci), Dr. Juan Pablo Alperin (Public Knowledge Project), and Dr. Martin Eve, (Open Library of Humanities). We will share details about the road map for Crossref Labs’ current and future initiatives, hear about the latest organizational developments from new members of our team, and see the debut of our new brand logo and communications strategy. Following the formal discussion, we’ll continue the conversation over cocktails as part of our celebration of Crossref’s milestone 15th Anniversary!

✱ Tickets:

Reserve your free tickets here: https://www.eventbrite.com/e/crossref15-tech-workshops-member-meeting-tickets-17921679225

Who Should Attend?

Scholarly publishers, technology providers, librarians, researchers, academic institutions, funders, journalists, and others who are keen to discuss tools and services to advance scholarly publishing are encouraged to attend.

✱ Venue:

About Crossref Crossref is a not-for profit membership organization that wants to improve research communication. We organize publisher metadata, run the infrastructure that makes DOI links work, and we rally multiple community stakeholders in order to develop tools and services to enable advancements in scholarly publishing.

DOI Event Tracker (DET): Pilot progresses and is poised for launch

Publishers, researchers, funders, institutions and technology providers are all interested in better understanding how scholarly research is used. Scholarly content has always been discussed by scholars outside the formal literature and by others beyond the academic community. We need a way to monitor and distribute this valuable information.

The Crossref DOI Event Tracker (DET)

To meet this need, Crossref will be introducing a new service that tracks activity surrounding a research work from potentially any web source where an event is associated with a DOI. Following a successful pilot run started Spring 2014, the service has been approved to move toward production and is expected to launch in 2016. Any party wishing to join this phase is welcome to contact Jennifer Lin. The DOI Event Tracker (DET) registers a wide variety of events such as bookmarks, comments, social shares, citations, and links to other research entities, from a growing list of online sources. DET aggregates them, and stores and delivers the data in many ways.

Open, portable, and licensed for maximum reuse
Crossref has long served as the citation linking and metadata infrastructure provider for scholarly communication; the new DOI Event Tracker is a natural next step, providing a practical solution as a resource for the whole community. The tracker offers the following features:

  • Data on event activity across a common pool of online channels.
  • Near real-time alerting for select sources with push notifications to the system.
  • Cross-publisher monitoring to enable benchmarking and provide context to the data.
  • Common format for normalizing data results across the diverse set of sources via modern REST API.
  • Secure and regularly refreshed backups of critical data for long term data preservation.
  • Transparency of data collection so as to ensure auditable, replicable, and trustworthy results.
  • Query-initiated retrieval or real-time alerts when an event of interest occurs.
  • CC-0 license for open and flexible propagation of data.

A number of platforms are already confirmed and more parties are welcomed at any stage. So far we have confirmation to track DOI events on the following platforms:

Confirmed DET Platforms Sept 2015

Blogs & Reference
Works
Social
Bookmarks
Social Shares
& Discussions
Links to Research
Entities
Research BloggingCiteULikeFacebookORCiD
ScienceSeekerMendeleyRedditDataCite
Wordpress.comEurope PMC
Database Citations
Wikipedia

This set of sources reflects our initial focus on parties willing to allow their data to be redistributed in the common pool. Efforts are underway to expand the source list to include Twitter and MyScienceWork, among others. Publishers can also act as sources by publishing and distributing DOI event data via the DET when an event occurs on its platform (for example, when a PDF is downloaded, or when a comment mentions a DOI in a locally hosted discussion forum, etc.). This would make local DOI activity globally available to funders, researchers, institutions, etc.

DET provides benefits of scale and ease of access as a central point for collecting and propagating data to the community. As a single point of access, it overcomes the business and technical hurdles that are a part of managing multiple online sources where scholarly activity occurs, in a rapidly changing landscape of online channels. This resource covers content across publishers and serves as a strong foundation to support the development of tools and services by any party. DET users will always be able to combine the DET data with those individually collected via negotiated or paid access. DET remains a utility separate from any value-added amenities, such as analytics, presentation, and reporting.

DET Service-Level Agreement

For those who seek the highest level of service and a more flexible range of access options, Crossref will provide a Service-Level Agreement (SLA) service for the DOI Event Tracker. The DET SLA includes the following additional features on top of the common data offering:

  • Access to the complete suite of sources, which includes restricted and/or paid sources in addition to common data, providing the fullest picture of DOI usage activity possible.
  • Guaranteed uptime and response time to the latest raw data on the aggregate activity surrounding a DOI.
  • Guaranteed support response time to questions and issues surrounding data and data delivery.
  • Flexible data access options: on-demand real time data access and scheduled bulk downloads for processing batch analytics.
  • Optimum retrieval rates and accelerated delivery speeds with the dedicated SLA API.
  • Access to a webhook API for events of interest as an alternative to polling DET.
  • Standardized and enhanced linkback service for the difficult-to-track, grey literature.

The DET SLA service has a simple, value-based pricing model based on subscriber size. Register your interest in Crossref’s DOI Event Tracker and the DET SLA service if you would like stay informed of the upcoming launch. Please contact Jennifer Lin for more information.

Image modified from “Radar” icon by Karsten Barnett from the Noun Project.