Looking at the two most complete researcher workflow portfolios, those owned by Digital Science and Elsevier, one of the glaring differences has been Digital Science’s lack of a citation database to compete with Elsevier’s Scopus. Today, Digital Science announced Dimensions, a new product that includes a citation database, a research analytics suite, and streamlined article discovery and access. An important innovation in a number of ways, Dimensions will offer stiff new competition for Elsevier and Clarivate.

On Friday, I spoke with Digital Science CEO Daniel Hook and Digital Science portfolio company executives Christian Herzog and Robert McGrath. I received a presentation and was able to ask questions about Dimensions. I received a live demo, but I have not used the product directly myself.

One boxer looks on as his competitor stumbles, perhaps down for the count, as Dimensions heats up the competitive landscape
Thomas Eakins, Taking the Count, 1898, Yale University Art Gallery.

The Context

Metrics and analytics are sweeping through academia and scholarly publishing. Of course, the Journal Impact Factor has had a major presence for decades. In more recent years, the shift to managing research and its dissemination on platforms is opening up vast new opportunities for analyzing performance. There is growing interest in measuring not just the performance of journal titles, but increasingly of individual articles, academics themselves, their departments, and their parent universities. University research offices, academic libraries, research funders, and scholarly publishers, are among those with interest in research analytics.

A long-standing product category is the citation database, including Clarivate’s Web of Science and Elsevier’s Scopus. Given their importance for research analytics about journal submission, tenure and promotion, and other purposes, citation databases have proven to be a reliable revenue stream for their owners, even as the launch of Scopus in 2004 provided the first competition for Web of Science. Google Scholar has introduced additional competition, but its sustainability is often questioned. More recently, citation databases have become unstable as a product category, since more and more of the citation data is being made available by the not-for-profit CrossRef and others, but it has remained viable to date.  

To support the growing demand for research analytics, entirely new categories of products have developed. For example, research analytics suites have been developed by several companies to allow universities, funders, and publishers to analyze research patterns, identify trends, and conduct comparisons. Notably, Elsevier and Clarivate built their respective research analytics suites, Elsevier’s SciVal and Clarivate’s InCites and Essential Science Indicators, atop their citation databases.

As a result, they have been able to continue selling citation databases through academic library channels while selling analytics suites through other academic customers, including provosts, institutional research, or university research officers. There are also non-university channels for these products, such as funders and publishers. Digital Science has had a product suite in this category, which has been branded Dimensions, and today’s news represents a relaunch of this product and a rethinking of both product categories, the citation database and the research analytics suite.

What is Dimensions?

The new Dimensions product combines a citation database, a research analytics suite, and modern article discovery and access functionality. Compared with its competitors, Dimensions provides expanded data and new integrations, an inclusive approach to content coverage, a “freemium” business model, and a modernized discovery and access experience.

First, Dimensions is built around a series of relationships. Citation databases are built around the relationships between publication units, such as journal articles, and including patents. Dimensions also includes relationships with grant funding and clinical trials, with extensive normalization for fields/disciplines and institutions. It is clear that Digital Science has built a powerful underlying platform that collapses the product categories of citation database and analytics suite into a single new product category. Digital Science will therefore market the same product to university libraries and university research offices, as well as to provosts and other academic channels.  

Second, Dimensions is inclusive in terms of content coverage, rather than curated as is the case for Scopus and Web of Science. Of course, what reads to some as more inclusive can be seen by others as less rigorous selection, given the ways that citation databases have been policed to minimize exploitation of bibliometrics. Dimensions claims to include approximately a quarter more items than does Scopus, although curiously Scopus claims to track 60% more citations than does Dimensions (equivalent current figures for Web of Science could not be located on Clarivate websites). Instead of a curated index, Dimensions offers search filters. For example, when searching the scholarly literature, one can limit results only to those journals included in Web of Science, DOAJ, or other sources.

Third, Dimensions is built around a modern business model. Basic individual access, including powerful searches of the article database and links from it to grants and patents, is free and requires no registration. You will be able to try it here by the time this post goes live. This free version includes much of the functionality of Scopus and Web of Science, enough that it may undercut them at some types of institutions. Digital Science CEO Hook emphasized to me that the underlying citation “data are a commodity and should be made as freely available as possible. Where one should make a margin is by innovating on top.”  Dimensions encourages individuals to register for the free service using their university accounts, so that more advanced features can be automatically enabled if and when their university licenses the institutional model — a smart move not only for that reason, but also because those accounts will no doubt be used in sales efforts to indicate the level of demand among university affiliates.

The institutional model, called Dimensions Plus, has an annual price tiered according to institution type from approximately $10,000-30,000, providing more of the use case for the research analytics tools. It will appeal to universities that want to search and analyze research grants, clinical trials, and patents, and conduct various kinds of analyses, visualizations, and comparisons. There is also a version including tools designed to appeal to a publisher or research funder. For early institutional adopters, the base price will include access to a powerful API that will allow additional services to be built on or integrated with the underlying database.

And finally, as a result of integrations with ReadCube, for much of its article content Dimensions features indexing of article full text for discovery and can offer a seamless reading experience. On the discovery side, fulltext is indexed for 50 million articles, either from a public source like PubMedCentral or arXiv or from a publisher when it has an agreement in place for ReadCube to make its content discoverable. Dimensions is therefore a more powerful discovery tool than the other citation databases, which do not index article full text for discovery. This development raises the interesting question of whether Elsevier could use Mendeley holdings as a way to bring full text searching to Scopus as a kind of “non-consumptive use” (STM’s “voluntary principles for article sharing on scholarly collaboration networks” appear to be silent on the matter).

On the access side, because “we want to replicate the simplicity of SciHub,” Dimensions uses the ReadCube PDF reader to display articles inside the Dimensions service, enabling truly seamless access to them for universities that choose to enable the feature, at an annual price of an additional $3,000-7,000. This integration of all university journal licenses into the Dimensions user experience will work for all content licensed from whatever publisher (including majors like Elsevier and Springer Nature). In addition, over time, every registered ReadCube user will receive a Dimensions account. As a result of the full-text indexing and seamless article access, and if it expands to cover more content and draw in many registered users, it is possible to see Dimensions growing into a primary discovery starting point for scientific researchers. It will be interesting to see if this possibility draws more publishers to pay for ReadCube integration — or if it leads more to a risk of attrition among them.

Portfolio Strategy

Beyond the features of the product itself, today’s Dimensions news is an indication that competition is continuing to heat up in scientific analytics and literature discovery.

Christian Herzog emphasized to me his view that Dimensions’ value to publishers is stronger than that of Scopus precisely because Digital Science is not tied to a publisher like Elsevier. This caught my attention because I have been so curious if Digital Science would eventually be merged into Springer Nature following its IPO. I asked Hook about the development timeline for Dimensions, and he indicated that development work began about 18 months ago, although the roadmap was in place earlier than that. It may be of interest to industry observers that this timeline syncs up perfectly to July 2016, exactly when BC Partners, part owner of Digital Science sibling Springer Nature, failed in its effort to acquire the businesses now known as Clarivate. The timeline for the development of this Dimensions launch clarifies for me that there almost certainly was a time when a merger of Digital Science into Springer Nature was anticipated, whether or not such a merger will take place in the future. It also underscores that Digital Science entering the citation database product category cannot be a surprise to its competitors.

Dimensions is comprised of contributions from six of the Digital Science portfolio companies. This is worthy of note. Dimensions is the strongest indication to date that Digital Science is growing beyond a sector-specific financial investor. Dimensions is a reflection of what deep integration looks like by a sophisticated corporation.

Today’s news provides further evidence that, as competition rises for the discovery and analytics elements of the emerging researcher workflow sector, it does so in a way that strengthens the underlying march towards integration among a very small number of major players.

Of course, we can expect to see others attempt to compete with what is, in one way of thinking, the Clarivate, Digital Science, and Elsevier troika. What can we expect to see from the Center for Open Science? From ResearchGate? From the Chan Zuckerberg Initiative’s Meta? Each of these three is nipping at one or more significant aspects of the workflow and are deserving of continued attention.

For the largest providers, portfolio dynamics are if anything more important than individual products. Will Elsevier be inclined to continue its various partnerships with Digital Science? Will the collaborations through CrossRef, which are closely scrutinized by certain strategists, shift anew? How big a “moat” is the Journal Impact Factor for Clarivate’s other products? Will the significant pricing pressure likely to be experienced by Scopus and Web of Science as a result of Dimensions limit the opportunities for lock-ins with, respectively, Pure and Converis? Will Dimensions create new opportunities for lock-ins within the Digital Science portfolio?

Others publishers that are not in the workflow game have considerations of their own. Will they view Dimensions as a positive development, by stiffening Elsevier’s competition? Or negatively, not least because the discovery and access features of Dimensions may further reduce their opportunity for a direct relationship with authors?

Ultimately, this emerging battle of the research workflow portfolio titans matters tremendously for academia. Universities are faced with the need to develop a strategy for outsourcing core research infrastructure if they are to benefit from the competition in the research workflow platform market. Innovation and competition in this market is ultimately good for scientists themselves, their universities, and science itself.

Roger C. Schonfeld

Roger C. Schonfeld

Roger C. Schonfeld is the vice president of organizational strategy for ITHAKA and of Ithaka S+R’s libraries, scholarly communication, and museums program. Roger leads a team of subject matter and methodological experts and analysts who conduct research and provide advisory services to drive evidence-based innovation and leadership among libraries, publishers, and museums to foster research, learning, and preservation. He serves as a Board Member for the Center for Research Libraries. Previously, Roger was a research associate at The Andrew W. Mellon Foundation.

Discussion

9 Thoughts on "A New Citation Database Launches Today: Digital Science’s Dimensions"

All this should be viewed in the light of Roger’s earlier blog on the question of who owns Digital Science and the response within Digital Science (https://www.digital-science.com/blog/news/owns-digital-science-question/).
The issue appears to me to be the following: who will impose and control the tools to evaluate journals. Maxwell harassed Garfield for decades on this exact point. Now, the struggle continues, but it is shaped by different competitors and contexts. How about a public database of citations?

Fascinating development in the marketplace. Just FYI … Digital Science just confirmed for me on Twitter (https://twitter.com/DSDimensions/status/952926830561984513), that while “Dimensions encourages individuals to register for the free service using their university accounts” is true … that functionality was not ready for launch. No account creation available at this time. Disappointing.

What I find somewhat disconcerting is how Dimensions is paying lip service to the idea of openness, while practicing something very different. When Digital Science CEO Daniel Hook is saying, as quoted here by Roger: “[citation] data are a commodity and should be made as freely available as possible. Where one should make a margin is by innovating on top” this echoes statements in the Dimensions white paper (https://doi.org/10.6084/m9.figshare.5783094.v1):

“Dimensions is an example of the power of making metadata including citations
publicly available, in order to stimulate innovation and novel solutions / tools.
Dimensions has been developed with the same goal in mind: Making good
quality, consistent and linked metadata available to the community not just
to ensure access for all but to stimulate creativity. So much can be done with
these data and to create innovation that supports research.”

In practice though, Dimensions, while perhaps partly building on publicly available data (e.g. from oaDOI), is not contributing to it. The freely accessible version of Dimensions might be very useful for certain purposes, but it doesn’t allow access, export and (re)use of the underlying (meta)data, thereby remaining a commercial party’s closed silo. This is very different from building on open data and, as one business model, charging for the value of all (in a paid model) or some (in a freemium model) of these functionalities, while ensuring that the underlying data are and will remain publicly available. Then citation data would also no longer be a commodity, but truly a public good.

And of course this is only one possible sustainability model. Following Jean-Claude Guédon’s leading question in the comment above: What would be needed to create an enriched citation database with resources like I4OC, ORCID, Grid.ac (from DigitalScience), and the CrossRef funder registry, and ideally, make this openly available as well (through a different business model than mentioned above, obviously)? What would be the role that Wikidata, Wikicite and Scholia could play in this field?

I do not blame Digital Science for investing in Dimensions the way they did, and positioning it in the market the way they chose. However, it feels uncomfortable to see them doing so while simultaneously extolling the benefits that openly available data can provide.

Thank you for this comment, Bianca. I think we are seeing many efforts to market new products as being “comparatively” open. One element of my interest are cases where a product can be more open than its alternative yet still display powerful forms of lock-in. As usual, evaluating even seemingly friendly marketing claims and seeking the best solutions is essential.

Thanks Roger. One other direction I’m thinking in is to what extent it would be possible for more closed-in solutions to nonetheless contribute to open data and infrastructure and the quality thereof. One example is the non-exclusive integration of oaDOI/Unpaywall into both Web of Science and Dimensions, a partnership that enables oaDOI/Unpaywall in turn to further develop their database and open API. And I wonder whether citation databases could contribute to the quality of citations in I4OC, that currently often still lack DOIs and are otherwise incomplete*, while still leveraging their custom functionality inside their own product? Something for Metadata2020.org?

*also noted by CWTS in their latest post, see last paragraph: https://www.cwts.nl/blog?article=n-r2s234&sthash.lInLf4Uz.mjjo

Dear Bianca,

Thank you for your comment here. I think that it’s important to make a few clarifying statements regarding Digital Science and Dimensions for context.

Digital Science works with its portfolio of companies to provide software to researchers. It is not a publisher. We have built Dimensions as a software tool on three principal routes: Firstly, open data available to all (e.g. PubMed, I4OC, GtR); secondly, data available under permissive licence (e.g. CrossRef) and finally, data available from publishers, funders or other third parties under licence to us.
You know that Digital Science is committed to ‘being as open as possible’ – we made, for example, the GRID database openly available under a CC0 license to help solve the challenge of institutional identifiers.

Concerning the open citation and initiative of I4OC, we see ourselves as strong advocates. Indeed, Digital Science portfolio companies Altmetric and Figshare have been supporters of the I4OC initiative since the beginning. The I4OC collaboration has allowed us to create Dimensions: A tool that shows the power of the open approach and which showcases tangible benefits to the many publishers contributing to I4OC. But citation data needs to be provided by the publishers who published the articles. We support this wherever we can and hope that Dimensions can be a tool to help in the argument for publishers to engage. It is clear to us that greater academic value can be created from those data by us and by others in the future.

Dimensions is an example of the kind of tool that can be created if the research community works together to make data more open – and we have responded in kind by making the citation and publications search elements of the platform freely available for anyone to use. The resource needed to develop and maintain a tool such as Dimensions, however, required that we find a way to ensure it’s long-term sustainability. Although this means that we had to put a commercial model in place, we hope that by offering a more comprehensive dataset and an API with fewer restrictions than has been produced thus far by other actors in this area, we will still achieve our goal of empowering research organisations. In addition, we will make any of the data available for research purposes.

On launch, we disabled any individual login for the free version of Dimensions – we didn’t want to increase the barrier to trying it out. We are also in the final stages of testing a single sign-on integration for institutional clients. The option to login to the Dimensions site, which will be available in the first week of February, will also enable download capabilities.

There are, of course, a lot of things which still need to be done across the platform – this includes a large amount of work on ORCID integration and improving categorisation approaches, to name just two items on the long list. We decided that it was better to release earlier, so that we can take the discussion that we’ve been having with development partners into a broader space and engage more widely.

We have tried to position this product honestly for what it is: An innovation using public and open data where possible, and data that we have managed to licence. It is currently “as open as possible”.

In moving scholarly search closer to open it is difficult (perhaps even disingenuous) not to use language of open research and to extol the values of open data. We are strong supporters of open research and we believe that Dimensions takes us another step closer to that. Much of the data we have used are available for download already from their individual sources, meaning anyone with a mind to innovate on those data are already free to do so.

If we are to achieve an entirely open research environment then it will take many steps, such as the one that we have just taken. We believe that our approach and APIs will help the sector to innovate and to take those next steps. We welcome the next innovation, the next tool and the next open data set that pushes all of us further forward.

The discussion via Twitter and comments is useful and needed. We also would like to offer to have an open conversation either in person or via a video call, to hear any further feedback you might have and explore possibilities to move things forward. Please let us know via info@dimensions.ai if you wish to join such a conversation and we will set it up!

Comments are closed.