Friday, 16 April 2021

The next generation discovery citation indexes — a review of the landscape in 2020

 Source: https://medium.com/a-academic-librarians-thoughts-on-open-access/the-next-generation-discovery-citation-indexes-a-review-of-the-landscape-a-2020-i-afc7b23ceb32

The next generation discovery citation indexes — a review of the landscape in 2020 (I)

Oct 7, 2020 · 30 min read

Chinese translated version available here


Some Discovery Citation Indexes in 2020

In terms of cross disciplinary citation indexes that are used for discovery, everyone knows of the two incumbants — Web of Science and Scopus(2004). Joined by the large web scale Google Scholar (2004), these three reigned as the “Big 3” of citation indexes for roughly a decade more or less unchallenged.

However 10 years later, around 2015 and in the years after, a new generation of citation indexes started to emerge to challenge the big 3 in a variety of ways .

As of time of writing in 2020, some of these new challengers have had a couple of years of development. How do things look now?

First off, using newer techniques and paradigms, we have for-profit companies like Digital Science launching Dimensions (2018) which strike me as challengers to Scopus and Web of Science in the arena of citation/bibliometric assessment, just as Scopus itself was a challenge to the older Web of Science back in 2004.

On the other end of the spectrum we have the rise of more “open” citation indexes . In particular, a very important player in this area is the relaunched Microsoft Academic(2016) which not only uses web crawling style technologies like Google Scholar to scour the web, applies the latest in Natural Language Processing (NLP) /“semantic” technologies and makes the dataset dubbed Microsoft Academic Graph (MAG) available with open licenses.

Semantic Scholar(2015) is yet another project with Microsoft ties ( funded by the Allen Institute for AI) that plays in the same arena and releases data with open licenses(The S2ORC Semantic Scholar Open Research Corpus is _the newer version with some significant differences vs older Semantic Scholar Open Research Corpus). One of the more “Semantic” features of this search engine is that it types citations into whether the cite is for citing of background, methods or results using machine learning.

While scite (2018) a new citation index by a startup does not provide open data, it’s selling point is the use of NLP to type citation relationships into “Supporting”, “Disputing” and “Neutral” cites which is yet another way of contextualizing research by describin citation relationships.

Besides the two above mentioned well funded think tanks projects, we also see more grassroot like movements like 2017's I4OC (Intiative for open Citations) — an amazingly successful push to get publishers to deposit and make references open in Crossref as well as efforts by OpenCitations.net (a founding member of I4OC) to extract citations from open access papers from PMC to produce the OpenCitations Corpus (OCC), which have served to further increase the pool of Scholarly meta-data and citations that are available in the public domain/CCO.

No comments:

Post a Comment