Sentiment and Intent Classification of in-Text Citations Using BERT.

EasyChair Preprint 7593

17 pages•Date: March 17, 2022

Abstract

When quantifying the quality or impact of research output, methods such as the h-index and the journal impact factor are commonly used by the scientific community. These methods rely primarily on citation frequency, without taking the context of citations into consideration. Furthermore, these methods weigh each citation equally ignoring valuable citation characteristics, such as citation intent and sentiment. The correct classification of citation intents and sentiments could further improve scientometric impact metrics. In this paper we evaluate BERT for intent and sentiment classification of in-text citations of articles contained in the database of the Association for Computing Machinery (ACM) library. We analyse various BERT models which are fine-tuned with appropriately labelled datasets for citation sentiment classification and citation intent classification. Our results show that BERT can be effectively used to classify in-text citations. Additionally, we find that shortening the context can significantly improve in-text citation classification. Lastly, we also evaluate these models with a manually annotated test dataset for sentiment classification, and find that BERT-cased and SciBERT-cased perform best.

Keyphrases: BERT, citation analysis, neural networks, text classification

Links:

https://easychair.org/publications/preprint/Ctc9

BibTeX entry

BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:

@booklet{EasyChair:7593,
  author    = {Ruan Visser and Marcel Dunaiski},
  title     = {Sentiment and Intent Classification of in-Text Citations Using BERT.},
  howpublished = {EasyChair Preprint 7593},
  year      = {EasyChair, 2022}}

Download PDF Open PDF in browser