Creating content-based citation analysis system for English and Polish

Description of the project

The outputs of scholarly communication processes are important for disseminating the results of scientific researches. While presenting the findings of these researches, the relationships established between the studies are critically important. The most fundamental element that provides these relationships are citations. However, citations are often used to evaluate research/er performances recently. Researchers are evaluated with regard to number of citations indexed in databases such as Web of Science or Scopus.

The most cited authors are identified as the most prominent authors in their fields. Also, tenures and awards are given by considering number of citations of researchers. There are many studies in the literature criticize the usage of citations as a quantitative data to evaluate research/er performances. The main reason of these criticisms is “publish or perish” culture in academia. The excessive change on scholarly communication processes has created the concepts of “perfunctory citations”, “citation manipulations” and “citation cartels” recently. Authors cite papers that they never read. They prefer citing their colleagues instead of competitors. In addition, editors or peer-reviewers of scholarly journals force authors to make coercive citations to increase their journals’ impact factors which is a factor depends on number of citations.

To solve these problems and create awareness of the actors of scholarly communication processes, there is a term used in library and information science (LIS) literature called “content-based citation analysis”. Weighting citations (e.g., determining positive, negative and neutral citations) and revealing citation motivations are possible with the help of content-based citation analysis studies. The main aim of my project is to develop a system for content-based analysis for English and Polish citations using semantic and syntactic structures. LIS literature is chosen for analysis by considering different specific citation styles of different fields.

In this context, the main processes of the project are as follows step by step:

  • Creating taxonomic citation schemes in terms of meanings, shapes, purposes and arrays of citations.
  • Extracting citation sentences (citances) from full texts documents.
  • Classifying citances automatically by considering taxonomic citation classes by using natural language processing methods.
  • Comparing citation motivations of English, Polish and Turkish literatures.
  • Comparing differences and similarities between international LIS literature and regional ones (Poland and Turkey) by using social network analysis techniques.

The project is financed by the NAWA in Poland. Zehra Taşkın is the Principal Investigator.

Photo by Stanislav Kondratiev on Unsplash