This chapter is intended to inspire ideas for the practical use of the tools discussed in the previous chapter and to help practitioners connect information needs with the types of bibliometric analyses that might help respond to these needs. It is appropriate here to remind ourselves of the very important challenge of using metrics responsibly, as discussed in chapter 1. If you are unsure where to start, the guidance available from the SCOPE Framework (INORMS Research Evaluation Group 2020) challenges us to think first about the “value of the entity” that is being evaluated or measured before using any bibliometric analysis. Using the SCOPE Framework as a starting point reminds us to use bibliometric analysis only if it matches our values and to avoid the streetlight effect (Freedman 2010) of counting things only because they can be counted. The SCOPE Framework also presents a very helpful decision matrix to help identify the amount of risk or impact involved depending on the motivation for using the analysis and what entity level is being evaluated (figure 3.1).
Somewhat aligned with this risk matrix, the applications of bibliometrics that will be discussed here are broken up into four types:
Of course, there are likely infinite ways to organize the different analysis types. However, these groupings will also loosely align with the various bibliometric services at academic libraries or institutions. In presenting the applications, an attempt has been made to generalize the data or these details so that they can be adjusted to unique situations; however, in some cases there are references to more descriptive materials. It is recommended that you pursue these for further guidance and detail. The step-by-step instructions cannot be captured here as these tools are constantly changing and evolving. The idea is to get a sense of the possible.
There are already some very excellent sources that define and describe bibliometric indicators. Measuring Research: What Everyone Needs to Know by Cassidy R. Sugimoto and Vincent Larivière (2018) is a particularly succinct, yet thorough, recent review of the main bibliometric indicators, their limitations, and considerations in interpreting the data. Somewhat more dated, yet still very relevant, is Measure Academic Research: How to Undertake a Bibliometric Study by Ana Andrés (2009), which divides the indicators into several groups: descriptive indicators, author production, journal productivity, collaborations, author citations, and journal citations. These sources and those provided in the reference list are excellent resources for understanding the breadth of indicators that could be used in any bibliometric study. This report is taking a more practical approach to the use of these indicators and attempts to describe the use of bibliometrics for specific service-oriented applications that a practitioner may encounter.
With a focus on the practitioner, this report also has to acknowledge that the full spectrum of bibliometric methodologies cannot possibly be covered.1 Certainly, there are highly skilled expert-level practitioners and teams that have exceptional experience with a variety of complex analyses using tools and skills that go beyond the reach of this report, such as building and using data science methods with machine language algorithms or complex relational databases. Instead, this report focuses on the use of the bibliometric tools discussed in chapter 2, attempting to give the entry-level to mid-level practitioner some guidance on the various applications of these tools. However, keep in mind that most bibliometric practitioners will be required to develop some comfort with downloading and analysis external to the bibliometric tools. At the very least, developing a proficiency in using Excel pivot tables is certainly a good start.
A university library’s collection development department wanted to know how many of its authors from its institution have published with a particular publisher in recent years to inform on the potential impact of a recent transformative agreement. Due to challenges in extracting complete publisher and journal data from some of the main bibliometric tools, the data source Crossref was used to create an API data pull using the affiliation name and publisher via its member code. With this analysis, the library was able to determine the count of affiliated articles published each year in journals by the publisher of interest, and the analysis would aid in year-over-year costing predictions if needed (figure 3.2).
Tools used: Crossref REST API and supporting documentation on GitHub.
Crossref REST API
GitHub supporting documentation
https://github.com/CrossRef/rest-api-doc#resourcecomponents
Transformative agreements are shifting the way publishers and libraries do business. The not-for-profit Jisc represents the UK higher-ed sector as a consortium and negotiates deals on behalf of numerous member academic institutions. It has an interest in monitoring the impact of transformative agreements and recently presented at the Bibliometrics and Research Impact Community Conference on the methods used in its investigation (Harris 2022). In the presentation slides, there are several valuable resources it has created that may be of interest to the practitioner.
Jisc
Bibliometrics does not need to be complicated. When trying to understand collection development needs, bibliometrics can be used to tailor a core journal list to a particular research area, a group of researchers or author, an institution, or all of these together.
For example:
Galter Library at Northwestern University (Pastva et al. 2020) was interested in using patent literature to help identify highly cited journal publications within the health sciences research domain and to determine if these journals corresponded with usage within its existing collections (obtained from COUNTER usage statistics) and the Journal Impact Factor (obtained from InCites Journal Citation Reports). It used Dimensions as the bibliometric data source and was able to obtain NU author/inventor patent information as well as the journal article information that was cited within these patents. It found that the Journal Impact Factor did not correspond with the citation data or usage data and questioned its utility in making collection development decisions. However, the journal most cited in the patent literature did correspond with the usage data. From this analysis, Northwestern found that its existing collections aligned with the identified journals in its patent-citation analysis; however, a different set of top-cited journals, or core journals, were identified and could be used to help strengthen its collection development decisions. The visualization it used to help communicate these findings (although not reproducible here) plotted each journal along the x-axis in decreasing order of patent citation counts with secondary plots of the Journal Impact Factor and usage along the y-axis. Their work illustrates that patent citation has better alignment with usage counts versus the poor alignment with Journal Impact Factor.
Summary of tools: Dimensions (user interface and API), COUNTER, InCites, Excel, Python (and Jupyter), Tableau.
The following papers have not been referenced in this section, but they may be of interest to the reader.
Bangani, Siviwe, and Michiel Moll. 2021. “Scattering of Journals Cited in Legal Theses and Dissertations.” Journal of Librarianship and Information Science, OnlineFirst, August 2021. https://doi.org/10.1177/09610006211036725.
Davis, Sarah, and Jan Thomas. 2020. “Bibliometrics in the Library: Beyond Research Evaluation.” In “BibSymp20 Panel: Bibliometric Services 1.” Panel presentation, Bibliometrics and Research Assessment Symposium 2020, online, October 7–9. YouTube video, 40:03–58:18. https://youtu.be/HbRBUDfkRWc?t=2403.
Martindale, Tyler. 2020. “More Than Collection Development: Using Local Citation Analysis to Begin a Career in Business Librarianship.” Collection Management 45, no. 4: 321–34. https://doi.org/10.1080/01462679.2020.1715315.
Pastva, Joelen, Jonathan Shank, Karen E. Gutzman, Madhuri Kaul, and Ramune K. Kubilius. 2018. “Capturing and Analyzing Publication, Citation, and Usage Data for Contextual Collection Development.” Serials Librarian 74, no. 1–4: 102–10. https://doi.org/10.1080/0361526X.2018.1427996.
Stribling, Judy C., Matthew S. Robbins, and Antonio P. DeRosa. 2020. “Mapping the Literature of Guillain–Barre Syndrome to Support Current Awareness among Neurologists.” Journal of Hospital Librarianship 20, no. 2: 111–19. https://doi.org/10.1080/15323269.2020.1738839.
Watwood, Carol L., and Terry Dean. 2019. “Mapping the Literature of Dental Hygiene: An Update.” Journal of the Medical Library Association 107, no. 3: 374–83. https://doi.org/10.5195/jmla.2019.562.
The proliferation of university rankings has captured the attention of academic institutions around the globe, with administrative units contributing to the data submission and validation processes and including their ranking scores within university communications. It is now very common to see a ranking score on the splash page of a university website. However, rankings have been heavily criticized for reducing institutions to a few quantitative measures that mask significant nuance in the actual research and teaching missions of the institutions being evaluated (Gingras 2016). There is also a lack of consistency and transparency within and between the methodologies undertaken by the ranking bodies. Therefore, the rankings can seldom be compared from year to year and certainly cannot be compared to each other. Despite this, the participation in rankings is not slowing down. For example, since the introduction of the UN Sustainability Goals as an assessment benchmark within the Times Higher Education (THE) World University Ranking in 2019, there has been considerable uptake in the institutions participating, growing from 467 in 2019 to 1,410 in 2022. This increased attention to rankings means that the bibliometric indicators are also gaining greater attention as a result. Access to the methodologies is therefore important if institutions are going to be able to understand, keep up with, and maybe even push back on how they are being ranked. It is in the institutions’ best interest to be able to understand and respond to changes in their rankings.
It is, however, important to understand the motivations behind rankings and the use of any indicator for that matter. As Yves Gingras (2016) discusses in his very on-point book titled Bibliometrics and Research Evaluation: Uses and Abuses, there are lessons to be learned from reviewing the impacts of national rankings such as the UK Research Evaluation Framework (REF); the bibliometrics-based university-funding formulas of Australia, Flanders, and Belgium; and the French grandes écoles (Gingras 2016). At best these rankings provide flimsy proxies for more time-consuming qualitative measures such as peer review or the nuanced pursuit of truth via academic rigor, and at worst the specific indicators chosen bias the outcomes toward preconceived notions of rank—for example, a focus on total outputs biases toward larger, better funded institutions, which can be clearly seen in the overrepresentation of privately funded American universities in many national and international rankings.
The major ranking organizations are ShanghaiRanking, also known as the Academic Ranking of World Universities (ARWU); the THE World University Ranking; the QS World University Ranking; and the Centre for Science and Technology Studies (CWTS) Leiden Ranking. For a more complete list the IREG maintains a list of international rankings (IREG Observatory, 2021). However, nationally significant rankings are not covered.
Each year the QS requests that submitting institutions validate the data that is to be included in their ranking. The data that is shared with the institution for validation is a combination of institutionally submitted data and Scopus source data that QS extracts and analyzes. The bibliometric data includes gross number of papers, gross number of citations, net number of papers, normalized number of papers, net number of citations including self-citations, net number of citations excluding self-citations, and normalized number of citations. The details of the methodologies that QS applies to generate these values are detailed in its methods documentation. Although the methodology cannot be completely replicated, the data is based on institutionally affiliated documents within a specified five-year publication window. To obtain all the necessary bibliographic metadata for each article, the SciVal data set needs to be employed. This is because information such as the number of affiliations and the All Science Journal Classification scheme used in its methodology are not available directly from the Scopus data, and therefore the publication sets have to be pulled from SciVal. This validation set allows institutions to have some control over the data that is being used in the ranking and also provides an opportunity to learn more about the methodology, along with its strengths and weaknesses, and can inform the institutions on areas where they may see opportunities for growth or where they may prefer to remain less active.
QS Methods Documentation
https://support.qs.com/hc/en-gb/sections/360005689220-Methods
Even when the ranking organization does not involve the ranked institution in a data validation process, undergoing a data validation may still be of interest to the institution. For example, an institution might be interested in understanding what influenced a recent increase or decrease in its ranking in the ShanghaiRanking’s global ranking of academic subjects. Using the methodology information provided by ShanghaiRanking, it could attempt to replicate the indicators used in the ranking, which include the number of journal publications with the Q1 Journal Impact Factor Quartile, the Category Normalized Citation Impact index value, the number of publications with international collaborations, and the number of publications that received the highest number of votes from the ShanghaiRanking Academic Excellence Survey. Keeping up to date with the institution’s bibliometric indicators may aid in trend analysis or even identify high-performing research areas previously unidentified internally.
ShanghaiRanking Methodology
https://www.shanghairanking.com/methodology/gras/2022
ShanghaiRanking Academic Excellence Survey
https://www.shanghairanking.com/activities/aes
Bibliometrics can be more informative to institutions when they take a multidimensional approach to the data and break out of the confines of university rankings that reduce the complex organisms of academic institutions into a single rank-ordered list. Therefore, the analyses outlined here are only a jumping off point to give the reader some ideas of how bibliometrics can be applied when doing analysis for the purposes of planning at the university or strategic level. Using bibliometrics internally gives you more control over the data, and a more detailed story can be built. For example, in Canada the mostly public universities are funded partly based on the types of programs they offer, with medical schools not only being the best funded but also benefiting from a large network of affiliated hospitals and publishing in research areas that typically have high output and high citation rate. Therefore, universities without medical schools are certainly going to seem to underperform compared to these other universities. But not all universities can have medical schools. Therefore, it is important to make sure that the benchmarking is created in a way that is either comparing schools with similar characteristics or taking the differences into account. Otherwise, the bibliometric analyses will be hiding the real story behind a charade of misleading numbers.
An institution was interested in benchmarking against a set of its peer institutions in its country. However, this is entering a territory similar to ranking, where it is easy to reduce an institution to a rank using only a single (or at best a few) oversimplified indicators. As discussed above, benchmarking should be handled with great care. It is ideal to present a breadth of indicators or to use multidimensional analyses and ensure that the choices of comparator institutions, data filtering, and selected indicators are reasonable and clearly communicated. Therefore, the institution decided to use the SciVal data set, with the field-weighted citation impact (FWCI) as the main indicator, evaluate the data from a five-year window, and present the data in two figures, one including all subject classifications (figure 3.3) and another that has publications from the medical sciences and related fields filtered out of the data (figure 3.4). This allowed the institution to clearly see the effects of having a medical school on even a normalized citation index like the FWCI and that there is obviously an advantage to focusing on the medical sciences. The message is not necessarily that the university should pursue a medical school or even more research within medical science. Rather, it is clear that the university has strengths outside of these research areas. What these strengths are precisely cannot be determined from this unidimensional analysis. Therefore, it may want to investigate further.
Tools used: SciVal and Excel.
Following the previous analysis, the institution was interested in understanding some more detail about the research areas that make it stand out at the national level. Therefore, it used a multidimensional approach that plotted its share of the national output and its FWCI for a series of journal-level research areas based on the All-Science Journal Classification (ASJC) from Scopus over a five-year period (figure 3.5). Quadrants were created using the national average output for all subjects and the expected FWCI of 1.00. The subjects that fall in the upper right quadrant are research areas that not only are above the national average but also have a higher impact (based the FWCI) than expected. This means that these are likely important research areas at this institution. But what about the research areas that have a high FWCI but a low share of the national output? Are these areas that are not of interest to the institution? It is hard to tell with this analysis, but there are some possible explanations: these research areas have a few highly impactful researchers who consistently maintain this level of impact, these research areas happen to have some outlier publications that have been particularly highly cited during this time frame, or the publications in these research areas also fall under more highly cited research areas and benefit from that association. These explanations are all conjecture, of course. The devil is in the details, and further investigation would help fill in the gaps in the story.
Tools used: SciVal, Excel
The researchers Pablo García-Sánchez and Manuel J. Cobo (2018) wanted to explore the impact of international collaborations with the researchers from universities within the Andalusian region of Spain. They wanted to know if publications with more geographically diverse authorship collaborations would see higher citation rates. They used the Dimensions API and Python code to export publications from the nine public universities of Andalusia, identified using the Global Research Identifier Database. They filtered the publications to include only articles as the publication type and the publication years 2010–2015. They were interested in looking at papers authored by only one university in Andalusia, papers where all the authors belonged to Andalusian universities, papers where all the authors were Spanish and at least one was Andalusian, and finally all the Andalusian-authored papers with coauthors from any region of the world. This provided a very interesting perspective on collaboration networks and the progressive diffusion of authorship collaboration types. The papers in the group that had the most geographically diverse authorship collaborations were much more likely to receive a high number of citations. Further details of this study, including all the figures from the analysis, are available in the full paper (García-Sánchez and Cobo 2018). On the other hand, it must be kept in mind that confounding factors could affect the citations obtained by these authorship collaborations, such as the simple effect of more authors on a single paper meaning it may receive more citations. Controlling for these variables may be needed to get a clearer picture of the real impact of authorship collaborations.
Tools used: Dimensions, Python
The areas of interdisciplinarity analyses using bibliometrics are too complex to illustrate in a short example here. Therefore, we will explore the variety of methodologies outlined by Larivière and Gingras in their book chapter “Measuring Interdisciplinarity” (Larivière and Gingras 2014). They outline that interdisciplinarity has been measured in the following ways:
Analyses at the research group or individual level require access to author-level data from a bibliometric data source. Although it is possible to create publication sets based on author name or author ID searches within any bibliometric tools or their associated data source, it is only SciVal that currently allows the creation and management of author groups and hierarchies within its system. This provides a great advantage to the system as groups can be created regardless of the accuracy of their affiliation information. For example, a researcher may forget a credit in a paper to a department where they are working as an adjunct faculty member. They may just fill in their main institution and department as their affiliation. However, the other institution may still like to count that paper in its analysis. It can do so by including the researcher in a group in SciVal. With this example in mind, this section illustrates two examples where the SciVal author tool provides an advantage for the analysis. However, similar analysis may be possible with the other bibliometric tools with a bit of creativity or by working through a few more steps, such as creating a search string of author IDs.
The researchers Maxim Kotsemir, Ekaterina Dyachenko, and Alena Nefedova were interested in looking at the impact of mobility on young, early career researchers at the National Research University Higher School of Economics. Using the researchers’ curricula vitae, they were able to select researchers based on their age (< 39 years) and sort them according to whether or not the researcher had past international educational opportunities that lasted at least three months. With this set of mobile and nonmobile researchers, they uploaded the researchers’ Scopus author IDs into SciVal and organized them into their respective groups. This enabled them to analyze the two groups of researchers based on a number of bibliometric indicators, including number of publications, number of publications per researcher in each group, average number of citations per publication, and the field-weighted citation impact, among others. This study found that there was a positive correlation between mobility and a number of indicators such as number of papers, the prestige of the journal (based on the CiteScore), and citations (Kotsemir, Dyachenko, and Nefedova 2021).
The researchers Nicola Cucari, Ilaria Tutore, Raffaella Montera, and Sofia Profita wanted to further analyze a list of top-cited authors in the field of corporate social responsibility that they discovered through a topic analysis in SciVal (Cucari et al. 2022). They were interested in understanding more about the collaboration activities of these authors based on the assumption that authors with more international collaborations do not always have higher publication output or citations.2 They used the author identifier ORCID to create publication sets from the Scopus database that could be uploaded in the VOSviewer system for network analysis. The resulting visualization of the coauthorship analysis by countries illustrated the relative productivity of each country in the field of corporate social responsibility and how strongly connected each country was based on the number of coauthored papers. The strength of the connection was visualized by the closeness of the nodes (countries) and the thickness of the edges (number of coauthored papers). Their analysis showed that countries like the United States which have high productivity also have many coauthorship links; however, there are also countries like Australia with a good share of coauthorship links that are less productive . This may support the authors’ assumption.
Tools used: SciVal, Scopus, VOSviewer
Andrés, Ana. 2009. Measuring Academic Research: How to Undertake a Bibliometric Study. Amsterdam, The Netherlands: Elsevier.
Cucari, Nicola, Ilaria Tutore, Raffaella Montera, and Sofia Profita. 2022. “A Bibliometric Performance Analysis of Publication Productivity in the Corporate Social Responsibility Field: Outcomes of SciVal Analytics.” Corporate Social Responsibility and Environmental Management, online version of record. https://doi.org/10.1002/csr.2346.
Freedman, David H. 2010. Wrong: Why Experts Keep Failing Us—and How to Know When Not to Trust Them. New York: Little, Brown and Co.
García-Sánchez, Pablo, and Manuel J. Cobo. 2018. “Measuring the Impact of the International Relationships of the Andalusian Universities Using Dimensions Database.” In Intelligent Data Engineering and Automated Learning—IDEAL 2018: 19th International Conference, Madrid, Spain, November 21–23, 2018, Proceedings, Part II, edited by Hujun Yin, David Camacho, Paulo Novais, and Antonio J. Tallón-Ballesteros, 138–144. Cham, Switzerland: Springer.
Gingras, Yves. 2016. Bibliometrics and Research Evaluation: Uses and Abuses. Cambridge, MA: MIT Press.
Harris, Beth. 2022. “Monitoring Transitional Agreements: The Challenges and Successes of Implementing Article Level Metadata Collection.” Presentation at BRIC 2022: Bibliometrics and Research Impact Community Conference, Hamilton, ON, June 25–15, 2022. https://static1.squarespace.com/static/608c20ab643b700aeaad3d9f/t/62b4c824b0fdd44307d2c2da/1656014884570/bric-2022-Monitoring-TAs.pdf.
INORMS Research Evaluation Group. 2020. The SCOPE Framework: A Five-Stage Process for Evaluating Research Responsibly. INORMS Research Evaluation Group. https://inorms.net/wp-content/uploads/2021/11/21655-scope-guide-v9-1636013361_cc-by.pdf.
IREG Observatory. 2021. “IREG Inventory on International Rankings.” https://ireg-observatory.org/en/initiatives/ireg-inventory-of-international-rankings/.
Kotsemir, Maxim, Ekaterina Dyachenko, and Alena Nefedova. 2021. “Publish More or Publish Differently? New Aspects of Relationship between Scientific Mobility and Performance of Young Researchers.” In 18th International Conference on Scientometrics and Informetrics, ISSI2021, 12–15 July 2021, KU Leuven, Belgium: Proceedings, edited by Wolfgang Glänzel, Sarah Heeffer, Pei-Shan Chi, and Ronald Rousseau, 585–96. Leuven, Belgium: International Society for Scientometrics and Informetrics.
Larivière, Vincent, and Yves Gingras. 2014. “Measuring Interdisciplinarity.” In Beyond Bibliometrics: Harnessing Multidimensional Indicators of Scholarly Impact, edited by Blaise Cronin and Cassidy R. Sugimoto, 187–200. Cambridge, MA: MIT Press. https://doi.org/10.7551/mitpress/9445.001.0001.
Makar, Susan, and Amy Trost. 2018. “Operationalizing Bibliometrics as a Service in a Research Library.” Information Outlook (Online) 22, no. 5: 21–34.
Pastva, Joelen, Bart Davis, Karen Gutzman, Ramune Kubilius, and Aaron Sorensen. 2020. “Compelling Evidence: New Tools and Methods for Aligning Collections with the Research Mission.” Serials Librarian 78, no. 1–4: 219–27. https://doi.org/10.1080/0361526X.2020.1701393.
Sugimoto, Cassidy R, and Vincent Larivière. 2018. Measuring Research: What Everyone Needs to Know. Oxford: Oxford University Press.
Country |
Institutional |
Group |
Individual |
||
Analysis |
To understand |
Low impact |
Low impact |
Medium impact |
Medium impact |
Advocacy |
To show off |
Low impact |
Low impact |
Medium impact |
Medium impact |
Accountability |
To monitor |
Low impact |
Medium impact |
Medium impact |
High impact |
Acclaim |
To benchmark |
Medium impact |
High impact |
High impact |
High impact |
Adaptation |
To incentivize |
Medium impact |
High impact |
High impact |
High impact |
Allocation |
To reward |
High impact |
High impact |
High impact |
High impact |
Table 3.1: Top 10 journal titles by scholarly output by affiliated author in the health technology research areas (defined by a keyword search). Titles shaded gray overlap with table 3.2. Data source: Scopus/SciVal.
Scopus Source |
Scholarly Output |
Views Count |
Field-Weighted Citation Impacta |
Citation Count |
Lecture Notes in Computer Science |
22 |
394 |
1.29 |
97 |
JMIR mHealth and uHealth |
19 |
488 |
1.19 |
461 |
Progress in Biomedical Optics and Imaging—Proceedings of SPIE |
14 |
203 |
4.24 |
51 |
Journal of Medical Internet Research |
12 |
772 |
1.52 |
246 |
Scientific Reports |
12 |
361 |
5.42 |
941 |
Sensors |
9 |
296 |
1.34 |
141 |
PLoS ONE |
8 |
311 |
4.47 |
521 |
ACS Applied Materials and Interfaces |
6 |
257 |
1.8 |
182 |
Annual International Conference of the IEEE Engineering in Medicine and Biology—Proceedings |
6 |
85 |
1.01 |
14 |
IEEE Access |
6 |
100 |
1.89 |
132 |
Table 3.2: Top 10 journal titles by scholarly output referenced by affiliated author publications in the health technology research areas (defined by a keyword search). Titles shaded gray overlap with table 3.1. Data source: Scopus/SciVal.
Scopus Source |
Scholarly Output |
Views Count |
Field-Weighted Citation Impact |
Citation Count |
Journal of Medical Internet Research |
63 |
6,030 |
5.02 |
5,932 |
Scientific Reports |
53 |
3,186 |
3.04 |
3,409 |
JMIR mHealth and uHealth |
46 |
1,217 |
2.37 |
2,279 |
Lecture Notes in Computer Science |
43 |
1,152 |
14.12 |
3,493 |
PLoS ONE |
40 |
1,785 |
4.21 |
2,975 |
Sensors |
35 |
1,847 |
2.72 |
1,887 |
ACS Applied Materials and Interfaces |
33 |
2,015 |
3.27 |
2,243 |
Advanced Materials |
27 |
3,186 |
8.51 |
5,011 |
IEEE Transactions on Neural Systems and Rehabilitation Engineering |
24 |
1,295 |
2.42 |
967 |
The Lancet |
24 |
28,804 |
457.41 |
86,331 |