Document type assignment by Web of Science, Scopus, PubMed, and publishers to “Top 100” papers

Main Article Content

Andy Wai Kan Yeung


Document type (DT) assignment is an important feature from literature databases. This work evaluated how the literature databases and the publisher websites labeled “Top 100” (T100) papers, a recurring act for which researchers identify and analyze the 100 most cited entities (e.g. articles) within a pre-defined literature set. T100 papers concurrently indexed in the Web of Science (WoS), Scopus and PubMed databses were identified. Among the 248 T100 papers analyzed, no general consensus or consistent pattern was found for labeling T100 papers by the three data sources and the publishers’ websites. All four sources labeled between 30–40% of the T100 papers as reviews. However, PubMed mostly did not give DT labels to the rest of the papers whereas WOS, Scopus, and publisher websites labeled them as articles. The inter-rater agreement was only fair; the decision seemed to be influenced by whether the authors mentioned the word “review” suggestive of the publication/document type in the title, abstract or keywords.


Download data is not yet available.

Article Details

How to Cite
Yeung, A. W. K. (2021). Document type assignment by Web of Science, Scopus, PubMed, and publishers to “Top 100” papers. Malaysian Journal of Library and Information Science, 26(3), 97–103.


Broadus, R.N. 1987. Toward a definition of “bibliometrics”. Scientometrics, Vol. 12, no. 5-6: 373-379. Available at:

Campanario, J.M., Carretero, J., Marangon, V., Molina, A. and Ros, G. 2011. Effect on the journal impact factor of the number and document type of citing records: a wide-scale study. Scientometrics, Vol. 87, no. 1: 75-84. Available at:

Di Girolamo, N., and Reynders, R.M. 2020. Characteristics of scientific articles on COVID-19 published during the initial 3 months of the pandemic. Scientometrics, Vol. 125, no. 1: 795-812. Available at:

Donner, P. 2017. Document type assignment accuracy in the journal citation index data of Web of Science. Scientometrics, Vol. 113, no. 1: 219-236. Available at:

Esene, I.N., Ngu, J., El Zoghby, M., Solaroglu, I., Sikod, A.M., Kotb, A., Dechambenoit, G. and El Husseiny, H. 2014. Case series and descriptive cohort studies in neurosurgery: The confusion and solution. Child's Nervous System, Vol. 30, no. 8: 1321-1332. Available at:

Harzing, A.-W. 2013. Document categories in the ISI Web of Knowledge: Misunderstanding the social sciences? Scientometrics, Vol. 94, no. 1: 23-34. Available at:

LeBrun, D.G., Kocher, M.S., Baldwin, K.D. and Patel, N.M. 2020. How often are study design and level of evidence misreported in the pediatric orthopaedic literature? Orthopaedic Journal of Sports Medicine, Vol. 40, no. 5: e385-e389. Available at:

Sargeant, J., O'Connor, A., Cullen, J., Makielski, K. and Jones‐Bitton, A. 2017. What's in a name? The incorrect use of case series as a study design label in studies involving dogs and cats. Journal of Veterinary Internal Medicine, Vol. 31, no. 4: 1035-1042. Available at:

Sigogneau, A. 2000. An analysis of document types published in journals related to physics: Proceeding papers recorded in the Science Citation Index database. Scientometrics, Vol. 47, no. 3: 589-604. Available at:

Tahamtan, I., Afshar, A.S. and Ahamdzadeh, K. 2016. Factors affecting number of citations: A comprehensive review of the literature. Scientometrics, Vol. 107, no. 3: 1195-1225. Available at:

Van Noorden, R., Maher, B. and Nuzzo, R. 2014. The top 100 papers. Nature, Vol. 514, no. 7524: 550-553. Available at:

Yeung, A.W.K. 2019. Comparison between Scopus, Web of Science, PubMed and publishers for mislabelled review papers. Current Science, Vol. 116, no. 11: 1909-1914. Available at: