An examination of metadata practices for research data reuse: Characteristics and predictive probability of metadata elements
Main Article Content
This study explores metadata practices in the relation to data reuse in biology. Metadata has long been viewed as a major constituent in research data management and reuse. However, the topic of whether metadata is used in a way that encourages data reuse has been understudied. The current study examined metadata elements used to describe datasets and the predictive probability of those metadata elements for data reuse under the assumption that citation frequency reflects the frequency of research data reuse. A total of 34,491 cited records from the biology category of the Clarivate Analytics Data Citation Index were analyzed using descriptive comparison and multiple regression analysis to compare usage patterns of metadata elements between data records cited more than twice and those cited only once. Of the five types of metadata elements identified and examined, metadata elements that provided descriptions about datasets and author-related information dominantly appeared across datasets, whereas DOI and ORCID identifier were scarce. Metadata related to author and funding resources were found to be positive influential factors in predicting data reuse, whereas data descriptions and identifiers appeared to have negative influences. This study contributed to a better understanding of metadata needs for data reuse.
It is a condition of publication that manuscripts submitted to the journal have not been published, accepted for publication, nor simultaneously submitted for publication elsewhere. By submitting a manuscript, the author(s) agree that copyright for the article is transferred to the publisher, if and when the manuscript is accepted for publication.