DOI: 10.1108/el-08-2025-0329 ISSN: 0264-0473

SLD: a unified framework of linked databases for scholarly Big Data enhancement and scientific innovation

Li Zhang, Ming Yin, Mengting Sun, Jing Shi, Ningyuan Song, Haihua Chen

Purpose

Scholarly databases (SDs) are fragmented and evolve independently, creating incomplete and inconsistent metadata that hinders reuse, trustworthy analytics and knowledge discovery. This study aims to propose the Scholarly Linked Databases (SLD) framework to systematically exploit cross-database complementarity for scalable enhancement.

Design/methodology/approach

SLD builds large-scale article linkages across major databases and performs field-level metadata fusion to fill missing values and resolve inconsistencies. We validate SLD on multiple database pairs and evaluate its downstream utility via disambiguation, retrieval, recommendation and bibliometrics, using ratio-based indicators and matched samples to mitigate bias from unequal database sizes.

Findings

Dense cross-database linkages can be established in practice, and SLD improves metadata completeness and consistency. SLD-enhanced data yield measurable gains over non-linked baselines across downstream tasks, indicating better support for Scholarly Big Data enhancement and scientific innovation.

Originality/value

The originality lies in synthesizing and systematically applying existing techniques (linking, cleaning, integration) into a dedicated, scalable framework for the SD ecosystem. SLD also frames inter-database variation as an evolutionary resource that can be selectively combined to create fitter enhanced databases.

More from our Archive