Link prediction accuracy on real-world networks under non-uniform missing-edge patterns. Journal Article uri icon

Overview

abstract

  • Real-world network datasets are typically obtained in ways that fail to capture all edges. The patterns of missing data are often non-uniform as they reflect biases and other shortcomings of different data collection methods. Nevertheless, uniform missing data is a common assumption made when no additional information is available about the underlying missing-edge pattern, and link prediction methods are frequently tested against uniformly missing edges. To investigate the impact of different missing-edge patterns on link prediction accuracy, we employ 9 link prediction algorithms from 4 different families to analyze 20 different missing-edge patterns that we categorize into 5 groups. Our comparative simulation study, spanning 250 real-world network datasets from 6 different domains, provides a detailed picture of the significant variations in the performance of different link prediction algorithms in these different settings. With this study, we aim to provide a guide for future researchers to help them select a link prediction algorithm that is well suited to their sampled network data, considering the data collection process and application domain.

publication date

  • January 1, 2024

has restriction

  • gold

Date in CU Experts

  • July 23, 2024 11:29 AM

Full Author List

  • He X; Ghasemian A; Lee E; Schwarze AC; Clauset A; Mucha PJ

author count

  • 6

Other Profiles

Electronic International Standard Serial Number (EISSN)

  • 1932-6203

Additional Document Info

start page

  • e0306883

volume

  • 19

issue

  • 7