Intelligent Web Crawlers with Federated Learning for Search Engine Freshness

Mohammad Abu Kausar,  Mohammad Nasar

International Journal of Engineering and Information Systems (IJEAIS)

Title: Intelligent Web Crawlers with Federated Learning for Search Engine Freshness

Authors: Mohammad Abu Kausar, Mohammad Nasar

Volume: 9

Issue: 9

Pages: 150-160

Publication Date: 2025/09/28

Abstract:
The exponential growth of online content poses significant challenges for search engines in maintaining fresh, relevant, and trustworthy indexes. Traditional crawling strategies and reinforcement learning (RL)-based models improve adaptability but remain centralized, leading to high latency, communication overhead, and privacy risks. This paper introduces a federated reinforcement learning-driven intelligent crawler that integrates distributed training, freshness-aware scheduling, and privacy-preserving aggregation. In this framework, crawler nodes train local models to predict content changes and prioritize high-value pages, while a secure aggregator combines updates without sharing raw data. Experimental results demonstrate that our approach achieves an 18% improvement in freshness and a 40% reduction in communication overhead compared to centralized RL-based crawlers. These findings highlight the potential of federated crawling as a scalable, adaptive, and privacy-preserving paradigm for next-generation search engines.

Download Full Article (PDF)