paper
arXiv cs.LG
November 18th, 2025 at 5:00 AM

SLOFetch: Compressed-Hierarchical Instruction Prefetching for Cloud Microservices

arXiv:2511.04774v2 Announce Type: replace Abstract: Large-scale networked services rely on deep soft-ware stacks and microservice orchestration, which increase instruction footprints and create frontend stalls that inflate tail latency and energy. We revisit instruction prefetching for these cloud workloads and present a design that aligns with SLO driven and self optimizing systems. Building on the Entangling Instruction Prefetcher (EIP), we introduce a Compressed Entry that captures up to eight destinations around a base using 36 bits by exploiting spatial clustering, and a Hierarchical Metadata Storage scheme that keeps only L1 resident and frequently queried entries on chip while virtualizing bulk metadata into lower levels. We further add a lightweight Online ML Controller that scores prefetch profitability using context features and a bandit adjusted threshold. On data center applications, our approach preserves EIP like speedups with smaller on chip state and improves efficiency for networked services in the ML era.

#ai

Score: 2.80

Engagement proxy: 0

Canonical link: https://arxiv.org/abs/2511.04774