Synergy: A Fast and Scalable Feedback-Driven Scheduler for Datacenter Applications
Publication
The paper Synergy: A Fast and Scalable Feedback-Driven Scheduler for Datacenter Applications has been accepted at the 25th IFIP Networking 2026 Conference, held in Lugano, Switzerland, May 24-27, 2026. It was presented in Technical Session B.2 — Datacenter & Traffic Engineering 1.
The paper is co-authored with Fabrício Carvalho and Ronaldo Ferreira.
Abstract
Microsecond-scale datacenter applications demand strict latency guarantees while operating under high load and variable service times. This environment often involves a mix of extremely short and long requests, where short requests — lasting just a few microseconds — are frequently delayed by longer ones due to Head-of-Line (HOL) blocking, leading to higher latencies, especially at the tail.
However, existing approaches to mitigate HOL blocking, such as centralized dispatching, fine-grained preemption, and resource reservation, face fundamental scalability limitations.
This work introduces Synergy, a cooperative, application-aware scheduling system that uses direct feedback from applications to prioritize short requests, dynamically adapts scheduling parameters, and avoids unnecessary preemptions. Synergy adopts a decentralized architecture with distributed queues, job-aware preemption, and dynamic quantum sizing. By eliminating centralized classification and using real-time application measurements, Synergy effectively mitigates HOL blocking without compromising throughput. Synergy outperforms state-of-the-art systems, achieving up to 43% higher throughput while meeting microsecond-scale service-level objectives.
Key Contributions
- Decentralized architecture — distributed queues replace a centralized dispatcher, removing a scalability bottleneck.
- Application-aware, feedback-driven — applications directly signal information that the scheduler uses to prioritize short requests.
- Job-aware preemption — preemption is triggered only when it actually helps, avoiding unnecessary preemptions.
- Dynamic quantum sizing — quantum sizes adapt in real time to the workload.
- HOL blocking mitigation without throughput loss — up to 43% higher throughput compared to state-of-the-art systems, while still meeting microsecond-scale SLOs.