Synergy: A Fast and Scalable Feedback-Driven Scheduler for Datacenter Applications - Blog

Synergy: A Fast and Scalable Feedback-Driven Scheduler for Datacenter Applications

Publication

The paper Synergy: A Fast and Scalable Feedback-Driven Scheduler for Datacenter Applications has been accepted at the 25th IFIP Networking 2026 Conference, held in Lugano, Switzerland, May 24-27, 2026. It was presented in Technical Session B.2 — Datacenter & Traffic Engineering 1.

The paper is co-authored with Fabrício Carvalho and Ronaldo Ferreira.

Abstract

Microsecond-scale datacenter applications demand strict latency guarantees while operating under high load and variable service times. This environment often involves a mix of extremely short and long requests, where short requests — lasting just a few microseconds — are frequently delayed by longer ones due to Head-of-Line (HOL) blocking, leading to higher latencies, especially at the tail.

However, existing approaches to mitigate HOL blocking, such as centralized dispatching, fine-grained preemption, and resource reservation, face fundamental scalability limitations.

This work introduces Synergy, a cooperative, application-aware scheduling system that uses direct feedback from applications to prioritize short requests, dynamically adapts scheduling parameters, and avoids unnecessary preemptions. Synergy adopts a decentralized architecture with distributed queues, job-aware preemption, and dynamic quantum sizing. By eliminating centralized classification and using real-time application measurements, Synergy effectively mitigates HOL blocking without compromising throughput. Synergy outperforms state-of-the-art systems, achieving up to 43% higher throughput while meeting microsecond-scale service-level objectives.

Key Contributions

  • Decentralized architecture — distributed queues replace a centralized dispatcher, removing a scalability bottleneck.
  • Application-aware, feedback-driven — applications directly signal information that the scheduler uses to prioritize short requests.
  • Job-aware preemption — preemption is triggered only when it actually helps, avoiding unnecessary preemptions.
  • Dynamic quantum sizing — quantum sizes adapt in real time to the workload.
  • HOL blocking mitigation without throughput loss — up to 43% higher throughput compared to state-of-the-art systems, while still meeting microsecond-scale SLOs.

Reference

comments powered by Disqus