-
Notifications
You must be signed in to change notification settings - Fork 755
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[NEW] QoS of ae-loop #1789
Comments
This is a good idea, not just for replication but also for admin clients (especially those monitoring the engine). How do you plan to prevent low-priority file descriptors from starving? See also #1596 |
+1, I think we should prioritize. @JimB123 do you want to document what we did internally for this as an alternative? |
Thanks for starting this thread @artikell!
On a high level, we can likely manage with two classes: internal connections (cluster bus, replication, and admin port connections) and external connections (normal clients). I don't foresee a risk of internal connections hogging the main thread. However, for the QoS idea to work, we'll need to limit the number of active external connections processed in each batch. The starvation risk exists here, but a simple solution might be to save the previous Looking forward to a detailed design :) |
The problem/use-case that the feature addresses
In high concurrency scenarios, Valkey's AE event handling mechanism may experience high delays in processing some events, resulting in some operations that require high real-time performance (such as primary and secondary replication, server Cron capabilities) being unable to respond in a timely manner.
Especially when a large number of low priority events occupy the event loop resources, high priority AE events may be blocked for a long time, affecting the overall performance and user experience of the system.
Description of the feature
Introduce event priority management mechanism in Valkey' AE event processing module, supporting the configuration of weights for each AE event. Treat the core master-replica synchronization tasks and server Cron tasks as high priority tasks to ensure their execution within a certain period of time.
Alternatives you've considered
Rate limit, CPU throttling.
Additional information
There are discussions on this proposal in other places. The proposal comes from: @xbasel @PingXie
The text was updated successfully, but these errors were encountered: