-
Notifications
You must be signed in to change notification settings - Fork 10k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GSoC] Develop a caching library for etcd #19371
Comments
High level I agree with the improvement & direction, as performance should be one of the key areas that we should spend more effort on. It will definitely ensure the long-term success of etcd. |
Sounds great. It could be more efficient to make and evaluate changes as an official library. +1 for help, if need. |
cc @ahrtr @ivanvc any preference where development should happen. My proposal:
|
It should be OK.
I think all packages in the etcd mono repo should have the same prefix |
Ok, don't think it should be a problem. I expect that on top level of hierarchy we will want client cache, and standalone cache server (like a grpc proxy but based on new cache library with configurable caching, covering all Range types and with proper guarantees). Within the client cache we will have separate watch de-multiplexer and cache for range requests. |
Can we have a spec & design doc for these? |
No, I was just providing more context. Using |
This sounds exciting, and I’d love to take it up as part of Google Summer of Code. The idea of a standardized caching solution for etcd is impactful and I'd love to implement this as my project. I'm currently exploring how we've implemented |
Hi, I have a question! It seems like the b-tree structure in api-server has recently been introduced. Can I ask what's been encouraging the community to strengthen the etcd caching logic? My intention is to know if there were specific team goals behind the recent activities on api-server and this proposal :) |
@mutokrm the main motivations can be found here in this KEP:https://github.com/kubernetes/enhancements/tree/master/keps/sig-api-machinery/4988-snapshottable-api-server-cache |
For folks following along here, here's a few pointers to take a look at to gain some context:
|
Hello Everyone,
What I need help with is @serathius @MadhavJivrajani
So I will start with the docs and setting up project locally and see what things I need help with. However, in order to not get overwhelmed with the complexity of the project, I will need guidance on the local setup. @serathius @MadhavJivrajani is there some other channel for communication? |
Hello @serathius and @MadhavJivrajani, My name is Sywen, I am an early career software engineer. I found this opportunity through the GSoC page and I am really excited about contributing to build a generic caching library at a lower level. I’ve been diving into both Kubernetes watch cache implementation and etcd codebase, definitely a lot to unpack! I understand that this project is aiming for an eventual goal to replace the K8s builtin library, and I’m very interested in contributing not just during GSoC but potentially as a long term contributor if possible, but I wanted to clarify a few aspects regarding the project scope and design:
Thanks!! |
Small size was based on two factors; there is a reference implementation in K8s that matches 1to1 what we want to do; code will be independent from rest of code, meaning no legacy code to learn/integrate.
Production ready in K8s takes at least 1 year :P |
Hi, I also came across this project through GSoC, and I’m excited about the potential of a generic caching library and proxy for etcd. I’d love to contribute for the long haul and help make it production-ready. @MadhavJivrajani, thanks for the links to the additional context. |
Hi @MadhavJivrajani and @serathius, my name is Bob. I am a Software engineer. It’s my pleasure to contribute the etcd caching and proxy features when I found this opportunity through the GSoC page. Here are my prepared works:
I am looking forward to this chance! |
Hi, I'm Jeff. I'm an undergraduate student and interested in this GSOC project. Assumptions
Questions
I'm looking forward to further discussion on this project! |
Hi all, Please also note that responses may be delayed due to a high volume of queries and KubeCon taking place in the first week of April. You are strongly encouraged to bring your questions to the etcd slack channel in order to get them answered.
We have a slack channel on the Kubernetes slack (slack.k8s.io) called Please also see: https://github.com/kubernetes/community/tree/master/sig-etcd
That is correct. It is completely okay to use dependencies if needed. However, it should not exist as part of the Kubernetes codebase for the reasons mentioned in the issue.
You won't need to wait for these. Ideally in the long run, features like KEP-4568 will simply call into the library that we build and we don't necessarily need to rely on their implementation. |
Hello 👋, I am Ikenna, a senior computer science student with an interest in distributed systems. I have taken classes in distributed systems, networking, and databases, and I enjoy exploring these domains outside the classroom. This will be a fun project to work on. |
Hi everyone, I'm applying for GSoC under CNCF to work on developing a generic watch cache for etcd. My proposal aims to create a caching layer similar to Kubernetes' watch cache but as a standalone package (go.etcd.io/cache) to improve scalability and simplify infrastructure management on etcd. This will help projects relying on etcd, like Cilium and Calico, by providing a standardized solution for caching watch events and list requests. A bit about me—I’m a software engineer primarily working with Go, and I enjoy building scalable backend systems and efficient algorithms. I’ve previously worked on distributed systems concepts like MapReduce and have been exploring geospatial data processing. I’m excited about this project as it aligns with my interest in making infrastructure tools more efficient and developer-friendly. |
Hi, it seems to be a very interesting challenge to take on. A little info about me: I am a newly grad cs student, and have some experience dealing with k8s during internship and learning the distributed system (kv, paxos to be specific). I’m particularly interested in the challenge of making the cache reusable without losing efficiency. Balancing generalization (custom indexing) with performance (like fast watch demultiplexes and list latency) seems important, and I’d love to quantify and help with that . A quick question: given the complexity, are you envisioning a thin compatibility layer over K8s internals, or a ground-up reimplementation guided by its design? Looking forward to contributing and learning through this! |
Hi, @serathius and @marseel, I checked the implementation of kvstore in cilium, it used a simple map for watch cache so the integration is not hard as long as the new cache library is compatible with etcd client v3. And for calico, the etcd v3 client is recommended to replace caliico typha. |
This is a project in etcd repository and mentors approval rights are limited to etcd. Milestones should not depend on other projects that we don't have merge right. We might collaborate, might collect feedback, might propose a PoC, but we cannot take the dependency. |
Thanks for your reply. I will first focus on etcd repository itself and keep considering the need of potential users during design. |
Hello everyone, I just got to know about GSoC few days ago, I don't know if it's too late for me contribute.I'm a devOps Engineer, and I haven't contributed to OS before, but I'm willing to learn. I'm still going through the whole document so as to know where to add my contribution. But I can't find the link to the slack channel please |
@Lumen-jane please see #19371 (comment) |
Submitted as project as part of Google Summer of Code with @MadhavJivrajani as second mentor.
While etcd is a powerful distributed key-value store, building scalable infrastructure management systems directly on top of it can be challenging. Kubernetes has demonstrated the effectiveness of the reconciliation pattern for managing complex deployments, and its watch cache plays a crucial role in achieving scalability. However, this crucial caching mechanism is tightly coupled with Kubernetes and not readily available for general etcd usage. Projects like Cilium and Calico Typha, while successfully using etcd for control planes, have had to implement custom solutions to address this gap.
This project addresses the need for a standardized, performant caching solution for etcd, enabling easier adoption of the reconciliation pattern and simplifying the development of scalable etcd-based systems. By providing a generic watch cache implementation, we aim to lower the barrier to entry for building robust and efficient infrastructure management tools on etcd.
Goals:
Milestones:
I'm proposing to locate the project within the etcd mono repo, but as a separate package, that will not be released/tagged until it's ready. Proposed package name: go.etcd.io/cache. Client library would be developed under go.etcd.io/cache/client.
/cc @fuweid @MadhavJivrajani @ahrtr @henrybear327
The text was updated successfully, but these errors were encountered: