Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ControlPlane node setup failing with "etcdserver: can only promote a learner member" #3152

Open
BernardMC opened this issue Jan 22, 2025 · 8 comments · May be fixed by kubernetes/kubernetes#130782
Assignees
Labels
area/etcd kind/feature Categorizes issue or PR as related to a new feature. priority/backlog Higher priority than priority/awaiting-more-evidence.
Milestone

Comments

@BernardMC
Copy link

Hello

I am using CAPI to manage setting up new kubernetes clusters
In my environment I am encountering an issue where I try to setup a 3 node controlplane but get stuck after adding the second node.

In the kcp logs this error repeats

E0121 13:25:32.061001       1 controller.go:316] "Reconciler error" err="failed to get etcdStatus for workload cluster obj-store-01: failed to create etcd client: etcd leader is reported as bc116290b8a83a52 with name \"obj-store-01-wchl9\", but we couldn't find a corresponding Node in the cluster" controller="kubeadmcontrolplane" controllerGroup="controlplane.cluster.x-k8s.io" controllerKind="KubeadmControlPlane" KubeadmControlPlane="minio-default/obj-store-01" namespace="minio-default" name="obj-store-01" reconcileID="75ed1ba2-fd8c-43a3-bd16-2ced8db56999"

The etcd member table looks like this

+------------------+---------+--------------------+-----------------------------+-----------------------------+------------+
|        ID        | STATUS  |        NAME        |         PEER ADDRS          |        CLIENT ADDRS         | IS LEARNER |
+------------------+---------+--------------------+-----------------------------+-----------------------------+------------+
| 785b2d66a9ca13ed | started | obj-store-01-gd6dn | https://SOME_IP:2380 | https://SOME_IP:2379 |      false |
| bc116290b8a83a52 | started | obj-store-01-wchl9 | https://SOME_OTHER_IP:2380 | https://SOME_OTHER_IP:2379 |      false |
+------------------+---------+--------------------+-----------------------------+-----------------------------+------------+

If i look at the cloud-init-output log for node obj-store-01-wchl9 it terminates with an error

[2025-01-21 12:14:31] {"level":"warn","ts":"2025-01-21T12:14:31.940653Z","logger":"etcd-client","caller":"[email protected]/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0000d68c0/SOME_IP:2379","attempt":0,"error":"rpc error: code = Unavailable desc = etcdserver: request timed out, waiting for the applied index took too long"}
[2025-01-21 12:14:37] {"level":"warn","ts":"2025-01-21T12:14:37.958044Z","logger":"etcd-client","caller":"[email protected]/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0000d68c0/SOME_IP:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
[2025-01-21 12:14:45] {"level":"warn","ts":"2025-01-21T12:14:45.689035Z","logger":"etcd-client","caller":"[email protected]/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0000d68c0/SOME_IP:2379","attempt":0,"error":"rpc error: code = FailedPrecondition desc = etcdserver: can only promote a learner member"}
[2025-01-21 12:14:53] [kubelet-check] Initial timeout of 40s passed.
[2025-01-21 12:14:56] {"level":"warn","ts":"2025-01-21T12:14:56.933405Z","logger":"etcd-client","caller":"[email protected]/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0000d68c0/SOME_IP:2379","attempt":0,"error":"rpc error: code = FailedPrecondition desc = etcdserver: can only promote a learner member"}
[2025-01-21 12:15:10] {"level":"warn","ts":"2025-01-21T12:15:10.318514Z","logger":"etcd-client","caller":"[email protected]/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0000d68c0/SOME_IP:2379","attempt":0,"error":"rpc error: code = FailedPrecondition desc = etcdserver: can only promote a learner member"}
[2025-01-21 12:15:31] {"level":"warn","ts":"2025-01-21T12:15:31.439807Z","logger":"etcd-client","caller":"[email protected]/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0000d68c0/SOME_IP:2379","attempt":0,"error":"rpc error: code = FailedPrecondition desc = etcdserver: can only promote a learner member"}
[2025-01-21 12:16:03] {"level":"warn","ts":"2025-01-21T12:16:03.125611Z","logger":"etcd-client","caller":"[email protected]/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0000d68c0/SOME_IP:2379","attempt":0,"error":"rpc error: code = FailedPrecondition desc = etcdserver: can only promote a learner member"}
[2025-01-21 12:16:49] {"level":"warn","ts":"2025-01-21T12:16:49.883513Z","logger":"etcd-client","caller":"[email protected]/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0000d68c0/SOME_IP:2379","attempt":0,"error":"rpc error: code = FailedPrecondition desc = etcdserver: can only promote a learner member"}
[2025-01-21 12:18:02] {"level":"warn","ts":"2025-01-21T12:18:02.221299Z","logger":"etcd-client","caller":"[email protected]/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0000d68c0/SOME_IP:2379","attempt":0,"error":"rpc error: code = FailedPrecondition desc = etcdserver: can only promote a learner member"}

So it appears that we had added an etcd member as a learner and then were attempting to promote it.
One of the calls to promote times out but succeeded on the backend, causing all future calls to promote to fail

I found this recent PR that addresses a similar issue kubernetes/kubernetes#127491 but would not address the problem I'm seeing

In the latest kubeadm the MemberPromote now looks like this

// MemberPromote promotes a member as a voting member. If the given member ID is already a voting member this method
// will return early and do nothing.
func (c *Client) MemberPromote(learnerID uint64) error {
    isLearner, err := c.isLearner(learnerID)
    if err != nil {
        return err
    }
    if !isLearner {
        klog.V(1).Infof("[etcd] Member %s already promoted.", strconv.FormatUint(learnerID, 16))
        return nil
    }

    klog.V(1).Infof("[etcd] Promoting a learner as a voting member: %s", strconv.FormatUint(learnerID, 16))
    cli, err := c.newEtcdClient(c.Endpoints)
    if err != nil {
        return err
    }
    defer func() { _ = cli.Close() }()

    // TODO: warning logs from etcd client should be removed.
    // The warning logs are printed by etcd client code for several reasons, including
    // 1. can not promote yet(no synced)
    // 2. context deadline exceeded
    // 3. peer URLs already exists
    // Once the client provides a way to check if the etcd learner is ready to promote, the retry logic can be revisited.
    var (
        lastError error
    )
    err = wait.PollUntilContextTimeout(context.Background(), constants.EtcdAPICallRetryInterval, constants.EtcdAPICallTimeout,
        true, func(_ context.Context) (bool, error) {
            ctx, cancel := context.WithTimeout(context.Background(), etcdTimeout)
            defer cancel()

            _, err = cli.MemberPromote(ctx, learnerID) <=== We need an are we promoted check here as well
            if err == nil {
                klog.V(1).Infof("[etcd] The learner was promoted as a voting member: %s", strconv.FormatUint(learnerID, 16))
                return true, nil
            }
            klog.V(5).Infof("[etcd] Promoting the learner %s failed: %v", strconv.FormatUint(learnerID, 16), err)
            lastError = err
            return false, nil
        })
    if err != nil {
        return lastError
    }
    return nil
}

What I would like to do is add an additional check within the wait.PollUntilContextTimeout func so that it checks if the member has been promoted before every attempt to promote. If the member has already been promoted it should return early
i.e. reuse this code from earlier in the function

isLearner, err := c.isLearner(learnerID)
    if err != nil {
        return err
    }
    if !isLearner {
        klog.V(1).Infof("[etcd] Member %s already promoted.", strconv.FormatUint(learnerID, 16))
        return nil
    }

Ideally etcd cluster wouldnt return a timeout message when the op actually succeeded, but I would like to add this extra check regardless as I think its fairly low risk. How would I go about adding it?

@k8s-ci-robot k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jan 22, 2025
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@niranjandarshann
Copy link

Yes, according to me suggested extra check will be helpful inside the wait. The edge scenario when an etcd member promotion is successful in the backend but sends the client a timeout message is manageable with PollUntilContextTimeout. By preventing pointless promotion attempts, this check would increase the MemberPromote method's dependability.

@niranjandarshann
Copy link

isLearner, err := c.isLearner(learnerID)
    if err != nil {
        return err
    }
    if !isLearner {
        klog.V(1).Infof("[etcd] Member %s already promoted.", strconv.FormatUint(learnerID, 16))
        return nil
    }

As this code Re-check if the member is already promoted before attempting the promotion.
Benefit by adding this is: It will check and ensures that if the learner is already promoted, then function will exit early without making redundant promotion attempts.

If triage get accepted . I would like to work on it.

@kannon92
Copy link

/sig cluster-lifecycle

@k8s-ci-robot k8s-ci-robot added sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jan 23, 2025
@neolit123
Copy link
Member

/transfer kubeadm

@k8s-ci-robot k8s-ci-robot transferred this issue from kubernetes/kubernetes Jan 27, 2025
@neolit123 neolit123 added priority/backlog Higher priority than priority/awaiting-more-evidence. area/etcd kind/feature Categorizes issue or PR as related to a new feature. and removed sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jan 27, 2025
@neolit123
Copy link
Member

i don't think it's a bug per se, so more of a feature / improvement.
PRs welcome.

@neolit123 neolit123 added this to the v1.33 milestone Jan 27, 2025
@neolit123 neolit123 added help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. and removed help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. labels Jan 27, 2025
@HirazawaUi
Copy link
Contributor

Friendly ping @niranjandarshann: If you’d like to contribute to this, you might want to put the plan for the implementation code on your agenda, as we’re nearing the code freeze on March 20th.

@rikatz
Copy link

rikatz commented Mar 6, 2025

/assign

@neolit123 @HirazawaUi I will work on it with Bernard, I am trying to get some spare time for it. The implementation seems straightforward (last famous words...)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/etcd kind/feature Categorizes issue or PR as related to a new feature. priority/backlog Higher priority than priority/awaiting-more-evidence.
Projects
None yet
7 participants