-
Notifications
You must be signed in to change notification settings - Fork 124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update pullsecret checks to include auth tokens and hive checks #653
base: master
Are you sure you want to change the base?
Conversation
Skipping CI for Draft Pull Request. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: nephomaniac The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@nephomaniac: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
emailOCM := o.account.Email() | ||
o.log.Info(o.ctx, "Found email for cluster's OCM account: %s\n", emailOCM) | ||
|
||
if !o.hiveOnly { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic for flags is pretty unintuitive, is there a way to better structure the code so that it's more clear what operations are performed based on the flag combinations?
|
||
if !o.hiveOnly { | ||
// get the pull secret in cluster | ||
err = getPullSecretElevated(o.clusterID, o.reason, pullSecret) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the current validate-pull-secret-email command we have an integration with managed-scripts to retrieve the email without admin elevation, is there a reason we should discard that and return to generate Compliance Alerts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It appears this is the same behavior as the existing osdctl command, I believe it always runs as Backplane Cluster Admin when fetching the pull secret on the target cluster?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We did have an integration with managed-scripts in the past #488 but it was removed due to a bug in the implementation, I was wrongly under the impression that it was still there. I think we can revisit this idea with a new implementation, I opened a ticket for it https://issues.redhat.com/browse/OSD-28292
if len(userName) <= 0 { | ||
err = fmt.Errorf("found empty 'username' for account:'%s', needed for accessToken", o.account.HREF()) | ||
} else { | ||
accessToken, err = o.getAccessTokenFromOCM(userName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check requires Region Lead permission to pull the secret from OCM, I don't see this stated anywhere in a message to the cli user or in the help command. Ideally it should require user confirmation if the flag is active e.g.:
"Retrieving credentials from OCM require Region Lead permissions, continue? [y,N]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For test purposes, the command 'may' not always require RL perms. If the cluster is owned by the current OCM users executing this request, the command will not 'impersonate'.
If the command fails for perms issues this error is provided:
osdctl/cmd/cluster/validatepullsecret.go
Line 847 in c6d6b0c
"AccessToken ops may require 'region lead' permissions to execute.\n"+ |
and then the user is prompted if they'd like to continue:
osdctl/cmd/cluster/validatepullsecret.go
Line 293 in c6d6b0c
fmt.Printf("Error fetching OCM AccessToken:%s'.\nWould you like to continue with validations? ", err) |
We could also add a prompt during the prerun() when the cli args indicate the user will want to fetch the AccessToken?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could be more upfront and add it to the description and during command runtime. The message could also be less ambiguous:
AccessToken ops may require 'region lead' permissions to execute.
This leaves doubts about when it is required, the user needs to find this information elsewhere.
I suggest something like:
"Fetching the AccessToken from OCM requires Region Lead permissions unless you're the cluster owner. Would you like to continue?"
} | ||
|
||
/* Checks against hive to confirm hive is sync'd with OCM */ | ||
if o.checkHive { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see why we need to check the credentials stored in Hive.
We don't have a formal process to rotate the entire pull secret for managed openshift, once the credentials are stored in Hive after cluster creation, AFIK there's no other process that changes them and would cause them to get out of sync (except owner transfer but those are manual processes with their own tooling).
Updates to PS are not supported in managed openshift so I don't see why complicate the code with this check for a situation we can already detect from the PS on the cluster. It seems redundant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also generates an additional Compliance Alert Ticket
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part was added to help a user validate the hive data + syncsets. The hive checks do not run by default, and can be run independent of the target cluster checks. This would only need to be checked if the checks on the target cluster fail. In that case the comp ticket is likely justified?
This is currently waiting for feedback on the actual update secret process and tooling. If this is not a touch point then validating this would be mis-leading.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Hive team confirmed here https://redhat-internal.slack.com/archives/CE3ETN3J8/p1739453387930049 that the secret is not used in any capacity after provisioning.
Having that validation is indeed misleading, it raises doubt on
- The role of that secret (is it actually important?)
- Follow-up action if it is found not up to date: should it be updated and how?
- When this validation should be run and as part of which investigation
I think we should remove this check but if you have more doubts, could you please open a thread on the sre channel so we can gather more feedback from the team?
/label tide/merge-method-squash |
Wouldn't it be easier to rename the command we currently have to: validate-pull-secret-email and create a new command named validate-pull-secret-credentials? The two checks even if related have different steps, it would simplify the code. |
Filed https://issues.redhat.com/browse/OSD-28215 for better tracking of the extended validation |
This is related to OSD-21669. This attempts to update the existing pull secret validations to include:
./osdctl -S cluster validate-pull-secret -h
Usage:
osdctl cluster validate-pull-secret [CLUSTER_ID] [flags]
Examples:
Flags:
-h, --help help for validate-pull-secret
--hive Check secret values on Hive against OCM for this target cluster
--hive-ocm string Path to OCM 'prod' config used to connect to hive when target cluster is using 'stage' or 'integration' envs
--hive-only Exclude checks against target cluster. Check Hive only.
--no-regcreds Exclude OCM Registry Credentials checks against cluster secret
--no-token Exclude OCM Access Token checks against cluster secret
--reason string The reason for this command to be run (usually an OHSS or PD ticket).
-v, --verbose int debug=4, (default)info=3, warn=2, error=1 (default 3)
Example run...