Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL][E2E] Fix DeviceLib/assert-windows.cpp run-time errors #17493

Open
wants to merge 2 commits into
base: sycl
Choose a base branch
from

Conversation

ayylol
Copy link
Contributor

@ayylol ayylol commented Mar 17, 2025

No description provided.

@ayylol ayylol requested a review from a team as a code owner March 17, 2025 17:59
@ayylol ayylol requested a review from againull March 17, 2025 17:59
@ayylol ayylol temporarily deployed to WindowsCILock March 17, 2025 17:59 — with GitHub Actions Inactive
@ayylol
Copy link
Contributor Author

ayylol commented Mar 17, 2025

For Context: This test had been XFAILed 5 years ago. The XFAIL tracker (#16507) pointed to a more recent build failure, and once that was fixed the XFAIL was removed from the test. However the original reason the test was XFAILed was not resolved, and thus this test failed in internal testing. We dont test windows cpu on github ci so this wasnt caught in pre/post commit.

// approach as on Linux - call the test in a subprocess.
//
// RUN: env SYCL_UR_TRACE=2 SYCL_DEVICELIB_INHIBIT_NATIVE=1 CL_CONFIG_USE_VECTORIZER=False %{run} %t.out | FileCheck %s --check-prefix=CHECK-FALLBACK
// RUN: env SHOULD_CRASH=1 SYCL_DEVICELIB_INHIBIT_NATIVE=1 CL_CONFIG_USE_VECTORIZER=False %{run} %t.out | FileCheck %s --check-prefix=CHECK-MESSAGE
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line failed because the error message is sent to stderr and here we check stdout instead for it. With the changes im specifically checking stderr instead

// explicitly. Since the test is going to crash, we'll have to follow a similar
// approach as on Linux - call the test in a subprocess.
//
// RUN: env SYCL_UR_TRACE=2 SYCL_DEVICELIB_INHIBIT_NATIVE=1 CL_CONFIG_USE_VECTORIZER=False %{run} %t.out | FileCheck %s --check-prefix=CHECK-FALLBACK
Copy link
Contributor Author

@ayylol ayylol Mar 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't too sure what the correct fix for this line would be. Looking at the UR Trace when we crash, vs not crash the urProgramLink function never appears in the trace. To me it seemed suspicious given the git history, since this line has been changed multiple times since the test had been XFAILed, so its never been confirmed to pass with the changes that have happened since.

Copy link
Contributor

@againull againull Mar 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe urProgramLink call is supposed to be emitted in case when a backend doesn't support assert natively and we have to link the fallback assert device library at runtime.
I believe cpu backend supports assert natively, so I don't understand why this check has been added.
I belive such verification needs to be done in our unit tests, i.e. by mocking UR to report that backend doesn't support native assert and then verify that urProgramLink is called in this case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I am not opposed to removing that check from this e2e test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually another question is why this test is limited to cpu. I believe it has to be enabled for gpu too or marked explicitly as failing on gpu if there is a problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants