-
Notifications
You must be signed in to change notification settings - Fork 760
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL][RTC] Cache frontend invocation #16823
Conversation
Signed-off-by: Julian Oppermann <[email protected]>
Signed-off-by: Julian Oppermann <[email protected]>
@sommerlukas @cperkinsintel I'll leave this PR in draft mode until discussions settle, but this is how I'd do the basic mechanism, to be extended with a more sophisticated cache lookup. |
Signed-off-by: Lukas Sommer <[email protected]> Signed-off-by: Julian Oppermann <[email protected]>
Signed-off-by: Julian Oppermann <[email protected]>
Signed-off-by: Julian Oppermann <[email protected]>
Signed-off-by: Julian Oppermann <[email protected]>
Signed-off-by: Julian Oppermann <[email protected]>
Signed-off-by: Julian Oppermann <[email protected]>
Signed-off-by: Julian Oppermann <[email protected]>
Signed-off-by: Julian Oppermann <[email protected]>
Signed-off-by: Julian Oppermann <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly nits and questions.
public: | ||
explicit RTCHashResult(const char *PreprocLog) | ||
: Failed{true}, Hash{}, PreprocLog{PreprocLog} {} | ||
RTCHashResult(const char *Hash, const char *PreprocLog) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this constructor take PreprocLog
? The only use of this constructor I found in the code only passes ""
in that case, might just as well set it to an empty string here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To distinguish the success/failure result variants, and possibly to always take the log when we reuse the preprocessed source in the other action. I pushed an alternate design in ed093ce, but don't love it. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't notice the issue with the same parameter type before.
I think ed093ce isn't bad, but we could further simplify it by only storing a single string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, done!
// Update the cache size file and trigger cache eviction if needed. | ||
if (TotalSize) | ||
updateCacheFileSizeAndTriggerEviction(getRootDir(), TotalSize); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it correct that this happens outside of the lock?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not my design (CC @cperkinsintel @uditagarwal97), but I'd say yes. The lock we held before is only for a specific entry (directory), and the eviction mechanism employs its own lock for the file containing the total cache size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1. Eviction uses its own lock.
if (isEvictionEnabled()) | ||
saveCurrentTimeInAFile(FileName + CacheEntryAccessTimeSuffix); | ||
} catch (...) { | ||
// If read was unsuccessfull try the next item |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this comment correct? Looks more like we're just giving up and will compile fresh.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cperkinsintel Just realised there's a potential bug in getCompiledKernelFromDisc
(which I based getDeviceCodeIRFromDisc
on): The iteration over the different .bin
files doesn't make sense because for kernel_compiler
as we don't write a source item to distinguish the binaries stored for the same cache key. putCompiledKernelToDisc
does look for an unused filename to store the binary, but getCompiledKernelFromDisc
will always attempt to read the 0.bin
file and use that unless an FS error occurs.
sycl/test-e2e/KernelCompiler/kernel_compiler_sycl_jit_cache.cpp
Outdated
Show resolved
Hide resolved
Signed-off-by: Julian Oppermann <[email protected]>
Signed-off-by: Julian Oppermann <[email protected]>
Signed-off-by: Julian Oppermann <[email protected]>
Signed-off-by: Julian Oppermann <[email protected]>
sycl/source/detail/jit_compiler.cpp
Outdated
std::string SYCLFileName = CompilationID + ".cpp"; | ||
// The filename must be stable, because it is part of the preprocessed output | ||
// and in consequence, the cache key. | ||
std::string SYCLFileName = "rtc.cpp"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is unfortunate. Some day in the future when debugging support is required, this "rt.cpp" filename for everything might become a problem. That day is likely no day soon, so I'm fine with it.
Is there no other (simple) way for a file name that is unique for the kernel, but consistent between invocations (cache read and write )?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, maybe a short hash of the source string only? Or we could come back to the idea of letting the user set the filename (or a generic tag) via a property, which might be more useful for debugging.
In any case, let me check again how much effort it would be to keep the filename out of the cache key calculation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Disabling the line markers in the preprocessor output does the trick for now, so I reverted back to the original behaviour, using rtc_<n>.cpp
as file name.
Signed-off-by: Julian Oppermann <[email protected]>
Adds `sycl_jit`-RTC-specific persistent caching to the runtime. The basic idea is to cache only the LLVM module resulting from the device compilation in bitcode format. Device linking and post-link would be run always (even for a cache hit), as invoking the frontend is the most expensive step in the pipeline right now. The cache key is the concatenation of: - the Base64*-encoding of a BLAKE3 hash of the preprocessed source string (i.e. containing all headers included as virtual files per the `kernel_compiler` extension as well as from the local file system), and - the Base64-encoding of a BLAKE3 hash of the user-supplied build options. *) Replacing `/` by `-` to make the string filesystem-friendly. --------- Signed-off-by: Julian Oppermann <[email protected]> Co-authored-by: Lukas Sommer <[email protected]>
Adds
sycl_jit
-RTC-specific persistent caching to the runtime. The basic idea is to cache only the LLVM module resulting from the device compilation in bitcode format. Device linking and post-link would be run always (even for a cache hit), as invoking the frontend is the most expensive step in the pipeline right now.The cache key is the concatenation of:
kernel_compiler
extension as well as from the local file system), and*) Replacing
/
by-
to make the string filesystem-friendly.