Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AutoBump] Merge with e7244d86 (Jan 08) (29) #499

Open
wants to merge 480 commits into
base: bump_to_d622b66a
Choose a base branch
from

Conversation

jorickert
Copy link

No description provided.

nikic and others added 30 commits January 6, 2025 10:18

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Currently hfinkel is listed as the AliasAnalysis maintainer, but I
believe he hasn't been actively working on LLVM in the last couple of
years, so I'd like to update this information.

I'd like to nominate fhahn and myself as the new maintainers for AA.
While here, I'd also like to nominate alinas as the maintainer for
MemorySSA.
If directive is put inside `#if __cplusplus`, it should reflect the condition, instead of being generic `expected`.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
…-rt (llvm#121625)

This compile time test uses inline asm with `.arch` directives to set
the target feature. It is however broken and always fails, since each
`asm()` construct in LLVM sets up a new AsmParser, and therefore the
`.arch` directive has no effect on later `asm()` contents. To fix this
we need to use a single inline `asm()` call with the entire code chunk
to emit contained inside.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
…ion (llvm#121559)

As part llvm#112171, support for FEAT_PAuthLR's CFI instructions was added.
However, the CFI instructions are emitted in the incorrect location. This
leads to incorrect CodeGen being generated and possible issues when
running a program. According to the ABI, the CFI instructions should be
emitted before the signing instruction. This is now done properly.

ABI information can be found here:
https://github.com/ARM-software/abi-aa/blob/bf0e2c8047c70987165f3e05e571d7836370ade9/aadwarf64/aadwarf64.rst#44call-frame-instructions

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
…vm#121681)

The new line types help to annotate */&/&& in simple requirements as
binary operators.

Fixes llvm#121675.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Print operations are often used for debugging, immediately before the
compiler aborts. In such cases, it is sometimes possible that the output
isn't fully produced yet. Make sure it is by explicitly flushing the
output.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
…rt for single reductions in ComplexDeinterleavingPass (llvm#112875)"  (llvm#120441)

This reverts commit 76714be, fixing the
build failure that caused the revert.

The failure stemmed from the complex deinterleaving pass identifying a
series of add operations as a "complex to single reduction", so when it
tried to transform this erroneously identified pattern, it faulted. The
fix applied is to ensure that complex numbers (or patterns that match
them) are used throughout, by checking if there is a deinterleave node
amidst the graph.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
… cost model (llvm#120742)

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
This PR is in reference to porting LLDB on AIX.

Link to discussions on llvm discourse and github:

1. https://discourse.llvm.org/t/port-lldb-to-ibm-aix/80640
2. llvm#101657
The complete changes for porting are present in this draft PR:
llvm#102601

Added a HostInfoAIX file for the AIX platform. 
Most of the common functionalities are handled by the parent
HostInfoPosix now,
So we just have some basic functions implemented here.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Move the debug output that prints out the selected VF from
selectVectorizationFactor -> computeBestVF. This means that the output
will still be written even after removing the assert for the legacy and
vplan cost models matching.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
… binary operation on input (llvm#120207)

Add codegen for when the input type has 4 times as many elements as the
output type and the input to the partial reduction does not have a
binary operation performed on it.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
…1550)

An instantiated templated function definition may not have a body due to
parsing errors inside the templated function. When serializing, an
assert is triggered inside `ASTRecordWriter::AddFunctionDefinition`.

The instantiation may happen on an intermediate module.

The test case was reduced from `mp-units`.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
This commit add an NVIDIA-specific lowering of `cf.assert` to to
`__assertfail`.

Note: `getUniqueFormatGlobalName`, `getOrCreateFormatStringConstant` and
`getOrDefineFunction` are moved to `GPUOpsLowering.h`, so that they can
be reused.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
)

This is trivially additional support for the existing ALLOCATE
directive, which allows an ALIGN clause.

The ALLOCATE directive is currently not implemented, so this is just
addding the necessary parser parts to allow the compiler to not say
"Huh? I don't get this" [or "Expected OpenMP construct"] when it
encounters the ALIGN clause.

Some parser testing is updated and a new todo test, just in case the
feature of align clause is not supported by the initial support for
ALLOCATE.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
This registers `sincos[f|l]` as a clang builtin and updates GCBuiltin to
emit the `llvm.sincos.*` intrinsic when `-fno-math-errno` is set. Note:
`llvm.sincos.*` is only emitted by `__builtin_sincos[f|l]` functions in
this initial patch.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
…rn (llvm#119527)

This fixes a regression from llvm#101294 by checking if we might be
clobbering a sh{1,2,3}add pattern.

Only do this is the underlying add isn't going to be folded away into an
address offset.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
…21747)

Replace `bzero` with the standard `memset` so that it is common to all platforms.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Fix a grammar mistake in Polly docs.

Co-authored-by: hstk30-hw <[email protected]>

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
…s" (llvm#121749)

Generalize the SymbolIDs used for SymbolData to all SymExprs and use
these IDs for comparison SymbolRef keys in various containers, such as
ConstraintMap. These IDs are superior to raw pointer values because they
are more controllable and are not randomized across executions (unlike
[pointers](https://en.wikipedia.org/wiki/Address_space_layout_randomization)).

These IDs order is stable across runs because SymExprs are allocated in
the same order.

Stability of the constraint order is important for the stability of the
analyzer results. I evaluated this change on a set of 200+ open-source C
and C++ projects with the total number of ~78 000 symbolic-execution
issues passing Z3 refutation.

This patch reduced the run-to-run churn (flakiness) in SE issues from
80-90 to 30-40 (out of 78K) in our CSA deployment (in our setting flaky
issues are mostly due to Z3 refutation instability).

Note, most of the issue churn (flakiness) is caused by the mentioned Z3
refutation. With Z3 refutation disabled, issue churn goes down to ~10
issues out of 83K and this patch has no effect on appearing/disappearing
issues between runs. It however, seems to reduce the volatility of the
execution flow: before we had 40-80 issues with changed execution flow,
after - 10-30.

Importantly, this change is necessary for the next step in stabilizing
analysis results by caching Z3 query outcomes between analysis runs
(work in progress).

Across our admittedly noisy CI runs, I detected no significant effect on
memory footprint or analysis time.

This PR reapplies llvm#121551 with
a fix to a g++ compiler error reported on some build bots

CPP-5919

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
…21338)

Inlining must be disabled for new-ZT0 callees as the callee is required
to save ZT0 and toggle PSTATE.ZA on entry.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
llvm#120104)

This combine pattern perform the below transformation.

fmul x, select(y, A, B)      -> fldexp (x, select i32 (y, a, b))
fmul x, select(y, -A, -B)   -> fldexp ((fneg x), select i32 (y, a, b))

where, A=2^a & B=2^b ; a and b are integers.

It is a follow-up PR to implement the above combine for globalIsel, as
the corresponding DAG combine has been done for SelectionDAG Isel
(llvm#111109)

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Matches the existing horizontal-add tests, with the additional non-commutable constraint

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
llvm#118636)

Fixes llvm#117975, a regression introduced by llvm#112521 due to forgetting
to check for `nullptr` before dereferencing in
`CallExpr::getUnusedResultAttr`.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Also move the -fno-wrapv option definition next to the -fwrapv one while
here.
mshockwave and others added 30 commits January 7, 2025 15:01

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
…g model (llvm#122007)

According to llvm-exegesis, they should have around 2 cycles of latency
on P400 cores.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
…sted access (llvm#119102)

Now that we are accepting commit access requests via GitHub issues, we
can keep track of who has recently requested access.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
…122003)

Case analysis:
* EEW=SEW*2, getEMULEqualsEEWDivSEWTimesLMUL(EEW) returns 2 x VLMUL
* EEW=SEW, getEMULEqualsEEWDivSEWTimesLMUL(EEW) returns VLMUL

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
This adds a new main command-line entry point for hdrgen, in the
new main.py.  This new interface is used for generating a header.
The old ways of invoking yaml_to_classes.py for other purposes
are left there for now, but `--e` is renamed to `--entry-point`
for consistency with the new CLI.

The YAML schema is expanded with the `header_template` key where
the corresponding `.h.def` file's path is given relative to where
the YAML file is found.  The build integration no longer gives
the `.h.def` path on the command line.  Instead, the script now
emits a depfile that's used by the cmake rules to track that.
The output file is always explicit in the script command line
rather than sometimes being derived from a directory path.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
We are missing MSVC C++ functions since the name is quoted in the LLVM IR,
so we don't find them in the generated IR and therefore don't add the test
checks. Additionally, there is an issue with finding functions using NEON
types (see llvm#121800).

Pull Request: llvm#121976

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Since the files have been reorganized, the readme is out of date. This
patch updates it to be more accurate.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Optionally (by default) no longer mark callsite nodes as Recursive,
which means they would be automatically skipped during cloning. This was
too conservative as it prevents cloning of any callsite that showed up
in any recursive cycle, even for non-recursive contexts.

While this will enable partial cloning of recursive contexts, the
recursive calls themselves will not be updated to call the correct
clone, possibly leading to some unnecessary but benign cloning and
affecting bytes hinted reporting. To prevent this, optional support
looks for recursive cycles in contexts during cloning and removes
those contexts from cloning. This requires some additional runtime
overhead, so is disabled by default for now.

Support for correct cloning of recursive cycles is WIP.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
…dBundle. (llvm#121846)

Explicitly disable copy CTOR/assigment for SchedBundle to avoid
acsidentional
usage of default versions that do not handle Nodes copies properly.
A developer will need to implement them once required.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
…#121861)

This patch introduces a new method:

void Vectorizer::mergeEquivalenceClasses(EquivalenceClassMap &EQClasses)
const;

The method is called at the end of
Vectorizer::collectEquivalenceClasses() and is needed to merge
equivalence classes that differ only by their underlying objects (UO1
and UO2), where UO1 is 1-level-indirection underlying base for UO2. This
situation arises due to the limited lookup depth used during the search
of underlying bases with llvm::getUnderlyingObject(ptr).

Using any fixed lookup depth can result into creation of multiple
equivalence classes that only differ by 1-level indirection bases.

The new approach merges equivalence classes if they have adjacent bases
(1-level indirection). If a series of equivalence classes form ladder
formed of 1-step/level indirections, they are all merged into a single
equivalence class. This provides more opportunities for the load-store
vectorizer to generate better vectors.

---------

Signed-off-by: Klochkov, Vyacheslav N <[email protected]>

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Add a new file to the module map and remove 2 missing files (migrated
from .def to .td).

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
the `ptx_kernel` calling convention is a more idiomatic and standard way
of specifying a NVPTX kernel than using the metadata which is not
supposed to change the meaning of the program. Further, checking the
calling convention is significantly faster than traversing the metadata,
improving compile time.

This change updates the clang and mlir frontends as well as the
NVPTXCtorDtorLowering pass to emit kernels using the calling convention.
In addition, this updates all NVPTX unit tests to use the calling
convention as well.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Custom lowering for s32 G_ADD/SUB to help match selection dag better.
Specifically for RV64 a s32 is produced as a add+sext the output this
allows for fewer instructions to sign extend a couple patterns. Allows
for the generation of addiw,subw,negw to reduce required instructions to
load values quicker

Log2_ceil_i32 in rvzbb.ll shows a more obvious improvement case.
…ug output.

ORC and JITLink debugging output write the dbgs() raw_ostream, which isn't
thread-safe. Use -num-threads=0 to force single-threaded linking for tests that
produce debugging output.

The llvm-jitlink tool is updated to suggest -num-threads=0 when debugging
output is enabled.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
…122030)

Add mask store to getOperandInfo since it has the same behavior.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
…lvm#120667)

As mentioned in llvm#118989, all
sanitizers but tsan are converted to just module pass for easier
maintenance.

This patch removes the TySan function pass, convert TySan from
function+module pass to just module pass.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
The implementation of ParentMap assumes that the key is absent if it is
mapped to nullptr. This breaks when trying to store a tuple as the value
type. Remove this assumption by explicit uses of `try_emplace()`.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
This patch extends the MachO linker's map file generation to include
branch extension thunk symbols. Previously, thunks were omitted from the
map file, making it difficult to understand the final layout of the
binary, especially when debugging issues related to long branch thunks.
This change ensures thunks are included and correctly interleaved with
other symbols based on their address, providing an accurate
representation of the linked output.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
…y after internal-externc-isystem when nostdlibinc is used (llvm#122035)

Embedded development often needs to use a different C standard library,
replacing the existing one normally passed as -internal-externc-isystem.
This works fine for an apple-macos target, but apple-none-macho doesn't
work because the MachO driver doesn't implement
AddClangSystemIncludeArgs to add the resource directory as
-internal-isystem like most other drivers do. Move most of the search
path logic from Darwin and DarwinClang down into an AppleMachO toolchain
between the MachO and Darwin toolchains.

Also define __MACH__ for apple-none-macho, as Swift expects all MachO
targets to have that defined.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
…#120058)

This just copies the same conservative definition from mayWriteToMemory,
and enables more VPInstructions to be hoisted out in LICM.

I think this should give more accurate costs, and I was able to build
llvm-test-suite without the legacy-vplan cost model assertion going off.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Following llvm#120380,
`err_pack_expansion_length_conflict` has one close paren too many.

Remove the extra parenthesis.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Following on from llvm#115200, disallow the null sgpr as a resource operand
in some instructions that were missed.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
These can generally be emitted using an ext instruction or mov from the
high half. The half half extracts can be free depending on the users,
but that is not handled here, just the basic costs. It originally
included all subvector extracts, but that was toned-down to just
half-vector extracts to try and help the mid end not breakup high/low
extracts without having the SLP vectorizer create a mess using other
shuffles.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
The patch llvm#102460 already implements separate DT/LI/SE for parallel sub
function. Crashes have been reported while region generator tries using
oringinal function's DT while creating new parallel sub function due to
checks in llvm#101198. This patch aims at fixing those cases by switching
the DT/LI while generating parallel function using Region Generator.

Fixes llvm#117877

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
…121936)

Bolt makes use of add_llvm_library and as such ends up exporting its
libraries from LLVMExports.cmake, which is not correct.

Bolt doesn't have its own exports file, and I assume that there is no
desire to have one either -- Bolt libraries are not intended to be
consumed as a cmake module, right?

As such, this PR adds a NO_EXPORT option to simplify exclude these
libraries from the exports file.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment