Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update LLVM to 14.0.3 #4055

Merged
merged 12 commits into from
May 7, 2022
Merged

Update LLVM to 14.0.3 #4055

merged 12 commits into from
May 7, 2022

Conversation

chalcolith
Copy link
Member

@chalcolith chalcolith commented Mar 15, 2022

This change updates our LLVM submodule to 14.0.3 and updates ponyc to compile with that version.

@chalcolith chalcolith added do not merge This PR should not be merged at this time changelog - changed Automatically add "Changed" CHANGELOG entry on merge labels Mar 15, 2022
@ponylang-main
Copy link
Contributor

Hi @kulibali,

The changelog - changed label was added to this pull request; all PRs with a changelog label need to have release notes included as part of the PR. If you haven't added release notes already, please do.

Release notes are added by creating a uniquely named file in the .release-notes directory. We suggest you call the file 4055.md to match the number of this pull request.

The basic format of the release notes (using markdown) should be:

## Title

End user description of changes, why it's important,
problems it solves etc.

If a breaking change, make sure to include 1 or more
examples what code would look like prior to this change
and how to update it to work after this change.

Thanks.

@ponylang-main ponylang-main added the discuss during sync Should be discussed during an upcoming sync label Mar 15, 2022
@chalcolith
Copy link
Member Author

Ponyc compiles and all tests succeed except for small-finalisers and class-final. All the code changes I made are to optimization passes; when I disable those passes the tests still fail, so I'm assuming that something changed in LLVM that is affecting the generation of finalizers. I'm continuing to investigate.

@SeanTAllen
Copy link
Member

We should also see if Windows debug builds stop hanging with this change.

@SeanTAllen SeanTAllen removed the discuss during sync Should be discussed during an upcoming sync label Mar 15, 2022
@chalcolith
Copy link
Member Author

Investigation so far:

Given a Pony program

use @pony_exitcode[None](code: I32)

class ClassFinal
  fun _final() =>
    @pony_exitcode(1)

actor Main
  new create(env: Env) =>
    ClassFinal

Using LLVM 13, the @Main_Dispatch() function ends with

  %6 = tail call i8* @pony_ctx() #2
  %7 = tail call i8* @pony_alloc_small_final(i8* %0, i32 0) #2
  %8 = bitcast i8* %7 to %ClassFinal_Desc**
  store %ClassFinal_Desc* @ClassFinal_Desc, %ClassFinal_Desc** %8, align 8

This is inlined from @Main_tag_create_oo(). If I understand correctly, we are allocating some memory for our instance of ClassFinal's finalizer, then writing a pointer to the global ClassFinal type descriptor to that memory.

Using LLVM 14 with optimizations on, the @Main_Dispatch() function ends with

  %6 = tail call i8* @pony_ctx() #2
  %7 = tail call i8* @pony_alloc_small_final(i8* %0, i32 0) #2

and there is no store of the ClassFinal type descriptor. So when we eventually get to running the finalizers, we have garbage instead of a pointer to the ClassFinal type descriptor.

If I change line 1365 in src/libponyc/codegen/genopt.cc to be pmb.OptLevel = 1 (or if I build in debug mode), the problem goes away (i.e. the store is now present). So now I'm investigating what optimizations are enabled at level 2 and above that are responsible for getting rid of the store.

@ponylang-main ponylang-main added the discuss during sync Should be discussed during an upcoming sync label Mar 30, 2022
@jemc
Copy link
Member

jemc commented Apr 5, 2022

Discussed during the sync call - I suspect if we link against the runtime bitcode the problem will be fixed, because we likely are tagging wrong attributes on one of our runtime functions. We need to tag the right attributes so that LLVM knows that the descriptor memory will be read from by our garbage collector, so that it won't try to optimize away the store to that memory.

That's what I suspect.

@jemc jemc removed the discuss during sync Should be discussed during an upcoming sync label Apr 12, 2022
@chalcolith
Copy link
Member Author

Looks like marking the store as volatile prevents it being removed.

@ponylang-main ponylang-main added the discuss during sync Should be discussed during an upcoming sync label Apr 20, 2022
@chalcolith
Copy link
Member Author

Also on Linux I had to make copies of some of the LLVM C functions because they were marked as deprecated (I didn't see this at first because we don't error on warning in Windows). They were all expecting non-opaque pointers (which we supply), from which they could extract the LLVM type. LLVM has been deprecating these in favour of new functions (with a "2" added to their name) that can take opaque pointers, so you have to provide their types explicitly. So I made copies of the old functions (with a "_P" for Pony added to their names) that aren't marked deprecated.

Alternatively if we don't like this approach we could add a new C function that calls cast<PointerType>(Val->getType()->getScalarType())->getElementType() on an LLVMValueRef and returns the type, and then use that with the new "2" functions.


LLVMValueRef desc_ptr = LLVMBuildStructGEP_P(c->builder, value, 0, "");
LLVMValueRef store_inst = LLVMBuildStore(c->builder, c_t->desc, desc_ptr);
LLVMSetVolatile(store_inst, true);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is where we set the type descriptor store to be volatile.

@chalcolith chalcolith requested review from mfelsche and a team April 20, 2022 17:41
@chalcolith chalcolith changed the title Update LLVM to 14.0.0 Update LLVM to 14.0.1 Apr 20, 2022
@jemc
Copy link
Member

jemc commented Apr 20, 2022

I worry that marking the stores as volatile may be more of a workaround than a "real" solution, and it may potentially have performance effects.

That said, I don't have the answers about what the real solution would be.

@kulibali - did you try linking against the runtime bitcode and seeing if that resolves the issue?

@chalcolith
Copy link
Member Author

Stupid typo in my comment above; using runtime bitcode DOES exhibit the problem. So it seems dead store elimination is a little agressive.

@SeanTAllen
Copy link
Member

This seems like it might be worth opening an LLVM issue?

@chalcolith chalcolith changed the title Update LLVM to 14.0.2 Update LLVM to 14.0.3 May 2, 2022
@chalcolith
Copy link
Member Author

I believe we said last week that when Sean was recovered he would do some performance testing on this branch.

@jemc jemc self-assigned this May 3, 2022
@jemc
Copy link
Member

jemc commented May 3, 2022

I'll spend some time poking around the LLVM git history.

@SeanTAllen SeanTAllen removed the discuss during sync Should be discussed during an upcoming sync label May 3, 2022
@jemc
Copy link
Member

jemc commented May 3, 2022

I'll also try removing attributes from function declarations and call sites.

@ponylang-main ponylang-main added the discuss during sync Should be discussed during an upcoming sync label May 3, 2022
@chalcolith
Copy link
Member Author

chalcolith commented May 5, 2022

I have tracked down the breaking change to llvm/llvm-project@3cef3cf. I'll investigate further what we're doing with alias scopes.

@jemc
Copy link
Member

jemc commented May 5, 2022

Thanks for tracking down the commit.

Perhaps all that's needed is to remove the noalias_attr from our runtime function declarations for the runtime functions that allocate new objects.

Technically it's not true that there is no alias to these new objects, because we hold an alias to each of them in our heap's object map (which is the alias we use for running the finalizer). So removing that attribute is probably the "correct" way to solve this problem.

@chalcolith chalcolith marked this pull request as ready for review May 6, 2022 17:39
@chalcolith
Copy link
Member Author

I have removed the noalias attribute from the runtime functions that allocate memory, and it works now without marking the store as volatile.

@jemc
Copy link
Member

jemc commented May 6, 2022

Glad that solved the issue!

I'm gonna clone this locally because there's one more thing I want to try - keeping the noalias attribute on some but not all of the functions, so we can still get this optimization in places where it's safe.

I'm going to do more reading in the Pony runtime to look for all the places we might dereference this pointer.

@jemc
Copy link
Member

jemc commented May 6, 2022

I've pushed a commit which reinstates most of the noalias attributes - all of them except the ones for pony_alloc_final* functions, as well as the deserialise function, which may include objects with finalisers.

This should help us keep the potential of LLVM being able to optimize away some heap allocations/stores which are not read from / captured, while still keeping correctness for objects with finalisers that may have specific finaliser effects to be run.

Only the `pony_alloc_final*` functions need to avoid the `noalias`
attribute, because running a finaliser is the only case where we
have the issue where the runtime needs to dereference memory with
the alias held by the object map, but LLVM sees that the object
was never read from again, and thus can optimize it away.
Only finaliser effects prevent this LLVM optimization.
jemc added a commit to savi-lang/savi that referenced this pull request May 7, 2022
In LLVM 13.0.0 we had observed a bug in which, when using
the new LLVM PassBuilder, some functions were being generated
without return instructions, causing the instruction pointer
to just blunder off the end of one function definition
into the next one in the executable binary's memory layout,
likely violating stack/argument assumptions of that function,
ultimately leading to some form of memory corruption.

This is why we had disabled use of those optimizations
temporarily in an earlier PR (#203), though it was only
diagnosed in detail this morning.

We update to the latest available LLVM release (14.0.3)
to avoid this issue and also to keep up with the times.

Note that in this update to 14.0.3 an issue was discovered in Pony
with the use of `NoAlias` attributes on allocation functions
for those functions that allocate an object with a finalizer.
These attributes were technically incorrect, and a new optimization
present in 14.0.3 now can optimize away code that shouldn't be
removed if it trusts the erroneous use of that attribute.
See ponylang/ponyc#4055

This bug had not been observed directly in Savi because we do
not yet support allocations with finalizers. But the attributes
are updated here nonetheless, as it's easy to fix this correctness
issue now while it's at top of mind.
jemc added a commit to savi-lang/savi that referenced this pull request May 7, 2022
In LLVM 13.0.0 we had observed a bug in which, when using
the new LLVM PassBuilder, some functions were being generated
without return instructions, causing the instruction pointer
to just blunder off the end of one function definition
into the next one in the executable binary's memory layout,
likely violating stack/argument assumptions of that function,
ultimately leading to some form of memory corruption.

This is why we had disabled use of those optimizations
temporarily in an earlier PR (#203), though it was only
diagnosed in detail this morning.

We update to the latest available LLVM release (14.0.3)
to avoid this issue and also to keep up with the times.

Note that in this update to 14.0.3 an issue was discovered in Pony
with the use of `NoAlias` attributes on allocation functions
for those functions that allocate an object with a finalizer.
These attributes were technically incorrect, and a new optimization
present in 14.0.3 now can optimize away code that shouldn't be
removed if it trusts the erroneous use of that attribute.
See ponylang/ponyc#4055

This bug had not been observed directly in Savi because we do
not yet support allocations with finalizers. But the attributes
are updated here nonetheless, as it's easy to fix this correctness
issue now while it's at top of mind.
jemc added a commit to savi-lang/savi that referenced this pull request May 7, 2022
In LLVM 13.0.0 we had observed a bug in which, when using
the new LLVM PassBuilder, some functions were being generated
without return instructions, causing the instruction pointer
to just blunder off the end of one function definition
into the next one in the executable binary's memory layout,
likely violating stack/argument assumptions of that function,
ultimately leading to some form of memory corruption.

This is why we had disabled use of those optimizations
temporarily in an earlier PR (#203), though it was only
diagnosed in detail this morning.

We update to the latest available LLVM release (14.0.3)
to avoid this issue and also to keep up with the times.

Note that in this update to 14.0.3 an issue was discovered in Pony
with the use of `NoAlias` attributes on allocation functions
for those functions that allocate an object with a finalizer.
These attributes were technically incorrect, and a new optimization
present in 14.0.3 now can optimize away code that shouldn't be
removed if it trusts the erroneous use of that attribute.
See ponylang/ponyc#4055

This bug had not been observed directly in Savi because we do
not yet support allocations with finalizers. But the attributes
are updated here nonetheless, as it's easy to fix this correctness
issue now while it's at top of mind.
@SeanTAllen SeanTAllen removed the do not merge This PR should not be merged at this time label May 7, 2022
@SeanTAllen SeanTAllen merged commit cb824ee into main May 7, 2022
@SeanTAllen SeanTAllen deleted the llvm1400 branch May 7, 2022 18:20
@ponylang-main ponylang-main removed the discuss during sync Should be discussed during an upcoming sync label May 7, 2022
github-actions bot pushed a commit that referenced this pull request May 7, 2022
github-actions bot pushed a commit that referenced this pull request May 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
changelog - changed Automatically add "Changed" CHANGELOG entry on merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants