Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LLVM 7.0.1 compatibility #2976

Merged
merged 23 commits into from
Feb 10, 2019
Merged

LLVM 7.0.1 compatibility #2976

merged 23 commits into from
Feb 10, 2019

Conversation

jemc
Copy link
Member

@jemc jemc commented Jan 12, 2019

This PR is a work in progress to collaborate on LLVM 7.0.1 support.
@kulibali already had some commits on a branch, and I've added some more here as well.

I'll continue working on it, but I wanted to make the combined work public here for collaboration in case @kulibali had any more commits to add here.

chalcolith and others added 15 commits December 27, 2018 09:45
There is a known bug in LLVM 7 that results in a large number of spurious warnings when JITing: https://bugs.llvm.org/show_bug.cgi?id=39577
The `llvm-config --includedir` result only gives one include dir,
but `llvm-config --cflags` sometimes gives multiple dirs,
and all are necessary.

This happens, for example, when working against an LLVM that was
built from source using `cmake`.

The new approach uses `llvm-config --cflags`, extracts only the
flags that add new include dirs, and replaces the `-I` usage with
`-isystem` usage that the original ponyc Makefile approach preferred.
LLVM 7 removed the alignment argument from memcpy and memmove intrinsics.
@jemc jemc added the do not merge This PR should not be merged at this time label Jan 15, 2019
@jemc
Copy link
Member Author

jemc commented Jan 17, 2019

So, when compiling ponyc stdlib tests in release mode, I get an LLVM 7 assert fail related to the MergeFunctionsoptimisation pass. I've managed to narrow it down to this (rather weird) minimal repro:

actor Main
  new create(env: Env) =>
    apply(env)
  
  fun apply(env: Env) =>
    test[I64](env, (0, true))
    test[U64](env, (0, true))
  
  fun test[A: Stringable #read](env: Env, expected: (A, Bool)) =>
    busywork[A](env, expected._1)
  
  fun busywork[A: Stringable #read](env: Env, expected: A) =>
    "." + "." + "." + "." + "." + "." + "." + "." + "." + "." + "." + "." + "."

The assert fail looks like this:

ponyc: /home/jemc/1/code/tarx/llvm-7.0.1.src/include/llvm/IR/ValueMap.h:279: virtual void llvm::ValueMapCallbackVH<llvm::Function *, std::_Rb_tree_const_iterator<(anonymous namespace)::FunctionNode>, llvm::ValueMapConfig<llvm::Function *, llvm::sys::SmartMutex<false> > >::allUsesReplacedWith(llvm::Value *) [KeyT = llvm::Function *, ValueT = std::_Rb_tree_const_iterator<(anonymous namespace)::FunctionNode>, Config = llvm::ValueMapConfig<llvm::Function *, llvm::sys::SmartMutex<false> >]: Assertion 'isa<KeySansPointerT>(new_key) && "Invalid RAUW on key of ValueMap<>"' failed.

The backtrace (with LLVM 7 debugging symbols included) looks like this:

* thread #1, name = 'ponyc', stop reason = signal SIGABRT
  * frame #0: 0x00007ffff6476eab libc.so.6`__GI_raise + 267
    frame #1: 0x00007ffff64615b9 libc.so.6`__GI_abort + 291
    frame #2: 0x00007ffff6461491 libc.so.6`__assert_fail_base.cold.0 + 15
    frame #3: 0x00007ffff646f612 libc.so.6`__GI___assert_fail + 66
    frame #4: 0x0000000004af7855 ponyc`llvm::ValueMapCallbackVH<llvm::Function*, std::_Rb_tree_const_iterator<(anonymous namespace)::FunctionNode>, llvm::ValueMapConfig<llvm::Function*, llvm::sys::SmartMutex<false> > >::allUsesReplacedWith(this=0x000000000662f330, new_key=0x0000000006404e38) at ValueMap.h:278
    frame #5: 0x0000000005c93b2a ponyc`llvm::ValueHandleBase::ValueIsRAUWd(Old=0x0000000006408fd8, New=0x0000000006404e38) at Value.cpp:927
    frame #6: 0x0000000005c936b1 ponyc`llvm::Value::doRAUW(this=0x0000000006408fd8, New=0x0000000006404e38, NoMetadata=false) at Value.cpp:417
    frame #7: 0x0000000005c93cef ponyc`llvm::Value::replaceAllUsesWith(this=0x0000000006408fd8, New=0x0000000006404e38) at Value.cpp:440
    frame #8: 0x0000000004afcb83 ponyc`(anonymous namespace)::MergeFunctions::mergeTwoFunctions(this=0x0000000006541d90, F=0x0000000006401498, G=0x0000000006408fd8) at MergeFunctions.cpp:773
    frame #9: 0x0000000004afa6fb ponyc`(anonymous namespace)::MergeFunctions::insert(this=0x0000000006541d90, NewFunction=0x0000000006408fd8) at MergeFunctions.cpp:854
    frame #10: 0x0000000004af6ec5 ponyc`(anonymous namespace)::MergeFunctions::runOnModule(this=0x0000000006541d90, M=0x00000000063e2630) at MergeFunctions.cpp:421
    frame #11: 0x0000000005bfc548 ponyc`(anonymous namespace)::MPPassManager::runOnModule(this=0x00000000064f56f0, M=0x00000000063e2630) at LegacyPassManager.cpp:1669
    frame #12: 0x0000000005bfbffa ponyc`llvm::legacy::PassManagerImpl::run(this=0x00000000064f5250, M=0x00000000063e2630) at LegacyPassManager.cpp:1774
    frame #13: 0x0000000005bfcaa1 ponyc`llvm::legacy::PassManager::run(this=0x00007fffffffc800, M=0x00000000063e2630) at LegacyPassManager.cpp:1805
    frame #14: 0x00000000028be353 ponyc`optimise(c=0x00007fffffffc938, pony_specific=true) at genopt.cc:1426
    frame #15: 0x00000000028bda87 ponyc`::genopt(c=0x00007fffffffc938, pony_specific=true) at genopt.cc:1435
    frame #16: 0x00000000028f1142 ponyc`genexe(c=0x00007fffffffc938, program=0x00007ffff63ffd40) at genexe.c:557
    frame #17: 0x000000000288744f ponyc`codegen(program=0x00007ffff63ffd40, opt=0x00007fffffffce28) at codegen.c:875
    frame #18: 0x00000000028f5033 ponyc`generate_passes(program=0x00007ffff63ffd40, options=0x00007fffffffce28) at pass.c:360
    frame #19: 0x000000000285e41d ponyc`compile_package(path=".", opt=0x00007fffffffce28, print_program_ast=false, print_package_ast=false) at main.c:67
    frame #20: 0x000000000285e240 ponyc`main(argc=1, argv=0x00007fffffffcfe8) at main.c:109
    frame #21: 0x00007ffff646311b libc.so.6`__libc_start_main + 235
    frame #22: 0x000000000285e02a ponyc`_start + 42

The MergeFunctions::mergeTwoFunctions function is an LLVM optimization used to merge code for equivalent functions and redirect all callers to the merged function. It fails when trying to merge this F and G:

(lldb) p F->dump()

; Function Attrs: nounwind
define private fastcc nonnull %None* @Main_ref_test_U64_val_o2Wbo(%Main* nocapture readnone dereferenceable(256) %this, %Env* noalias nocapture readonly dereferenceable(64) %env, %t2_U64_val_Bool_val %expected) unnamed_addr #2 !dbg !98 !pony.abi !4 {
entry:
  call void @llvm.dbg.value(metadata %Main* %this, metadata !97, metadata !DIExpression()), !dbg !200
  call void @llvm.dbg.value(metadata %Env* %env, metadata !107, metadata !DIExpression()), !dbg !201
  %0 = extractvalue %t2_U64_val_Bool_val %expected, 0
  %1 = tail call fastcc %None* @Main_box_busywork_I64_val_owo(%Main* nonnull %this, %Env* nonnull %env, i64 %0), !dbg !202
  ret %None* @None_Inst, !dbg !203
}
(lldb) p G->dump()

; Function Attrs: nounwind
define private fastcc nonnull %None* @Main_val_test_I64_val_o2wbo(%Main* nocapture readnone dereferenceable(256) %this, %Env* noalias nocapture readonly dereferenceable(64) %env, %t2_I64_val_Bool_val %expected) unnamed_addr #2 !dbg !204 !pony.abi !4 {
entry:
  call void @llvm.dbg.value(metadata %Main* %this, metadata !205, metadata !DIExpression()), !dbg !206
  call void @llvm.dbg.value(metadata %Env* %env, metadata !207, metadata !DIExpression()), !dbg !208
  %0 = extractvalue %t2_I64_val_Bool_val %expected, 0
  %1 = tail call fastcc %None* @Main_box_busywork_I64_val_owo(%Main* nonnull %this, %Env* nonnull %env, i64 %0), !dbg !209
  ret %None* @None_Inst, !dbg !210
}

Note from looking at the LLVM IR that by this point the busywork function has already been merged, with both callers pointing to the I64 specialization of it even though they originally pointed to the U64 and I64 specialization (respectively). This means that the merge worked for the busywork function, but it isn't working for the test function.

It fails when trying to run G->replaceAllUsesWith(BitcastF), where BitcastF is an LLVM ConstantExpr value that casts F to the type of G:

frame #8: 0x0000000004afcb83 ponyc`(anonymous namespace)::MergeFunctions::mergeTwoFunctions(this=0x0000000006541d90, F=0x0000000006401498, G=0x0000000006408fd8) at MergeFunctions.cpp:773
   772 	        Constant *BitcastF = ConstantExpr::getBitCast(F, G->getType());
-> 773 	        G->replaceAllUsesWith(BitcastF);

The replaceAllUsesWith feature has a system that calls any registered callbacks for things that are holding the value that is to be replaced. One of these things is a ValueMap<Function *, std::_Rb_tree_const_iterator<(anonymous namespace)::FunctionNode>, which is complaining that the ConstantExpr value from the bitcast doesn't match the required type of the key (Function) at runtime.

At this point, I'm confused as to how the bitcasting code could have ever been expected to work correctly at all, but it's been around in LLVM for some time, so there must be some other factor at play here.

I will continue to investigate this.

@jemc
Copy link
Member Author

jemc commented Jan 18, 2019

Okay, I figured out that this LLVM assert fail only happens when the MergeFunctions pass runs more than once - the second time it has this problem, but only for some particular programs, like the one I showed above.

I'm still not entirely sure how or why this bug happens, but I'm able to prevent it by keeping the MergeFunctions pass from running more than once, in the latest commit. I think a "real" fix would probably have to happen as a bugfix upstream.

Note that the issue can be reproduced without ponyc by obtaining the unoptimized LLVM IR for the above minimal program (by running ponyc in debug mode to get it), then running LLVM's optimizer tool (called opt) with these options (causing the mergefunc pass to be run twice):

# (from within the LLVM 7 build tree where the tools like `opt` have already been built:
bin/opt -mergefunc -mergefunc -S --debug-pass=Executions /path/to/pony-example.ll

As of now, stdlib tests compile and run successfully for me in ponyc release mode compilation (optimizations enabled).

@jemc
Copy link
Member Author

jemc commented Jan 22, 2019

@kulibali - when you get a chance, could you look at the appveyor failure? Something in the waf script is amiss, I think.

Otherwise, this PR is pretty much working as-is - there was one CircleCI failure on an ARM build, but I've kicked the CI to run again and see if it was spurious.

@chalcolith
Copy link
Member

I neglected to update appveyor.yml for the new version:

diff --git a/.appveyor.yml b/.appveyor.yml
index f8467b57..1766c241 100644
--- a/.appveyor.yml
+++ b/.appveyor.yml
@@ -10,8 +10,8 @@ branches:
environment:
matrix:
- llvm: 3.9.1
-  - llvm: 5.0.1
-  - llvm: 6.0.0
+  - llvm: 6.0.1
+  - llvm: 7.0.1

configuration:
- release

@chalcolith
Copy link
Member

chalcolith commented Jan 23, 2019

Seems to be an additional problem on Windows. For sufficiently complex programs (e.g. the stdlib tests; helloworld works fine), the generated object file seems to be corrupt, and the Microsoft linker says fatal error LNK1143: invalid or corrupt file: no symbol for COMDAT section 0x4.

This was theoretically fixed in July (https://reviews.llvm.org/rL337950), but at least one other person has encountered it (https://stackoverflow.com/questions/53811108/kaleidoscope-chapter-8-linker-command-failed-with-exit-code-1143).

I will investigate further if I have time.

@mfelsche
Copy link
Contributor

@kulibali i also gave it a try on windows 10.

My ponyc is:

0.25.0-7fe0d935 [release]
compiled with: llvm 7.0.1 -- msvc-15-x64
Defaults: pic=false ssl=openssl_0.9.0

I tried to compile: https://github.com/mfelsche/pony-appdirs
It worked using ponyc with --debug but it failed with the error you described for a release build.
I am somehow unable to get hold of the .obj file on windows. If I can help giving you any more hints, let me know.

@chalcolith
Copy link
Member

You have to run with --pass=obj to get an obj file; otherwise it gets cleaned up when linking fails.

@mfelsche
Copy link
Contributor

@kulibali thank you. I made a silly typo.

Here is some info on the headers. It seems it contains some packaged functions (COMDAT stuff) without a symbol (is that legit?):

Microsoft (R) COFF/PE Dumper Version 14.13.26131.1
Copyright (C) Microsoft Corporation.  All rights reserved.


Dump of file .\selftest_nodebug.obj

File Type: COFF OBJECT

FILE HEADER VALUES
            8664 machine (x64)
               A number of sections
               0 time date stamp
           1B696 file pointer to symbol table
              C5 number of symbols
               0 size of optional header
               0 characteristics

SECTION HEADER #1
   .text name
       0 physical address
       0 virtual address
    FD15 size of raw data
     1A4 file pointer to raw data (000001A4 to 0000FEB8)
    FEB9 file pointer to relocation table
       0 file pointer to line numbers
     561 number of relocations
       0 number of line numbers
60500020 flags
         Code
         16 byte align
         Execute Read

SECTION HEADER #2
   .data name
       0 physical address
       0 virtual address
       0 size of raw data
   13484 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
C0300040 flags
         Initialized Data
         4 byte align
         Read Write

SECTION HEADER #3
    .bss name
       0 physical address
       0 virtual address
       0 size of raw data
       0 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
C0300080 flags
         Uninitialized Data
         4 byte align
         Read Write

SECTION HEADER #4
  .xdata name
       0 physical address
       0 virtual address
     9CC size of raw data
   13484 file pointer to raw data (00013484 to 00013E4F)
   13E50 file pointer to relocation table
       0 file pointer to line numbers
      24 number of relocations
       0 number of line numbers
40300040 flags
         Initialized Data
         4 byte align
         Read Only

SECTION HEADER #5
  .rdata name
       0 physical address
       0 virtual address
      20 size of raw data
   13FB8 file pointer to raw data (00013FB8 to 00013FD7)
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
40501040 flags
         Initialized Data
         COMDAT (no symbol)
         16 byte align
         Read Only

SECTION HEADER #6
  .rdata name
       0 physical address
       0 virtual address
      A0 size of raw data
   13FD8 file pointer to raw data (00013FD8 to 00014077)
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
40601040 flags
         Initialized Data
         COMDAT (no symbol)
         32 byte align
         Read Only

SECTION HEADER #7
  .rdata name
       0 physical address
       0 virtual address
      A0 size of raw data
   14078 file pointer to raw data (00014078 to 00014117)
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
40601040 flags
         Initialized Data
         COMDAT (no symbol)
         32 byte align
         Read Only

SECTION HEADER #8
  .rdata name
       0 physical address
       0 virtual address
      40 size of raw data
   14118 file pointer to raw data (00014118 to 00014157)
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
40601040 flags
         Initialized Data
         COMDAT (no symbol)
         32 byte align
         Read Only

SECTION HEADER #9
  .rdata name
       0 physical address
       0 virtual address
    4AEF size of raw data
   14158 file pointer to raw data (00014158 to 00018C46)
   18C47 file pointer to relocation table
       0 file pointer to line numbers
     2EF number of relocations
       0 number of line numbers
40500040 flags
         Initialized Data
         16 byte align
         Read Only

SECTION HEADER #A
  .pdata name
       0 physical address
       0 virtual address
     3B4 size of raw data
   1A9A0 file pointer to raw data (0001A9A0 to 0001AD53)
   1AD54 file pointer to relocation table
       0 file pointer to line numbers
      ED number of relocations
       0 number of line numbers
40300040 flags
         Initialized Data
         4 byte align
         Read Only

  Summary

           0 .bss
           0 .data
         3B4 .pdata
        4C8F .rdata
        FD15 .text
         9CC .xdata

this seems to be the result of optimization, as the debug build does not list these:

C:\"Program Files (x86)\Microsoft Visual Studio"\2017\BuildTools\VC\Tools\MSVC\14.13.26128\bin\HostX64\x64\dumpbin.exe /HEADERS .\selftest.obj
Microsoft (R) COFF/PE Dumper Version 14.13.26131.1
Copyright (C) Microsoft Corporation.  All rights reserved.


Dump of file .\selftest.obj

File Type: COFF OBJECT

FILE HEADER VALUES
            8664 machine (x64)
               8 number of sections
               0 time date stamp
           682D4 file pointer to symbol table
             3CC number of symbols
               0 size of optional header
               0 characteristics

SECTION HEADER #1
   .text name
       0 physical address
       0 virtual address
   1B1DA size of raw data
     154 file pointer to raw data (00000154 to 0001B32D)
   1B32E file pointer to relocation table
       0 file pointer to line numbers
     EE1 number of relocations
       0 number of line numbers
60500020 flags
         Code
         16 byte align
         Execute Read

SECTION HEADER #2
   .data name
       0 physical address
       0 virtual address
       0 size of raw data
   247F8 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
C0300040 flags
         Initialized Data
         4 byte align
         Read Write

SECTION HEADER #3
    .bss name
       0 physical address
       0 virtual address
       0 size of raw data
       0 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
C0300080 flags
         Uninitialized Data
         4 byte align
         Read Write

SECTION HEADER #4
  .xdata name
       0 physical address
       0 virtual address
    1E20 size of raw data
   247F8 file pointer to raw data (000247F8 to 00026617)
   26618 file pointer to relocation table
       0 file pointer to line numbers
      24 number of relocations
       0 number of line numbers
40300040 flags
         Initialized Data
         4 byte align
         Read Only

SECTION HEADER #5
  .rdata name
       0 physical address
       0 virtual address
    6CDF size of raw data
   26780 file pointer to raw data (00026780 to 0002D45E)
   2D45F file pointer to relocation table
       0 file pointer to line numbers
     3B6 number of relocations
       0 number of line numbers
40500040 flags
         Initialized Data
         16 byte align
         Read Only

SECTION HEADER #6
.debug$S name
       0 physical address
       0 virtual address
   200AC size of raw data
   2F97C file pointer to raw data (0002F97C to 0004FA27)
   4FA28 file pointer to relocation table
       0 file pointer to line numbers
    1058 number of relocations
       0 number of line numbers
42300040 flags
         Initialized Data
         Discardable
         4 byte align
         Read Only

SECTION HEADER #7
.debug$T name
       0 physical address
       0 virtual address
    6988 size of raw data
   59D98 file pointer to raw data (00059D98 to 0006071F)
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
42300040 flags
         Initialized Data
         Discardable
         4 byte align
         Read Only

SECTION HEADER #8
  .pdata name
       0 physical address
       0 virtual address
    2358 size of raw data
   60720 file pointer to raw data (00060720 to 00062A77)
   62A78 file pointer to relocation table
       0 file pointer to line numbers
     8D6 number of relocations
       0 number of line numbers
40300040 flags
         Initialized Data
         4 byte align
         Read Only

  Summary

           0 .bss
           0 .data
       200AC .debug$S
        6988 .debug$T
        2358 .pdata
        6CDF .rdata
       1B1DA .text
        1E20 .xdata

Gordon Tisher and others added 2 commits February 8, 2019 16:26
In LLVM 7 it seems that the output of `LLVMGetDefaultTargetTriple()` is no longer normalized.  On Win32 it now returns `x86_64-pc-win32`, whereupon the COFF object writer assumes that COMDAT sections don't need global names.  This fix makes the target triple `x86_64-pc-win32-msvc` for MSVC builds, and the COFF object writer correctly writes symbol names for the COMDAT sections.
@chalcolith
Copy link
Member

@jemc do you think this is ready to merge?

@jemc jemc added changelog - added Automatically add "Added" CHANGELOG entry on merge and removed do not merge This PR should not be merged at this time labels Feb 10, 2019
@jemc jemc changed the title [WIP] LLVM 7.0.1 compatibility LLVM 7.0.1 compatibility Feb 10, 2019
@jemc jemc merged commit b404644 into master Feb 10, 2019
@jemc jemc deleted the llvm701 branch February 10, 2019 15:57
ponylang-main added a commit that referenced this pull request Feb 10, 2019
winksaville added a commit to winksaville/ponyc that referenced this pull request Feb 21, 2019
The add lines are from PR ponylang#824 which were changed in ponylang#2976.
winksaville added a commit to winksaville/ponyc that referenced this pull request Feb 21, 2019
This fixes ponylang#3016 by adding lines back introduced in PR ponylang#824. This code
conditionally added llvm include directory only if it wasn't already
included. These lines were a small part of PR ponylang#2976 which adds LLVM
7.0.1 compatibility.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
changelog - added Automatically add "Added" CHANGELOG entry on merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants