Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running corefreqd causes kernel bug in Enable_ACPI_CPPC #531

Open
archiecarrot123 opened this issue Feb 22, 2025 · 13 comments
Open

Running corefreqd causes kernel bug in Enable_ACPI_CPPC #531

archiecarrot123 opened this issue Feb 22, 2025 · 13 comments
Labels

Comments

@archiecarrot123
Copy link

Running corefreqd (CoreFreq version 2.0.0) as root causes a kernel bug and hangs.

This is the trace:

[  482.676678] vma ffff88828a0d9cc0 start 00007f17b6c11000 end 00007f17b6c18000 mm ffff888172db0000
               prot 8000000000000025 anon_vma 0000000000000000 vm_ops ffffffff8a24aac0
               pgoff 100 file ffff888340f4f300 private_data 0000000000000000
               flags: 0xb9(read|shared|mayread|maywrite|mayshare)
[  482.676710] ------------[ cut here ]------------
[  482.676711] kernel BUG at include/linux/mm.h:737!
[  482.676716] invalid opcode: 0000 [#1] SMP NOPTI
[  482.676719] CPU: 4 PID: 10493 Comm: corefreqd Tainted: G           OE      6.6.74-gentoo-x86_64 #1
[  482.676721] Hardware name: Micro-Star International Co., Ltd. MS-7C84/MAG X570 TOMAHAWK WIFI (MS-7C84), BIOS 1.B0 08/11/2022
[  482.676722] RIP: 0010:Enable_ACPI_CPPC+0x1d7da/0x3d6e0 [corefreqk]
[  482.676727] Code: 8b 33 4c 8b 43 18 48 83 c4 08 48 89 df 48 01 c2 5b 48 c1 ea 0c 5d e9 55 d2 b4 c8 e8 70 ba b3 c8 0f 0b 48 89 df e8 d6 b9 b3 c8 <0f> 0b b8 f5 ff ff ff e9 4c ff ff ff 48 8b b8 d0 59 00 00 be c0 0c
[  482.676729] RSP: 0018:ffffc90010c77d00 EFLAGS: 00010282
[  482.676731] RAX: 000000000000010d RBX: ffff88828a0d9cc0 RCX: 00000000fffeffff
[  482.676733] RDX: 0000000000000000 RSI: 0000000000000003 RDI: 0000000000000003
[  482.676734] RBP: 00000000000000b9 R08: 0000000000000000 R09: ffffc90010c77b50
[  482.676735] R10: ffff88880e7fffe8 R11: 0000000000000003 R12: ffff888340f4f300
[  482.676736] R13: ffff888172db0000 R14: ffff888384b1f088 R15: ffff88828a0d9cc0
[  482.676738] FS:  00007f17b693c740(0000) GS:ffff88880eb00000(0000) knlGS:0000000000000000
[  482.676739] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  482.676741] CR2: 00007f17b6ace8d0 CR3: 00000003d16d0000 CR4: 0000000000b50ee0
[  482.676742] Call Trace:
[  482.676745]  <TASK>
[  482.676746]  ? die+0x43/0xb0
[  482.676750]  ? do_trap+0x116/0x150
[  482.676753]  ? Enable_ACPI_CPPC+0x1d7da/0x3d6e0 [corefreqk]
[  482.676757]  ? do_error_trap+0x87/0xc0
[  482.676759]  ? Enable_ACPI_CPPC+0x1d7da/0x3d6e0 [corefreqk]
[  482.676762]  ? exc_invalid_op+0x53/0x70
[  482.676766]  ? Enable_ACPI_CPPC+0x1d7da/0x3d6e0 [corefreqk]
[  482.676769]  ? asm_exc_invalid_op+0x16/0x20
[  482.676773]  ? Enable_ACPI_CPPC+0x1d7da/0x3d6e0 [corefreqk]
[  482.676776]  mmap_region+0x34e/0x9c0
[  482.676781]  do_mmap+0x31d/0x5e0
[  482.676783]  ? srso_alias_return_thunk+0x5/0xfbef5
[  482.676786]  vm_mmap_pgoff+0x124/0x210
[  482.676790]  ? handle_mm_fault+0x189/0x300
[  482.676793]  ksys_mmap_pgoff+0x1c0/0x220
[  482.676795]  do_syscall_64+0x35/0xb0
[  482.676798]  entry_SYSCALL_64_after_hwframe+0x78/0xe2
[  482.676801] RIP: 0033:0x7f17b6a40056
[  482.676804] Code: 00 00 00 90 f3 0f 1e fa 41 f7 c1 ff 0f 00 00 75 33 55 89 cd 53 48 89 fb 48 85 ff 74 47 41 89 ea 48 89 df b8 09 00 00 00 0f 05 <48> 3d 00 f0 ff ff 0f 87 7e 00 00 00 5b 5d c3 66 66 2e 0f 1f 84 00
[  482.676805] RSP: 002b:00007ffddb7eeec8 EFLAGS: 00000246 ORIG_RAX: 0000000000000009
[  482.676807] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f17b6a40056
[  482.676809] RDX: 0000000000000001 RSI: 0000000000007000 RDI: 0000000000000000
[  482.676810] RBP: 0000000000000001 R08: 0000000000000003 R09: 0000000000100000
[  482.676811] R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000001000
[  482.676812] R13: 00007ffddb7ef0b8 R14: 00007f17b6c4f000 R15: 0000555ddd6b5be0
[  482.676815]  </TASK>
[  482.676816] Modules linked in: snd_seq_dummy snd_hrtimer fuse nft_masq nft_ct nft_reject_ipv4 nft_reject act_csum cls_u32 sch_htb nft_chain_nat nf_nat nf_tables nfnetlink bridge rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs sunrpc autofs4 ccm 8021q algif_aead garp mrp stp llc des_generic libdes algif_skcipher cmac md4 algif_hash af_alg ip6t_REJECT nf_reject_ipv6 xt_hl ip6t_rt ipt_REJECT nf_reject_ipv4 xt_LOG nf_log_syslog xt_multiport xt_limit xt_addrtype xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip6table_filter ip6_tables iptable_filter ip_tables bpfilter ext4 mbcache jbd2 hid_logitech_hidpp btusb btrtl btintel btbcm bluetooth hid_logitech_dj uas ftdi_sio ecdh_generic usb_storage usbserial ecc vfat fat intel_rapl_msr intel_rapl_common edac_mce_amd joydev kvm_amd amdgpu pkcs8_key_parser kvm iwlmvm dm_multipath irqbypass crct10dif_pclmul crc32_pclmul i2c_algo_bit dm_mod crc32c_intel drm_exec drm_suballoc_helper sr_mod mac80211 amdxcp mfd_core cdrom drm_buddy
[  482.676871]  ghash_clmulni_intel ledtrig_timer gpu_sched drm_ttm_helper sha512_ssse3 sha256_ssse3 libarc4 sha1_ssse3 wmi_bmof corefreqk(OE) ttm iwlwifi aesni_intel crypto_simd drm_display_helper cryptd cec rc_core cfg80211 video pcspkr backlight firewire_ohci xhci_pci ahci ccp firewire_core sp5100_tco xhci_hcd rfkill libahci crc_itu_t i2c_piix4 wmi efivarfs ipv6 crc_ccitt
[  482.676895] ---[ end trace 0000000000000000 ]---
[  482.799903] RIP: 0010:Enable_ACPI_CPPC+0x1d7da/0x3d6e0 [corefreqk]
[  482.799912] Code: 8b 33 4c 8b 43 18 48 83 c4 08 48 89 df 48 01 c2 5b 48 c1 ea 0c 5d e9 55 d2 b4 c8 e8 70 ba b3 c8 0f 0b 48 89 df e8 d6 b9 b3 c8 <0f> 0b b8 f5 ff ff ff e9 4c ff ff ff 48 8b b8 d0 59 00 00 be c0 0c
[  482.799914] RSP: 0018:ffffc90010c77d00 EFLAGS: 00010282
[  482.799917] RAX: 000000000000010d RBX: ffff88828a0d9cc0 RCX: 00000000fffeffff
[  482.799918] RDX: 0000000000000000 RSI: 0000000000000003 RDI: 0000000000000003
[  482.799920] RBP: 00000000000000b9 R08: 0000000000000000 R09: ffffc90010c77b50
[  482.799921] R10: ffff88880e7fffe8 R11: 0000000000000003 R12: ffff888340f4f300
[  482.799922] R13: ffff888172db0000 R14: ffff888384b1f088 R15: ffff88828a0d9cc0
[  482.799924] FS:  00007f17b693c740(0000) GS:ffff88880eb00000(0000) knlGS:0000000000000000
[  482.799925] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  482.799927] CR2: 00007f17b6ace8d0 CR3: 00000003d16d0000 CR4: 0000000000b50ee0
@cyring
Copy link
Owner

cyring commented Feb 23, 2025

@archiecarrot123 Hello,

Can you try the branch hotfix_optimizations which attempts to fix some inline functions ?

@archiecarrot123
Copy link
Author

archiecarrot123 commented Feb 24, 2025

Using hotfix_optimizations, I get a kernel bug in CoreFreqK_mmap instead. I did try using the master branch but I didn't actually read the call trace.
It seems strange to me that it is calling asm_exc_invalid_op, is this because the processor encounters an invalid opcode? To me this would suggest a problem during compilation (generating an invalid opcode) or linking (jumping to a place with an invalid opcode).
edit: looking at the module in ghidra the execution of an invalid instruction appears to be intentional, right after calling dump_vma(). Still, I don't know why it does this, and there are three different conditional jumps there.

This is the call trace:

[  129.994048] vma ffff88813b44e8a0 start 00007faed3601000 end 00007faed3608000 mm ffff8881121b3180
               prot 8000000000000025 anon_vma 0000000000000000 vm_ops ffffffff8e24aac0
               pgoff 100 file ffff888269f99900 private_data 0000000000000000
               flags: 0xb9(read|shared|mayread|maywrite|mayshare)
[  129.994062] ------------[ cut here ]------------
[  129.994063] kernel BUG at include/linux/mm.h:737!
[  129.994070] invalid opcode: 0000 [#1] SMP NOPTI
[  129.994073] CPU: 4 PID: 7186 Comm: corefreqd Tainted: G           OE      6.6.74-gentoo-x86_64 #1
[  129.994076] Hardware name: Micro-Star International Co., Ltd. MS-7C84/MAG X570 TOMAHAWK WIFI (MS-7C84), BIOS 1.B0 08/11/2022
[  129.994077] RIP: 0010:CoreFreqK_mmap+0x2ba/0x330 [corefreqk]
[  129.994092] Code: 8b 33 4c 8b 43 18 48 83 c4 08 48 89 df 48 01 c2 5b 48 c1 ea 0c 5d e9 f5 54 3d cb e8 10 3d 3c cb 0f 0b 48 89 df e8 76 3c 3c cb <0f> 0b b8 f5 ff ff ff e9 4c ff ff ff 48 8b b8 d0 59 00 00 be c0 0c
[  129.994094] RSP: 0018:ffffc900101efd00 EFLAGS: 00010282
[  129.994097] RAX: 000000000000010d RBX: ffff88813b44e8a0 RCX: 00000000fffeffff
[  129.994099] RDX: 0000000000000000 RSI: 0000000000000003 RDI: 0000000000000003
[  129.994100] RBP: 00000000000000b9 R08: 0000000000000000 R09: ffffc900101efb50
[  129.994102] R10: ffff88880e7fffe8 R11: 0000000000000003 R12: ffff888269f99900
[  129.994103] R13: ffff8881121b3180 R14: ffff888108c68c38 R15: ffff88813b44e8a0
[  129.994105] FS:  00007faed332c740(0000) GS:ffff88880eb00000(0000) knlGS:0000000000000000
[  129.994107] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  129.994109] CR2: 00007faed34be8d0 CR3: 00000002abbb0000 CR4: 0000000000b50ee0
[  129.994110] Call Trace:
[  129.994113]  <TASK>
[  129.994115]  ? die+0x43/0xb0
[  129.994119]  ? do_trap+0x116/0x150
[  129.994123]  ? CoreFreqK_mmap+0x2ba/0x330 [corefreqk]
[  129.994134]  ? do_error_trap+0x87/0xc0
[  129.994137]  ? CoreFreqK_mmap+0x2ba/0x330 [corefreqk]
[  129.994148]  ? exc_invalid_op+0x53/0x70
[  129.994152]  ? CoreFreqK_mmap+0x2ba/0x330 [corefreqk]
[  129.994163]  ? asm_exc_invalid_op+0x16/0x20
[  129.994167]  ? CoreFreqK_mmap+0x2ba/0x330 [corefreqk]
[  129.994178]  mmap_region+0x34e/0x9c0
[  129.994184]  do_mmap+0x31d/0x5e0
[  129.994187]  ? srso_alias_return_thunk+0x5/0xfbef5
[  129.994191]  vm_mmap_pgoff+0x124/0x210
[  129.994195]  ? handle_mm_fault+0x189/0x300
[  129.994199]  ksys_mmap_pgoff+0x1c0/0x220
[  129.994202]  do_syscall_64+0x35/0xb0
[  129.994205]  entry_SYSCALL_64_after_hwframe+0x78/0xe2
[  129.994208] RIP: 0033:0x7faed3430056
[  129.994210] Code: 00 00 00 90 f3 0f 1e fa 41 f7 c1 ff 0f 00 00 75 33 55 89 cd 53 48 89 fb 48 85 ff 74 47 41 89 ea 48 89 df b8 09 00 00 00 0f 05 <48> 3d 00 f0 ff ff 0f 87 7e 00 00 00 5b 5d c3 66 66 2e 0f 1f 84 00
[  129.994212] RSP: 002b:00007ffe69515088 EFLAGS: 00000246 ORIG_RAX: 0000000000000009
[  129.994215] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007faed3430056
[  129.994216] RDX: 0000000000000001 RSI: 0000000000007000 RDI: 0000000000000000
[  129.994218] RBP: 0000000000000001 R08: 0000000000000003 R09: 0000000000100000
[  129.994219] R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000001000
[  129.994221] R13: 00007ffe69515278 R14: 00007faed363f000 R15: 000055ef8b8a1be0
[  129.994224]  </TASK>
[  129.994225] Modules linked in: corefreqk(OE) snd_seq_dummy snd_hrtimer fuse nft_masq nft_ct nft_reject_ipv4 nft_reject act_csum cls_u32 sch_htb nft_chain_nat nf_nat nf_tables nfnetlink bridge rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs sunrpc autofs4 ccm 8021q garp algif_aead mrp stp llc des_generic libdes algif_skcipher cmac md4 algif_hash af_alg ip6t_REJECT nf_reject_ipv6 xt_hl ip6t_rt ipt_REJECT nf_reject_ipv4 xt_LOG nf_log_syslog xt_multiport xt_limit xt_addrtype xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip6table_filter ip6_tables iptable_filter ip_tables bpfilter ext4 mbcache jbd2 hid_logitech_hidpp btusb btrtl btintel btbcm bluetooth hid_logitech_dj ftdi_sio ecdh_generic ecc usbserial uas usb_storage vfat fat intel_rapl_msr intel_rapl_common edac_mce_amd iwlmvm kvm_amd joydev amdgpu mac80211 i2c_algo_bit drm_exec drm_suballoc_helper amdxcp kvm libarc4 mfd_core pkcs8_key_parser drm_buddy irqbypass crct10dif_pclmul gpu_sched dm_multipath crc32_pclmul drm_ttm_helper
[  129.994299]  sr_mod crc32c_intel ttm cdrom dm_mod ghash_clmulni_intel iwlwifi drm_display_helper sha512_ssse3 sha256_ssse3 sha1_ssse3 aesni_intel crypto_simd ledtrig_timer wmi_bmof cec cryptd rc_core video cfg80211 backlight pcspkr sp5100_tco firewire_ohci xhci_pci firewire_core ccp ahci crc_itu_t xhci_hcd libahci i2c_piix4 rfkill wmi efivarfs ipv6 crc_ccitt
[  129.994329] ---[ end trace 0000000000000000 ]---
[  130.021346] pstore: backend (efi_pstore) writing error (-5)
[  130.021349] RIP: 0010:CoreFreqK_mmap+0x2ba/0x330 [corefreqk]
[  130.021363] Code: 8b 33 4c 8b 43 18 48 83 c4 08 48 89 df 48 01 c2 5b 48 c1 ea 0c 5d e9 f5 54 3d cb e8 10 3d 3c cb 0f 0b 48 89 df e8 76 3c 3c cb <0f> 0b b8 f5 ff ff ff e9 4c ff ff ff 48 8b b8 d0 59 00 00 be c0 0c
[  130.021365] RSP: 0018:ffffc900101efd00 EFLAGS: 00010282
[  130.021367] RAX: 000000000000010d RBX: ffff88813b44e8a0 RCX: 00000000fffeffff
[  130.021369] RDX: 0000000000000000 RSI: 0000000000000003 RDI: 0000000000000003
[  130.021370] RBP: 00000000000000b9 R08: 0000000000000000 R09: ffffc900101efb50
[  130.021371] R10: ffff88880e7fffe8 R11: 0000000000000003 R12: ffff888269f99900
[  130.021372] R13: ffff8881121b3180 R14: ffff888108c68c38 R15: ffff88813b44e8a0
[  130.021374] FS:  00007faed332c740(0000) GS:ffff88880eb00000(0000) knlGS:0000000000000000
[  130.021375] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  130.021377] CR2: 00007faed34be8d0 CR3: 00000002abbb0000 CR4: 0000000000b50ee0

@cyring
Copy link
Owner

cyring commented Feb 24, 2025

Thank you for you return.

So there is no way to check if the changes of the branch are helpful or not ?

Did it happen that you ran CoreFreq successfully with Gentoo ?
I had been reported that a package has been made and works for this distribution.

I'm also reviewing CoreFreqK_mmap function and the issue I could think about is a non support of read-only or read-write pages I'm setting (to protect from any user-space escalation).

Perhaps your kernel is not compatible.
As a reference here is attached the config of my ArchLinux.
config.txt

Also it could be drivers incompatibilities; especially those accessing the SMU. Here's my boot command line.

initrd=\EFI\Linux\amd-ucode.img initrd=\EFI\Linux\initramfs-linux.img root=/dev/disk/by-label/root rw quiet break=n kfence.sample_interval=0 kasan=off add_efi_memmap nmi_watchdog=0 selinux=0 loglevel=3 rd.systemd.show_status=auto rd.udev.log-priority=3 consoleblank=0 vt.color=0x03 modprobe.blacklist=pcspkr,joydev,mousedev,k10temp,sp5100_tco,acpi_cpufreq,eeepc_wmi,mxm_wmi,wmi_bmof,asus_wmi,wmi,intel_rapl_msr,intel_rapl_common,rapl amd_pstate.shared_mem=0 amd_pstate=disable idle=halt cpu0_hotplug audit=0 nowatchdog watchdog_thresh=0 tsc=nowatchdog acpi_no_watchdog watchdog.handle_boot_enabled=0 initcall_blacklist=watchdog_init mitigations=off nokaslr sysrq_always_enabled msr.allow_writes=on amdgpu.ppfeaturemask=0xffffffff retbleed=off spectre_v2=off kvm_amd.sev=1 kvm_amd.sev_es=1 acpi_osi=Linux

However I don't face a crash when command line is barely empty.

Looking at this trace

[  129.994076] Hardware name: Micro-Star International Co., Ltd. MS-7C84/MAG X570 TOMAHAWK WIFI (MS-7C84), BIOS 1.B0 08/11/2022

I see an X570 chipset, the same as mine (ASUS Crosshair VIII); my processor is a Ryzen 9 3950X; what is yours ?

@cyring
Copy link
Owner

cyring commented Feb 24, 2025

Meanwhile looking at both of your logs

[  129.994063] kernel BUG at include/linux/mm.h:737!

which leads to the function vma_assert_write_locked
and I believe your kernel built is incompatible with the page protection (see previous post)

Do you have CONFIG_PER_VMA_LOCK=y ?

EDIT: Also let me known about adding ibt=off on command line ?

@cyring cyring removed the bugfix label Feb 24, 2025
@archiecarrot123
Copy link
Author

I just checked and CONFIG_PER_VMA_LOCK=y is set on my kernel.
config.gz

Also, my processor is a Ryzen 7 5800X.

It does seem very strange that it crashes on asserting that the VMA is write protected.

@cyring
Copy link
Owner

cyring commented Feb 24, 2025

I just checked and CONFIG_PER_VMA_LOCK=y is set on my kernel. config.gz

Also, my processor is a Ryzen 7 5800X.

It does seem very strange that it crashes on asserting that the VMA is write protected.

Some times a go 5800X has been tested ok

Just to verify about a Gentoo or Kernel issue; do you mind to boot the CoreFreq Live CD ?

@archiecarrot123
Copy link
Author

I tried booting the live CD this morning, and CoreFreq worked fine, so this is presumably an issue with my kernel.

CoreFreq worked previously, but at some point stopped working. I noticed this bug first with an older version of CoreFreq installed from an ebuild, then made the report when it persisted for a CoreFreq 2.0.0 ebuild.

I am going to test setting ANON_VMA_NAME=y as this is set in your config but not in mine (I grepped for VMA in the output of diffconfig to find it).

@cyring
Copy link
Owner

cyring commented Feb 26, 2025

I am going to test setting ANON_VMA_NAME=y as this is set in your config but not in mine (I grepped for VMA in the output of diffconfig to find it).

Thank you for your help.
I'm definitely interested by the results of ANON_VMA_NAME=y which would be added among the prerequisites listed in Readme

corefreqk.c has various conditional statements to test if optional CONFIG_ directives are present or not in current kernel source code.
It may however remain other cases when implicit CONFIG_ are by some means not included...


EDIT: Meanwhile I'm also trying CONFIG_ANON_VMA_NAME against latest kernel source code built and installed as bellow

make menuconfig ## to disable CONFIG_ANON_VMA_NAME
make -j32 pacman-pkg
pacman -U linux-upstream-6.14.0_rc4+-66-x86_64.pkg.tar.zst linux-upstream-headers-6.14.0_rc4+-66-x86_64.pkg.tar.zst
  • Now rebooting on that kernel
uname -a
Linux RYZEN 6.14.0-rc4+ #66 SMP PREEMPT_DYNAMIC Wed Feb 26 10:41:00 CET 2025 x86_64 GNU/Linux

zgrep ANON_VMA /proc/config.gz 
# CONFIG_ANON_VMA_NAME is not set
  • And loading CoreFreq
cd src/CoreFreq
make clean
make -j
insmod build/corefreqk.ko Register_ClockSource=1 Register_Governor=1 Register_CPU_Idle=1 Register_CPU_Freq=1 Override_SubCstate="1,1,1,1,1,1,0,0" C3U_Enable=1 C2U_Enable=1 ServiceProcessor=12; \
echo "corefreq_tsc" > /sys/devices/system/clocksource/clocksource0/current_clocksource; \ 
./build/corefreqd -d; \
rmmod corefreqk
CoreFreq Daemon 2.0.2  Copyright (C) 2015-2025 CYRIL COURTIAT

  Processor [AMD Ryzen 9 3950X 16-Core Processor]
  Architecture [Zen2/Matisse] 32/32 CPU Online.
  SleepInterval(1000), SysGate(2000), 2335 tasks

    CPU #000 @ 3500.11 MHz
    CPU #001 @ 3500.11 MHz
    Thread [7ffff74206c0] Init CYCLE 000
    CPU #002 @ 3500.11 MHz
    Thread [7ffff6c1f6c0] Init CYCLE 001
    CPU #003 @ 3500.14 MHz
    CPU #004 @ 3500.11 MHz
    CPU #005 @ 3500.11 MHz
    CPU #006 @ 3500.11 MHz
    CPU #007 @ 3500.11 MHz
    Thread [7ffff74206c0] Init CHILD 000
    Thread [7ffff641e6c0] Init CHILD 002
    Thread [7ffff5c1d6c0] Init CHILD 003
    Thread [7ffff541c6c0] Init CHILD 004
    Thread [7ffff4c1b6c0] Init CHILD 005
    CPU #008 @ 3500.14 MHz
    Thread [7ffff5c1d6c0] Init CYCLE 003
    Thread [7ffff1c156c0] Init CHILD 011
    Thread [7ffff6c1f6c0] Init CHILD 001
    CPU #009 @ 3500.11 MHz
    Thread [7ffff3c196c0] Init CHILD 007
    Thread [7ffff24166c0] Init CHILD 010
    Thread [7ffff34186c0] Init CHILD 008
    Thread [7ffff2c176c0] Init CHILD 009
    CPU #010 @ 3500.11 MHz
    Thread [7ffff14146c0] Init CHILD 012
    Thread [7fffeffff6c0] Init CYCLE 006
    Thread [7ffff541c6c0] Init CYCLE 004
    Thread [7fffeeffd6c0] Init CYCLE 008
    Thread [7ffff441a6c0] Init CHILD 006
    Thread [7ffff4c1b6c0] Init CYCLE 005
    CPU #011 @ 3500.11 MHz
    Thread [7ffff641e6c0] Init CYCLE 002
    Thread [7fffef7fe6c0] Init CYCLE 007
    Thread [7ffff0c136c0] Init CHILD 013
    CPU #012 @ 3500.14 MHz
    Thread [7fffdbfff6c0] Init CHILD 014
    Thread [7fffee7fc6c0] Init CYCLE 009
    CPU #013 @ 3500.11 MHz
    Thread [7fffdaffd6c0] Init CHILD 016
    Thread [7fffedffb6c0] Init CYCLE 010
    Thread [7fffdb7fe6c0] Init CHILD 015
    Thread [7fffed7fa6c0] Init CYCLE 011
    CPU #014 @ 3500.14 MHz
    CPU #015 @ 3500.11 MHz
    CPU #016 @ 3500.11 MHz
    CPU #017 @ 3500.11 MHz
    CPU #018 @ 3500.11 MHz
    CPU #019 @ 3500.18 MHz
    CPU #020 @ 3500.11 MHz
    CPU #021 @ 3500.11 MHz
    CPU #022 @ 3500.11 MHz
    CPU #023 @ 3500.11 MHz
    CPU #024 @ 3500.11 MHz
    CPU #025 @ 3500.11 MHz
    Thread [7fffda7fc6c0] Init CHILD 017
    CPU #026 @ 3500.11 MHz
    Thread [7fffd9ffb6c0] Init CHILD 018
    Thread [7fffcbfff6c0] Init CYCLE 013
    Thread [7fffd97fa6c0] Init CHILD 019
    Thread [7fffcb7fe6c0] Init CYCLE 014
    Thread [7fffd8ff96c0] Init CHILD 020
    CPU #027 @ 3500.11 MHz
    Thread [7fffa7fff6c0] Init CHILD 021
    CPU #028 @ 3500.11 MHz
    Thread [7fffca7fc6c0] Init CYCLE 016
    Thread [7fffc9ffb6c0] Init CYCLE 017
    Thread [7fffa77fe6c0] Init CHILD 022
    Thread [7fffecff96c0] Init CYCLE 012
    Thread [7fffa6ffd6c0] Init CHILD 023
    Thread [7fffc97fa6c0] Init CYCLE 018
    Thread [7fffcaffd6c0] Init CYCLE 015
    CPU #029 @ 3500.11 MHz
    Thread [7fffa67fc6c0] Init CHILD 024
    CPU #030 @ 3500.14 MHz
    Thread [7fffc8ff96c0] Init CYCLE 019
    Thread [7fffa5ffb6c0] Init CHILD 025
    CPU #031 @ 3500.11 MHz
    Thread [7fffa57fa6c0] Init CHILD 026
    Thread [7fffb8ff96c0] Init CYCLE 026
    Thread [7fffb37fe6c0] Init CYCLE 028
    Thread [7fffa4ff96c0] Init CHILD 027
    Thread [7fffa47f86c0] Init CHILD 028
    Thread [7fffbbfff6c0] Init CYCLE 020
    Thread [7fffb2ffd6c0] Init CYCLE 029
    Thread [7fffb3fff6c0] Init CYCLE 027
    Thread [7fffa3ff76c0] Init CHILD 029
    Thread [7fffbb7fe6c0] Init CYCLE 021
    Thread [7fffa37f66c0] Init CHILD 030
    Thread [7fffb27fc6c0] Init CYCLE 030
    Thread [7fffba7fc6c0] Init CYCLE 023
    Thread [7fffa2ff56c0] Init CHILD 031
    Thread [7fffbaffd6c0] Init CYCLE 022
    Thread [7fffb1ffb6c0] Init CYCLE 031
    Thread [7fffb9ffb6c0] Init CYCLE 024
    Thread [7fffb97fa6c0] Init CYCLE 025
	NTFY || ....
	RING[1](c60d,0)(0:0,0:a87)
  • UI is then being start with no issue

@archiecarrot123
Copy link
Author

archiecarrot123 commented Feb 26, 2025

Setting CONFIG_ANON_VMA_NAME=y did not fix the crash, and produces the same call trace as before (with some different offsets).

Any other ideas, apart from using your (or some other known good) config and performing a bisection?

@cyring
Copy link
Owner

cyring commented Feb 26, 2025

Setting CONFIG_ANON_VMA_NAME=y did not fix the crash, and produces the same call trace as before (with some different offsets).

Any other ideas, apart from using your (or some other known good) config and performing a bisection?

No other idea yet. I had tried to build against kernel 6.6; presuming some incompatibilities but that version is too old now to integrate nicely with ArchLinux
I had also reviewed again kernel source code about prerequisites for remap_pfn_range with ro or rw vma protection
I don't see what's so special in your Gentoo that does not work anymore.

@cyring
Copy link
Owner

cyring commented Mar 2, 2025

To reproduce the bug, is there a Gentoo live with similar environment? (kernel, compiler)

@cyring
Copy link
Owner

cyring commented Mar 17, 2025

looking at the module in ghidra the execution of an invalid instruction

Hello,
What is ghidra?

@cyring
Copy link
Owner

cyring commented Mar 20, 2025

Setting CONFIG_ANON_VMA_NAME=y did not fix the crash, and produces the same call trace as before (with some different offsets).

Any other ideas, apart from using your (or some other known good) config and performing a bisection?

asm_exc_invalid_op has been reported to be triggered on some kernel version. Can you upgrade ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants