-
Notifications
You must be signed in to change notification settings - Fork 6
Description
Hi,
I'm wondering if anyone here would be interested in helping to debug a segfault that's occurring with this fork on the Power9 architecture. I can't prove that this is LuaJIT's fault, but I'm tearing my hear out trying to figure out what's going wrong. Backtrace for the crash looks like:
Program received signal SIGSEGV, Segmentation fault.
0x0000000012f73600 in lj_BC_CALLT ()
Missing separate debuginfos, use: debuginfo-install libgcc-4.8.5-28.el7.ppc64le libstdc++-4.8.5-28.el7.ppc64le zlib-1.2.7-17.el7.ppc64le
(gdb) where
#0 0x0000000012f73600 in lj_BC_CALLT ()
#1 0x0000000012f1b2b4 in lua_pcall (L=0x200000070378, nargs=10, nresults=-1, errfunc=1) at lj_api.c:1129
#2 0x00000000103cdd48 in docall (L=0x200000070378, narg=10, clear=0) at src/main.cpp:332
#3 0x00000000103cceb0 in main (argc=13, argv=0x7fffffffd3c8) at src/main.cpp:109
(In a debugger for clarity, though of course it happens without the debugger too.)
The program below is fully minimized, i.e. removing any part of the program causes it to stop reproducing:
https://github.com/stanfordhpccenter/soleil-x/blob/minimize-ppc64-crash/src/dom.rg
Furthermore, introducing print statements can cause the crash to move or to disappear entirely.
Based on these symptoms, it sure seems like there has to be some sort of memory corruption going on, but because I can't printf-debug, I'm not even sure where the crash is occurring! (Aside from the backtrace above that seems to indicate it's going somewhere into Lua code.)
You'll note that there are two other languages thrown in the mix here: Terra and Regent. Unfortunately I can't remove these, so I just have to debug around them.
I'd be happy to respond with complete instructions for reproducing, if that would be helpful.
Thanks in advance for any help or advice.