-
Notifications
You must be signed in to change notification settings - Fork 38.3k
Description
Initially ci ran into a segfault in https://github.com/maflcko/bitcoin-core-nightly/actions/runs/19882806964/job/56984144564#step:6:4040:
3/152 Test #3: mptest ...............................***Failed 0.05 sec
[ TEST ] test.cpp:117: Call FooInterface methods
[ PASS ] test.cpp:117: Call FooInterface methods (18106 μs)
[ TEST ] test.cpp:209: Call IPC method after client connection is closed
[ PASS ] test.cpp:209: Call IPC method after client connection is closed (744 μs)
[ TEST ] test.cpp:226: Calling IPC method after server connection is closed
[ PASS ] test.cpp:226: Calling IPC method after server connection is closed (819 μs)
[ TEST ] test.cpp:243: Calling IPC method and disconnecting during the call
[ PASS ] test.cpp:243: Calling IPC method and disconnecting during the call (793 μs)
[ TEST ] test.cpp:263: Calling IPC method, disconnecting and blocking during the call
[ PASS ] test.cpp:263: Calling IPC method, disconnecting and blocking during the call (1329 μs)
[ TEST ] test.cpp:312: Make simultaneous IPC calls to trigger 'thread busy' error
*** Received signal #11: Segmentation fault
stack: 588ddc67 587839bb 5877e6ef 588667a8 58858975 58861e3d 58784940 58780828 5868c5fb 5868c0c8 f5b62d20 f57b0136 f5844c05
When trying to reproduce it locally, I'd see mptest hang, when run in a loop (and the CPUs on the machine were made busy via other means). To reproduce on a fresh podman run -it --rm --platform=linux alpine:3.23:
apk update && apk add rsync screen python3 git bash vim && git clone --depth=1 https://github.com/bitcoin/bitcoin ./b-c-ci && cd ./b-c-ci
RUN_UNIT_TESTS=false RUN_FUNCTIONAL_TESTS=false CCACHE_DIR=/ccache_dir CCACHE_MAXSIZE=5500M USER=dummy_user DANGER_RUN_CI_ON_HOST="1" MAKEJOBS="-j$(nproc)" FILE_ENV="./ci/test/00_setup_env_native_alpine_musl.sh" ./ci/test_run_all.sh
while /ci_container_base/ci/scratch/build-*/src/ipc/libmultiprocess/test/mptest ; do true ; done
After some time it just hangs forever:
...
6 test(s) passed
[ TEST ] test.cpp:117: Call FooInterface methods
[ PASS ] test.cpp:117: Call FooInterface methods (83659 μs)
[ TEST ] test.cpp:209: Call IPC method after client connection is closed
[ PASS ] test.cpp:209: Call IPC method after client connection is closed (6884 μs)
[ TEST ] test.cpp:226: Calling IPC method after server connection is closed
[ PASS ] test.cpp:226: Calling IPC method after server connection is closed (9762 μs)
[ TEST ] test.cpp:243: Calling IPC method and disconnecting during the call
[ PASS ] test.cpp:243: Calling IPC method and disconnecting during the call (8612 μs)
[ TEST ] test.cpp:263: Calling IPC method, disconnecting and blocking during the call
[ PASS ] test.cpp:263: Calling IPC method, disconnecting and blocking during the call (13144 μs)
[ TEST ] test.cpp:312: Make simultaneous IPC calls to trigger 'thread busy' error
mp/proxy.cpp:45: error: Uncaught exception in daemonized task.; exception = (unknown):-1: failed: std::exception: std::future_error: Promise already satisfied
[ PASS ] test.cpp:312: Make simultaneous IPC calls to trigger 'thread busy' error (16464 μs)
6 test(s) passed
[ TEST ] test.cpp:117: Call FooInterface methods
[ PASS ] test.cpp:117: Call FooInterface methods (53144 μs)
[ TEST ] test.cpp:209: Call IPC method after client connection is closed
[ PASS ] test.cpp:209: Call IPC method after client connection is closed (7757 μs)
[ TEST ] test.cpp:226: Calling IPC method after server connection is closed
[ PASS ] test.cpp:226: Calling IPC method after server connection is closed (7552 μs)
[ TEST ] test.cpp:243: Calling IPC method and disconnecting during the call
[ PASS ] test.cpp:243: Calling IPC method and disconnecting during the call (7087 μs)
[ TEST ] test.cpp:263: Calling IPC method, disconnecting and blocking during the call
[ PASS ] test.cpp:263: Calling IPC method, disconnecting and blocking during the call (11442 μs)
[ TEST ] test.cpp:312: Make simultaneous IPC calls to trigger 'thread busy' error
[ PASS ] test.cpp:312: Make simultaneous IPC calls to trigger 'thread busy' error (11951 μs)
6 test(s) passed
[ TEST ] test.cpp:117: Call FooInterface methods
[ PASS ] test.cpp:117: Call FooInterface methods (57813 μs)
[ TEST ] test.cpp:209: Call IPC method after client connection is closed
[ PASS ] test.cpp:209: Call IPC method after client connection is closed (6301 μs)
[ TEST ] test.cpp:226: Calling IPC method after server connection is closed
[ PASS ] test.cpp:226: Calling IPC method after server connection is closed (7153 μs)
[ TEST ] test.cpp:243: Calling IPC method and disconnecting during the call
[ PASS ] test.cpp:243: Calling IPC method and disconnecting during the call (8374 μs)
[ TEST ] test.cpp:263: Calling IPC method, disconnecting and blocking during the call
[ PASS ] test.cpp:263: Calling IPC method, disconnecting and blocking during the call (11935 μs)
[ TEST ] test.cpp:312: Make simultaneous IPC calls to trigger 'thread busy' error
[ PASS ] test.cpp:312: Make simultaneous IPC calls to trigger 'thread busy' error (15253 μs)
6 test(s) passed
[ TEST ] test.cpp:117: Call FooInterface methods
[ PASS ] test.cpp:117: Call FooInterface methods (98583 μs)
[ TEST ] test.cpp:209: Call IPC method after client connection is closed
[ PASS ] test.cpp:209: Call IPC method after client connection is closed (6840 μs)
[ TEST ] test.cpp:226: Calling IPC method after server connection is closed
[ PASS ] test.cpp:226: Calling IPC method after server connection is closed (6811 μs)
[ TEST ] test.cpp:243: Calling IPC method and disconnecting during the call
[ PASS ] test.cpp:243: Calling IPC method and disconnecting during the call (6656 μs)
[ TEST ] test.cpp:263: Calling IPC method, disconnecting and blocking during the call
[ PASS ] test.cpp:263: Calling IPC method, disconnecting and blocking during the call (12740 μs)
[ TEST ] test.cpp:312: Make simultaneous IPC calls to trigger 'thread busy' error
[ PASS ] test.cpp:312: Make simultaneous IPC calls to trigger 'thread busy' error (14853 μs)
6 test(s) passed
[ TEST ] test.cpp:117: Call FooInterface methods
[ PASS ] test.cpp:117: Call FooInterface methods (45695 μs)
[ TEST ] test.cpp:209: Call IPC method after client connection is closed
[ PASS ] test.cpp:209: Call IPC method after client connection is closed (7465 μs)
[ TEST ] test.cpp:226: Calling IPC method after server connection is closed
[ PASS ] test.cpp:226: Calling IPC method after server connection is closed (5967 μs)
[ TEST ] test.cpp:243: Calling IPC method and disconnecting during the call
[ PASS ] test.cpp:243: Calling IPC method and disconnecting during the call (7727 μs)
[ TEST ] test.cpp:263: Calling IPC method, disconnecting and blocking during the call
[ PASS ] test.cpp:263: Calling IPC method, disconnecting and blocking during the call (9699 μs)
[ TEST ] test.cpp:312: Make simultaneous IPC calls to trigger 'thread busy' error
... (hang)
The bt looks like this:
(gdb) thread apply all bt
Thread 3 (LWP 106816 "mptest"):
#0 __cp_end () at src/thread/x86_64/syscall_cp.s:29
#1 0x00007f21f68bb868 in __syscall_cp_c (nr=281, u=<optimized out>, v=<optimized out>, w=<optimized out>, x=<optimized out>, y=0, z=8) at src/thread/pthread_cancel.c:33
#2 0x00007f21f689896c in epoll_pwait (fd=3, ev=0x7f21f6449ae0, cnt=16, to=-1, sigs=0x0) at src/linux/epoll.c:28
#3 0x00005638a3067426 in kj::UnixEventPort::wait (this=0x7f21f68fb4b8) at /usr/src/kj/io.h:284
#4 0x00005638a2fd0228 in kj::EventLoop::wait (this=this@entry=0x7f21f68fb590) at /usr/src/kj/async.c++:1850
#5 0x00005638a2fd594c in kj::_::waitImpl (node=..., result=..., waitScope=..., location=...) at /usr/src/kj/async.c++:1984
#6 0x00005638a2ed3eb6 in kj::Promise<unsigned long>::wait (this=0x7f21f644a070, waitScope=..., location=...) at /ci_container_base/depends/x86_64-pc-linux-musl/include/kj/async-inl.h:1357
#7 mp::EventLoop::loop (this=this@entry=0x7f21f644a7b0) at ./ipc/libmultiprocess/src/mp/proxy.cpp:240
#8 0x00005638a2dbcf58 in mp::test::TestSetup::TestSetup(bool)::{lambda()#1}::operator()() const (__closure=<optimized out>) at ./ipc/libmultiprocess/test/mp/test/test.cpp:99
#9 0x00005638a2dbd1a9 in std::__invoke_impl<void, mp::test::TestSetup::TestSetup(bool)::{lambda()#1}>(std::__invoke_other, mp::test::TestSetup::TestSetup(bool)::{lambda()#1}&&) (__f=...) at /usr/include/c++/15.2.0/bits/invoke.h:63
#10 std::__invoke<mp::test::TestSetup::TestSetup(bool)::{lambda()#1}>(mp::test::TestSetup::TestSetup(bool)::{lambda()#1}&&) (__fn=...) at /usr/include/c++/15.2.0/bits/invoke.h:98
#11 std::thread::_Invoker<std::tuple<mp::test::TestSetup::TestSetup(bool)::{lambda()#1}> >::_M_invoke<0ul>(std::_Index_tuple<0ul>) (this=<optimized out>) at /usr/include/c++/15.2.0/bits/std_thread.h:303
#12 std::thread::_Invoker<std::tuple<mp::test::TestSetup::TestSetup(bool)::{lambda()#1}> >::operator()() (this=<optimized out>) at /usr/include/c++/15.2.0/bits/std_thread.h:310
#13 std::thread::_State_impl<std::thread::_Invoker<std::tuple<mp::test::TestSetup::TestSetup(bool)::{lambda()#1}> > >::_M_run() (this=<optimized out>) at /usr/include/c++/15.2.0/bits/std_thread.h:255
#14 0x00007f21f6695680 in ?? () from /usr/lib/libstdc++.so.6
#15 0x00007f21f68bc573 in start (p=<optimized out>) at src/thread/pthread_create.c:207
#16 0x00007f21f68bdec1 in __clone () at src/thread/x86_64/clone.s:22
Backtrace stopped: frame did not save the PC
Thread 2 (LWP 106817 "mptest"):
#0 __cp_end () at src/thread/x86_64/syscall_cp.s:29
#1 0x00007f21f68bb868 in __syscall_cp_c (nr=202, u=<optimized out>, v=<optimized out>, w=<optimized out>, x=<optimized out>, y=0, z=0) at src/thread/pthread_cancel.c:33
#2 0x00007f21f68bad24 in __futex4_cp (addr=0x7f21f63d27a4, op=0, val=2, to=<optimized out>) at src/thread/__timedwait.c:24
#3 __timedwait_cp (addr=addr@entry=0x7f21f63d27a4, val=val@entry=2, clk=clk@entry=0, at=at@entry=0x0, priv=128, priv@entry=1) at src/thread/__timedwait.c:52
#4 0x00007f21f68bbc05 in __pthread_cond_timedwait (c=0x7f21f659a708, m=0x7f21f659a6e0, ts=0x0) at src/thread/pthread_cond_timedwait.c:100
#5 0x00007f21f668b39d in std::condition_variable::wait(std::unique_lock<std::mutex>&) () from /usr/lib/libstdc++.so.6
#6 0x00005638a2ed3548 in std::condition_variable::wait<mp::Waiter::wait<mp::ProxyServer<mp::ThreadMap>::makeThread(mp::ThreadMap::Server::MakeThreadContext)::<lambda()>::<lambda()> >(mp::Lock&, mp::ProxyServer<mp::ThreadMap>::makeThread(mp::ThreadMap::Server::MakeThreadContext)::<lambda()>::<lambda()>)::<lambda()> > (this=0x7f21f659a708, __lock=..., __p=...) at /usr/include/c++/15.2.0/condition_variable:107
#7 mp::Waiter::wait<mp::ProxyServer<mp::ThreadMap>::makeThread(mp::ThreadMap::Server::MakeThreadContext)::<lambda()>::<lambda()> > (this=0x7f21f659a6e0, lock=..., pred=...) at ./ipc/libmultiprocess/include/mp/proxy-io.h:343
#8 operator() (__closure=<optimized out>) at ./ipc/libmultiprocess/src/mp/proxy.cpp:419
#9 0x00005638a2ed3733 in std::__invoke_impl<void, mp::ProxyServer<mp::ThreadMap>::makeThread(mp::ThreadMap::Server::MakeThreadContext)::<lambda()> > (__f=...) at /usr/include/c++/15.2.0/bits/invoke.h:63
#10 std::__invoke<mp::ProxyServer<mp::ThreadMap>::makeThread(mp::ThreadMap::Server::MakeThreadContext)::<lambda()> > (__fn=...) at /usr/include/c++/15.2.0/bits/invoke.h:98
#11 std::thread::_Invoker<std::tuple<mp::ProxyServer<mp::ThreadMap>::makeThread(mp::ThreadMap::Server::MakeThreadContext)::<lambda()> > >::_M_invoke<0> (this=<optimized out>) at /usr/include/c++/15.2.0/bits/std_thread.h:303
#12 std::thread::_Invoker<std::tuple<mp::ProxyServer<mp::ThreadMap>::makeThread(mp::ThreadMap::Server::MakeThreadContext)::<lambda()> > >::operator() (this=<optimized out>) at /usr/include/c++/15.2.0/bits/std_thread.h:310
#13 std::thread::_State_impl<std::thread::_Invoker<std::tuple<mp::ProxyServer<mp::ThreadMap>::makeThread(mp::ThreadMap::Server::MakeThreadContext)::<lambda()> > > >::_M_run(void) (this=<optimized out>) at /usr/include/c++/15.2.0/bits/std_thread.h:255
#14 0x00007f21f6695680 in ?? () from /usr/lib/libstdc++.so.6
#15 0x00007f21f68bc573 in start (p=<optimized out>) at src/thread/pthread_create.c:207
#16 0x00007f21f68bdec1 in __clone () at src/thread/x86_64/clone.s:22
Backtrace stopped: frame did not save the PC
Thread 1 (LWP 106803 "mptest"):
#0 __cp_end () at src/thread/x86_64/syscall_cp.s:29
#1 0x00007f21f68bb868 in __syscall_cp_c (nr=202, u=<optimized out>, v=<optimized out>, w=<optimized out>, x=<optimized out>, y=0, z=0) at src/thread/pthread_cancel.c:33
#2 0x00007f21f68bad24 in __futex4_cp (addr=0x7ffd73a69134, op=0, val=2, to=<optimized out>) at src/thread/__timedwait.c:24
#3 __timedwait_cp (addr=addr@entry=0x7ffd73a69134, val=val@entry=2, clk=clk@entry=0, at=at@entry=0x0, priv=128, priv@entry=1) at src/thread/__timedwait.c:52
#4 0x00007f21f68bbc05 in __pthread_cond_timedwait (c=0x7f21f659a608, m=0x7f21f659a5e0, ts=0x0) at src/thread/pthread_cond_timedwait.c:100
#5 0x00007f21f668b39d in std::condition_variable::wait(std::unique_lock<std::mutex>&) () from /usr/lib/libstdc++.so.6
#6 0x00005638a2da84ac in std::condition_variable::wait<mp::Waiter::wait<mp::test::TestCase312::run()::<lambda()> >(mp::Lock&, mp::test::TestCase312::run()::<lambda()>)::<lambda()> > (this=0x7f21f659a608, __lock=..., __p=...) at /usr/include/c++/15.2.0/condition_variable:107
#7 mp::Waiter::wait<mp::test::TestCase312::run()::<lambda()> > (this=0x7f21f659a5e0, lock=..., pred=...) at ./ipc/libmultiprocess/include/mp/proxy-io.h:343
#8 mp::test::TestCase312::run (this=<optimized out>) at ./ipc/libmultiprocess/test/mp/test/test.cpp:373
#9 0x00005638a2ede23b in kj::TestRunner::run()::{lambda()#1}::operator()() const (__closure=<optimized out>) at /usr/src/kj/test.c++:318
#10 kj::runCatchingExceptions<kj::TestRunner::run()::{lambda()#1}>(kj::TestRunner::run()::{lambda()#1}&&) (func=...) at /usr/src/kj/exception.h:371
#11 kj::TestRunner::run (this=0x7ffd73a6a040) at /usr/src/kj/test.c++:318
#12 0x00005638a2edf2e3 in kj::TestRunner::getMain()::{lambda(auto:1&, (auto:2&&)...)#7}::operator()<kj::TestRunner>(kj::TestRunner&) (__closure=<optimized out>, s=...) at /usr/src/kj/test.c++:217
#13 kj::_::BoundMethod<kj::TestRunner&, kj::TestRunner::getMain()::{lambda(auto:1&, (auto:2&&)...)#7}, kj::TestRunner::getMain()::{lambda(auto:1&, (auto:2&&)...)#8}>::operator()<>() (this=<optimized out>) at /usr/src/kj/function.h:263
#14 kj::Function<kj::MainBuilder::Validity ()>::Impl<kj::_::BoundMethod<kj::TestRunner&, kj::TestRunner::getMain()::{lambda(auto:1&, (auto:2&&)...)#7}, kj::TestRunner::getMain()::{lambda(auto:1&, (auto:2&&)...)#8}> >::operator()() (this=<optimized out>) at /usr/src/kj/function.h:142
#15 0x00005638a308cd48 in kj::Function<kj::MainBuilder::Validity()>::operator() (this=<optimized out>) at /usr/src/kj/function.h:119
#16 kj::MainBuilder::MainImpl::operator() (this=<optimized out>, programName=..., params=...) at /usr/src/kj/main.c++:623
#17 0x00005638a3091b95 in kj::Function<void(kj::StringPtr, kj::ArrayPtr<kj::StringPtr const>)>::Impl<kj::MainBuilder::MainImpl>::operator() (this=<optimized out>, params#0=..., params#1=...) at /usr/src/kj/function.h:142
#18 0x00005638a308d1cb in kj::Function<void(kj::StringPtr, kj::ArrayPtr<kj::StringPtr const>)>::operator() (this=0x7ffd73a6a050, params#0=..., params#1=...) at /usr/src/kj/function.h:119
#19 operator() (__closure=<optimized out>) at /usr/src/kj/main.c++:228
#20 kj::runCatchingExceptions<kj::runMainAndExit(ProcessContext&, MainFunc&&, int, char**)::<lambda()> > (func=...) at /usr/src/kj/exception.h:371
#21 kj::runMainAndExit (context=..., func=..., argc=<optimized out>, argc@entry=1, argv=argv@entry=0x7ffd73a6a0f8) at /usr/src/kj/main.c++:228
#22 0x00005638a2edc4c9 in main (argc=1, argv=0x7ffd73a6a0f8) at /usr/src/kj/test.c++:381