c++ 进程死锁排查
c++ 进程死锁排查
内存泄漏是比较常见的问题,下面记录一下我在遇到内存泄漏时的排查思路以及如何降低内存泄漏的风险。
排查示例
假如下面这样一个 c++ 程序出现了死锁
demo.cpp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
#include <iostream>
#include <mutex>
#include <thread>
std::mutex mutex;
void func2() {
mutex.lock();
std::cout << "func2\n";
mutex.unlock();
}
void func1() {
mutex.lock();
std::cout << "func1\n";
func2();
mutex.unlock();
}
int main() {
std::thread([] { func1(); }).detach();
while (true) {
std::this_thread::sleep_for(std::chrono::seconds(1));
}
return 0;
}
程序输出 func1 打印信息之后,永远不回输出 func2,陷入了死锁; 这个时候可以使用 valgrind
来排查,排查方法如下:
- 编译代码
1
g++ -g demo.cpp -o demo
- 使用 valgrind 运行代码
1
valgrind --tool=helgrind ./demo
- 观察结果 线程进入死锁之后按下 crtl + c 终止进程,并查看输出信息
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
==8063== Helgrind, a thread error detector
==8063== Copyright (C) 2007-2017, and GNU GPL'd, by OpenWorks LLP et al.
==8063== Using Valgrind-3.22.0 and LibVEX; rerun with -h for copyright info
==8063== Command: ./demo
==8063==
func1
^C==8063==
==8063== Process terminating with default action of signal 2 (SIGINT)
==8063== at 0x4C0CADF: clock_nanosleep@@GLIBC_2.17 (clock_nanosleep.c:78)
==8063== by 0x4C19A26: nanosleep (nanosleep.c:25)
==8063== by 0x109AB6: void std::this_thread::sleep_for<long, std::ratio<1l, 1l> >(std::chrono::duration<long, std::ratio<1l, 1l> > const&) (this_thread_sleep.h:80)
==8063== by 0x1093FE: main (demo.cpp:23)
==8063== ---Thread-Announcement------------------------------------------
==8063==
==8063== Thread #2 was created
==8063== at 0x4C49A23: clone (clone.S:76)
==8063== by 0x4C49BA2: __clone_internal_fallback (clone-internal.c:64)
==8063== by 0x4C49BA2: __clone_internal (clone-internal.c:109)
==8063== by 0x4BBC54F: create_thread (pthread_create.c:297)
==8063== by 0x4BBD1A4: pthread_create@@GLIBC_2.34 (pthread_create.c:836)
==8063== by 0x4854975: ??? (in /usr/libexec/valgrind/vgpreload_helgrind-amd64-linux.so)
==8063== by 0x4960EB0: std::thread::_M_start_thread(std::unique_ptr<std::thread::_State, std::default_delete<std::thread::_State> >, void (*)()) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.33)
==8063== by 0x1094BC: std::thread::thread<main::{lambda()#1}, , void>(main::{lambda()#1}&&) (std_thread.h:164)
==8063== by 0x1093C0: main (demo.cpp:21)
==8063==
==8063== ----------------------------------------------------------------
==8063==
==8063== Thread #2: Exiting thread still holds 1 lock
==8063== at 0x4BB8F70: futex_wait (futex-internal.h:146)
==8063== by 0x4BB8F70: __lll_lock_wait (lowlevellock.c:49)
==8063== by 0x4BC0100: lll_mutex_lock_optimized (pthread_mutex_lock.c:48)
==8063== by 0x4BC0100: pthread_mutex_lock@@GLIBC_2.2.5 (pthread_mutex_lock.c:93)
==8063== by 0x4851274: ??? (in /usr/libexec/valgrind/vgpreload_helgrind-amd64-linux.so)
==8063== by 0x109814: __gthread_mutex_lock(pthread_mutex_t*) (gthr-default.h:749)
==8063== by 0x1098D1: std::mutex::lock() (std_mutex.h:113)
==8063== by 0x10930B: func2() (demo.cpp:8)
==8063== by 0x10936B: func1() (demo.cpp:16)
==8063== by 0x10938E: main::{lambda()#1}::operator()() const (demo.cpp:21)
==8063== by 0x1097C3: void std::__invoke_impl<void, main::{lambda()#1}>(std::__invoke_other, main::{lambda()#1}&&) (invoke.h:61)
==8063== by 0x109786: std::__invoke_result<main::{lambda()#1}>::type std::__invoke<main::{lambda()#1}>(main::{lambda()#1}&&) (invoke.h:96)
==8063== by 0x109733: void std::thread::_Invoker<std::tuple<main::{lambda()#1}> >::_M_invoke<0ul>(std::_Index_tuple<0ul>) (std_thread.h:292)
==8063== by 0x109707: std::thread::_Invoker<std::tuple<main::{lambda()#1}> >::operator()() (std_thread.h:299)
==8063==
==8063==
==8063== Use --history-level=approx or =none to gain increased speed, at
==8063== the cost of reduced accuracy of conflicting-access information
==8063== For lists of detected and suppressed errors, rerun with: -s
==8063== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
从 Thread #2: Exiting thread still holds 1 lock
可以看出,我们创建的那个线程里出现了死锁。 并且从 by 0x10930B: func2() (demo.cpp:8), by 0x10936B: func1() (demo.cpp:16)
可以知道在 demo.cpp 的第 8 行和第 16 行都用到了锁,再去这里检查代码即可定位问题。
本文由作者按照 CC BY 4.0 进行授权