Windows?程序內(nèi)存泄漏?(?Memory?Leak?)?分析之Windbg
- 實踐證明,當程序復雜,內(nèi)存頻繁的申請釋放,通過
UMDH
對比的文件將會非常的大,并且很難直接看出內(nèi)存泄露所在。 UMDH
在收集信息的需要符號文件,不太適合于在客戶的機器上進行操作。
樣例代碼
這個樣例代碼中循環(huán)調(diào)用一個Memory Leak的函數(shù):#include?#include?#include?class?TestClass{public:?char?m_str[100];};void?MemoryLeakObj(){?TestClass?*?pObj?=?new?TestClass;?strcpy_s(pObj->m_str,?100,?"Memory?Leak?Sample");?std::cout?<m_str?<
基礎知識
這個章節(jié)了解下堆的一些基本知識。一個進程可以有若干個堆,包括CRT庫中malloc
也是從堆中申請內(nèi)存,也可以自己通過Windows API?HeapCreate
創(chuàng)建堆。在windbg中查看所有的堆, 一般主要通過查看commit
的內(nèi)存來確定是否有內(nèi)存泄露。0:008>?!heap?-s
*****************************************************************************************************
??????????????????????????????????????????????NT?HEAP?STATS?BELOW
*****************************************************************************************************
NtGlobalFlag?enables?following?debugging?aids?for?new?heaps:
????tail?checking
????free?checking
????validate?parameters
LFH?Key???????????????????:?0x3f0f03d02e6012eb
Termination?on?corruption?:?ENABLED
??????????Heap?????Flags???Reserv??Commit??Virt???Free??List???UCR??Virt??Lock??Fast?
????????????????????????????(k)?????(k)????(k)?????(k)?length??????blocks?cont.?heap?
-------------------------------------------------------------------------------------
0000026349b50000?40000062????2040???1088???2040??????2????26?????2????1??????0??????
00000263499d0000?40008060??????64??????4?????64??????2?????1?????1????0??????0??????
0000026349b30000?40001062??????60?????20?????60??????2?????2?????1????0??????0??????
000002634b440000?40001062????1080?????88???1080??????2?????4?????2????0??????0??????
-------------------------------------------------------------------------------------
Windows中,一個堆本身并不只是由一個連續(xù)的空間組成,而是可以由多個連續(xù)的空間組成,而每一個連續(xù)的空間我們稱之為Segment
。我們挑選一個堆來查看他的Segment
。可以看到這個堆目前由兩個Segment
構(gòu)成,并且列出了每個Segment
的地址范圍。0:008> !heap 0000026349b50000
Index Address Name Debugging options enabled
1: 26349b50000
Segment at 0000026349b50000 to 0000026349c4f000 (000ff000 bytes committed)
????Segment?at?000002634bef0000?to?000002634bfef000?(00011000?bytes?committed)
可以通過heap -a
來查看各個Segment
中申請內(nèi)存。我們申請的內(nèi)存的時候便是占用每一個Entry
,有時候也叫做block
。0:008> !heap -a 26349b50000
Index Address Name Debugging options enabled
1: 26349b50000
Segment at 0000026349b50000 to 0000026349c4f000 (000ff000 bytes committed)
Segment at 000002634bef0000 to 000002634bfef000 (00011000 bytes committed)
Flags: 40000062
ForceFlags: 40000060
Granularity: 16 bytes
Segment Reserve: 00200000
Segment Commit: 00002000
DeCommit Block Thres: 00000100
DeCommit Total Thres: 00001000
Total Free Size: 0000009f
Max. Allocation Size: 00007ffffffdefff
Lock Variable at: 0000026349b502a0
Next TagIndex: 0000
Maximum TagIndex: 0000
Tag Entries: 00000000
PsuedoTag Entries: 00000000
Virtual Alloc List: 26349b50110
000002634ba79000: 00100000 [commited 101000, unused 1000] - busy (b)
Uncommitted ranges: 26349b500f0
2634bf01000: 000ee000 (974848 bytes)
FreeList[ 00 ] at 0000026349b50150: 000002634bf00a30 . 0000026349bd9fb0
0000026349bd9fa0: 00050 . 00020 [104] - free
0000026349bd4670: 00050 . 00020 [104] - free
0000026349bd8630: 000b0 . 00020 [104] - free
0000026349bd80c0: 00050 . 00020 [104] - free
0000026349bd60b0: 00060 . 00020 [104] - free
0000026349bd53f0: 000b0 . 00020 [104] - free
0000026349b5f4c0: 00060 . 00020 [104] - free
0000026349b5dea0: 00050 . 00020 [104] - free
0000026349b61860: 00090 . 00020 [104] - free
0000026349b57ae0: 00080 . 00020 [104] - free
0000026349b53990: 00080 . 00020 [104] - free
0000026349b6a800: 00050 . 00030 [104] - free
0000026349b629c0: 00050 . 00030 [104] - free
0000026349b5f610: 00070 . 00030 [104] - free
0000026349b60a90: 00070 . 00030 [104] - free
0000026349b62390: 00070 . 00030 [104] - free
0000026349b5f940: 000c0 . 00030 [104] - free
0000026349b668b0: 00070 . 00030 [104] - free
0000026349b65230: 00040 . 00030 [104] - free
0000026349b65ad0: 00040 . 00030 [104] - free
0000026349b57e70: 00080 . 00030 [104] - free
0000026349b57cb0: 00070 . 00030 [104] - free
0000026349b57930: 00050 . 00030 [104] - free
0000026349bd9c70: 000a0 . 00040 [104] - free
0000026349bd9ea0: 00040 . 00070 [104] - free
000002634bf00a20: 000a0 . 005a0 [104] - free
Segment00 at 49b50000:
Flags: 00000000
Base: 26349b50000
First Entry: 49b50720
Last Entry: 26349c4f000
Total Pages: 000000ff
Total UnCommit: 00000000
Largest UnCommit:00000000
UnCommitted Ranges: (1)
Heap entries for Segment00 in Heap 0000026349b50000
address: psize . size flags state (requested size)
0000026349b50000: 00000 . 00720 [101] - busy (71f)
0000026349b50720: 00720 . 00130 [107] - busy (12f), tail fill Internal
0000026349b50850: 00130 . 00130 [107] - busy (100), tail fill
.......
0000026349c4ede0: 000a0 . 000a0 [107] - busy (64), tail fill
0000026349c4ee80: 000a0 . 000a0 [107] - busy (64), tail fill
0000026349c4ef20: 000a0 . 000a0 [107] - busy (64), tail fill
0000026349c4efc0: 000a0 . 00040 [111] - busy (3d)
0000026349c4f000: 00000000 - uncommitted bytes.
Segment01 at 4bef0000:
Flags: 00000000
Base: 2634bef0000
First Entry: 4bef0070
Last Entry: 2634bfef000
Total Pages: 000000ff
Total UnCommit: 000000ee
Largest UnCommit:00000000
UnCommitted Ranges: (1)
Heap entries for Segment01 in Heap 0000026349b50000
address: psize . size flags state (requested size)
000002634bef0000: 00000 . 00070 [101] - busy (6f)
000002634bef0070: 00070 . 000a0 [107] - busy (64), tail fill
.......
000002634bf00700: 000a0 . 000a0 [107] - busy (64), tail fill
000002634bf00840: 000a0 . 000a0 [107] - busy (64), tail fill
000002634bf008e0: 000a0 . 000a0 [107] - busy (64), tail fill
000002634bf00980: 000a0 . 000a0 [107] - busy (64), tail fill
000002634bf00a20: 000a0 . 005a0 [104] free fill
000002634bf00fc0: 005a0 . 00040 [111] - busy (3d)
000002634bf01000: 000ee000 - uncommitted bytes.
但是Entry
的地址并不等同于我們通過malloc
返回的地址,比如通過heap -x
來查看剛剛Entry
的信息,注意到Entry
的地址和User
(也就是我們通過malloc
申請的內(nèi)存地址啦)不同,那是堆通過Entry
開頭_HEAP_ENTRY
數(shù)據(jù)結(jié)構(gòu)進行Entry
管理。0:008> !heap -x 000002634bf00980
Entry User Heap Segment Size PrevSize Unused Flags
-------------------------------------------------------------------------------------------------------------
000002634bf00980 000002634bf00990 0000026349b50000 000002634bef0000 a0 a0 3c busy extra fill
那么假設我們知道泄漏的內(nèi)存地址了,如何知道申請內(nèi)存的函數(shù)調(diào)用棧呢?在進行運行前,使用gflag設置記錄函數(shù)調(diào)用棧信息:?"C:\Program Files (x86)\Windows Kits\10\Debuggers\x64\gflags" -i MemoryLeakAnalysisViaWindbg.exe ust
。然后調(diào)用heap -p -a
,就可以看到泄露的內(nèi)存地址對應的函數(shù)調(diào)用棧了。那么接下來我們一起來看看是如何分析內(nèi)存泄露的。Windbg內(nèi)存泄露分析
第一步
?要做的和UMDH
分析一樣,調(diào)用以下命令對MemoryLeakAnalysisViaWindbg.exe
程序在申請堆上內(nèi)存的時候記錄其函數(shù)調(diào)用棧"C:\Program Files (x86)\Windows Kits\10\Debuggers\x64\gflags" -i MemoryLeakAnalysisViaWindbg.exe ust
。第二步
?開始運行程序一段時間,查看當前堆的使用情況, 主要查看commit
的大小,再用g
指令運行一段后,查看是哪個對的commit
的大小增加比較快。這里鎖定到了堆000001471ba50000
。0:006>?!heap?-s
************************************************************************************************************************
??????????????????????????????????????????????NT?HEAP?STATS?BELOW
************************************************************************************************************************
NtGlobalFlag?enables?following?debugging?aids?for?new?heaps:
????stack?back?traces
LFH?Key???????????????????:?0xe82e55f3a47de176
Termination?on?corruption?:?ENABLED
??????????Heap?????Flags???Reserv??Commit??Virt???Free??List???UCR??Virt??Lock??Fast?
????????????????????????????(k)?????(k)????(k)?????(k)?length??????blocks?cont.?heap?
-------------------------------------------------------------------------------------
000001471ba50000?08000002????1220????820???1020?????48????25?????1????1??????0???LFH
000001471a110000?08008000??????64??????4?????64??????2?????1?????1????0??????0??????
000001471bd50000?08001002?????260?????36?????60??????7?????2?????1????0??????0???LFH
000001471bd10000?08001002????1280????112???1080??????4?????3?????2????0??????0???LFH
-------------------------------------------------------------------------------------
通過指令!heap -stat [-h Handle [-grp GroupBy [MaxDisplay]]]
來做統(tǒng)計信息。這里按照block
的數(shù)量進行排序篩選出前5的。這里注意有時候數(shù)量多不一定就是泄露的點,如果運行時間足夠長也可以使用-grp S
選項來根據(jù)同種類型的內(nèi)存申請的總和進行排序。0:006> !heap -stat -h 000001471ba50000 -grp B 5
heap @ 000001471ba50000
group-by: BLOCKCOUNT max-display: 5
size #blocks total ( %) (percent of totalblocks)
64 1fa - c5a8 (30.43)
30 12c - 3840 (18.04)
48 d1 - 3ac8 (12.57)
20 7f - fe0 (7.64)
10 3c - 3c0 (3.61)
第三步
?運行一段時間,足夠明顯的感覺到內(nèi)存的增長,此時中斷調(diào)試,繼續(xù)按照block
的數(shù)量進行排序。此時觀察到大小為0x64
的對象從數(shù)量0x1fa
增長到0x849
,增加了1615次申請。那么如此數(shù)量的增長,或者上面如果是用-grp S
進行觀測,則尋找內(nèi)存增加較多的Entry Size
0:009> !heap -stat -h 000001471ba50000 -grp B 5
heap @ 000001471ba50000
group-by: BLOCKCOUNT max-display: 5
size #blocks total ( %) (percent of totalblocks)
64 849 - 33c84 (64.14)
30 12c - 3840 (9.07)
48 d1 - 3ac8 (6.32)
20 7e - fc0 (3.81)
10 3c - 3c0 (1.81)
第四步
?然后根據(jù)這個特定的大小,查看所有對應的entry
。此時可能有很多的entry, 如果想保存下來windbg 提供.logopen
和.logclose
來保存命令輸出結(jié)果。0:009> !heap -flt s 64
_HEAP @ 1471ba50000
HEAP_ENTRY Size Prev Flags UserPtr UserSize - state
000001471ba61790 0009 0000 [00] 000001471ba617c0 00064 - (busy)
000001471ba66d80 0009 0009 [00] 000001471ba66db0 00064 - (busy)
000001471bafaa80 0009 0009 [00] 000001471bafaab0 00064 - (busy)
000001471bafab10 0009 0009 [00] 000001471bafab40 00064 - (busy)
......
000001471df9fd10 0009 0009 [00] 000001471df9fd40 00064 - (busy)
000001471df9fda0 0009 0009 [00] 000001471df9fdd0 00064 - (busy)
000001471df9fe30 0009 0009 [00] 000001471df9fe60 00064 - (busy)
000001471df9fec0 0009 0009 [00] 000001471df9fef0 00064 - (busy)
000001471df9ff50 0009 0009 [00] 000001471df9ff80 00064 - (busy)
000001471df9ffe0 0009 0009 [00] 000001471dfa0010 00064 - (busy)
_HEAP @ 1471a110000
_HEAP @ 1471bd50000
_HEAP @ 1471bd10000
第五步
?隨便找?guī)讉€Entry
的地址查看其函數(shù)調(diào)用棧,比如這里查看000001471df9ff50
。比較容易就定位到了申請內(nèi)存的代碼。不過這里注意一下為什么函數(shù)棧是main
?而不是MemoryLeakObj
,這是因為我們的編譯進行的優(yōu)化,不過這也不妨礙我們找到問題。0:009>?!heap?-p?-a?000001471df9ff50
????address?000001471df9ff50?found?in
????_HEAP?@?1471ba50000
??????????????HEAP_ENTRY?Size?Prev?Flags????????????UserPtr?UserSize?-?state
????????000001471df9ff50?0009?0000??[00]???000001471df9ff80????00064?-?(busy)
????????7ff8350fbe47?ntdll!RtlpCallInterceptRoutine 0x000000000000003f
????????7ff8350baa6f?ntdll!RtlpAllocateHeapInternal 0x000000000009192f
????????7ff8315b9686?ucrtbase!_malloc_base 0x0000000000000036
????????7ff6558613a3?MemoryLeakAnalysisViaWindbg!operator?new 0x000000000000001f
????????7ff65586102d?MemoryLeakAnalysisViaWindbg!main 0x000000000000002d
????????7ff6558615b0?MemoryLeakAnalysisViaWindbg!__scrt_common_main_seh 0x000000000000010c
????????7ff834e84034?KERNEL32!BaseThreadInitThunk 0x0000000000000014
????????7ff835083691?ntdll!RtlUserThreadStart 0x0000000000000021
總結(jié)
- 本文所闡述的方式是針對同一種大小的內(nèi)存申請導致的內(nèi)存泄露。而內(nèi)存泄露在大型工程中還有可能是可變大小的,那么這種方法就不適合。這也是為什么內(nèi)存泄露問題寫了兩篇文章還沒寫完: 內(nèi)存泄露各式各樣,在客戶環(huán)境如何定位問題,也是難上加難。計劃后面還會寫幾篇比如vmmap, DebugDialog,以及其他的一些非使用工具的一些方法。
- 上面的例子是筆者attach到進程調(diào)試的結(jié)果。如果碰到在客戶環(huán)境有這樣的問題,顯然在線調(diào)試是不太可能的,可以用gflag開啟
ust
后收集兩次Dump來查找問題(這兩次dump的間隔時間要足以觀測到內(nèi)存泄露,根據(jù)實際情況而定)。 - 編寫代碼的時候盡量使用智能指針
unique_ptr
和shared_ptr
,埋坑簡單,但找到問題的原因可能比寫代碼的時間都長。
- EOF -