Bug 29785 - memory bloat in version 2.39
Summary: memory bloat in version 2.39
Status: RESOLVED FIXED
Alias: None
Product: binutils
Classification: Unclassified
Component: binutils (show other bugs)
Version: 2.39
: P2 normal
Target Milestone: 2.43
Assignee: Alan Modra
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-11-15 14:26 UTC by lijunlong
Modified: 2024-02-21 13:10 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed: 2022-11-23 00:00:00


Attachments
Proposed patch (952 bytes, patch)
2023-05-19 09:14 UTC, Steinar H. Gunderson
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description lijunlong 2022-11-15 14:26:53 UTC
I think maybe the commit b43771b045fb5616da396 introduces this problem.
In version 2.37, addr2line uses less memory and runs faster.

I run the following cmd

time /usr/local/bin/addr2line -a -f -i -e mysqld.debug a77563 b3cabf 4f7eaa 9be700 a3dee8 b05b87 afecba aa8cd2 51f53e 52d0e4 9c0da9 9bad63 9bfb0a 9c0c26 a3b8e0 a3a7bf a3c0d9 99fe9f 9c1253 52bc4f


the version 2.37 runs for 8s while the version 2.39 runs for 1m58s.
and the version 2.37 use about 8G memory while the version 2.39 use 16.8G memory.


the hot backtrace of the version 2.39 from flamegraph is:


_start
__libc_start_main
main@binutils/addr2line.c:579
process_file@binutils/addr2line.c:470
translate_addresses@binutils/addr2line.c:337
bfd_map_over_sections@bfd/section.c:1369
find_address_in_section@binutils/addr2line.c:177
find_address_in_section@binutils/addr2line.c:197
_bfd_elf_find_nearest_line@bfd/elf.c:9309
_bfd_dwarf2_find_nearest_line@./dwarf2.c:5854
comp_unit_find_nearest_line@./dwarf2.c:4598
comp_unit_maybe_decode_line_info@./dwarf2.c:4631
decode_line_info@./dwarf2.c:2863
arange_add@./dwarf2.c:2213
insert_arange_in_trie@./dwarf2.c:2191
insert_arange_in_trie@./dwarf2.c:2191
insert_arange_in_trie@./dwarf2.c:2191
insert_arange_in_trie@./dwarf2.c:2191
insert_arange_in_trie@./dwarf2.c:2191
insert_arange_in_trie@./dwarf2.c:2191
insert_arange_in_trie@./dwarf2.c:2191
insert_arange_in_trie@./dwarf2.c:2191
insert_arange_in_trie@./dwarf2.c:2076
Comment 1 lijunlong 2022-11-15 14:41:47 UTC
The debug file is greater than 10M.
So can not be an attachment.

I put the debug file in the github.

https://github.com/zhuizhuhaomeng/sharedfiles/blob/master/mysqld.debug.tar.gz
Comment 2 Alan Modra 2022-11-23 01:48:35 UTC
Confirmed, one commit before b43771b045fb gives
$ time binutils/addr2line -a -e /tmp/mysqld.debug a77563
0x0000000000a77563
binutils/addr2line: DWARF error: could not find variable specification at offset 0xdfdf
binutils/addr2line: DWARF error: could not find variable specification at offset 0x1dee4
binutils/addr2line: DWARF error: could not find variable specification at offset 0x1def2
binutils/addr2line: DWARF error: could not find variable specification at offset 0x1de3a
binutils/addr2line: DWARF error: could not find variable specification at offset 0x1de48
binutils/addr2line: DWARF error: could not find variable specification at offset 0x1de56
binutils/addr2line: DWARF error: could not find variable specification at offset 0x1de2c
binutils/addr2line: DWARF error: could not find variable specification at offset 0x1ded6
/usr/src/debug/mariadb-10.3.35-1.module+el8.6.0+1005+cdf19c22.x86_64/storage/innobase/ut/ut0wqueue.cc:162

real	0m3.454s
user	0m2.229s
sys	0m1.224s

commit b43771b045fb results:
$ time binutils/addr2line -a -e /tmp/mysqld.debug a77563
0x0000000000a77563
binutils/addr2line: DWARF error: could not find variable specification at offset 0xdfdf
binutils/addr2line: DWARF error: could not find variable specification at offset 0x1dee4
binutils/addr2line: DWARF error: could not find variable specification at offset 0x1def2
binutils/addr2line: DWARF error: could not find variable specification at offset 0x1de3a
binutils/addr2line: DWARF error: could not find variable specification at offset 0x1de48
binutils/addr2line: DWARF error: could not find variable specification at offset 0x1de56
binutils/addr2line: DWARF error: could not find variable specification at offset 0x1de2c
binutils/addr2line: DWARF error: could not find variable specification at offset 0x1ded6
/usr/src/debug/mariadb-10.3.35-1.module+el8.6.0+1005+cdf19c22.x86_64/storage/innobase/ut/ut0wqueue.cc:162

real	0m46.450s
user	0m43.108s
sys	0m3.336s
Comment 3 Steinar H. Gunderson 2023-05-19 08:47:35 UTC
It seems this binary is somehow broken; it has a _huge_ amount of duplicate ranges in different units. E.g. the range [0x51f600, 0x52dad0) shows up 764 times! And there are lots and lots of such ranges (approximately 30000 of them). Since we try to merge ranges together, this causes O(n²) behavior, which is probably related to the slowness. (Since they are duplicate, a binary search wouldn't help us either.) I wonder also if the huge number of duplicates cause us to try to make the tree deeper in an attempt to distinguish them--which obviously doesn't work, but causes tons of new nodes that we need to insert into.

I guess the most reasonable thing to do here would be fixing the debug info in the binary so that it's not as pathological?
Comment 4 Steinar H. Gunderson 2023-05-19 09:14:54 UTC
Created attachment 14889 [details]
Proposed patch

Something like this would probably mitigate the worst splitting issues. There's still going to be O(n²) behavior almost no matter what we do, though, so the best thing would really be to stop emitting such huge amounts of duplicate ranges.
Comment 5 Sourceware Commits 2024-02-15 03:23:12 UTC
The master branch has been updated by Alan Modra <amodra@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=7bd1e04a3532ed3f833a79a40bd7bc0bd48706ad

commit 7bd1e04a3532ed3f833a79a40bd7bc0bd48706ad
Author: Steinar H. Gunderson <steinar+sourceware@gunderson.no>
Date:   Fri May 19 09:14:54 2023 +0000

    PR29785, memory bloat after b43771b045fb
    
    Pathological cases of dwarf info with overlapping duplicate memory
    ranges can cause splitting of trie leaf nodes, which in the worst case
    will cause memory to increase without bounds.
    
            PR 29785
            * dwarf2.c (insert_arange_in_trie): Don't split leaf nodes
            unless that reduces number of elements in at least one node.
Comment 6 Alan Modra 2024-02-15 03:25:35 UTC
Patch pushed.  It also fixes https://oss-fuzz.com/testcase-detail/4564110830272512
Comment 7 Sourceware Commits 2024-02-21 13:10:59 UTC
The master branch has been updated by Alan Modra <amodra@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=f96127310144d360eac93444c1b6efe80497d163

commit f96127310144d360eac93444c1b6efe80497d163
Author: Alan Modra <amodra@gmail.com>
Date:   Wed Feb 21 21:59:40 2024 +1030

    Re: PR29785, memory bloat after b43771b045fb
    
    Commit 7bd1e04a3532 introduced "dwarf2.c:2152:29: runtime error: shift
    exponent 64 is too large".  This is on the bucket_high_pc calculation
    which was moved to the top of insert_arange_in_trie where previously
    it was later, at a point where the overflow could not occur.  Move it
    back and arrange for a duplicate calculation of bucket_high_pc which
    is also protected from overflow.
    
            PR 29785
            * dwarf2.c (insert_arange_in_trie): Split bucket_high_pc.
            Move trie_pc_bits < VMA_BITS into splitting_leaf_will_help.