glibc(2.26) malloc原理及源码分析

glibc2.26开始引入了tcache的技术,提升了堆管理的性能。本文根据glibc2.26的源码介绍含tcache技术的Linux堆分配原理。

基本概念

arena

以前的Linux使用的ldmalloc只支持单线程的堆分配,现在的ptmalloc2支持多线程的堆分配,因为它给每个线程分配了一块内存区,这个内存区就叫做arena。(其实arena的数量是有限制的,当线程数大于arena的数量上限的时候就会有线程共用arena)

bin

为了更有效率地分配空闲的堆内存,ptmalloc2把空闲的chunk进行了分类,并用链表把不同类的chunk串起来就形成了bin。不同类的chunk组成的bin有不同的名字:fast bin, unsorted bin, small bin, large bin,这些bin的不同之处将在后面源码分析的部分说明。

基本结构

malloc_chunk

malloc_chunk结构用来表示一个chunk,代码中的注释对其解释得很详细了,总体意思就是说:当当前chunk被使用时,只有mchunk_prev_sizemchunk_size正常使用,其余部分全部用来存储用户数据,下一个chunk的mchunk_prev_size也用来存用户数据;当当前chunk空闲时,fd指向下一个chunk,bd指向上一个chunk,下一个chunk的mchunk_prev_size存的是当前chunk的size。另外,因为mchunk_size的低三位被用来存储是否属于主线程、是否由mmap分配、上一个chunk是否被使用等信息,所以mchunk_size*8才是当前chunk的真实大小。这样的结构设计充分利用了空间,当chunk被使用时是没有在链表中的,所以不需要指针,与它物理相邻的下一个chunk也不需要通过mchunk_prev_size来找到它,所以下一个chunk的mchunk_prev_size的空间也被用来存储数据。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
/*
This struct declaration is misleading (but accurate and necessary).
It declares a "view" into memory allowing access to necessary
fields at known offsets from a given base. See explanation below.
*/

struct malloc_chunk {

INTERNAL_SIZE_T mchunk_prev_size; /* Size of previous chunk (if free). */
INTERNAL_SIZE_T mchunk_size; /* Size in bytes, including overhead. */

struct malloc_chunk* fd; /* double links -- used only if free. */
struct malloc_chunk* bk;

/* Only used for large blocks: pointer to next larger size. */
struct malloc_chunk* fd_nextsize; /* double links -- used only if free. */
struct malloc_chunk* bk_nextsize;
};


/*
malloc_chunk details:

(The following includes lightly edited explanations by Colin Plumb.)

Chunks of memory are maintained using a `boundary tag' method as
described in e.g., Knuth or Standish. (See the paper by Paul
Wilson ftp://ftp.cs.utexas.edu/pub/garbage/allocsrv.ps for a
survey of such techniques.) Sizes of free chunks are stored both
in the front of each chunk and at the end. This makes
consolidating fragmented chunks into bigger chunks very fast. The
size fields also hold bits representing whether chunks are free or
in use.

An allocated chunk looks like this:


chunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Size of previous chunk, if unallocated (P clear) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Size of chunk, in bytes |A|M|P|
mem-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| User data starts here... .
. .
. (malloc_usable_size() bytes) .
. |
nextchunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| (size of chunk, but used for application data) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Size of next chunk, in bytes |A|0|1|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Where "chunk" is the front of the chunk for the purpose of most of
the malloc code, but "mem" is the pointer that is returned to the
user. "Nextchunk" is the beginning of the next contiguous chunk.

Chunks always begin on even word boundaries, so the mem portion
(which is returned to the user) is also on an even word boundary, and
thus at least double-word aligned.

Free chunks are stored in circular doubly-linked lists, and look like this:

chunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Size of previous chunk, if unallocated (P clear) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
`head:' | Size of chunk, in bytes |A|0|P|
mem-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Forward pointer to next chunk in list |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Back pointer to previous chunk in list |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Unused space (may be 0 bytes long) .
. .
. |
nextchunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
`foot:' | Size of chunk, in bytes |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Size of next chunk, in bytes |A|0|0|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

The P (PREV_INUSE) bit, stored in the unused low-order bit of the
chunk size (which is always a multiple of two words), is an in-use
bit for the *previous* chunk. If that bit is *clear*, then the
word before the current chunk size contains the previous chunk
size, and can be used to find the front of the previous chunk.
The very first chunk allocated always has this bit set,
preventing access to non-existent (or non-owned) memory. If
prev_inuse is set for any given chunk, then you CANNOT determine
the size of the previous chunk, and might even get a memory
addressing fault when trying to do so.

The A (NON_MAIN_ARENA) bit is cleared for chunks on the initial,
main arena, described by the main_arena variable. When additional
threads are spawned, each thread receives its own arena (up to a
configurable limit, after which arenas are reused for multiple
threads), and the chunks in these arenas have the A bit set. To
find the arena for a chunk on such a non-main arena, heap_for_ptr
performs a bit mask operation and indirection through the ar_ptr
member of the per-heap header heap_info (see arena.c).

Note that the `foot' of the current chunk is actually represented
as the prev_size of the NEXT chunk. This makes it easier to
deal with alignments etc but can be very confusing when trying
to extend or adapt this code.

The three exceptions to all this are:

1. The special chunk `top' doesn't bother using the
trailing size field since there is no next contiguous chunk
that would have to index off it. After initialization, `top'
is forced to always exist. If it would become less than
MINSIZE bytes long, it is replenished.

2. Chunks allocated via mmap, which have the second-lowest-order
bit M (IS_MMAPPED) set in their size fields. Because they are
allocated one-by-one, each must contain its own trailing size
field. If the M bit is set, the other bits are ignored
(because mmapped chunks are neither in an arena, nor adjacent
to a freed chunk). The M bit is also used for chunks which
originally came from a dumped heap via malloc_set_state in
hooks.c.

3. Chunks in fastbins are treated as allocated chunks from the
point of view of the chunk allocator. They are consolidated
with their neighbors only in bulk, in malloc_consolidate.
*/

heap_info

heap_info结构用来表示一个堆,主要被用于非主线程中,因为主线程中只有一个堆,而非主线程可以调用mmap来创建多个堆,这些堆就通过prev指针串起来。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
/* A heap is a single contiguous memory region holding (coalesceable)
malloc_chunks. It is allocated with mmap() and always starts at an
address aligned to HEAP_MAX_SIZE. */

typedef struct _heap_info
{
mstate ar_ptr; /* Arena for this heap. */
struct _heap_info *prev; /* Previous heap. */
size_t size; /* Current size in bytes. */
size_t mprotect_size; /* Size in bytes that has been mprotected
PROT_READ|PROT_WRITE. */
/* Make sure the following data is properly aligned, particularly
that sizeof (heap_info) + 2 * SIZE_SZ is a multiple of
MALLOC_ALIGNMENT. */
char pad[-6 * SIZE_SZ & MALLOC_ALIGN_MASK];
} heap_info;

malloc_state

malloc_state用来表示一个arena,所以每个arena中只有一个malloc_state结构。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
struct malloc_state
{
/* Serialize access. */
__libc_lock_define (, mutex);//用于进程间的互斥,同一个arena同时只能被一个进程访问

/* Flags (formerly in max_fast). */
int flags;//表示arena是否存在fastbin或者内存是否连续等信息

/* Set if the fastbin chunks contain recently inserted free blocks. */
/* Note this is a bool but not all targets support atomics on booleans. */
int have_fastchunks;

/* Fastbins */
//一个用来存放所有fastbin链表的数组,最多10个fastbin链表
mfastbinptr fastbinsY[NFASTBINS];

/* Base of the topmost chunk -- not otherwise kept in a bin */
mchunkptr top;//指向top chunk

/* The remainder from the most recent split of a small request */
mchunkptr last_remainder;//指向last_remainder

/* Normal bins packed as described above */
mchunkptr bins[NBINS * 2 - 2];//存放除fastbin的所有bin

/* Bitmap of bins */
unsigned int binmap[BINMAPSIZE];

/* Linked list */
struct malloc_state *next;//指向下一个arena

/* Linked list for free arenas. Access to this field is serialized
by free_list_lock in arena.c. */
struct malloc_state *next_free;//指向下一个空闲的arena

/* Number of threads attached to this arena. 0 if the arena is on
the free list. Access to this field is serialized by
free_list_lock in arena.c. */
INTERNAL_SIZE_T attached_threads;//使用本arena的线程的数量

/* Memory allocated from the system in this arena. */
//分配给本arena的内存范围
INTERNAL_SIZE_T system_mem;
INTERNAL_SIZE_T max_system_mem;
};

tcache_entry

当chunk空闲时tcache_entry就会存放在chunk的用户数据部分,并指向下一个tcache_entry, 因此next指向的就直接是用户数据的开始,也对应chunk的fd指针

1
2
3
4
5
6
/* We overlay this structure on the user-data portion of a chunk when
the chunk is stored in the per-thread cache. */
typedef struct tcache_entry
{
struct tcache_entry *next;
} tcache_entry;

tcache_perthread_struct

tcache_perthread_struct用于存放所有的entries链,counts表示每条entries链的tcache_entry的数量,entries指向每条entries链的第一个tcache_entry

1
2
3
4
5
6
7
8
9
10
/* There is one of these for each thread, which contains the
per-thread cache (hence "tcache_perthread_struct"). Keeping
overall size low is mildly important. Note that COUNTS and ENTRIES
are redundant (we could have just counted the linked list each
time), this is for performance reasons. */
typedef struct tcache_perthread_struct
{
char counts[TCACHE_MAX_BINS];
tcache_entry *entries[TCACHE_MAX_BINS];
} tcache_perthread_struct;

关键函数

__libc_malloc

malloc函数,调用了tcache_init, tcache_get, _init_malloc等关键函数,这些函数将在后面讲解

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
void *
__libc_malloc(size_t bytes)
{
mstate ar_ptr;
void *victim;
//在malloc之前先调用__malloc_hook函数,如果__malloc_hook不为空的话
void *(*hook)(size_t, const void *) = atomic_forced_read(__malloc_hook);
if (__builtin_expect(hook != NULL, 0))
return (*hook)(bytes, RETURN_ADDRESS(0));
#if USE_TCACHE
/* int_free also calls request2size, be careful to not pad twice. */
//tbytes为真实申请的字节数
//checked_request2size调用request2size对tbytes进行了赋值,并进行了检查
size_t tbytes;
checked_request2size(bytes, tbytes);
size_t tc_idx = csize2tidx(tbytes);//csize2tidx返回索引
/* With rounding and alignment, the bins are...
idx 0 bytes 0..24 (64-bit) or 0..12 (32-bit)
idx 1 bytes 25..40 or 13..20
idx 2 bytes 41..56 or 21..28
etc. */

MAYBE_INIT_TCACHE();//调用tcache_init进行tcache的初始化

DIAG_PUSH_NEEDS_COMMENT;
//如果索引合法并且对应entries里有chunk则调用tcache_get返回chunk
if (tc_idx < mp_.tcache_bins
/*&& tc_idx < TCACHE_MAX_BINS*/ /* to appease gcc */
&& tcache && tcache->entries[tc_idx] != NULL)
{
return tcache_get(tc_idx);
}
DIAG_POP_NEEDS_COMMENT;
#endif

//如果为单线程直接调用_int_malloc在main_arena中分配chunk
if (SINGLE_THREAD_P)
{
victim = _int_malloc(&main_arena, bytes);
assert(!victim || chunk_is_mmapped(mem2chunk(victim)) ||
&main_arena == arena_for_chunk(mem2chunk(victim)));
return victim;
}

//寻找可用的arena并分配chunk
arena_get(ar_ptr, bytes);
victim = _int_malloc(ar_ptr, bytes);

/* Retry with another arena only if we were able to find a usable arena
before. */
//如果找到了可用的arena但chunk分配不成功则重新寻找和分配
if (!victim && ar_ptr != NULL)
{
LIBC_PROBE(memory_malloc_retry, 1, bytes);
ar_ptr = arena_get_retry(ar_ptr, bytes);
victim = _int_malloc(ar_ptr, bytes);
}

//解锁arena
if (ar_ptr != NULL)
__libc_lock_unlock(ar_ptr->mutex);
//安全检查
assert(!victim || chunk_is_mmapped(mem2chunk(victim)) ||
ar_ptr == arena_for_chunk(mem2chunk(victim)));
return victim;
}

在glibc2.26之前的版本中(如glibc2.25),没有#if USE_TCACHE...#endif和判断单线程的那些代码。另外在这些glibc版本中可以通过修改__malloc_hook的值来getshell,但是在glibc2.34中已经将__malloc_hook, __realloc_hook,__memalign_hook ,__free_hook等API删除。

tcache_init

tcache的初始化函数,主要就是利用malloc的内存分配原理给tcache分配内存,完成初始化

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
static void
tcache_init(void)
{
//本部分与__libc_malloc的后半段基本一致,就是给tcache分配sizeof(tcache_perthread_struct)大小的内存,完成初始化
mstate ar_ptr;
void *victim = 0;
const size_t bytes = sizeof (tcache_perthread_struct);

if (tcache_shutting_down)
return;

arena_get (ar_ptr, bytes);
victim = _int_malloc (ar_ptr, bytes);
if (!victim && ar_ptr != NULL)
{
ar_ptr = arena_get_retry (ar_ptr, bytes);
victim = _int_malloc (ar_ptr, bytes);
}


if (ar_ptr != NULL)
__libc_lock_unlock (ar_ptr->mutex);

/* In a low memory situation, we may not be able to allocate memory
- in which case, we just keep trying later. However, we
typically do this very early, so either there is sufficient
memory, or there isn't enough memory to do non-trivial
allocations anyway. */
if (victim)
{
tcache = (tcache_perthread_struct *) victim;
memset (tcache, 0, sizeof (tcache_perthread_struct));
}

}

tcache_get

__libc_malloc中判断索引为tc_idx的entries中有可用的chunk后调用本函数取出对应的chunk

1
2
3
4
5
6
7
8
9
10
11
12
/* Caller must ensure that we know tc_idx is valid and there's
available chunks to remove. */
static __always_inline void *
tcache_get (size_t tc_idx)
{
tcache_entry *e = tcache->entries[tc_idx];//找到索引为tc_idx的entries,并暂存给e
assert (tc_idx < TCACHE_MAX_BINS);//确保tc_idx小于TCACHE_MAX_BINS
assert (tcache->entries[tc_idx] > 0);//确保entries非空
tcache->entries[tc_idx] = e->next;//把e踢出链表
--(tcache->counts[tc_idx]);//链表里的chunk数减一
return (void *) e;//e即要找的chunk,因为e指向的就是chunk的用户数据,所以直接返回e
}

_int_malloc

本函数就是Linux堆分配的主要内容,由于代码较长,所以分段分析。

开始先对申请的字节数做对齐处理,使得申请的最小内存大小必须是2*SIZE_SZ(32位下为4, 64位下为8)的最小整数倍,再判断当前的arena是否可用,如果不可用则调用sysmalloc申请内存

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
static void *
_int_malloc (mstate av, size_t bytes)
{
INTERNAL_SIZE_T nb; /* normalized request size */
unsigned int idx; /* associated bin index */
mbinptr bin; /* associated bin */

mchunkptr victim; /* inspected/selected chunk */
INTERNAL_SIZE_T size; /* its size */
int victim_index; /* its bin index */

mchunkptr remainder; /* remainder from a split */
unsigned long remainder_size; /* its size */

unsigned int block; /* bit map traverser */
unsigned int bit; /* bit map traverser */
unsigned int map; /* current word of binmap */

mchunkptr fwd; /* misc temp for linking */
mchunkptr bck; /* misc temp for linking */

#if USE_TCACHE
size_t tcache_unsorted_count; /* count of unsorted chunks processed */
#endif

/*
Convert request size to internal form by adding SIZE_SZ bytes
overhead plus possibly more to obtain necessary alignment and/or
to obtain a size of at least MINSIZE, the smallest allocatable
size. Also, checked_request2size traps (returning 0) request sizes
that are so large that they wrap around zero when padded and
aligned.
*/
//将申请的字节数bytes根据2*SIZE_SZ对齐转换成实际分配的字节数nb
//并做了一些安全检查,确保不会溢出
checked_request2size (bytes, nb);

/* There are no usable arenas. Fall back to sysmalloc to get a chunk from
mmap. */
//如果当前的arena不可用,就用调用sysmalloc向操作系统直接申请nb大小的内存返回
if (__glibc_unlikely (av == NULL))
{
void *p = sysmalloc (nb, av);
if (p != NULL)
alloc_perturb (p, bytes);
return p;
}
...
}

fast bin

fast bin存储在fastbinY数组中,一共有10个,每个fast bin都是一个单链表,只用fd将各个chunk串起来。每个单链表中的chunk大小是一样的,这样在找特定大小的chunk的时候就不用挨个找,只需要计算出对应链表的索引即可,提高了效率。如fastbinY[0]中存储的链表里都是16字节大小的chunk(以32位系统为例),之后每个链表增加8个字节,所以最后一个fast bin中存储的是88字节的chunk。另外,fast bin使用的是LIFO算法(即后入先出),这样刚被释放的chunk最先被分配,利用了局部性原理。

fast bin是除tcache外优先级最高的,如果fas tbin中有满足需求的chunk就不需要再到small bin和large bin中寻找。当在fast bin中找到需要的chunk后还将与该chunk大小相同的所有chunk放入tcache,目的就是利用局部性原理提高下一次内存分配的效率。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
/*
If the size qualifies as a fastbin, first check corresponding bin.
This code is safe to execute even if av is not yet initialized, so we
can try it without checking, which saves some time on this fast path.
*/

#define REMOVE_FB(fb, victim, pp) \
do \
{ \
victim = pp; \
if (victim == NULL) \
break; \
} while ((pp = catomic_compare_and_exchange_val_acq(fb, victim->fd, victim)) != victim);
//判断实际分配的chunk大小是否满足fastbin的要求
//即小于等于MAX_FAST_SIZE(在32位中是80,在64位中是160)
if ((unsigned long)(nb) <= (unsigned long)(get_max_fast()))
{
idx = fastbin_index(nb);//寻找在fastbin数组中对应的索引
mfastbinptr *fb = &fastbin(av, idx);//链表头指针
mchunkptr pp;
victim = *fb;//将链表的第一个chunk作为victim取出,插入时也插入在链表头,即LIFO

if (victim != NULL)
{
if (SINGLE_THREAD_P)
*fb = victim->fd;//即将victim从链表中取出
else
REMOVE_FB(fb, pp, victim);//也是将victim从链表中取出,fb指向下一个chunk
//确保分配的chunk属于刚才找到的fastbin链表
if (__glibc_likely(victim != NULL))
{
size_t victim_idx = fastbin_index(chunksize(victim));
if (__builtin_expect(victim_idx != idx, 0))
malloc_printerr("malloc(): memory corruption (fast)");
check_remalloced_chunk(av, victim, nb);
//将大小为nb的chunk放到tcache里
#if USE_TCACHE
/* While we're here, if we see other chunks of the same size,
stash them in the tcache. */
size_t tc_idx = csize2tidx(nb);
if (tcache && tc_idx < mp_.tcache_bins)//mp_.tcache_bins=64
{
mchunkptr tc_victim;

/* While bin not empty and tcache not full, copy chunks. */
//当对应tcache链没有满并且要放入tcache的chunk(即刚才被拿出去的chunk的下一个chunk)不为空
while (tcache->counts[tc_idx] < mp_.tcache_count
&& (tc_victim = *fb) != NULL)//mp_.tcache_count=7,即tcache的链表最多7个chunk
{
//同上,从链表中取出tc_victim
if (SINGLE_THREAD_P)
*fb = tc_victim->fd;
else
{
REMOVE_FB(fb, pp, tc_victim);
if (__glibc_unlikely(tc_victim == NULL))
break;
}
tcache_put(tc_victim, tc_idx);//将取出的chunk放入对应的tcache链表
}
}
#endif
//返回从fastbin中找到的chunk
void *p = chunk2mem(victim);
alloc_perturb(p, bytes);
return p;
}
}
}

small bin

small bin存储在bins数组中,下标为2到63,一共62个,和fast bin一样,每个small bin中的chunk都是相同的大小,下标为2的small bin中存储的是16个字节的chunk,之后一次增加8个字节,所以下标为63的small bin中存储的是504个字节的chunk(以32位系统为例)。与fast bin不同的是,small bin是一个双向循环链表,且采用FIFO(先进先出)算法。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
  /*
If a small request, check regular bin. Since these "smallbins"
hold one size each, no searching within bins is necessary.
(For a large request, we need to wait until unsorted chunks are
processed to find best fit. But for small ones, fits are exact
anyway, so we can check now, which is faster.)
*/

if (in_smallbin_range(nb))//即小于MIN_LARGE_SIZE(在32位下为512,在64位下为1024)
//所以small bin中最大的chunk为504字节和1008字节
{
idx = smallbin_index(nb);//算出对应的small bin索引
bin = bin_at(av, idx);//找到对应的small bin链表
//如果small bin链表头的上一个chunk(即链表的最后一个chunk,small bin链表是双向循环链表)
//不等于它自己,即该small bin链表非空
//取链表的最后一个chunk作为victim,即FIFO
if ((victim = last(bin)) != bin)
{
if (victim == 0) /* initialization check */
malloc_consolidate(av);//合并fastbin到unsortedbin
else
{
bck = victim->bk;//倒数第二个chunk
//确保倒数第二个chunk的fd指向最后的chunk
if (__glibc_unlikely(bck->fd != victim))
malloc_printerr("malloc(): smallbin double linked list corrupted");
//将victim下一个chunk的PREV_INUSE位置1,即标记victim为被使用状态
set_inuse_bit_at_offset(victim, nb);
//将victim从链表中取出
bin->bk = bck;
bck->fd = bin;

//如果不是main_arena将NON_MAIN_ARENA位置1
if (av != &main_arena)
set_non_main_arena(victim);
check_malloced_chunk(av, victim, nb);
#if USE_TCACHE
//与fast bin中那部分相似,都是将所有大小为nb的chunk放入tcache中
//不同的是在本部分中chunk放入tcache之前将该chunk标记为使用中
//而fast bin的那部分没有
//说明tcache和fast bin中的chunk尽管空闲还是都被标记为使用中
//目的是防止相邻的chunk合并
/* While we're here, if we see other chunks of the same size,
stash them in the tcache. */
size_t tc_idx = csize2tidx(nb);
if (tcache && tc_idx < mp_.tcache_bins)
{
mchunkptr tc_victim;

/* While bin not empty and tcache not full, copy chunks over. */
while (tcache->counts[tc_idx] < mp_.tcache_count && (tc_victim = last(bin)) != bin)
{
if (tc_victim != 0)
{
bck = tc_victim->bk;
set_inuse_bit_at_offset(tc_victim, nb);
if (av != &main_arena)
set_non_main_arena(tc_victim);
bin->bk = bck;
bck->fd = bin;

tcache_put(tc_victim, tc_idx);
}
}
}
#endif
void *p = chunk2mem(victim);
alloc_perturb(p, bytes);
return p;
}
}
}

large bin1

这部分并不是真正地使用large bin,只是计算出了对应的large bin的索引,并调用malloc_consolidate函数合并fast bin,目的是减少堆中的碎片。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
/*
If this is a large request, consolidate fastbins before continuing.
While it might look excessive to kill all fastbins before
even seeing if there is space available, this avoids
fragmentation problems normally associated with fastbins.
Also, in practice, programs tend to have runs of either small or
large requests, but less often mixtures, so consolidation is not
invoked all that often in most programs. And the programs that
it is called frequently in otherwise tend to fragment.
*/

else
{
idx = largebin_index(nb);//算出对应的large bin的索引
//合并 fast bin 到 unsorted bin
if (atomic_load_relaxed(&av->have_fastchunks))
malloc_consolidate(av);
}

unsorted bin

unsorted bin也存储在bins数组中,下标为1,所以只有一条unsorted bin链表。unsorted bin也是双向循环链表,但是链表里的chunk大小没有限制,任何大小的chunk都可以放到unsorted bin链表中,也没有顺序,采用的算法也是FIFO。

遍历unsorted bin的所有chunk,如果unsorted bin中只有一个chunk且为last_remainder时,就拆分last_remainder,并返回;否则如果当前chunk大小是所需的大小则先尝试放入tcache再返回,否则就将当前chunk放入对应的bin中,如果tcache已经有足够多的chunk则从tcache中返回chunk。当遍历完后,再判断是否有chunk被放入tcache,如果有则从tcache中返回chunk。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
  /*
Process recently freed or remaindered chunks, taking one only if
it is exact fit, or, if this a small request, the chunk is remainder from
the most recent non-exact fit. Place other traversed chunks in
bins. Note that this step is the only place in any routine where
chunks are placed in bins.

The outer loop here is needed because we might not realize until
near the end of malloc that we should have consolidated, so must
do so and retry. This happens at most once, and only when we would
otherwise need to expand memory to service a "small" request.
*/
//tcache变量定义及初始化
#if USE_TCACHE
INTERNAL_SIZE_T tcache_nb = 0;
size_t tc_idx = csize2tidx(nb);//nb对应的tcache索引
if (tcache && tc_idx < mp_.tcache_bins)
tcache_nb = nb;
int return_cached = 0;//用于标记是否有合适大小chunk在下面的过程中被放入tcache

tcache_unsorted_count = 0;//被处理过的unsorted chunk的数量
#endif

for (;;)//大循环
{
int iters = 0;
//unsorted bin非空
while ((victim = unsorted_chunks(av)->bk) != unsorted_chunks(av))
{
bck = victim->bk;//倒数第二个chunk
//victim的大小非法
if (__builtin_expect(chunksize_nomask(victim) <= 2 * SIZE_SZ, 0) || __builtin_expect(chunksize_nomask(victim) > av->system_mem, 0))
malloc_printerr("malloc(): memory corruption");
size = chunksize(victim);//victim的大小

/*
If a small request, try to use last remainder if it is the
only chunk in unsorted bin. This helps promote locality for
runs of consecutive small requests. This is the only
exception to best-fit, and applies only when there is
no exact fit for a small chunk.
*/

if (in_smallbin_range(nb) && //如果请求的大小在small bin的范围内
bck == unsorted_chunks(av) && //并且unsorted bin中只有一个chunk
victim == av->last_remainder &&//并且该chunk就是last remainder
//并且在该chunk中拿走nb的内存后依然可以成为一个chunk
(unsigned long)(size) > (unsigned long)(nb + MINSIZE))
{
/* split and reattach remainder */
remainder_size = size - nb;//剩余部分的大小
remainder = chunk_at_offset(victim, nb);//剩余部分的起始地址
//将剩余部分取代原来的chunk放入unsorted bin链表中
unsorted_chunks(av)->bk = unsorted_chunks(av)->fd = remainder;
av->last_remainder = remainder;
remainder->bk = remainder->fd = unsorted_chunks(av);
//如果剩余部分过大则将fd_nextsize和bk_nextsize指针置空
if (!in_smallbin_range(remainder_size))
{
remainder->fd_nextsize = NULL;
remainder->bk_nextsize = NULL;
}

//设置victim的头部
set_head(victim, nb | PREV_INUSE |
(av != &main_arena ? NON_MAIN_ARENA : 0));
//设置剩余部分的头部
set_head(remainder, remainder_size | PREV_INUSE);
//设置剩余部分的尾部
set_foot(remainder, remainder_size);

//返回victim
check_malloced_chunk(av, victim, nb);
void *p = chunk2mem(victim);
alloc_perturb(p, bytes);
return p;
}

/* remove from unsorted list */
//将victim移出链表
unsorted_chunks(av)->bk = bck;
bck->fd = unsorted_chunks(av);

/* Take now instead of binning if exact fit */

if (size == nb)//如果victim的大小正好是要求的大小
{
//设置标识位
set_inuse_bit_at_offset(victim, size);
if (av != &main_arena)
set_non_main_arena(victim);
#if USE_TCACHE
/* Fill cache first, return to user only if cache fills.
We may return one of these chunks later. */
//如果tcache没满就优先把victim放入tcache,而不是返回给用户
if (tcache_nb && tcache->counts[tc_idx] < mp_.tcache_count)
{
tcache_put(victim, tc_idx);
return_cached = 1;
continue;
}
else
{
#endif
//返回victim
check_malloced_chunk(av, victim, nb);
void *p = chunk2mem(victim);
alloc_perturb(p, bytes);
return p;
#if USE_TCACHE
}
#endif
}

/* place chunk in bin */

//如果victim的大小在small bin的范围内就放入small bin中
if (in_smallbin_range(size))
{
victim_index = smallbin_index(size);
bck = bin_at(av, victim_index);
fwd = bck->fd;
}
else //否则就放入large bin中
{
victim_index = largebin_index(size);
bck = bin_at(av, victim_index);
fwd = bck->fd;

/* maintain large bins in sorted order */
if (fwd != bck)//如果large bin链表不为空
{
/* Or with inuse bit to speed comparisons */
//将PREV_INUSE置1,加速比较
size |= PREV_INUSE;
/* if smaller than smallest, bypass loop below */
//确保bck->bk在main arena中
assert(chunk_main_arena(bck->bk));
//如果victim比链表中的最后一个也就是最小的一个还小,就直接插入最后
if ((unsigned long)(size) < (unsigned long)chunksize_nomask(bck->bk))
{
fwd = bck;
bck = bck->bk;

victim->fd_nextsize = fwd->fd;
victim->bk_nextsize = fwd->fd->bk_nextsize;
fwd->fd->bk_nextsize = victim->bk_nextsize->fd_nextsize = victim;
}
else
{
assert(chunk_main_arena(fwd));
//找到第一个不大于victim的chunk,即找victim的位置
while ((unsigned long)size < chunksize_nomask(fwd))
{
fwd = fwd->fd_nextsize;
assert(chunk_main_arena(fwd));
}

//如果victim和找到的chunk一样大就不用插入nextsize循环链表
if ((unsigned long)size == (unsigned long)chunksize_nomask(fwd))
/* Always insert in the second position. */
fwd = fwd->fd;//victim插在与之相同大小的chunk的后面
else //如果victim和找到的chunk不一样大就需要插入nextsize循环链表
{
victim->fd_nextsize = fwd;
victim->bk_nextsize = fwd->bk_nextsize;
fwd->bk_nextsize = victim;
victim->bk_nextsize->fd_nextsize = victim;
}
bck = fwd->bk;
}
}
//如果large bin链表为空,直接插入,让victim的fd_nextsize和bk_nextsize都指向自己
else
victim->fd_nextsize = victim->bk_nextsize = victim;
}

//利用之前找到的bck和fwd的值将victim插入循环链表中
mark_bin(av, victim_index);
victim->bk = bck;
victim->fd = fwd;
fwd->bk = victim;
bck->fd = victim;

//如果已经将足够多的chunk放入tcache就直接从tcache中找到chunk并返回
#if USE_TCACHE
/* If we've processed as many chunks as we're allowed while
filling the cache, return one of the cached ones. */
++tcache_unsorted_count;
if (return_cached && mp_.tcache_unsorted_limit > 0 && tcache_unsorted_count > mp_.tcache_unsorted_limit)
{
return tcache_get(tc_idx);
}
#endif

#define MAX_ITERS 10000
//wille循环最多10000次
if (++iters >= MAX_ITERS)
break;
}

//while循环结束后判断是否有chunk被放进tcache,如果有就从tcache中取出chunk并返回
#if USE_TCACHE
/* If all the small chunks we found ended up cached, return one now. */
if (return_cached)
{
return tcache_get(tc_idx);
}
#endif

在这个遍历unsorted bin的过程中,不仅找了所需大小的chunk,还对unsorted bin中的chunk进行了整理,把他们分别分配到了各自该在的bin中,这也是给chunk找家的唯一代码。

到这里,用到tcache的地方就结束了,可以看到放入tcache中的所有chunk都是和被需要的chunk相同大小的,这是因为在很多情况下,刚被需要的内存块大小有更大的概率被继续需要,所以在tcache链未满的情况下把所有与被需要的内存块大小相同的chunk放入tcache,并在分配时优先从tcache中找,可以有效地提高内存分配的效率。

large bin2

large bin也在bins数组中,一共有63个,不同的是每个large bin中的chunk并不是大小相同的,而是存储一定范围的chunk,每个large bin存储的范围都不同。

查找完unsorted bin后就真正开始从large bin中找,因为large bin是有序链表,第一个最大,依次减小,所以就依次扫描链表,直到找到一个符合要求的最小chunk。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
/*
If a large request, scan through the chunks of current bin in
sorted order to find smallest that fits. Use the skip list for this.
*/

if (!in_smallbin_range(nb))
{
bin = bin_at(av, idx);//idx在之前被赋值,idx = largebin_index(nb)

/* skip scan if empty or largest chunk is too small */
//如果large bin非空并且large bin中最大的chunk不小于nb
if ((victim = first(bin)) != bin && (unsigned long)chunksize_nomask(victim) >= (unsigned long)(nb))
{
victim = victim->bk_nextsize;
//反向遍历链表,从最小的开始找,找到第一个不小于nb的chunk
while (((unsigned long)(size = chunksize(victim)) <
(unsigned long)(nb)))
victim = victim->bk_nextsize;

/* Avoid removing the first entry for a size so that the skip
list does not have to be rerouted. */
//如果victim不是链表的最后一个,并且victim与之普通链上的下一个chunk大小相同
//就使用下一个chunk,因为该chunk不在nextsize链上,就不用改动nextsize指针
//注:前面找到的第一个不小于nb的chunk是在nextsize链上找到的,所以该chunk普通链上
//的下一个chunk是有可能与之大小相同的,正如前面所说,如果插入的时候大小相同就插在后面
if (victim != last(bin) && chunksize_nomask(victim) == chunksize_nomask(victim->fd))
victim = victim->fd;

remainder_size = size - nb;//剩余部分的大小
unlink(av, victim, bck, fwd);//将victim从链表中取出

/* Exhaust */
//如果剩余部分不够一个chunk
if (remainder_size < MINSIZE)
{
set_inuse_bit_at_offset(victim, size);
if (av != &main_arena)
set_non_main_arena(victim);
}
/* Split */
else //否则就进行拆分
{
remainder = chunk_at_offset(victim, nb);
/* We cannot assume the unsorted list is empty and therefore
have to perform a complete insert here. */
bck = unsorted_chunks(av);
fwd = bck->fd;
if (__glibc_unlikely(fwd->bk != bck))
malloc_printerr("malloc(): corrupted unsorted chunks");
//将剩余部分插到unsorted bin的第一个
remainder->bk = bck;
remainder->fd = fwd;
bck->fd = remainder;
fwd->bk = remainder;
//如果属于large bin的范畴就将nextsize指针置空
if (!in_smallbin_range(remainder_size))
{
remainder->fd_nextsize = NULL;
remainder->bk_nextsize = NULL;
}
//设置标识位
set_head(victim, nb | PREV_INUSE |
(av != &main_arena ? NON_MAIN_ARENA : 0));
set_head(remainder, remainder_size | PREV_INUSE);
set_foot(remainder, remainder_size);
}
//检查并返回victim
check_malloced_chunk(av, victim, nb);
void *p = chunk2mem(victim);
alloc_perturb(p, bytes);
return p;
}
}

block

一个block就是连续的32个bin,每个block有一个map来标记每个bin里是否有空闲的chunk,通过这种机制就可以批量地查看bin的状态,避免了挨个遍历,提高了检索速度。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
   /*
Search for a chunk by scanning bins, starting with next largest
bin. This search is strictly by best-fit; i.e., the smallest
(with ties going to approximately the least recently used) chunk
that fits is selected.

The bitmap avoids needing to check that most blocks are nonempty.
The particular case of skipping all bins during warm-up phases
when no chunks have been returned yet is faster than it might look.
*/

++idx;//下一个large bin
bin = bin_at(av, idx);
// #define idx2block(i) ((i) >> BINMAPSHIFT)
//BINMAPSHIFT = 5
//binmap用一个整数的每一个位来标识每个bin中是否有空闲的chunk
//因为一个int是32位,所以一个map能标识32个bin的状态,所以idx右移5位就是map的索引
//如100010表示第2和第6个bin中有空闲的chunk
block = idx2block(idx);
map = av->binmap[block];
// #define idx2bit(i) ((1U << ((i) & ((1U << BINMAPSHIFT) - 1))))
//实际上就是把1左移idx位,但是如果idx超过31就越了int的界
//但是在实际执行时是1<<(idx%32),如1<<35 == 1<<3 == 8
bit = idx2bit(idx);

for (;;)
{
/* Skip rest of block if there are no more set bits in this block. */
//bit>map,即在当前block中大于等于(idx%32)位置上的bin都不可用
//另外按照bin的排列规则,索引越大的bin,chunk越大
//所以bit>map表示当前block中没有可用的bin
if (bit > map || bit == 0)
{
//如果当前block不可用就找下一个block,直到binmap不等于0,即找到的block中有可用bin
do
{
if (++block >= BINMAPSIZE) /* out of bins */
goto use_top;//如果超出了bin的范围,就使用top chunk
} while ((map = av->binmap[block]) == 0);

//block左移5位即idx2block的逆过程,找到当前block的第一个bin
bin = bin_at(av, (block << BINMAPSHIFT));
bit = 1;//表示当前使用的是第一个bin
}

/* Advance to bin with set bit. There must be one. */
//(bit & map) == 0表示当前的bin不可用
while ((bit & map) == 0)
{
bin = next_bin(bin);
bit <<= 1;//左移1位表示使用下一个chunk
assert(bit != 0);
}

/* Inspect the bin. It is likely to be non-empty */
//从最后一个也就是最小的一个chunk开始
victim = last(bin);

/* If a false alarm (empty bin), clear the bit. */
//如果bin是空的的话更新对应map中的bit值为0,并找下一个bin
if (victim == bin)
{
av->binmap[block] = map &= ~bit; /* Write through */
bin = next_bin(bin);
bit <<= 1;
}

//如果非空就取出chunk然后跟large bin一样进行拆分拼接等操作
//因为当前的bin索引比之前按照nb找到的bin的索引大,所以chunksize肯定大于nb
else
{
size = chunksize(victim);

/* We know the first chunk in this bin is big enough to use. */
assert((unsigned long)(size) >= (unsigned long)(nb));

remainder_size = size - nb;

/* unlink */
unlink(av, victim, bck, fwd);

/* Exhaust */
if (remainder_size < MINSIZE)
{
set_inuse_bit_at_offset(victim, size);
if (av != &main_arena)
set_non_main_arena(victim);
}

/* Split */
else
{
remainder = chunk_at_offset(victim, nb);

/* We cannot assume the unsorted list is empty and therefore
have to perform a complete insert here. */
bck = unsorted_chunks(av);
fwd = bck->fd;
if (__glibc_unlikely(fwd->bk != bck))
malloc_printerr("malloc(): corrupted unsorted chunks 2");
remainder->bk = bck;
remainder->fd = fwd;
bck->fd = remainder;
fwd->bk = remainder;

/* advertise as last remainder */
if (in_smallbin_range(nb))
av->last_remainder = remainder;
if (!in_smallbin_range(remainder_size))
{
remainder->fd_nextsize = NULL;
remainder->bk_nextsize = NULL;
}
set_head(victim, nb | PREV_INUSE |
(av != &main_arena ? NON_MAIN_ARENA : 0));
set_head(remainder, remainder_size | PREV_INUSE);
set_foot(remainder, remainder_size);
}
check_malloced_chunk(av, victim, nb);
void *p = chunk2mem(victim);
alloc_perturb(p, bytes);
return p;
}
}

top chunk

top chunk是一个单独的chunk,位于一个arena的最顶部。当所有的bin都无法满足需求时就从top chunk中拆分chunk来满足需求,如果top chunk也无法满足需求就调用sysmalloc来扩展堆内存。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
  use_top:
/*
If large enough, split off the chunk bordering the end of memory
(held in av->top). Note that this is in accord with the best-fit
search rule. In effect, av->top is treated as larger (and thus
less well fitting) than any other available chunk since it can
be extended to be as large as necessary (up to system
limitations).

We require that av->top always exists (i.e., has size >=
MINSIZE) after initialization, so if it would otherwise be
exhausted by current request, it is replenished. (The main
reason for ensuring it exists is that we may need MINSIZE space
to put in fenceposts in sysmalloc.)
*/

victim = av->top;
size = chunksize(victim);

//如果top chunk除去nb之后还能独立成为一个chunk
//就从top chunk中拆分出nb大小的chunk
if ((unsigned long)(size) >= (unsigned long)(nb + MINSIZE))
{
remainder_size = size - nb;
remainder = chunk_at_offset(victim, nb);
av->top = remainder;//剩余部分继续成为top chunk
//设置chunk头部
set_head(victim, nb | PREV_INUSE |
(av != &main_arena ? NON_MAIN_ARENA : 0));
set_head(remainder, remainder_size | PREV_INUSE);

//返回victim
check_malloced_chunk(av, victim, nb);
void *p = chunk2mem(victim);
alloc_perturb(p, bytes);
return p;
}

/* When we are using atomic ops to free fast chunks we can get
here for all block sizes. */
//如果top chunk不够,并且有fast bin就合并fast bin
else if (atomic_load_relaxed(&av->have_fastchunks))
{
malloc_consolidate(av);//合并fast bin到top chunk
/* restore original bin index */
//计算nb所属的bin
if (in_smallbin_range(nb))
idx = smallbin_index(nb);
else
idx = largebin_index(nb);
}

/*
Otherwise, relay to handle system-dependent cases
*/
//否则说明堆不够,调用sysmalloc向操作系统请求内存
else
{
void *p = sysmalloc(nb, av);
if (p != NULL)
alloc_perturb(p, bytes);
return p;
}
}//大循环
}

总结

首先判断tcache中是否有可用的chunk,如果有直接返回,否则进入_int_malloc中做进一步分配。在_int_malloc中先判断arena是否可用,如果不可用就调用sysmalloc向操作系统申请内存并返回,如果arena可用就开始在各个bin中寻找chunk。首先是fast bin,如果fast bin中有需求大小的chunk就返回该chunk,并将该fast bin链中(即相同大小)的其余chunk放入tcache中,直到tcache被填满,否则就开始在small bin中寻找chunk。在small bin中流程与fast bin相同,如果在small bin中找不到chunk就需要进入一个直到找到chunk的大循环。在大循环中,先遍历unsorted bin,对unsorted chunk进行判断,如果就是所需的大小则优先放入tcache中,tcache满了才返回;如果不是所需大小则放入small bin或large bin中。遍历结束后判断在遍历的过程中是否有chunk被放入tcache,如果有则从tcache找chunk并返回,如果没有就到large bin中找。在对应large bin中如果能找到最合适的chunk就将该chunk拆分后返回,如果不能则利用binmap批量遍历更大的large bin,如果有可用的chunk就拆分并返回,如果没有就只能用top chunk了。但是如果top chunk也不够,就先判断是否有fast bin,如果有则先合并fast bin,并重新开始大循环,如果没有fast bin则调用sysmalloc向操作系统请求支援。流程图如下:

malloc.png

文章目录
  1. 1. 基本概念
    1. 1.1. arena
    2. 1.2. bin
  2. 2. 基本结构
    1. 2.1. malloc_chunk
    2. 2.2. heap_info
    3. 2.3. malloc_state
    4. 2.4. tcache_entry
    5. 2.5. tcache_perthread_struct
  3. 3. 关键函数
    1. 3.1. __libc_malloc
    2. 3.2. tcache_init
    3. 3.3. tcache_get
    4. 3.4. _int_malloc
      1. 3.4.1. fast bin
      2. 3.4.2. small bin
      3. 3.4.3. large bin1
      4. 3.4.4. unsorted bin
      5. 3.4.5. large bin2
      6. 3.4.6. block
      7. 3.4.7. top chunk
  4. 4. 总结
|