This patch amounts to a mostly complete rewrite of the k_mem_pool allocator, which had been the source of historical complaints vs. the one easily available in newlib. The basic design of the allocator is unchanged (it's still a 4-way buddy allocator), but the implementation has made different choices throughout. Major changes: Space efficiency: The old implementation required ~2.66 bytes per "smallest block" in overhead, plus 16 bytes per log4 "level" of the allocation tree, plus a global tracking struct of 32 bytes and a very surprising 12 byte overhead (in struct k_mem_block) per active allocation on top of the returned data pointer. This new allocator uses a simple bit array as the only per-block storage and places the free list into the freed blocks themselves, requiring only ~1.33 bits per smallest block, 12 bytes per level, 32 byte globally and only 4 bytes of per-allocation bookeeping. And it puts more of the generated tree into BSS, slightly reducing binary sizes for non-trivial pool sizes (even as the code size itself has increased a tiny bit). IRQ safe: atomic operations on the store have been cut down to be at most "4 bit sets and dlist operations" (i.e. a few dozen instructions), reducing latency significantly and allowing us to lock against interrupts cleanly from all APIs. Allocations and frees can be done from ISRs now without limitation (well, obviously you can't sleep, so "timeout" must be K_NO_WAIT). Deterministic performance: there is no more "defragmentation" step that must be manually managed. Block coalescing is done synchronously at free time and takes constant time (strictly log4(num_levels)), as the detection of four free "partner bits" is just a simple shift and mask operation. Cleaner behavior with odd sizes. The old code assumed that the specified maximum size would be a power of four multiple of the minimum size, making use of non-standard buffer sizes problematic. This implementation re-aligns the sub-blocks at each level and can handle situations wehre alignment restrictions mean fewer than 4x will be available. If you want precise layout control, you can still specify the sizes rigorously. It just doesn't break if you don't. More portable: the original implementation made use of GNU assembler macros embedded inline within C __asm__ statements. Not all toolchains are actually backed by a GNU assembler even when the support the GNU assembly syntax. This is pure C, albeit with some hairy macros to expand the compile-time-computed values. Related changes that had to be rolled into this patch for bisectability: * The new allocator has a firm minimum block size of 8 bytes (to store the dlist_node_t). It will "work" with smaller requested min_size values, but obviously makes no firm promises about layout or how many will be available. Unfortunately many of the tests were written with very small 4-byte minimum sizes and to assume exactly how many they could allocate. Bump the sizes to match the allocator minimum. * The mbox and pipes API made use of the internals of k_mem_block and had to be ported to the new scheme. Blocks no longer store a backpointer to the pool that allocated them (it's an integer ID in a bitfield) , so if you want to "nullify" them you have to use the data pointer. * test_mbox_api had a bug were it was prematurely freeing k_mem_blocks that it sent through the mailbox. This worked in the old allocator because the memory wouldn't be touched when freed, but now we stuff list pointers in there and the bug was exposed. * Remove test_mpool_options: the options (related to defragmentation behavior) tested no longer exist. Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
187 lines
7.0 KiB
ReStructuredText
187 lines
7.0 KiB
ReStructuredText
.. _memory_pools_v2:
|
|
|
|
Memory Pools
|
|
############
|
|
|
|
A :dfn:`memory pool` is a kernel object that allows memory blocks
|
|
to be dynamically allocated from a designated memory region.
|
|
The memory blocks in a memory pool can be of any size,
|
|
thereby reducing the amount of wasted memory when an application
|
|
needs to allocate storage for data structures of different sizes.
|
|
The memory pool uses a "buddy memory allocation" algorithm
|
|
to efficiently partition larger blocks into smaller ones,
|
|
allowing blocks of different sizes to be allocated and released efficiently
|
|
while limiting memory fragmentation concerns.
|
|
|
|
.. contents::
|
|
:local:
|
|
:depth: 2
|
|
|
|
Concepts
|
|
********
|
|
|
|
Any number of memory pools can be defined. Each memory pool is referenced
|
|
by its memory address.
|
|
|
|
A memory pool has the following key properties:
|
|
|
|
* A **minimum block size**, measured in bytes.
|
|
It must be at least 4X bytes long, where X is greater than 0.
|
|
|
|
* A **maximum block size**, measured in bytes.
|
|
This should be a power of 4 times larger than the minimum block size.
|
|
That is, "maximum block size" must equal "minimum block size" times 4^Y,
|
|
where Y is greater than or equal to zero.
|
|
|
|
* The **number of maximum-size blocks** initially available.
|
|
This must be greater than zero.
|
|
|
|
* A **buffer** that provides the memory for the memory pool's blocks.
|
|
This must be at least "maximum block size" times
|
|
"number of maximum-size blocks" bytes long.
|
|
|
|
The memory pool's buffer must be aligned to an N-byte boundary, where
|
|
N is a power of 2 larger than 2 (i.e. 4, 8, 16, ...). To ensure that
|
|
all memory blocks in the buffer are similarly aligned to this boundary,
|
|
the minimum block size must also be a multiple of N.
|
|
|
|
A thread that needs to use a memory block simply allocates it from a memory
|
|
pool. Following a successful allocation, the :c:data:`data` field
|
|
of the block descriptor supplied by the thread indicates the starting address
|
|
of the memory block. When the thread is finished with a memory block,
|
|
it must release the block back to the memory pool so the block can be reused.
|
|
|
|
If a block of the desired size is unavailable, a thread can optionally wait
|
|
for one to become available.
|
|
Any number of threads may wait on a memory pool simultaneously;
|
|
when a suitable memory block becomes available, it is given to
|
|
the highest-priority thread that has waited the longest.
|
|
|
|
Unlike a heap, more than one memory pool can be defined, if needed. For
|
|
example, different applications can utilize different memory pools; this
|
|
can help prevent one application from hijacking resources to allocate all
|
|
of the available blocks.
|
|
|
|
Internal Operation
|
|
==================
|
|
|
|
A memory pool's buffer is an array of maximum-size blocks,
|
|
with no wasted space between the blocks.
|
|
Each of these "level 0" blocks is a *quad-block* that can be
|
|
partitioned into four smaller "level 1" blocks of equal size, if needed.
|
|
Likewise, each level 1 block is itself a quad-block that can be partitioned
|
|
into four smaller "level 2" blocks in a similar way, and so on.
|
|
Thus, memory pool blocks can be recursively partitioned into quarters
|
|
until blocks of the minimum size are obtained,
|
|
at which point no further partitioning can occur.
|
|
|
|
A memory pool keeps track of how its buffer space has been partitioned
|
|
using an array of *block set* data structures. There is one block set
|
|
for each partitioning level supported by the pool, or (to put it another way)
|
|
for each block size. A block set keeps track of all free blocks of its
|
|
associated size using an array of *quad-block status* data structures.
|
|
|
|
When an application issues a request for a memory block,
|
|
the memory pool first determines the size of the smallest block
|
|
that will satisfy the request, and examines the corresponding block set.
|
|
If the block set contains a free block, the block is marked as used
|
|
and the allocation process is complete.
|
|
If the block set does not contain a free block,
|
|
the memory pool attempts to create one automatically by splitting a free block
|
|
of a larger size or by merging free blocks of smaller sizes;
|
|
if a suitable block can't be created, the allocation request fails.
|
|
|
|
The memory pool's merging algorithm cannot combine adjacent free
|
|
blocks of different sizes, nor can it merge adjacent free blocks of
|
|
the same size if they belong to different parent quad-blocks. As a
|
|
consequence, memory fragmentation issues can still be encountered when
|
|
using a memory pool.
|
|
|
|
When an application releases a previously allocated memory block it is
|
|
combined synchronously with its three "partner" blocks if possible,
|
|
and recursively so up through the levels. This is done in constant
|
|
time, and quickly, so no manual "defragmentation" management is
|
|
needed.
|
|
|
|
Implementation
|
|
**************
|
|
|
|
Defining a Memory Pool
|
|
======================
|
|
|
|
A memory pool is defined using a variable of type :c:type:`struct k_mem_pool`.
|
|
However, since a memory pool also requires a number of variable-size data
|
|
structures to represent its block sets and the status of its quad-blocks,
|
|
the kernel does not support the run-time definition of a memory pool.
|
|
A memory pool can only be defined and initialized at compile time
|
|
by calling :c:macro:`K_MEM_POOL_DEFINE`.
|
|
|
|
The following code defines and initializes a memory pool that has 3 blocks
|
|
of 4096 bytes each, which can be partitioned into blocks as small as 64 bytes
|
|
and is aligned to a 4-byte boundary.
|
|
(That is, the memory pool supports block sizes of 4096, 1024, 256,
|
|
and 64 bytes.)
|
|
Observe that the macro defines all of the memory pool data structures,
|
|
as well as its buffer.
|
|
|
|
.. code-block:: c
|
|
|
|
K_MEM_POOL_DEFINE(my_pool, 64, 4096, 3, 4);
|
|
|
|
Allocating a Memory Block
|
|
=========================
|
|
|
|
A memory block is allocated by calling :cpp:func:`k_mem_pool_alloc()`.
|
|
|
|
The following code builds on the example above, and waits up to 100 milliseconds
|
|
for a 200 byte memory block to become available, then fills it with zeroes.
|
|
A warning is issued if a suitable block is not obtained.
|
|
|
|
Note that the application will actually receive a 256 byte memory block,
|
|
since that is the closest matching size supported by the memory pool.
|
|
|
|
.. code-block:: c
|
|
|
|
struct k_mem_block block;
|
|
|
|
if (k_mem_pool_alloc(&my_pool, &block, 200, 100) == 0)) {
|
|
memset(block.data, 0, 200);
|
|
...
|
|
} else {
|
|
printf("Memory allocation time-out");
|
|
}
|
|
|
|
Releasing a Memory Block
|
|
========================
|
|
|
|
A memory block is released by calling :cpp:func:`k_mem_pool_free()`.
|
|
|
|
The following code builds on the example above, and allocates a 75 byte
|
|
memory block, then releases it once it is no longer needed. (A 256 byte
|
|
memory block is actually used to satisfy the request.)
|
|
|
|
.. code-block:: c
|
|
|
|
struct k_mem_block block;
|
|
|
|
k_mem_pool_alloc(&my_pool, &block, 75, K_FOREVER);
|
|
... /* use memory block */
|
|
k_mem_pool_free(&block);
|
|
|
|
Suggested Uses
|
|
**************
|
|
|
|
Use a memory pool to allocate memory in variable-size blocks.
|
|
|
|
Use memory pool blocks when sending large amounts of data from one thread
|
|
to another, to avoid unnecessary copying of the data.
|
|
|
|
APIs
|
|
****
|
|
|
|
The following memory pool APIs are provided by :file:`kernel.h`:
|
|
|
|
* :c:macro:`K_MEM_POOL_DEFINE`
|
|
* :cpp:func:`k_mem_pool_alloc()`
|
|
* :cpp:func:`k_mem_pool_free()`
|