zephyr/doc/kernel/memory/pools.rst
Andy Ross 73cb9586ce k_mem_pool: Complete rework
This patch amounts to a mostly complete rewrite of the k_mem_pool
allocator, which had been the source of historical complaints vs. the
one easily available in newlib.  The basic design of the allocator is
unchanged (it's still a 4-way buddy allocator), but the implementation
has made different choices throughout.  Major changes:

Space efficiency: The old implementation required ~2.66 bytes per
"smallest block" in overhead, plus 16 bytes per log4 "level" of the
allocation tree, plus a global tracking struct of 32 bytes and a very
surprising 12 byte overhead (in struct k_mem_block) per active
allocation on top of the returned data pointer.  This new allocator
uses a simple bit array as the only per-block storage and places the
free list into the freed blocks themselves, requiring only ~1.33 bits
per smallest block, 12 bytes per level, 32 byte globally and only 4
bytes of per-allocation bookeeping.  And it puts more of the generated
tree into BSS, slightly reducing binary sizes for non-trivial pool
sizes (even as the code size itself has increased a tiny bit).

IRQ safe: atomic operations on the store have been cut down to be at
most "4 bit sets and dlist operations" (i.e. a few dozen
instructions), reducing latency significantly and allowing us to lock
against interrupts cleanly from all APIs.  Allocations and frees can
be done from ISRs now without limitation (well, obviously you can't
sleep, so "timeout" must be K_NO_WAIT).

Deterministic performance: there is no more "defragmentation" step
that must be manually managed.  Block coalescing is done synchronously
at free time and takes constant time (strictly log4(num_levels)), as
the detection of four free "partner bits" is just a simple shift and
mask operation.

Cleaner behavior with odd sizes.  The old code assumed that the
specified maximum size would be a power of four multiple of the
minimum size, making use of non-standard buffer sizes problematic.
This implementation re-aligns the sub-blocks at each level and can
handle situations wehre alignment restrictions mean fewer than 4x will
be available.  If you want precise layout control, you can still
specify the sizes rigorously.  It just doesn't break if you don't.

More portable: the original implementation made use of GNU assembler
macros embedded inline within C __asm__ statements.  Not all
toolchains are actually backed by a GNU assembler even when the
support the GNU assembly syntax.  This is pure C, albeit with some
hairy macros to expand the compile-time-computed values.

Related changes that had to be rolled into this patch for bisectability:

* The new allocator has a firm minimum block size of 8 bytes (to store
  the dlist_node_t).  It will "work" with smaller requested min_size
  values, but obviously makes no firm promises about layout or how
  many will be available.  Unfortunately many of the tests were
  written with very small 4-byte minimum sizes and to assume exactly
  how many they could allocate.  Bump the sizes to match the allocator
  minimum.

* The mbox and pipes API made use of the internals of k_mem_block and
  had to be ported to the new scheme.  Blocks no longer store a
  backpointer to the pool that allocated them (it's an integer ID in a
  bitfield) , so if you want to "nullify" them you have to use the
  data pointer.

* test_mbox_api had a bug were it was prematurely freeing k_mem_blocks
  that it sent through the mailbox.  This worked in the old allocator
  because the memory wouldn't be touched when freed, but now we stuff
  list pointers in there and the bug was exposed.

* Remove test_mpool_options: the options (related to defragmentation
  behavior) tested no longer exist.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
2017-05-13 14:39:41 -04:00

187 lines
7.0 KiB
ReStructuredText

.. _memory_pools_v2:
Memory Pools
############
A :dfn:`memory pool` is a kernel object that allows memory blocks
to be dynamically allocated from a designated memory region.
The memory blocks in a memory pool can be of any size,
thereby reducing the amount of wasted memory when an application
needs to allocate storage for data structures of different sizes.
The memory pool uses a "buddy memory allocation" algorithm
to efficiently partition larger blocks into smaller ones,
allowing blocks of different sizes to be allocated and released efficiently
while limiting memory fragmentation concerns.
.. contents::
:local:
:depth: 2
Concepts
********
Any number of memory pools can be defined. Each memory pool is referenced
by its memory address.
A memory pool has the following key properties:
* A **minimum block size**, measured in bytes.
It must be at least 4X bytes long, where X is greater than 0.
* A **maximum block size**, measured in bytes.
This should be a power of 4 times larger than the minimum block size.
That is, "maximum block size" must equal "minimum block size" times 4^Y,
where Y is greater than or equal to zero.
* The **number of maximum-size blocks** initially available.
This must be greater than zero.
* A **buffer** that provides the memory for the memory pool's blocks.
This must be at least "maximum block size" times
"number of maximum-size blocks" bytes long.
The memory pool's buffer must be aligned to an N-byte boundary, where
N is a power of 2 larger than 2 (i.e. 4, 8, 16, ...). To ensure that
all memory blocks in the buffer are similarly aligned to this boundary,
the minimum block size must also be a multiple of N.
A thread that needs to use a memory block simply allocates it from a memory
pool. Following a successful allocation, the :c:data:`data` field
of the block descriptor supplied by the thread indicates the starting address
of the memory block. When the thread is finished with a memory block,
it must release the block back to the memory pool so the block can be reused.
If a block of the desired size is unavailable, a thread can optionally wait
for one to become available.
Any number of threads may wait on a memory pool simultaneously;
when a suitable memory block becomes available, it is given to
the highest-priority thread that has waited the longest.
Unlike a heap, more than one memory pool can be defined, if needed. For
example, different applications can utilize different memory pools; this
can help prevent one application from hijacking resources to allocate all
of the available blocks.
Internal Operation
==================
A memory pool's buffer is an array of maximum-size blocks,
with no wasted space between the blocks.
Each of these "level 0" blocks is a *quad-block* that can be
partitioned into four smaller "level 1" blocks of equal size, if needed.
Likewise, each level 1 block is itself a quad-block that can be partitioned
into four smaller "level 2" blocks in a similar way, and so on.
Thus, memory pool blocks can be recursively partitioned into quarters
until blocks of the minimum size are obtained,
at which point no further partitioning can occur.
A memory pool keeps track of how its buffer space has been partitioned
using an array of *block set* data structures. There is one block set
for each partitioning level supported by the pool, or (to put it another way)
for each block size. A block set keeps track of all free blocks of its
associated size using an array of *quad-block status* data structures.
When an application issues a request for a memory block,
the memory pool first determines the size of the smallest block
that will satisfy the request, and examines the corresponding block set.
If the block set contains a free block, the block is marked as used
and the allocation process is complete.
If the block set does not contain a free block,
the memory pool attempts to create one automatically by splitting a free block
of a larger size or by merging free blocks of smaller sizes;
if a suitable block can't be created, the allocation request fails.
The memory pool's merging algorithm cannot combine adjacent free
blocks of different sizes, nor can it merge adjacent free blocks of
the same size if they belong to different parent quad-blocks. As a
consequence, memory fragmentation issues can still be encountered when
using a memory pool.
When an application releases a previously allocated memory block it is
combined synchronously with its three "partner" blocks if possible,
and recursively so up through the levels. This is done in constant
time, and quickly, so no manual "defragmentation" management is
needed.
Implementation
**************
Defining a Memory Pool
======================
A memory pool is defined using a variable of type :c:type:`struct k_mem_pool`.
However, since a memory pool also requires a number of variable-size data
structures to represent its block sets and the status of its quad-blocks,
the kernel does not support the run-time definition of a memory pool.
A memory pool can only be defined and initialized at compile time
by calling :c:macro:`K_MEM_POOL_DEFINE`.
The following code defines and initializes a memory pool that has 3 blocks
of 4096 bytes each, which can be partitioned into blocks as small as 64 bytes
and is aligned to a 4-byte boundary.
(That is, the memory pool supports block sizes of 4096, 1024, 256,
and 64 bytes.)
Observe that the macro defines all of the memory pool data structures,
as well as its buffer.
.. code-block:: c
K_MEM_POOL_DEFINE(my_pool, 64, 4096, 3, 4);
Allocating a Memory Block
=========================
A memory block is allocated by calling :cpp:func:`k_mem_pool_alloc()`.
The following code builds on the example above, and waits up to 100 milliseconds
for a 200 byte memory block to become available, then fills it with zeroes.
A warning is issued if a suitable block is not obtained.
Note that the application will actually receive a 256 byte memory block,
since that is the closest matching size supported by the memory pool.
.. code-block:: c
struct k_mem_block block;
if (k_mem_pool_alloc(&my_pool, &block, 200, 100) == 0)) {
memset(block.data, 0, 200);
...
} else {
printf("Memory allocation time-out");
}
Releasing a Memory Block
========================
A memory block is released by calling :cpp:func:`k_mem_pool_free()`.
The following code builds on the example above, and allocates a 75 byte
memory block, then releases it once it is no longer needed. (A 256 byte
memory block is actually used to satisfy the request.)
.. code-block:: c
struct k_mem_block block;
k_mem_pool_alloc(&my_pool, &block, 75, K_FOREVER);
... /* use memory block */
k_mem_pool_free(&block);
Suggested Uses
**************
Use a memory pool to allocate memory in variable-size blocks.
Use memory pool blocks when sending large amounts of data from one thread
to another, to avoid unnecessary copying of the data.
APIs
****
The following memory pool APIs are provided by :file:`kernel.h`:
* :c:macro:`K_MEM_POOL_DEFINE`
* :cpp:func:`k_mem_pool_alloc()`
* :cpp:func:`k_mem_pool_free()`