Custom C++ Memory Pool for Fast Allocation from Heap
Today I will present custom memory pool design which can cut significant allocation time since performance is the greatest concern in C++ programs. The idea is pre-allocating a large block and giving fixed-size pieces to consumers later. This especially comes handy when your objects are of same size. Stay tuned for the details 😎
Memory Manager Design
The idea is simple. First, allocate space to accomodate N objects. Second, allocate them to consumers in order. Last, use a stack and push the deallocated blocks there for fast retrieval of empty blocks next time.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 |
template<class T, size_t MaxSize = 1000> class MemManager { private: T* m_data; std::array<T*, MaxSize> m_stack; size_t m_availableSlots; size_t m_maxAllocatedSlot; public: /* * @brief Allocates memory pool depending on the template arguments. */ MemManager() { // Allocate pool memory. m_data = reinterpret_cast<T*>(std::malloc(MaxSize * sizeof(T))); // Make all slots available. m_availableSlots = 0; m_maxAllocatedSlot = 0; } /* * @brief Allocates memory for one object of type T. * @return Returns the allocated pointer. */ void* Allocate() { // Check for slot availability. assert(m_maxAllocatedSlot < MaxSize || 0 < m_availableSlots);// TODO // Check if stack has available slots if (m_availableSlots) { // Return the next empty slot from the stack. return reinterpret_cast<void*>(m_stack[--m_availableSlots]); } else { // Make new slot available and return it. return m_data + m_maxAllocatedSlot++; } } /* * @brief Deallocates previously allocated memory slot. * @return Pointer to the memory to deallocate. */ void Deallocate(void* ptr) { // Check for pointer validity. assert(m_data <= ptr && ptr < m_data + MaxSize); // Place the pointer back to the stack. m_stack[m_availableSlots++] = reinterpret_cast<T*>(ptr); } /* * @brief Deallocates all resources. */ ~MemManager() { // Check if all slots are freed. assert(m_availableSlots == m_maxAllocatedSlot); // Deallocate pool memory. std::free(m_data); } }; |
MemManager is a template class so that you can use it well-tailored for your class.
In constructor, a fixed size block is reserved from heap memory. It is freed later in the destructor.
Note that m_maxAllocatedSlot represents the number of slots that are ever reserved. It starts with 0 and increases but never goes back. Check line 36 for reservation of a block for use. It reserves a block for the first time. When a block is returned as of line 49, it is pushed on top of the stack. When an allocation is requested next time, line 33 gives the recently deallocated line.
Don’t worry, those details will be clearer later.
Test Drive
Let’s create a class that has no use other than keeping an integer.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
class Dummy { int value; public: Dummy(int x) : value(x) { std::cout << "Dummy(" << x << ") @" << this << std::endl; } ~Dummy() { std::cout << "~Dummy(" << value << ") @" << this << std::endl; } void* operator new(size_t sz, MemManager<Dummy>& pool) { auto mem = pool.Allocate(); std::cout << "new(" << mem << ")" << std::endl; return mem; } void operator delete(void *mem, MemManager<Dummy>& pool) { std::cout << "delete(" << mem << ")" << std::endl; pool.Deallocate(mem); } }; |
Take a look at it. There are two special methods: operator new(...) and operator delete(...) for ease of use with our custom allocator class. They both use the provided allocator for finding a space.
Now the real test takes place below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
int main() { MemManager<Dummy> pool; std::vector<Dummy*> allocated; allocated.reserve(4); // Allocate 4 elements in the pool. for (int i = 0; i < 4; ++i) { auto dummy = new(pool) Dummy(i); allocated.push_back(dummy); } // Deallocate 2 elements. for (int i = 0; i < 2; ++i) { allocated[i]->~Dummy(); Dummy::operator delete(allocated[i], pool); } // Reallocate 2 elements. for (int i = 0; i < 2; ++i) { auto dummy = new(pool) Dummy(i); allocated[i] = dummy; } // Free all allocated elements. for (int i = 0; i < 4; ++i) { allocated[i]->~Dummy(); Dummy::operator delete(allocated[i], pool); } return 0; } |
To visualize the behavior, I first allocated 4 elements, freed first 2 elements, then reallocated them back. In the end I deallocated everything. Please check the output created by this fuction below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
new(012CF0F0) Dummy(0) @012CF0F0 new(012CF0F4) Dummy(1) @012CF0F4 new(012CF0F8) Dummy(2) @012CF0F8 new(012CF0FC) Dummy(3) @012CF0FC ~Dummy(0) @012CF0F0 delete(012CF0F0) ~Dummy(1) @012CF0F4 delete(012CF0F4) new(012CF0F4) Dummy(0) @012CF0F4 new(012CF0F0) Dummy(1) @012CF0F0 ~Dummy(0) @012CF0F4 delete(012CF0F4) ~Dummy(1) @012CF0F0 delete(012CF0F0) ~Dummy(2) @012CF0F8 delete(012CF0F8) ~Dummy(3) @012CF0FC delete(012CF0FC) |
Conclusion
For performance comparison between custom allocated pool and system managed pool, I allocated 10M elements and released them. Check the table below.
Dynamic Allocation Times(s) | Custom Allocation Times(s) |
7.869 | 5.182 |
7.943 | 4.997 |
7.813 | 5.028 |
~7.875 | ~5.069 |
On average 36% time is saved for this simple use case. This would vary depending on the usage pattern especially when there is huge fragmentation in memory. Use your RAM wisely 😉