Python Memory Management
How CPython manages memory with PyMalloc, object pools, and reference counting
Best viewed on desktop for optimal interactive experience
Python Memory Architecture
CPython uses a hierarchical memory management system optimized for the allocation patterns of typical Python programs.
Python Memory Management
PyMalloc Memory Pools
8B
45/64 blocks
int, bool
16B
32/64 blocks
float
24B
28/64 blocks
small str
32B
20/64 blocks
tuple
40B
15/64 blocks
small list
48B
10/64 blocks
small dict
64B
8/64 blocks
object
128B
4/32 blocks
medium str
256B
2/16 blocks
large list
512B
1/8 blocks
large dict
Simulate Allocation
Memory Optimization Tips
- • Use `__slots__` to reduce memory overhead for classes
- • Reuse objects when possible (especially small integers and strings)
- • Use generators for large datasets to avoid loading everything into memory
- • Profile memory usage with tools like memory_profiler or tracemalloc
Memory Management Layers
1. System Allocator
- Used for large objects (greater than 512 bytes)
- Direct calls to system
malloc()
/free()
- No Python-specific optimizations
2. PyMalloc (Object Allocator)
- Handles small objects (less than 512 bytes)
- Reduces fragmentation
- Faster than system malloc for small allocations
3. Object-Specific Allocators
- Specialized allocators for ints, lists, dicts
- Free lists for common types
- Object pools for frequently created/destroyed objects
PyMalloc Architecture
Memory Hierarchy
Arena (256 KB) ├── Pool 1 (4 KB) - 8-byte blocks ├── Pool 2 (4 KB) - 16-byte blocks ├── Pool 3 (4 KB) - 24-byte blocks └── ... up to 512-byte blocks
Allocation Strategy
# Small allocation (<= 512 bytes) def allocate_small(size): size_class = round_up_to_multiple_of_8(size) pool = find_pool_for_size_class(size_class) if pool.has_free_block(): return pool.allocate_block() else: return create_new_pool(size_class).allocate_block() # Large allocation (> 512 bytes) def allocate_large(size): return system_malloc(size)
Reference Counting
How It Works
Every Python object has a reference count:
import sys a = [] # refcount = 1 b = a # refcount = 2 c = [a, a] # refcount = 4 del b # refcount = 3 c = None # refcount = 1 del a # refcount = 0 → object freed
Checking Reference Counts
import sys obj = "hello" print(sys.getrefcount(obj)) # Shows count + 1 (temporary ref) # Reference count changes x = [1, 2, 3] print(sys.getrefcount(x)) # 2 (x + temporary) y = x print(sys.getrefcount(x)) # 3 (x + y + temporary) container = [x, x, x] print(sys.getrefcount(x)) # 6 (x + y + 3 in container + temporary)
Object Caching
Small Integer Cache
# Integers from -5 to 256 are cached a = 256 b = 256 print(a is b) # True - same object c = 257 d = 257 print(c is d) # False - different objects # Force same object with sys.intern equivalent behavior import sys e = 100 f = 100 print(e is f) # True - cached
String Interning
# Short strings are often interned a = "hello" b = "hello" print(a is b) # True - interned # Longer strings may not be c = "hello world this is a long string" d = "hello world this is a long string" print(c is d) # May be False # Force interning import sys e = sys.intern("long string to intern") f = sys.intern("long string to intern") print(e is f) # True - forced interning
Free Lists
CPython maintains free lists for common types:
# Lists reuse memory lists = [] for i in range(1000): lists.append([1, 2, 3]) # Delete all - memory goes to free list del lists # New lists reuse the freed memory new_lists = [] for i in range(1000): new_lists.append([4, 5, 6]) # Faster allocation
Memory Profiling
Using tracemalloc
import tracemalloc # Start tracing tracemalloc.start() # Your code here data = [i for i in range(1000000)] # Get current memory usage current, peak = tracemalloc.get_traced_memory() print(f"Current: {current / 1024 / 1024:.1f} MB") print(f"Peak: {peak / 1024 / 1024:.1f} MB") # Get top memory users snapshot = tracemalloc.take_snapshot() top_stats = snapshot.statistics('lineno') for stat in top_stats[:3]: print(stat) tracemalloc.stop()
Using memory_profiler
# Install: pip install memory-profiler from memory_profiler import profile @profile def memory_hungry_function(): a = [1] * (10 ** 6) b = [2] * (2 * 10 ** 7) del b return a # Run with: python -m memory_profiler script.py
Memory Optimization Techniques
1. Use slots
# Without __slots__ - uses dict class Point: def __init__(self, x, y): self.x = x self.y = y # Memory per instance: ~296 bytes # With __slots__ - fixed attributes class PointOptimized: __slots__ = ('x', 'y') def __init__(self, x, y): self.x = x self.y = y # Memory per instance: ~56 bytes (5x less!)
2. Use Generators
# Bad: Creates entire list in memory def get_squares(n): return [x**2 for x in range(n)] # Good: Generates values on demand def get_squares_gen(n): return (x**2 for x in range(n)) # Memory comparison import sys list_squares = get_squares(1000000) print(sys.getsizeof(list_squares)) # ~8.5 MB gen_squares = get_squares_gen(1000000) print(sys.getsizeof(gen_squares)) # ~120 bytes
3. String Operations
# Bad: Creates many intermediate strings result = "" for item in items: result += str(item) + ", " # Good: Single allocation result = ", ".join(str(item) for item in items)
4. Reuse Objects
# Bad: Creates new list each time def process_data(): temp = [] for item in data: temp.append(transform(item)) return temp # Good: Reuse with clear() temp_buffer = [] def process_data_optimized(): temp_buffer.clear() for item in data: temp_buffer.append(transform(item)) return temp_buffer.copy()
Memory Leaks in Python
Common Causes
- Circular References (handled by GC)
- Global Caches that grow indefinitely
- Unclosed Resources (files, connections)
- Large Default Arguments
# Memory leak example cache = {} def cached_computation(x): if x not in cache: cache[x] = expensive_computation(x) return cache[x] # Cache grows without bound! # Fixed version with LRU cache from functools import lru_cache @lru_cache(maxsize=128) def cached_computation_fixed(x): return expensive_computation(x)
Best Practices
- Profile Before Optimizing: Use tools to find actual bottlenecks
- Prefer Built-in Types: They're optimized in C
- Use Context Managers: Ensure cleanup with
with
statements - Limit Cache Sizes: Use
lru_cache
or similar - Consider Data Types:
array.array
for homogeneous data - Lazy Loading: Don't load data until needed
Key Takeaways
- PyMalloc optimizes small object allocation
- Reference counting is the primary memory management
- Object caching improves performance for common values
- Free lists reduce allocation overhead
- slots can significantly reduce memory usage
- Generators provide memory-efficient iteration
- Profile to find real memory issues