Python Memory Management

Python Memory Architecture

CPython uses a hierarchical memory management system optimized for the allocation patterns of typical Python programs.

Python Memory Management

PyMalloc Memory Pools

45/64 blocks

int, bool

16B

32/64 blocks

float

24B

28/64 blocks

small str

32B

20/64 blocks

tuple

40B

15/64 blocks

small list

48B

10/64 blocks

small dict

64B

8/64 blocks

object

128B

4/32 blocks

medium str

256B

2/16 blocks

large list

512B

1/8 blocks

large dict

Simulate Allocation

Memory Optimization Tips

• Use `__slots__` to reduce memory overhead for classes
• Reuse objects when possible (especially small integers and strings)
• Use generators for large datasets to avoid loading everything into memory
• Profile memory usage with tools like memory_profiler or tracemalloc

Memory Management Layers

1. System Allocator

Used for large objects (greater than 512 bytes)
Direct calls to system malloc()/free()
No Python-specific optimizations

2. PyMalloc (Object Allocator)

Handles small objects (less than 512 bytes)
Reduces fragmentation
Faster than system malloc for small allocations

3. Object-Specific Allocators

Specialized allocators for ints, lists, dicts
Free lists for common types
Object pools for frequently created/destroyed objects

PyMalloc Architecture

Memory Hierarchy

Arena (256 KB)
├── Pool 1 (4 KB) - 8-byte blocks
├── Pool 2 (4 KB) - 16-byte blocks
├── Pool 3 (4 KB) - 24-byte blocks
└── ... up to 512-byte blocks

Allocation Strategy

# Small allocation (<= 512 bytes)
def allocate_small(size):
    size_class = round_up_to_multiple_of_8(size)
    pool = find_pool_for_size_class(size_class)
    if pool.has_free_block():
        return pool.allocate_block()
    else:
        return create_new_pool(size_class).allocate_block()

# Large allocation (> 512 bytes)
def allocate_large(size):
    return system_malloc(size)

Reference Counting

How It Works

Every Python object has a reference count:

import sys

a = []           # refcount = 1
b = a            # refcount = 2
c = [a, a]       # refcount = 4
del b            # refcount = 3
c = None         # refcount = 1
del a            # refcount = 0 → object freed

Checking Reference Counts

import sys

obj = "hello"
print(sys.getrefcount(obj))  # Shows count + 1 (temporary ref)

# Reference count changes
x = [1, 2, 3]
print(sys.getrefcount(x))  # 2 (x + temporary)

y = x
print(sys.getrefcount(x))  # 3 (x + y + temporary)

container = [x, x, x]
print(sys.getrefcount(x))  # 6 (x + y + 3 in container + temporary)

Object Caching

Small Integer Cache

# Integers from -5 to 256 are cached
a = 256
b = 256
print(a is b)  # True - same object

c = 257
d = 257
print(c is d)  # False - different objects

# Force same object with sys.intern equivalent behavior
import sys
e = 100
f = 100
print(e is f)  # True - cached

String Interning

# Short strings are often interned
a = "hello"
b = "hello"
print(a is b)  # True - interned

# Longer strings may not be
c = "hello world this is a long string"
d = "hello world this is a long string"
print(c is d)  # May be False

# Force interning
import sys
e = sys.intern("long string to intern")
f = sys.intern("long string to intern")
print(e is f)  # True - forced interning

Free Lists

CPython maintains free lists for common types:

# Lists reuse memory
lists = []
for i in range(1000):
    lists.append([1, 2, 3])

# Delete all - memory goes to free list
del lists

# New lists reuse the freed memory
new_lists = []
for i in range(1000):
    new_lists.append([4, 5, 6])  # Faster allocation

Memory Profiling

Using tracemalloc

import tracemalloc

# Start tracing
tracemalloc.start()

# Your code here
data = [i for i in range(1000000)]

# Get current memory usage
current, peak = tracemalloc.get_traced_memory()
print(f"Current: {current / 1024 / 1024:.1f} MB")
print(f"Peak: {peak / 1024 / 1024:.1f} MB")

# Get top memory users
snapshot = tracemalloc.take_snapshot()
top_stats = snapshot.statistics('lineno')

for stat in top_stats[:3]:
    print(stat)

tracemalloc.stop()

Using memory_profiler

# Install: pip install memory-profiler

from memory_profiler import profile

@profile
def memory_hungry_function():
    a = [1] * (10 ** 6)
    b = [2] * (2 * 10 ** 7)
    del b
    return a

# Run with: python -m memory_profiler script.py

Memory Optimization Techniques

1. Use slots

# Without __slots__ - uses dict
class Point:
    def __init__(self, x, y):
        self.x = x
        self.y = y

# Memory per instance: ~296 bytes

# With __slots__ - fixed attributes
class PointOptimized:
    __slots__ = ('x', 'y')
    
    def __init__(self, x, y):
        self.x = x
        self.y = y

# Memory per instance: ~56 bytes (5x less!)

2. Use Generators

# Bad: Creates entire list in memory
def get_squares(n):
    return [x**2 for x in range(n)]

# Good: Generates values on demand
def get_squares_gen(n):
    return (x**2 for x in range(n))

# Memory comparison
import sys
list_squares = get_squares(1000000)
print(sys.getsizeof(list_squares))  # ~8.5 MB

gen_squares = get_squares_gen(1000000)
print(sys.getsizeof(gen_squares))   # ~120 bytes

3. String Operations

# Bad: Creates many intermediate strings
result = ""
for item in items:
    result += str(item) + ", "

# Good: Single allocation
result = ", ".join(str(item) for item in items)

4. Reuse Objects

# Bad: Creates new list each time
def process_data():
    temp = []
    for item in data:
        temp.append(transform(item))
    return temp

# Good: Reuse with clear()
temp_buffer = []
def process_data_optimized():
    temp_buffer.clear()
    for item in data:
        temp_buffer.append(transform(item))
    return temp_buffer.copy()

Memory Leaks in Python

Common Causes

Circular References (handled by GC)
Global Caches that grow indefinitely
Unclosed Resources (files, connections)
Large Default Arguments

# Memory leak example
cache = {}

def cached_computation(x):
    if x not in cache:
        cache[x] = expensive_computation(x)
    return cache[x]
# Cache grows without bound!

# Fixed version with LRU cache
from functools import lru_cache

@lru_cache(maxsize=128)
def cached_computation_fixed(x):
    return expensive_computation(x)

Best Practices

Profile Before Optimizing: Use tools to find actual bottlenecks
Prefer Built-in Types: They're optimized in C
Use Context Managers: Ensure cleanup with with statements
Limit Cache Sizes: Use lru_cache or similar
Consider Data Types: array.array for homogeneous data
Lazy Loading: Don't load data until needed

Key Takeaways

PyMalloc optimizes small object allocation
Reference counting is the primary memory management
Object caching improves performance for common values
Free lists reduce allocation overhead
slots can significantly reduce memory usage
Generators provide memory-efficient iteration
Profile to find real memory issues

Table of Contents

Python Memory Management

PyMalloc Memory Pools

Simulate Allocation

Memory Optimization Tips