Python Object Model

10 min

Understanding PyObject, type system, and how Python objects work internally

Best viewed on desktop for optimal interactive experience

Everything is a PyObject

In CPython, every Python object is represented as a C structure that starts with a common header. This unified object model enables Python's dynamic typing and introspection capabilities.

Python Object Model

Python Code

x = 42

Memory Layout

ob_refcnt
1
Reference count
ob_type
&PyLong_Type
Type pointer
ob_size
1
Number of digits
ob_digit[0]
42
Actual value
Total Size:28 bytes

C Implementation

Click "Show C Structure" to view implementation

PyLongObject
  • • Every Python object starts with PyObject_HEAD
  • • ob_refcnt tracks reference count for memory management
  • • ob_type points to the type object (metaclass)
  • • Additional fields store the actual data

Common PyObject Operations

Reference Counting
Py_INCREF(obj);
Py_DECREF(obj);
Type Checking
PyLong_Check(obj);
Py_TYPE(obj);
Object Creation
PyLong_FromLong(42);
PyUnicode_FromString("hello");

The PyObject Structure

Base Object Header

typedef struct _object { _PyObject_HEAD_EXTRA // Debug builds only Py_ssize_t ob_refcnt; // Reference count PyTypeObject *ob_type; // Pointer to type object } PyObject;

Every Python object contains:

  1. Reference count: For memory management
  2. Type pointer: Points to the object's type

Variable-Size Objects

typedef struct { PyObject ob_base; // Standard header Py_ssize_t ob_size; // Number of items } PyVarObject;

Used for lists, tuples, strings - anything with variable length.

Common Object Types

Integer Objects (PyLongObject)

x = 42 # In C: # PyLongObject { # ob_refcnt: 1, # ob_type: &PyLong_Type, # ob_size: 1, # ob_digit: [42] # }

String Objects (PyUnicodeObject)

s = "hello" # Complex structure with: # - UTF-8 cached representation # - Compact storage for ASCII # - Hash value caching # - Interning support

List Objects (PyListObject)

lst = [1, 2, 3] # PyListObject { # ob_refcnt: 1, # ob_type: &PyList_Type, # ob_size: 3, # Current size # ob_item: [...], # Pointer to array # allocated: 4 # Allocated capacity # }

Type Objects

The Type Type

# Everything has a type type(5) # <class 'int'> type(int) # <class 'type'> type(type) # <class 'type'> (metaclass)

Type Object Structure

typedef struct _typeobject { PyObject_VAR_HEAD const char *tp_name; // "int", "str", etc. Py_ssize_t tp_basicsize; // Instance size // Destructor destructor tp_dealloc; // Protocol methods reprfunc tp_repr; PyNumberMethods *tp_as_number; PySequenceMethods *tp_as_sequence; PyMappingMethods *tp_as_mapping; // Attribute access getattrofunc tp_getattro; setattrofunc tp_setattro; // ... many more fields } PyTypeObject;

Object Creation and Destruction

Creating Objects

# Python code obj = MyClass() # Internally: # 1. Allocate memory (tp_alloc) # 2. Initialize object header # 3. Call __new__ (tp_new) # 4. Call __init__ (tp_init)

Object Lifecycle

import sys # Creation x = [1, 2, 3] print(sys.getrefcount(x)) # 2 (x + temporary) # Reference increase y = x print(sys.getrefcount(x)) # 3 # Reference decrease del y print(sys.getrefcount(x)) # 2 # Destruction (when refcount reaches 0) del x # Object deallocated

Special Methods and Slots

Mapping Python to C

Python MethodC SlotPurpose
__init__tp_initInitialize instance
__new__tp_newCreate instance
__del__tp_deallocCleanup/destroy
__repr__tp_reprString representation
__str__tp_strHuman-readable string
__getattr__tp_getattroAttribute access
__setattr__tp_setattroAttribute setting
__call__tp_callMake callable

Protocol Support

# Number protocol class MyNumber: def __add__(self, other): # tp_as_number->nb_add pass def __mul__(self, other): # tp_as_number->nb_multiply pass # Sequence protocol class MySequence: def __len__(self): # tp_as_sequence->sq_length pass def __getitem__(self, key): # tp_as_sequence->sq_item pass

Attribute Access

Descriptor Protocol

class Descriptor: def __get__(self, obj, objtype=None): print(f"Getting from {obj}") return self._value def __set__(self, obj, value): print(f"Setting on {obj}") self._value = value class MyClass: attr = Descriptor() obj = MyClass() obj.attr = 42 # Calls __set__ val = obj.attr # Calls __get__

Method Resolution Order (MRO)

class A: pass class B(A): pass class C(A): pass class D(B, C): pass print(D.__mro__) # (<class 'D'>, <class 'B'>, <class 'C'>, <class 'A'>, <class 'object'>)

Memory Layout Examples

Simple Object

class Point: def __init__(self, x, y): self.x = x self.y = y p = Point(3, 4) # Memory layout: # PyObject header: 16 bytes # __dict__: 8 bytes (pointer) # __weakref__: 8 bytes (pointer) # Total: ~32 bytes + dict overhead

Optimized with slots

class PointOptimized: __slots__ = ('x', 'y') def __init__(self, x, y): self.x = x self.y = y p = PointOptimized(3, 4) # Memory layout: # PyObject header: 16 bytes # x: 8 bytes # y: 8 bytes # Total: ~32 bytes (no dict!)

Type Checking and Introspection

Runtime Type Checking

# Using isinstance isinstance(42, int) # True isinstance(42, (int, float)) # True # Using type type(42) == int # True # Check subclass issubclass(bool, int) # True (bool inherits from int)

Object Introspection

# Get all attributes dir(object) # Get object's dict vars(object) # Get type information obj.__class__ obj.__class__.__name__ obj.__class__.__bases__ obj.__class__.__mro__

Special Objects

None, True, False

# Singletons - only one instance exists a = None b = None print(a is b) # True - same object # In C: # Py_None, Py_True, Py_False are global objects

NotImplemented

class MyClass: def __eq__(self, other): if not isinstance(other, MyClass): return NotImplemented return self.value == other.value

Ellipsis

# Used in slicing, type hints arr[..., 0] # NumPy advanced indexing def func(x: ...) -> ...: pass # Type hints

Performance Implications

Attribute Access Speed

# Slowest: Dynamic attribute obj.x # Dict lookup # Faster: Slots obj.x # Direct memory access with __slots__ # Fastest: Local variable x # Array index access

Method Calls

# Slow: Attribute lookup + call obj.method() # Faster: Bound method method = obj.method method() # Fastest: Function call function(obj)

Key Takeaways

  1. Everything is a PyObject with refcount and type pointer
  2. Type objects define behavior and structure
  3. Special methods map to C-level slots
  4. slots can significantly reduce memory usage
  5. MRO determines attribute resolution order
  6. Introspection is powerful due to unified object model
  7. Understanding internals helps optimize performance

If you found this explanation helpful, consider sharing it with others.

Mastodon