Introduction
Linking is where separate object files unite to form an executable. It's where "undefined reference" errors lurk, where static meets dynamic, and where symbols find their definitions. This article demystifies the linking process with interactive visualizations.
The Linking Process Overview
Linking Process Visualizer
Linking Pipeline
Input Object Files
Linked Libraries
Linker Command
ld -o program main.o math.o utils.o -lm -lc --dynamic-linker /lib64/ld-linux-x86-64.so.2
The linker performs several critical tasks:
- Symbol Resolution: Matching undefined symbols with definitions
- Relocation: Adjusting addresses to final locations
- Section Merging: Combining similar sections from different objects
- Library Handling: Including required functions from libraries
Understanding Object Files
Before linking, let's understand what object files contain:
# Examine object file sections objdump -h main.o # Typical sections: # .text - Machine code # .data - Initialized global variables # .bss - Uninitialized global variables # .rodata - Read-only data (string literals, const) # .symtab - Symbol table # .strtab - String table # .rela.* - Relocation entries
Object File Structure
// main.cpp #include <iostream> int global_var = 42; // → .data section int uninit_var; // → .bss section const char* msg = "Hello"; // → .rodata section void function() { // → .text section std::cout << msg; } int main() { // → .text section function(); return 0; }
Symbol Resolution
The linker's primary job is matching undefined symbols with their definitions.
Symbol Resolution
Symbol Table
Resolution Process
Successful Resolution
All symbols found and linked correctly
Symbol Types
// Strong symbols (definitions) int x = 10; // Strong symbol void func() { } // Strong symbol // Weak symbols int y; // Weak symbol (uninitialized global) __attribute__((weak)) int z = 5; // Explicitly weak symbol // Undefined symbols (references) extern int external_var; // Undefined symbol void external_func(); // Undefined symbol
Symbol Resolution Rules
- Multiple strong symbols: Error
- One strong, multiple weak: Choose strong
- Multiple weak symbols: Choose any (usually first)
- No definition found: Undefined reference error
Common Symbol Resolution Errors
// Error: Multiple definitions // file1.cpp int global = 1; // file2.cpp int global = 2; // Error: multiple definition of 'global' // Solution: Use static or namespace static int global = 1; // File-local // or namespace { int global = 1; } // Anonymous namespace
Static Linking
Static linking copies all required code into the final executable.
Creating Static Libraries
# Compile object files g++ -c math_utils.cpp -o math_utils.o g++ -c string_utils.cpp -o string_utils.o # Create static library (archive) ar rcs libutils.a math_utils.o string_utils.o # View library contents ar t libutils.a nm libutils.a # Link with static library g++ main.cpp -L. -lutils -o program # or g++ main.cpp libutils.a -o program
Advantages of Static Linking
- Self-contained executable
- No runtime dependencies
- Predictable performance
- Easier distribution
Disadvantages
- Larger executable size
- Memory duplication (each program has its own copy)
- Updates require recompilation
- License implications (LGPL)
Dynamic Linking
Dynamic linking defers symbol resolution to runtime.
Static vs Dynamic Linking
Linking Process
Compile
Create object files
Archive
Bundle into .a library
Link
Copy code into executable
Run
Everything loaded at once
Advantages
- ✓Self-contained executable
- ✓No runtime dependencies
- ✓Faster startup time
- ✓Predictable performance
Disadvantages
- ✗Larger executable size
- ✗Memory duplication
- ✗No shared updates
- ✗Longer link time
Performance Metrics
Command Examples
Creating Shared Libraries
# Compile with Position Independent Code (PIC) g++ -fPIC -c math_utils.cpp g++ -fPIC -c string_utils.cpp # Create shared library g++ -shared -o libutils.so math_utils.o string_utils.o # Or in one step g++ -fPIC -shared math_utils.cpp string_utils.cpp -o libutils.so # Link with shared library g++ main.cpp -L. -lutils -o program # Set library path for runtime export LD_LIBRARY_PATH=.:$LD_LIBRARY_PATH ./program
SONAME Versioning
# Create versioned library g++ -shared -Wl,-soname,libutils.so.1 -o libutils.so.1.2.3 *.o # Create symlinks ln -s libutils.so.1.2.3 libutils.so.1 # SONAME link ln -s libutils.so.1 libutils.so # Development link # Link against library g++ main.cpp -lutils -o program
Position Independent Code (PIC)
PIC allows code to execute at any memory address without modification.
Relocation Process
Relocation Entries
Type | Symbol | Offset | Before | After |
---|---|---|---|---|
R_X86_64_32 | global_var | 0x1000 | 00 00 00 00 | 00 50 40 00 |
R_X86_64_PC32 | function_call | 0x2000 | FC FF FF FF | 3C 20 00 00 |
R_X86_64_GOT32 | external_func | 0x3000 | 00 00 00 00 | 00 60 40 00 |
R_X86_64_PLT32 | printf | 0x4000 | FC FF FF FF | 5C 10 00 00 |
Before Relocation
Placeholder Values
All addresses are zeros or relative
After Relocation
Final Addresses
Relocated to base 0x400000
Relocation Formula
S + A
S = Symbol value, A = Addend
32-bit absolute relocation
Why PIC Matters
// Without PIC (absolute addressing) int global = 42; int* get_global() { return &global; // Address fixed at link time } // With PIC (relative addressing) // Compiler generates: // - PC-relative addressing for code // - GOT-relative addressing for data
PIC Performance Impact
# Non-PIC (slightly faster, not shareable) g++ -c file.cpp # PIC (shareable, required for .so) g++ -fPIC -c file.cpp # PIE (Position Independent Executable) g++ -fPIE -pie main.cpp -o program
GOT and PLT
The Global Offset Table (GOT) and Procedure Linkage Table (PLT) enable dynamic linking.
GOT/PLT Mechanism
Global Offset Table (GOT)
Procedure Linkage Table (PLT)
Lazy Binding Process
Program
PLT
GOT
Resolver
How GOT/PLT Works
- First call: Goes through PLT stub
- PLT stub: Jumps to GOT entry
- GOT entry: Initially points back to PLT
- Dynamic linker: Resolves symbol, updates GOT
- Subsequent calls: Direct jump via GOT
Examining GOT/PLT
# View PLT entries objdump -d -j .plt program # View GOT entries objdump -d -j .got program # View dynamic relocations readelf -r program # Trace dynamic linking LD_DEBUG=bindings ./program
Lazy vs Immediate Binding
# Lazy binding (default) ./program # Immediate binding (resolve all symbols at startup) LD_BIND_NOW=1 ./program # Compile with immediate binding g++ -Wl,-z,now main.cpp -o program
Relocation
Relocation adjusts addresses when the final memory layout is determined.
Types of Relocations
// R_X86_64_64: Absolute 64-bit relocation void* ptr = &global_var; // R_X86_64_PC32: PC-relative 32-bit call function // R_X86_64_GOT32: GOT-relative mov rax, variable@GOT // R_X86_64_PLT32: PLT-relative call function@PLT
Viewing Relocations
# Relocations in object file readelf -r main.o # Dynamic relocations in executable readelf -r program # Relocation processing LD_DEBUG=reloc ./program
Link Order Matters
The order of libraries on the command line can affect linking success.
Link Order Matters
Current Link Order
main.o
Needs:
func_afunc_bProvides:
mainliba.a
Needs:
func_bProvides:
func_alibb.a
Needs:
NoneProvides:
func_bWhy Order Matters
The linker processes libraries left-to-right and only pulls in object files that resolve currently undefined symbols. Libraries should be ordered from most dependent to least dependent.
Dependency Resolution
# Wrong order (may fail) g++ -lB -lA main.cpp # If A depends on B # Correct order g++ main.cpp -lA -lB # Objects first, then dependencies # Circular dependencies g++ main.cpp -lA -lB -lA # Repeat if necessary # Or use groups g++ main.cpp -Wl,--start-group -lA -lB -Wl,--end-group
Link Order Rules
- Object files before libraries
- Libraries in dependency order
- More specific before more general
- Static before shared (when mixing)
Solving Common Linking Errors
Undefined Reference
// undefined reference to `function()' // Causes: // 1. Missing implementation void function(); // Declaration only // 2. Name mangling mismatch extern "C" void c_function(); // C linkage // 3. Template instantiation template<typename T> void tmpl_func(T t) { } // Need explicit instantiation or definition in header // 4. Missing library // Solution: Add -llibrary flag
Multiple Definition
// multiple definition of `variable' // header.h int var = 10; // Wrong: Definition in header // Solutions: // 1. Use extern declaration extern int var; // In header int var = 10; // In one .cpp file // 2. Use inline (C++17) inline int var = 10; // In header // 3. Use static (file-local) static int var = 10; // Each translation unit gets its own
Library Not Found
# cannot find -llibrary # Solutions: # 1. Specify library path g++ main.cpp -L/path/to/library -llibrary # 2. Use full path g++ main.cpp /path/to/library/liblibrary.a # 3. Update library path export LIBRARY_PATH=/path/to/library:$LIBRARY_PATH g++ main.cpp -llibrary # 4. For runtime export LD_LIBRARY_PATH=/path/to/library:$LD_LIBRARY_PATH
Advanced Linking Topics
Weak Symbols
// Provide default implementation __attribute__((weak)) void optional_feature() { std::cout << "Default implementation\n"; } // Can be overridden by strong symbol void optional_feature() { std::cout << "Custom implementation\n"; }
Symbol Visibility
// Control symbol visibility in shared libraries // Default visibility (exported) __attribute__((visibility("default"))) void public_function(); // Hidden visibility (not exported) __attribute__((visibility("hidden"))) void internal_function(); // Compile with default hidden // g++ -fvisibility=hidden -fPIC shared.cpp
Linker Scripts
/* custom.ld - Custom linker script */ SECTIONS { . = 0x400000; /* Start address */ .text : { *(.text) } .data : { *(.data) } .bss : { *(.bss) } } /* Use: g++ main.cpp -T custom.ld */
Link-Time Optimization (LTO)
Link Time Optimization (LTO)
Optimization Pipeline
Dead Code Elimination
Remove unused functions and variables
Function Inlining
Inline small functions across modules
Devirtualization
Convert virtual calls to direct calls
Constant Propagation
Propagate constants across modules
Without LTO
With LTO
Compiler Flags
gcc
clang
Best Practices
- • Use LTO for release builds to maximize performance
- • Consider thin LTO for faster build times with good optimization
- • Profile-guided optimization (PGO) works well with LTO
- • May increase build time significantly for large projects
# Enable LTO g++ -flto -c file1.cpp g++ -flto -c file2.cpp g++ -flto file1.o file2.o -o program # Whole program optimization g++ -flto -fwhole-program main.cpp lib.cpp -o program # Parallel LTO g++ -flto=auto -O3 *.cpp -o program
Library Dependencies
Understanding and managing library dependencies is crucial.
Library Dependencies
Dependency Tree
Dependencies of app
Direct Dependencies
Indirect Dependencies
Dependency Management
Use tools like pkg-config, CMake, or package managers to handle complex dependency chains and version conflicts automatically.
Viewing Dependencies
# Direct dependencies ldd program # Recursive dependencies ldd -v program # Unused dependencies ldd -u program # Missing symbols nm -u program # Library search order LD_DEBUG=libs ./program
Managing Dependencies
# RPATH (built into executable) g++ main.cpp -Wl,-rpath,/custom/lib/path -lutils # Check RPATH readelf -d program | grep RPATH # RUNPATH (can be overridden by LD_LIBRARY_PATH) g++ main.cpp -Wl,--enable-new-dtags,-rpath,/path # Remove unnecessary dependencies g++ main.cpp -Wl,--as-needed -lutil1 -lutil2
Debugging Linking Issues
Verbose Linking
# GCC/G++ verbose g++ -v main.cpp -lutils # Show linker invocation g++ -### main.cpp # Linker verbose g++ -Wl,--verbose main.cpp # Trace library search g++ -Wl,--trace main.cpp -lutils # Show link map g++ -Wl,-Map=output.map main.cpp
Useful Tools
# nm - List symbols nm -C program # Demangled C++ symbols nm -D libshared.so # Dynamic symbols only nm -u program # Undefined symbols # objdump - Display object information objdump -t program # Symbol table objdump -T program # Dynamic symbol table objdump -p program # Private headers # readelf - Display ELF information readelf -s program # Symbol table readelf -d program # Dynamic section readelf -r program # Relocations # c++filt - Demangle symbols nm program | c++filt # patchelf - Modify ELF executables patchelf --set-rpath /new/path program patchelf --add-needed libneeded.so program
Platform-Specific Considerations
Linux (ELF)
# ELF-specific tools eu-readelf -a program # Alternative readelf elfutils program # Various ELF utilities
macOS (Mach-O)
# macOS-specific otool -L program # Like ldd otool -t program # Text section install_name_tool -change old.dylib new.dylib program
Windows (PE)
# Windows/MinGW objdump -p program.exe | grep DLL # List DLLs dumpbin /dependents program.exe # MSVC
Best Practices
-
Minimize shared library dependencies
- Reduces startup time
- Improves portability
-
Use symbol visibility
- Hide internal symbols
- Reduce symbol table size
-
Version your libraries
- Use SONAME for ABI compatibility
- Semantic versioning
-
Prefer static linking for distributions
- Simpler deployment
- No dependency hell
-
Use LTO for release builds
- Better optimization
- Smaller binaries
-
Avoid circular dependencies
- Restructure code
- Use forward declarations
-
Be careful with global constructors
- Initialization order issues
- Use lazy initialization
-
Test with sanitizers
- AddressSanitizer for memory issues
- UndefinedBehaviorSanitizer for UB
-
Document library requirements
- Minimum versions
- Optional features
-
Use pkg-config for libraries
g++ `pkg-config --cflags --libs gtk+-3.0` main.cpp
Performance Considerations
Dynamic Linking Overhead
- Startup cost: Symbol resolution
- Runtime cost: PLT indirection
- Memory cost: GOT entries
Optimization Strategies
# Prelinking (deprecated but instructive) prelink -a # Prelink all system libraries # Use -fno-plt for direct calls g++ -fno-plt main.cpp # Combine multiple .so into one g++ -shared obj1.o obj2.o obj3.o -o combined.so # Use static linking for hot paths # Dynamic for rarely used features
Conclusion
Linking transforms separate object files into working programs. Understanding this process helps you:
- Debug linking errors effectively
- Choose between static and dynamic linking
- Optimize program startup and runtime
- Create robust, portable software
The journey from object files to executable involves sophisticated symbol resolution, relocation, and platform-specific mechanisms. Master these concepts to build better C++ applications.