Menu
About
Articles
Papers
Resume
Speaking
Uses
Consulting
Text Encoding Comparison
ASCII vs UTF-8 vs UTF-16 vs UTF-32
ASCII
(7-bit, 1 byte)
Letter "A"
7
6
5
4
3
2
1
0
0
1
0
0
0
0
0
1
Hex: 41
• English only
• 128 characters
• Fixed length
• Compact
• Not multilingual
UTF-8
(1-4 bytes)
Letter "A"
7
6
5
4
3
2
1
0
0
1
0
0
0
0
0
1
Hex: 41
Euro "€"
7
6
5
4
3
2
1
0
1
1
1
0
0
0
1
0
7
6
5
4
3
2
1
0
1
0
0
0
0
0
1
0
7
6
5
4
3
2
1
0
1
0
1
0
1
1
0
0
Hex: E2 82 AC
• ASCII compatible
• Variable length
• Web standard
UTF-16
(2-4 bytes)
Letter "A"
7
6
5
4
3
2
1
0
0
0
0
0
0
0
0
0
7
6
5
4
3
2
1
0
0
1
0
0
0
0
0
1
Hex: 00 41
Japanese "あ"
7
6
5
4
3
2
1
0
0
0
1
1
0
0
1
0
7
6
5
4
3
2
1
0
0
1
1
0
0
0
0
1
Hex: 30 42
• Windows default
• Variable length
• Asian: 2 bytes
UTF-32
(Fixed 4 bytes)
Letter "A"
7
6
5
4
3
2
1
0
0
0
0
0
0
0
0
0
7
6
5
4
3
2
1
0
0
0
0
0
0
0
0
0
7
6
5
4
3
2
1
0
0
0
0
0
0
0
0
0
7
6
5
4
3
2
1
0
0
1
0
0
0
0
0
1
Hex: 00 00 00 41
• Fixed length
• Performance optimized
• Easy string operations
• Space inefficient
• All chars: 4 bytes
Size vs Performance Tradeoff
UTF-8: Size Optimized • UTF-16: Balance • UTF-32: Performance Optimized
Mastodon