Understanding MessagePack
JSON shape, binary size — the wire format that drops the quotes.
Why MessagePack is roughly half the size of JSON, how the type tags work, and where to choose Protobuf or CBOR instead.
JSON's data model, binary encoding.
MessagePack supports the same primitives as JSON — nil, bool, int, float, string, array, map — but encodes them as binary tag-length-value sequences instead of text. A small integer like 42 fits in one byte. A short string takes 1 + n bytes. There's no quoting, no escaping, no whitespace, no UTF-8 character-by-character escaping for binary. Result: typical JSON shrinks by 30-50 %.
The tag byte.
Each value starts with a tag byte that identifies its type and (for small values) its payload. 0x00-0x7F: positive int 0-127 inline (one byte, no payload). 0xE0-0xFF: negative int -32 to -1 inline. 0xA0-0xBF: fixstr (up to 31 chars). 0x90-0x9F: fixarray (up to 15 elements). 0x80-0x8F: fixmap (up to 15 pairs). 0xC0: nil. 0xC2: false. 0xC3: true. The inline forms keep tiny values cheap; larger payloads use multi-byte tags like 0xD9 (str 8) or 0xCA (float 32).
A worked encode.
JSON: {"name":"Alice","age":30,"admin":true} — 38 characters. MessagePack: 0x83 (fixmap with 3 pairs), 0xA4 "name" (fixstr 4), 0xA5 "Alice" (fixstr 5), 0xA3 "age", 0x1E (positive int 30), 0xA5 "admin", 0xC3 (true). Total: ~22 bytes. A 42 % size reduction without any actual compression, just a smarter tag scheme. The shape is preserved exactly; parsers reconstruct the same JSON object.
{"name":"Alice","age":30}
JSON 38 chars → MsgPack 22 bytes
Drop quotes, encode types as tags.
83 A4 6E 61 6D 65 A5 41 6C 69 63 65 A3 61 67 65 1E...
= ~42% smaller
vs JSON.
MessagePack is faster to parse (no quote/escape state machine), smaller on the wire, and preserves binary data without base64 encoding. JSON is human-readable, debugged with any text tool, native to browsers. Use MessagePack when the wire format isn't read by humans and you care about size or parse speed — Redis serialisation, mobile API payloads, IoT messages. Use JSON when the format is consumed by browsers, CLI tools, or logs.
vs Protobuf and CBOR.
Protobuf is schema-driven: tiny on the wire because field names are replaced with numeric tags, but requires a .proto file at both ends. MessagePack is schemaless: like JSON, the keys travel with the data. CBOR (RFC 7049/8949) is essentially the same idea as MessagePack with slight format differences and IETF blessing — used in COSE, the secure-token format for FIDO2. For pure-Python or Node services swapping structured data, MessagePack and CBOR are interchangeable; for cross-language performance-critical pipelines, Protobuf usually wins.
Extension types.
The spec reserves tags 0xC7-0xC9 and 0xD4-0xD8 for extension types. Tag 0xFF (-1) is standard for Timestamp, encoded as seconds + nanoseconds. User-defined extension types let you embed custom binary data — UUIDs, decimals, vector embeddings — without resorting to base64. The receiver needs to know the extension's meaning, so extensions are best left in-house. Cross-organisation MessagePack should stick to the built-in types.