MessagePack

MessagePack is an efficient binary serialization format. It lets you exchange data among multiple languages like JSON. But it’s faster and smaller. Small integers are encoded into a single byte, and typical short strings require only one extra byte in addition to the strings themselves.

Specification

MessagePack has two concepts: type system and formats.

Types

  • Integer represents an integer
  • Nil represents nil
  • Boolean represents true or false
  • Float represents a IEEE 754 double precision floating point number including NaN and Infinity
  • Raw
    • String extending Raw type represents a UTF-8 string
    • Binary extending Raw type represents a byte array
  • Array represents a sequence of objects
  • Map represents key-value pairs of objects
  • Extension represents a tuple of type information and a byte array where type information is an integer whose meaning is defined by applications or MessagePack specification
    • Timestamp represents an instantaneous point on the time-line in the world that is independent from time zones or calendars. Maximum precision is nanoseconds.

Formats

Overview

format namefirst byte (in binary)first byte (in hex)
positive fixint0xxxxxxx0x00 - 0x7f
fixmap1000xxxx0x80 - 0x8f
fixarray1001xxxx0x90 - 0x9f
fixstr101xxxxx0xa0 - 0xbf
nil110000000xc0
(never used)110000010xc1
false110000100xc2
true110000110xc3
bin 8110001000xc4
bin 16110001010xc5
bin 32110001100xc6
ext 8110001110xc7
ext 16110010000xc8
ext 32110010010xc9
float 32110010100xca
float 64110010110xcb
uint 8110011000xcc
uint 16110011010xcd
uint 32110011100xce
uint 64110011110xcf
int 8110100000xd0
int 16110100010xd1
int 32110100100xd2
int 64110100110xd3
fixext 1110101000xd4
fixext 2110101010xd5
fixext 4110101100xd6
fixext 8110101110xd7
fixext 16110110000xd8
str 8110110010xd9
str 16110110100xda
str 32110110110xdb
array 16110111000xdc
array 32110111010xdd
map 16110111100xde
map 32110111110xdf
negative fixint111xxxxx0xe0 - 0xff

Notation in diagrams

one byte:
+--------+
|        |
+--------+

a variable number of bytes:
+========+
|        |
+========+

variable number of objects stored in MessagePack format:
+~~~~~~~~~~~~~~~~~+
|                 |
+~~~~~~~~~~~~~~~~~+

X, Y, Z and A are the symbols that will be replaced by an actual bit.

nil format

Nil format stores nil in 1 byte.

nil:
+--------+
|  0xc0  |
+--------+

bool format family

Bool format family stores false or true in 1 byte.

false:
+--------+
|  0xc2  |
+--------+

true:
+--------+
|  0xc3  |
+--------+

int format family

Int format family stores an integer in 1, 2, 3, 5, or 9 bytes.

positive fixint stores 7-bit positive integer
+--------+
|0XXXXXXX|
+--------+

negative fixint stores 5-bit negative integer
+--------+
|111YYYYY|
+--------+

* 0XXXXXXX is 8-bit unsigned integer
* 111YYYYY is 8-bit signed integer

uint 8 stores a 8-bit unsigned integer
+--------+--------+
|  0xcc  |ZZZZZZZZ|
+--------+--------+

uint 16 stores a 16-bit big-endian unsigned integer
+--------+--------+--------+
|  0xcd  |ZZZZZZZZ|ZZZZZZZZ|
+--------+--------+--------+

uint 32 stores a 32-bit big-endian unsigned integer
+--------+--------+--------+--------+--------+
|  0xce  |ZZZZZZZZ|ZZZZZZZZ|ZZZZZZZZ|ZZZZZZZZ|
+--------+--------+--------+--------+--------+

uint 64 stores a 64-bit big-endian unsigned integer
+--------+--------+--------+--------+--------+--------+--------+--------+--------+
|  0xcf  |ZZZZZZZZ|ZZZZZZZZ|ZZZZZZZZ|ZZZZZZZZ|ZZZZZZZZ|ZZZZZZZZ|ZZZZZZZZ|ZZZZZZZZ|
+--------+--------+--------+--------+--------+--------+--------+--------+--------+

int 8 stores a 8-bit signed integer
+--------+--------+
|  0xd0  |ZZZZZZZZ|
+--------+--------+

int 16 stores a 16-bit big-endian signed integer
+--------+--------+--------+
|  0xd1  |ZZZZZZZZ|ZZZZZZZZ|
+--------+--------+--------+

int 32 stores a 32-bit big-endian signed integer
+--------+--------+--------+--------+--------+
|  0xd2  |ZZZZZZZZ|ZZZZZZZZ|ZZZZZZZZ|ZZZZZZZZ|
+--------+--------+--------+--------+--------+

int 64 stores a 64-bit big-endian signed integer
+--------+--------+--------+--------+--------+--------+--------+--------+--------+
|  0xd3  |ZZZZZZZZ|ZZZZZZZZ|ZZZZZZZZ|ZZZZZZZZ|ZZZZZZZZ|ZZZZZZZZ|ZZZZZZZZ|ZZZZZZZZ|
+--------+--------+--------+--------+--------+--------+--------+--------+--------+

float format family

Float format family stores a floating point number in 5 bytes or 9 bytes.

float 32 stores a floating point number in IEEE 754 single precision floating point number format:
+--------+--------+--------+--------+--------+
|  0xca  |XXXXXXXX|XXXXXXXX|XXXXXXXX|XXXXXXXX|
+--------+--------+--------+--------+--------+

float 64 stores a floating point number in IEEE 754 double precision floating point number format:
+--------+--------+--------+--------+--------+--------+--------+--------+--------+
|  0xcb  |YYYYYYYY|YYYYYYYY|YYYYYYYY|YYYYYYYY|YYYYYYYY|YYYYYYYY|YYYYYYYY|YYYYYYYY|
+--------+--------+--------+--------+--------+--------+--------+--------+--------+

where
* XXXXXXXX_XXXXXXXX_XXXXXXXX_XXXXXXXX is a big-endian IEEE 754 single precision floating point number.
  Extension of precision from single-precision to double-precision does not lose precision.
* YYYYYYYY_YYYYYYYY_YYYYYYYY_YYYYYYYY_YYYYYYYY_YYYYYYYY_YYYYYYYY_YYYYYYYY is a big-endian
  IEEE 754 double precision floating point number

str format family

Str format family stores a byte array in 1, 2, 3, or 5 bytes of extra bytes in addition to the size of the byte array.

fixstr stores a byte array whose length is upto 31 bytes:
+--------+========+
|101XXXXX|  data  |
+--------+========+

str 8 stores a byte array whose length is upto (2^8)-1 bytes:
+--------+--------+========+
|  0xd9  |YYYYYYYY|  data  |
+--------+--------+========+

str 16 stores a byte array whose length is upto (2^16)-1 bytes:
+--------+--------+--------+========+
|  0xda  |ZZZZZZZZ|ZZZZZZZZ|  data  |
+--------+--------+--------+========+

str 32 stores a byte array whose length is upto (2^32)-1 bytes:
+--------+--------+--------+--------+--------+========+
|  0xdb  |AAAAAAAA|AAAAAAAA|AAAAAAAA|AAAAAAAA|  data  |
+--------+--------+--------+--------+--------+========+

where
* XXXXX is a 5-bit unsigned integer which represents N
* YYYYYYYY is a 8-bit unsigned integer which represents N
* ZZZZZZZZ_ZZZZZZZZ is a 16-bit big-endian unsigned integer which represents N
* AAAAAAAA_AAAAAAAA_AAAAAAAA_AAAAAAAA is a 32-bit big-endian unsigned integer which represents N
* N is the length of data

bin format family

Bin format family stores an byte array in 2, 3, or 5 bytes of extra bytes in addition to the size of the byte array.

bin 8 stores a byte array whose length is upto (2^8)-1 bytes:
+--------+--------+========+
|  0xc4  |XXXXXXXX|  data  |
+--------+--------+========+

bin 16 stores a byte array whose length is upto (2^16)-1 bytes:
+--------+--------+--------+========+
|  0xc5  |YYYYYYYY|YYYYYYYY|  data  |
+--------+--------+--------+========+

bin 32 stores a byte array whose length is upto (2^32)-1 bytes:
+--------+--------+--------+--------+--------+========+
|  0xc6  |ZZZZZZZZ|ZZZZZZZZ|ZZZZZZZZ|ZZZZZZZZ|  data  |
+--------+--------+--------+--------+--------+========+

where
* XXXXXXXX is a 8-bit unsigned integer which represents N
* YYYYYYYY_YYYYYYYY is a 16-bit big-endian unsigned integer which represents N
* ZZZZZZZZ_ZZZZZZZZ_ZZZZZZZZ_ZZZZZZZZ is a 32-bit big-endian unsigned integer which represents N
* N is the length of data

array format family

Array format family stores a sequence of elements in 1, 3, or 5 bytes of extra bytes in addition to the elements.

fixarray stores an array whose length is upto 15 elements:
+--------+~~~~~~~~~~~~~~~~~+
|1001XXXX|    N objects    |
+--------+~~~~~~~~~~~~~~~~~+

array 16 stores an array whose length is upto (2^16)-1 elements:
+--------+--------+--------+~~~~~~~~~~~~~~~~~+
|  0xdc  |YYYYYYYY|YYYYYYYY|    N objects    |
+--------+--------+--------+~~~~~~~~~~~~~~~~~+

array 32 stores an array whose length is upto (2^32)-1 elements:
+--------+--------+--------+--------+--------+~~~~~~~~~~~~~~~~~+
|  0xdd  |ZZZZZZZZ|ZZZZZZZZ|ZZZZZZZZ|ZZZZZZZZ|    N objects    |
+--------+--------+--------+--------+--------+~~~~~~~~~~~~~~~~~+

where
* XXXX is a 4-bit unsigned integer which represents N
* YYYYYYYY_YYYYYYYY is a 16-bit big-endian unsigned integer which represents N
* ZZZZZZZZ_ZZZZZZZZ_ZZZZZZZZ_ZZZZZZZZ is a 32-bit big-endian unsigned integer which represents N
* N is the size of an array

map format family

Map format family stores a sequence of key-value pairs in 1, 3, or 5 bytes of extra bytes in addition to the key-value pairs.

fixmap stores a map whose length is upto 15 elements
+--------+~~~~~~~~~~~~~~~~~+
|1000XXXX|   N*2 objects   |
+--------+~~~~~~~~~~~~~~~~~+

map 16 stores a map whose length is upto (2^16)-1 elements
+--------+--------+--------+~~~~~~~~~~~~~~~~~+
|  0xde  |YYYYYYYY|YYYYYYYY|   N*2 objects   |
+--------+--------+--------+~~~~~~~~~~~~~~~~~+

map 32 stores a map whose length is upto (2^32)-1 elements
+--------+--------+--------+--------+--------+~~~~~~~~~~~~~~~~~+
|  0xdf  |ZZZZZZZZ|ZZZZZZZZ|ZZZZZZZZ|ZZZZZZZZ|   N*2 objects   |
+--------+--------+--------+--------+--------+~~~~~~~~~~~~~~~~~+

where
* XXXX is a 4-bit unsigned integer which represents N
* YYYYYYYY_YYYYYYYY is a 16-bit big-endian unsigned integer which represents N
* ZZZZZZZZ_ZZZZZZZZ_ZZZZZZZZ_ZZZZZZZZ is a 32-bit big-endian unsigned integer which represents N
* N is the size of a map
* odd elements in objects are keys of a map
* the next element of a key is its associated value

ext format family

Ext format family stores a tuple of an integer and a byte array.

fixext 1 stores an integer and a byte array whose length is 1 byte
+--------+--------+--------+
|  0xd4  |  type  |  data  |
+--------+--------+--------+

fixext 2 stores an integer and a byte array whose length is 2 bytes
+--------+--------+--------+--------+
|  0xd5  |  type  |       data      |
+--------+--------+--------+--------+

fixext 4 stores an integer and a byte array whose length is 4 bytes
+--------+--------+--------+--------+--------+--------+
|  0xd6  |  type  |                data               |
+--------+--------+--------+--------+--------+--------+

fixext 8 stores an integer and a byte array whose length is 8 bytes
+--------+--------+--------+--------+--------+--------+--------+--------+--------+--------+
|  0xd7  |  type  |                                  data                                 |
+--------+--------+--------+--------+--------+--------+--------+--------+--------+--------+

fixext 16 stores an integer and a byte array whose length is 16 bytes
+--------+--------+--------+--------+--------+--------+--------+--------+--------+--------+
|  0xd8  |  type  |                                  data                                  
+--------+--------+--------+--------+--------+--------+--------+--------+--------+--------+
+--------+--------+--------+--------+--------+--------+--------+--------+
                              data (cont.)                              |
+--------+--------+--------+--------+--------+--------+--------+--------+

ext 8 stores an integer and a byte array whose length is upto (2^8)-1 bytes:
+--------+--------+--------+========+
|  0xc7  |XXXXXXXX|  type  |  data  |
+--------+--------+--------+========+

ext 16 stores an integer and a byte array whose length is upto (2^16)-1 bytes:
+--------+--------+--------+--------+========+
|  0xc8  |YYYYYYYY|YYYYYYYY|  type  |  data  |
+--------+--------+--------+--------+========+

ext 32 stores an integer and a byte array whose length is upto (2^32)-1 bytes:
+--------+--------+--------+--------+--------+--------+========+
|  0xc9  |ZZZZZZZZ|ZZZZZZZZ|ZZZZZZZZ|ZZZZZZZZ|  type  |  data  |
+--------+--------+--------+--------+--------+--------+========+

where
* XXXXXXXX is a 8-bit unsigned integer which represents N
* YYYYYYYY_YYYYYYYY is a 16-bit big-endian unsigned integer which represents N
* ZZZZZZZZ_ZZZZZZZZ_ZZZZZZZZ_ZZZZZZZZ is a big-endian 32-bit unsigned integer which represents N
* N is a length of data
* type is a signed 8-bit signed integer
* type < 0 is reserved for future extension including 2-byte type information

Timestamp extension type

Timestamp extension type is assigned to extension type -1. It defines 3 formats: 32-bit format, 64-bit format, and 96-bit format.

timestamp 32 stores the number of seconds that have elapsed since 1970-01-01 00:00:00 UTC
in an 32-bit unsigned integer:
+--------+--------+--------+--------+--------+--------+
|  0xd6  |   -1   |   seconds in 32-bit unsigned int  |
+--------+--------+--------+--------+--------+--------+

timestamp 64 stores the number of seconds and nanoseconds that have elapsed since 1970-01-01 00:00:00 UTC
in 32-bit unsigned integers:
+--------+--------+--------+--------+--------+------|-+--------+--------+--------+--------+
|  0xd7  |   -1   | nanosec. in 30-bit unsigned int |   seconds in 34-bit unsigned int    |
+--------+--------+--------+--------+--------+------^-+--------+--------+--------+--------+

timestamp 96 stores the number of seconds and nanoseconds that have elapsed since 1970-01-01 00:00:00 UTC
in 64-bit signed integer and 32-bit unsigned integer:
+--------+--------+--------+--------+--------+--------+--------+
|  0xc7  |   12   |   -1   |nanoseconds in 32-bit unsigned int |
+--------+--------+--------+--------+--------+--------+--------+
+--------+--------+--------+--------+--------+--------+--------+--------+
                    seconds in 64-bit signed int                        |
+--------+--------+--------+--------+--------+--------+--------+--------+
  • Timestamp 32 format can represent a timestamp in [1970-01-01 00:00:00 UTC, 2106-02-07 06:28:16 UTC) range. Nanoseconds part is 0.
  • Timestamp 64 format can represent a timestamp in [1970-01-01 00:00:00.000000000 UTC, 2514-05-30 01:53:04.000000000 UTC) range.
  • Timestamp 96 format can represent a timestamp in [-292277022657-01-27 08:29:52 UTC, 292277026596-12-04 15:30:08.000000000 UTC) range.
  • In timestamp 64 and timestamp 96 formats, nanoseconds must not be larger than 999999999.