A primer on VLQ
Variable-Length Quantity (VLQ) is an way of representing arbitrarly
large integers in bytes. In this blog post we will explore the
details of how this is achieved.
1 Representation Problem
Lets say you are representing a music score. You need to represent the
duration of various notes. For each note you will tell how many
milliseconds it lasts. Although this story is apocryphal it offers a
nice conundrum: how do you represent numbers on an open ended scale?
One way of doing this is Variable-Length Quantity, a code that uses a
a variable amount of bytes and a such is amendable to use in a
VLQ is composed of a number of octets, i.e. eight bits. The most
significant bit of an octet is used to signal if an other octet
follows. The least significant bits of all octets together form the
Lets say I want to represent the number
3435 in VLQ.
110101101011. We can not fit this in a byte. So we will
chop it up from the end in 7-bit blocks
Now we prepend all but the last with a 1-bit to indicate that an octet
follows and prepend a 0-bit to the last, signalling the final octet.
Finally we concatenate them, most significant octet first, into
1001101001101011. An implementation in java is provided.
There are a few variations on VLQ possible. First and foremost is the
order of octets. We agreed to send the most significant octet first,
but you can switch the order around. You would have to change which
groups get a continuation bit prepended, but all in all it is not
A more drastic change is to represent both positive and negative
numbers. In the current code there is no room to encode a sign for
the number. So we need to free up a bit to encode the sign. One way
of doing this is to keep the most significant bit as a continuation
bit, i.e. it signals if an other octet follows. Then use the next
most significant bit to encode the sign of the number.
Under this representation
-3435 would be represented as