This portion of the architecture guide describes the octet stream form of data types understood by the LWMsg
marshaller. All data encoding follow several common rules unless otherwise specified:
This section describes the core data types understood by the marshaller, and does not include extended types added by the association and connection abstractions, nor common type aliases which may be reduced to combinations of core types.
Integers are arbitrary-length integral values (although the length must be a multiple of 8 bits). They are encoded starting with the most significant byte and ending with the least. Integers may be signed or unsigned. Signed integers are encoded using two's complement representation. Although the data representation does not impose a limit on the size of integers, LWMsg only supports integers as wide as intmax_t
on a given platform. The largest size which is guaranteed to be supported is 64 bits.
Width | Value |
8 | Most significant byte |
... | |
8 | Least significant byte |
Pointers represent a potentially-null reference to zero or more contiguous, homogenous elements of a particular type. If a pointer is not null, it must be unique – that is, no two pointers in an encoded LWMsg
octet stream can share a referent.
There are three elements in the octet representation of a pointer:
The first byte of a pointer representation is a flag which indicates whether the pointer is null:
0x00
: the pointer is null0xff
: the pointer is non-nullPointer types may be decorated with an attribute that requires them to be non-null, in which case the indicator byte is omitted entirely.
The number of elements may be determined in three ways:
If the first case, the length of the referent is well-known and is not encoded in the octet stream. In the second case, the length already appears previously in the stream and is not repeated. In the third case, the length is encoded explicitly as a 32-bit unsigned integer. In all three cases, the length specifies the number of elements, not the size in bytes.
Finally, each element of the referent is encoded in order according to the rules of that type. In the case of a zero-terminated referent, the zero element is not encoded in the stream and is not counted in the transmitted length. The decoder implicitly adds it back.
Width | Value |
8 | Indicator flag (omitted for non-nullable pointers) |
32 | Length of referent (omitted for static or correlated length) |
w | Representation of 1st element |
w | Representation of 2nd element |
... | |
w | Representation of nth element |
Arrays share many characteristics with pointers but can never be null due to the fact that they are laid out contiguously in memory within their containing type. Because of this, their octet encoding is identical to that of a non-nullable pointer. Otherwise, arrays support the same set of length determination methods as pointers.
Width | Value |
32 | Length of array (omitted for static or correlated length) |
w | Representation of 1st element |
w | Representation of 2nd element |
... | |
w | Representation of nth element |
Some encodings which are possible in theory are not allowed in practice because they cannot be decoded to a usable in-memory structure. In particular, an array with a variable length cannot occur in the middle of a structure or another array – it must come at the end. This is known as a flexible array member.
Structures are heterogeneous tuples of zero or more members, each of a specific type. Members in structures may be correlated:
The last member of a structure may optionally be an array with a non-static (variable) length. This is known as a flexible array member. A flexible array may not appear in any other position in a structure. A structure with a flexible array member must always be reached through a pointer with a static length of 1 – that is, it may not be a direct member of another structure, of an array, or of a pointer referent with more than 1 element.
The encoding of a structure is merely the encoding of its members in order.
Width | Value |
w1 | Representation of the 1st member |
w2 | Representation of the 2nd member |
... | |
wn | Representation of the nth member |
Unions are a combination of one or more hetergeneous arms, only one of which is present for any given instance. Each arm is associated with a unique integer tag which identifies it. Every instance of a union must be correlated with an integer member of its containing structure. This integer is known as a discriminator and distinguises which arm of the union instance is active. Only the representation of this active arm is encoded in the octet stream.
Width | Value |
wa | Representation of the active arm |
The following data types are extensions built on the core set using the custom type mechanism (see LWMSG_CUSTOM) and included as part of the standard LWMsg software package.
Handles are opaque, persistent pointers which allow peers joined by an association to reference each other's objects without transmitting them. Handles are the recommended means of maintaining connection state.
A handle's representation consists of its locality and handle ID. The locality is an 8-bit value which specifies the side of an association – local or remote – where the physical object represented by the handle resides. Alternatively, it may indicate that the handle is null. The handle ID is a 32-bit integer distinguishing the handle from all other possible active handles in the session. Handle IDs are arbitrarily assigned by the peer which first creates the handle. Both peers may by chance pick the same handle ID for handles they create; this is allowed because the locality of a handle is also taken into account when resolving the handle ID to an object in memory.
Width | Value |
8 | Locality |
32 | Handle ID (omitted if locality is NULL) |
The locality field has three legal values:
0x00
: The handle is null0x01
: The handle is local from the perspective of the encoder0x02
: The handle is remote from the perspective of the encoderThe file descriptor type allows LWMsg
applications communicating over UNIX domain sockets to exchange UNIX file descriptors between processes. Because the mechanism to achieve this involves passing special ancillary data to the kernel, the actual file descriptor is not encoded into the representation. Instead, an 8-bit flag is sent indicating whether the file descriptor was valid.
Width | Value |
8 | Validity flag |
The flag field has two legal values:
0x00
: the file descriptor was invalid (-1)0xff
: the file descriptor was valid and was transmitted as ancillary data Likewise Message Library, part of the Likewise platform
Copyright © 2018 Likewise Software. All rights reserved.