Part 1 to this series presented 3 types of delays that constitute latency measurements in messaging systems. Packetization delay represents the first type of delay, which we will discuss next. All references to packets below are for the IPv4 definition of packets.
What is Packetization Delay?
This delay refers to the time it takes a system to create and fill packets of data for sending over internet protocol related technologies. A packet represents a fundamental unit of data for IP related technologies and the delay is comprised of the time it takes to create the packet’s headers coupled with the time it takes to fill the packet’s upper data layer, or payload, with application specific data.
A packet’s header is structured with 20 bytes of fixed-value fields and 4 bytes of variable-value fields, as shown in the diagram below. The first component of packetization delay is the amount of time it takes the sending system to populate this header information. Generally speaking this time is negligible as compared to the time it takes to populate the upper layer data portion of the packet. The Maximum Transmission Unit, or MTU, represents the largest packet size that any given layer of the IP protocol stack can pass to another layer. The Ethernet MTU is 1518 bytes, which includes the 18 byte Ethernet header information. This results in a 1500 byte MTU for the IP Layer which generally is the largest packet size for IP related technology. Subtracting the 24 bytes of the IP header itself results in a maximum payload of 1476 bytes and a minimum payload of 40 bytes.
By ‘hydrating’ I mean the time it takes to fill the upper data layer portion of the packet. This time is regulated by the size of the upper data layer, the rate of message creation on the sending system and the message batching algorithm used in the sender’s protocol implementation. The batching of multiple messages into a single packet may increase the overall message-latency as the batching algorithm waits for additional messages before sending the packet. To put this into context, a real-time streaming application that sends 100 50-byte messages per second is generating 125 packets per second if the minimum Ethernet packet size of 64 bytes is used, where 24 of these bytes correspond to the header fields, and the remaining 40 bytes are used to store the upper data layer information. Disabling the batching of messages could improve latency as packets are sent upon message arrival, however network and CPU resources may get saturated processing the flood of smaller sized packets.
Packet Size and Latency
While message size and message rate are directly related to the system’s functional requirements, packet size can be configured to suit the system’s non-functional requirements. One can hypothesize that smaller packet sizes introduce inefficiencies as network and CPU utilization increase in order to process the growth in smaller sized packets. Similarly, larger packet sizes result in more time spent waiting to fill packets, although efficiencies can be gained by network and CPU resources processing significantly less packets. Determining the optimal packet size for your messaging application requires thorough testing that highlights the impact this size has on latency.
MTU, or Maximum Transmission Unit, refers to the largest packet size supported by any node along the message path. When a packet’s size is larger than the MTU for a receiving node, the packet needs to be fragmented into smaller packets. This results in packet fragmentation which negatively impacts the message latency for two reasons. First, routers need to perform the fragmentation operation, which costs time and router resources. Second, downstream nodes are now required to process more packets which results in the potential inefficiencies described in the section above.
In part 3 of this series, I’ll cover the second of the three latency delays, namely serialization delay.