Part 4 presented propagation delay, or the delay incurred as packets travel between sending and receiving nodes. The Measuring Latency series has illustrated some of the low-level but nonetheless key contributors to messaging latency. Understanding these contributors can help information technology customers better decide between competing low-latency technologies. There are many other factors that contribute to latency along the message path including those described below.
This delay refers to the time it takes applications to route, transform, embellish or apply any other business rule, prior to sending messages to downstream applications. The application’s architectural characteristics are key to minimizing this latency. Threading, pipelining, caching, and direct memory access are just a few of the performance design techniques that can minimize application latency.
(click here for article series on how Financial firms on Wall Street find innovative techniques that minimize application latency)
Routers and switches can add between 30 microseconds and 1000 milliseconds to a message’s overall latency. Configuration options with these devices can add even more latency. Switches, for example, can be designed to forward frames with store and forward, or cut-through semantics. With store-and-forward, a switch will wait to forward a frame until it has received the the entire frame. Cut-through configurations, on the other hand, allow the switch to operate at wire speed by forwarding a frame as soon as the destination address is read.
Before You Start Measuring
You must choose your endpoints wisely. Your source and destination endpoints will incorporate some or all of the delays I’ve presented in this series. Not knowing which of the latency delays are included in this message path will make it almost impossible to act upon the results.
Before you measure and collect timestamps, keep in mind the need to synchronize the clocks between message processing nodes. Without clock synchronization, or understanding the variability between these clocks, will only destroy the integrity of your measurements. Some other considerations:
- Precision of your timestamps (i.e. will millisecond precision suffice?)
- Latency of your measuring tools (i.e. how much time to the overall latency belongs to your testing tools themselves)
- Relevancy of your configuration (i.e. do the software/hardware specs reflect your target environment)
- Network Congestion (i.e have other applications/users been locked out of the testing network)
- Message rate (i.e. measure at a message rate that reflects your target environment)
Organize your Measurements
Depending on the frequency of measurements you may find yourself with large amounts of data representing timestamps between your chosen endpoints. As you process through this information use the following statistics to summarize the latency characteristics:
- mean – The mean is the sum of the observations divided by the number of observations. When referring to the average, we’re referring to the arithmetic mean.
- median – The number separating the higher half of a sample, a population, or a probability distribution, from the lower half.
- standard deviation – Describes the spread of data from the mean, it is defined as the square root of the variance.
- percentile – The value of a variable below which a certain percent of observations fall.
In order to accurately communicate the latency characteristics of your application you must include statistics such as the arithmetic mean as well as median, standard deviation and percentile statistics to communicate the spread of these latency measures. Some high-performance computing requirements, such as those surrounding electronic trading on Wall Street, are especially sensitive to measures of latency that fall outside the mean.
I hope you’ve found this series helpful and I look forward to your comments.