Designing for Performance on Wall Street – The Storage Dilemma

What and How Much?

Prediction, transparency and compliance all come with a heavy storage price these days. Electronic trading applications, and specifically the algorithms that drive them, depend on access to high-quality historical data both at runtime and design time when the data is mined in an effort to identify new patterns which can drive future trading strategies. Risk analysis, a discipline that attempts to valuate a firm’s investments in real time, relies on the ability to detect and predict patterns found by mining the same high-quality historical data. The SEC’s Regulation NMS, which requires that trades execute at the best available price mandates that firm’s retain records that show a history of trading at the best price. These records should include both the trades the firm executed, as well as the stock quotes that inspired them.

Previously I showed how electronic trading, regulation and innovation have indirectly resulted in the bandwidth problem and the need for speed. The storage dilemma is closely related. It is a dilemma because what to store, how much of it to store, and how to store it is an inexact science which can have enormous consequences on storage requirements. It can potentially cost 60GB a day (January 2008 conditions) to store each and every level-1 quote, disseminated on a daily basis from any of the ECNs, exchanges, and OTC markets. Knowing how to resolve the storage dilemma requires a balanced view of constraints in the problem and solution domains as well as a little bit of luck.

Compliance History

The storage dilemma was initially and in many ways continues to be driven by regulatory compliance. The US financial accounting scandals, that spilled into the start of this century, and resulting debacle, as well as the terrorist attacks in 2001 led regulators to write and in some cases rewrite the rules surrounding digital communications (voice, chat, email) within and between firms, and the need to retain the history trail of communication, for all individuals within a firm, for a predetermined period of time. The general regulations I’m referring to here include Sarbanes-Oxley Act of 2002, NASD 3010 & 3110, NYSE 342/440/472, SEC Rule 17a-4.

These regulations are as ambitious as the technological innovations that are needed to support them. Requiring that firms store all digital communications for a period of three years, for example, is largely based on the fact that with today’s technology you can. From a regulatory standpoint the technology capacity and capability exists on paper, but from an implementation standpoint, it’s not so easy. For example, we can backup anything these days, but can you easily restore from that backup? The same applies to email archiving. Sure you can store all emails for all members of your organization as far back as you want, but when regulators come asking for specific email records dating back 5 years, the true test begins. Amazingly, and i say amazingly because compliance here is largely determined by the design and implementation of the storage/archival system, incredibly large fines are being issued for failure to comply, on the order of hundreds of millions of dollars.

The storage dilemma surrounding these regulatory requirements is further fueled by willing and paranoid compliance departments, whose job it is to ensure a firm’s compliance with all applicable regulations, and sometimes unwilling IT departments, who are fully aware of the financial, technological, and temporal constraints surrounding these ambitious regulations. You’re not supposed to meet halfway on these requirements, but there are many reasons to try to compromise. Finding a balance between the regulatory, financial, temporal and technological requirements is just one example of storage dilemma.

It Gets Worse

Electronic trading has introduced incredible efficiencies in the markets, resulting in lower per-order profit margins. Simultaneously, the structure of the securities themselves, think mortgaged-backed securities, credit default swaps, has become so complex it becomes almost impossible to valuate them. How can a firm design the most intelligent, empirically backed electronic trading algorithm, or the most sophisticated empirically backed risk analysis model? The answer is access to high-quality historical data. Deriving intelligence from the mining of market data is a key differentiator in the electronic trading and risk analysis space. When you add to this the exponentially increasing data volumes we’ve shown in the bandwidth problem, you have another example of the storage dilemma. There are no easy answers for which market data to store, and how much of it to store. The need is clear, and is even more clear if you consider that Regulation NMS requires that trading firms capture sufficient amounts of quote and trade data to show they are executing trades at the best prices across all market centers.

If you build it, they will saturate it…

As we’ve shown at the start of this series, the convergence of regulation, innovation and electronic trading have redefined the magnitude of problems and solutions in the capital markets. Technological innovation, however, is the primary driver. Just like mobile communication devices have inspired the increasing amounts of email and chat conversations between related and unrelated (i.e. junk mail) parties, innovations in technology have creatively inspired regulators, investors, brokers, investment banks, exchanges, hedge funds to stay ahead of their objectives.

Regulation NMS, in particular, implicitly addresses the sophistication of today’s technology, and shows that regulators can be as innovative as for-profit firms in demanding transparency and fairness. Algorithmic trading’s thirst for predictive intelligence is driven by the necessity to be accurate and fast, as we’ve shown in the need for speed. This transparency and prediction requires data, and if you thought email archiving was a lot, wait until you need to store and efficiently mine terabytes upon terabytes of market data information – assuming you’ve found the budget or technology to store it all.

Tagged ,