Jesus Rodriguez is the CEO of IntoTheBlock, a market intelligence platform for crypto assets. He has held leadership roles at major technology companies and hedge funds. He is an active investor, speaker, author and guest lecturer at Columbia University in New York.
The terms “crypto” and “quant” seem to go perfectly together. Bitcoin and crypto assets were born during one of the most exciting times in capital markets coinciding with the golden era of quantitative finance. The technological acceleration caused by movements such as cloud computing and big data together with the renaissance of machine learning have collided to cause the perfect storm in favor of the quant revolution. Billions of dollars are shifting hands every year from discretionary funds into quant vehicles, and Wall Street cannot hire mathematicians and machine learning experts fast enough.
Being a completely digital asset class, crypto seems like the perfect target for quant models. And yet, quant strategies remain constrained to relatively simple techniques such as statistical arbitrage (a pair trade strategy that looks to exploit market inefficiencies in a pair of securities) and we still haven’t seen the emergence of large dominant quant desks in the market. Despite the attractive characteristics of crypto assets for quant strategies, crypto poses unique challenges for quant models and the reality is that most quant strategies in crypto fail. In this article, I would like to explore some of the fundamental but not obvious reasons that can cause the failure of most quant strategies in the crypto space.
See also: Jesus Rodriguez – Crypto Needn’t Fear GPT-3. It Should Embrace It
By claiming that most quant strategies in crypto fail, I am referring mostly to machine learning strategies. Statistical arbitrage has proven to be an effective mechanism to develop algorithmic strategies, but we should expect those opportunities to disappear as the market increases in size and efficiency. In traditional capital markets, we have seen an explosion in the implementation of machine learning-based quant models and the body of research in the space is growing exponentially.
However, most of the quant strategies proven effective in traditional capital markets are likely to not work as well when applied to crypto assets. Based on some of our recent experience at IntoTheBlock working on predictive models and quant strategies, I’ve listed some of the factors that I believe can cause the failure of quant models for crypto assets.
1. Small datasets
Many of the machine learning-based quant strategies you find in research papers are trained in decades of data from capital markets. The trading history of most crypto assets can be counted in months, and, even for vehicles like Bitcoin and Ethereum, the datasets remain relatively small. Many machine learning models will have a hard time generalizing any knowledge from such small datasets. Let’s say that you are trying to build a predictive model for the price of an asset like ChainLink (LINK), which is red-hot in recent days. It turns out LINK has a very small trading history, which is insufficient to train most machine learning models in quant finance.
2. Regular ‘outlier’ events
Although the terms “regular” and “outlier” should not be used in the same sentence, I can’t think of a better term to describe what we experience in crypto assets. Massive price crashes or sudden spikes that, in a lapse of a few hours, change the momentum in any crypto asset. These “outlier” events happen quite frequently with many crypto assets.
From a machine learning perspective, most models will be puzzled with these price movements as they haven’t seen anything similar during training. It’s not surprising that many machine learning quant models got decimated during the flash crash of mid-March or failed to capitalize in the sudden increase in volatility of the last few weeks. It is hard to capture knowledge for those types of events during the training of the model.
3. Propensity to overfit
A side effect of the small market datasets in crypto assets is the propensity of most machine learning quant models to overfit or to “optimize for the training dataset.” We constantly see quant models that perform incredibly well during backtesting just to fail when applied to real market conditions.
4. The regular retraining dilemma
Think about this scenario: You have created a predictive model trained on a few years of Bitcoin trading history, then you experience weeks of almost no volatility followed by a few crazy volatile days (not that it has ever happened before ). You would like to retrain the model to capture that knowledge, but how? If you simply retrain the model in the most recent data, there is a strong chance of overfitting while if you wait then the knowledge might not be relevant any longer.
Talent is a very important, and often overlooked aspect, to grow quant investment as a discipline in the crypto space.
This retraining dilemma is a direct consequence of the “regular outlier events” phenomena. If you train a model in a dataset from the last 10 years of the S&P 500, you can design a strategy to retrain the model regularly as it is unlikely the index will deviate too much from its traditional behavior in short periods of time. This regular retraining of models that has been well adopted in traditional quant strategies goes out the window when it comes to crypto.
5. Data quality and reliability
One of the biggest drawbacks of designing machine learning quant models for crypto assets is the poor quality and reliability of datasets. It is not a secret that many exchange order book datasets are full of records that indicate fake volumes, wash trades or spoofing behavior. Obviously, training a machine learning model using those datasets won’t produce any relevant results. Additionally, almost every week we hear about exchange APIs having outages and shutting down for hours. When was the last time you heard about a Nasdaq API crash? It definitely happens, but not that frequently. That lack of reliability can kill the accuracy of the most robust quant models.
6. Anonymous blockchain records
Blockchain datasets remain one of the richest sources of alpha for quant strategies in the crypto space. But the anonymity of blockchain records makes it really challenging to design meaningful quant models. Let’s say, for instance, that one of the features in a quant strategy leverages the address count in the Ethereum blockchain. Well, addresses that are part of exchanges are fundamentally different from addresses of individual wallets and those are different from miners’ addresses. Labeling blockchain records is essential to design meaningful quant models based on blockchain datasets and, unfortunately, those efforts are still in the very early stages.
7. Factor strategies out the window
Factor models have been at the center of some of the most successful quant strategies in the last two decades. Entire mega funds like AQR were built on the promise of factor investing quant strategies. From the original factors like value, momentum, or quality, factor strategies have grown to hundreds of factors that model relevant behaviors in financial asset classes.
At least until today, most factor strategies have proven to be ineffective in the context of crypto assets. When it comes to crypto, factors like value and quality are not clearly defined and the behavior of others such as momentum defies conventional patterns. This causes many crypto quant desks to spend numerous hours trying to recreate factor-based strategies that are highly unlikely to perform in the crypto space.
8. Simple model fallacy
The field of quantitative finance is rapidly gravitating towards large and complex models regularly outperform simpler and more specialized models. This trend is a reflection of what’s happening in the entire machine learning space. The advent of deep learning showed us it’s possible to create highly complex neural networks that acquire knowledge in the most unthinkable ways.
Funds like TwoSigma and WorldQuant are actively pushing deep learning research and incorporating ideas coming out of the AI labs of tech giants like Google, Microsoft, or Facebook. Yet, in the world of crypto, most quant strategies still rely on very basic machine learning paradigms like linear regression or decision trees.
Simpler models are unquestionably attractive given that they are easy to understand, but they can have a hard time generalizing knowledge from a complex environment such as the crypto markets. As a machine learning environment, crypto combines the complexity of a financial market with the inefficiencies and uncertainty of a new asset class. Definitely not the best fit for simple quant strategies.
9. Basic quant infrastructures
Complementing the previous point, most quant infrastructures in the crypto space are relatively nascent. A robust quant infrastructure goes beyond good strategies and includes elements such as risk management, backtesting, portfolio management, strategy execution, error recovery and many others. In the crypto space, the quant infrastructure of most hedge funds remains relatively simple which makes it difficult to operate certain types of strategies.
See also: Jesus Rodriguez – Myths and Realities: Sentiment Analysis for Crypto Assets
For instance, suppose that you have designed a beautiful deep learning quant strategy that forecasts the price of Bitcoin based on blockchain datasets. To operate that strategy, a fund would need an infrastructure that collects blockchain records regularly, has the computer infrastructure to run deep learning models, the appropriate retraining tool, and so on.
Today’s technology has certainly reduced the time and cost required to build a quant infrastructure to run machine learning models, but quant desks remain relatively basic compared to those operating in traditional capital markets.
10. Talent availability
I left the most controversial point to the end. As a financial market, crypto is still failing to attract top quant talent with relevant experience in traditional capital markets. We are still tackling incredibly complex problems such as forecasting the behavior of an asset class with relatively simple models, basic infrastructure and poor processes. Talent is a very important, and often overlooked aspect, to grow quant investment as a discipline in the crypto space. There are incredibly talented quant teams in crypto, but they are the exception, not the rule.
These are some points that might cause us to reflect about the current state of quant investment in the crypto space. Crypto is an ideal asset class for quant strategies and, in the long run, quant funds should be the dominant investment vehicle in crypto. The path includes many challenges, but also fascinating opportunities.