Trading Platform
Overview
My role: Architecture to deployed system.
Challenge: Run any number of trading algorithms in parallel against over a thousand trading pairs across several centralised crypto exchanges.
Solution: Built and deployed a distributed, scalable, banking-grade streaming platform with high throughput and low latency from exchange data receipt through decisioning to order placement.
Technologies: AWS (EC2, VPC), GitHub, Apache Kafka, Rust, Python, WireGuard
Limitations of Bots
There is an inflection point at around 5-10 trading pairs (sometimes less) where monolithic bots break down. They hit practical limits: exchange rate limits, I/O and network saturation, and insufficient parallelism to process and act on high-volume market data quickly enough.
This is compounded by constraints of the underlying operating systems and network infrastructure that can be saturated by the high volume of real-time market data.
What is even worse is that updating or reconfiguring a monolithic bot requires stopping it, either leaving positions hanging or waiting for them, for an unpredictable time, to close.
Moving up from the bot concept to the platform concept, therefore, becomes necessary for any reasonably complex, multi-algorithm, multi-pair use case.
Platform to the Rescue
A trading platform is a distributed system of specialised components that collectively provide a robust, high throughput, low latency environment to run trading strategies and algorithms.
The cloud made setting up and scaling platforms viable and relatively low-cost compared to dedicated, high-speed internet connections and internal networks serving large servers. It is now quite possible to spin up many servers in an AWS data centre close to the exchanges they need to communicate with.
Specialised instances can then initiate separate connections to the exchanges to receive and process the real-time market data feed, spreading it out across several endpoints, staying within rate limits and network capacity.
High-Throughput Event Streaming
When processing large amounts of data in real time, a fast, reliable, high capacity core is required. I used Apache Kafka, a banking-grade event streaming solution, as the core for the platform.
It allowed all components of the platform to get, process and forward data without unnecessary latency, allowing the algorithms to quickly decide when to trade.
By separating the components with Kafka, each could be built against their own requirements, e.g., Rust for the most demanding components with a lot of data flowing through and Python for the algorithms, which only use the pre-digested data.
Delivery and Technical Approach
Architecture Overview
The platform was built as a set of specialised components connected via Kafka:
- Ingestion: horizontally scalable market-feed clients using WebSocket connections to exchanges.
- Normalisation: transforms exchange-specific payloads into internal canonical structures.
- Processing: computes indicators and derived streams from the canonical feed.
- Decisioning: runs trading strategies and capital allocation strategies over the derived streams.
- Execution: submits orders back to exchanges and tracks order lifecycle.
Market data and derived outputs were also persisted to databases for later analysis and reuse.
Latency and Geography
To minimise network latency, ingestion and execution nodes were deployed close to exchange infrastructure, typically in the same AWS region. The design used many low-cost EC2 instances rather than a single large server, spreading load across multiple independent connections ("many pipes") to prevent ingestion congestion and to stay within exchange rate limits.
Reliability and Scaling
Each node was designed to recover independently. If a node fell behind or missed a signal, it could be restarted quickly, and the system would resume with minimal disruption.
Impact
Compared to a prior monolithic deployment, the distributed platform reduced end-to-end latency to ~300ms round-trip from market data ingestion to order issuance, while improving throughput and resilience under multi-pair, multi-algorithm load.