Market-Making Bot

This write-up is in its 2nd draft but still has some way to go to become fully clear – send feedback!

My favourite aspect of programming is design, meaning abstracting problems and coming up with creative + novel takes on existing concepts to solve them. The projects that have become my specialty are backend services, like bots, that have narrow scopes, simple overall design, and extremely high reliability – I expect to be able to leave them running without monitoring indefinitely once finished. So to put those two sentences together, I think it’d be cool to write up a bot I made, the one with the most creative design, to show off some of my ideas and philosophies.

The Problem

The goal of the project is to make a market-making bot. The thing we’re trading (some crypto derivative) has a market that’s comprised of limit orders, which name prices they’re willing to buy or sell some quantity of the thing at, and market orders, which meet those demands. Market-making is the provision of limit orders, which are a merit for the exchange because they provide information and stability to the market; this is why limit orders earn commission from the exchange, whereas market orders pay commission to it.

The market has 3 prices, ordered as bid price < market price < ask price. These are respectively the

highest price of any existing offer to buy the product,
the price of the product when it was traded most recently,
the lowest price of any offer to sell.

The strategy is to aggressively chase down the bid and ask prices, i.e. have orders in at those prices, so that any trades in the market happen through us, so we earn lots of commission on the volume. This means always quoting the prices on either buy or sell sides that are the worst for us, so that they hit the trades that happen in the market, with the aim that the commission outweighs the losses on following price trends so closely. The amount of the product we hold - the position – should come out to zero, so our order quantities are always set to balance that, and orders are typically only active on one side at a time. This strategy works well if prices are somewhat stable.

Therefore, we’ll call the bid and ask prices the best prices, since the goal of the algorithm is to optimise either towards the market price. In order to make this clearer, for the rest of the write-up, I’ll only be talking about prices of sell orders – the lowest possible is the ask price, which is always above the market price, and the closer (= lower) the price on our own order is to the ask price, the better.

In order for this algorithm to perform best, we want to track prices as closely as possible, as well as getting in on them as fast as possible – trades are hit first-in-first-out at each price level. This makes the problem a fairly straightforward case of low-latency coding – react as fast as possible to place orders at the ever-changing best price.

Coding Philosophy

I call the core idea of the bot being awaitless. Information comes from two sources – updates on market prices tend to be faster than updates on your orders – and we can exploit that by allowing state to not be strongly-synced. A lot of the algorithm is gonna rely on predicting state from price updates, and resyncing based on order updates. Thus, it needs to be reacting to events (price or order), so have a lightweight event loop and asynchronous event handling, and have no state encoded by where in the execution it is currently.

The way to make sense of info from lots of sources affecting lots of bits of state is by defining explicit contracts on each part of the state, specifying the transitions it can go through, and which part of the algorithm is responsible for them. Combining them all gives emergent behaviour that is tractable, rather than chaotic, since each part of the state is easily understood in isolation.

Finally, the way to ensure the algo is working as intended is to derive strict assertions from the state contracts, and enforce them with background checks when the algo is idle. Error-handling needs to be simple and not bloat the code, so we go for a design that can exit unexpectedly at any time, persisting its state properly, and we throw fatal errors whenever the slightest unexpected behaviour happens, so we the devs can catch it and fix it quickly.

Event Loop

A good basic design of a low-latency trading algorithm is to use two threads, one to read a socket with the exchange, and the other to do logic and send down the socket. This is a good compromise of cutting latency (caused by multithreading) while still having truly asynchronous code. The receiver thread runs handlers that send price updates by editing shared-memory integers, and order updates by writing structs into a ring-buffer, which is the optimal implementation of a data channel from one thread to another, i.e. a lock-free queue. Design-wise, the onus is on the main thread.

Aside – Unique Pointers

A trade-off in C++ is whether to put data structs in the ring-buffer directly, or to replace those structs with a unique pointer. Large blocks of data would use a unique pointer – then they’re only allocated once, but the trouble is that reading from the pointer is indirected and ends up in parts of memory that aren’t contiguous with the active program memory, i.e. the main-thread stack frame. This increases CPU operations and worsens cache/prefetch performance, so latency increases. Since the structs had size 16B, I opted to copy them over the ring-buffer – around 3 allocations, but the data is more readily available to the program once it’s been received.

The main thread is split into 3 event-handlers, which have different latency strata, and a live-loop is polling them over and over again (because electricity is assumed free in low-latency trading lol). For this to work, the handlers have to be guarded by trigger conditions that can be checked instantly. They are:

crit (low-latency): this handler reads the state and acts accordingly, i.e. sends instructions to the exchange. So-called because for most situations, only it is on the critical path. Guarded by checking if the price changed.
chan (medium-latency): this handler syncs order updates to the state, reading them off the ring-buffer. (Order updates are heavier than price updates, the latter of which is just edited into shared memory since it’s small enough to be atomic by default). Guarded by checking for a non-zero difference in ring buffer read/write indices.
steady (high-latency): this handler runs if nothing else is running, verifies the state, and fixes it or exits the algorithm if it can’t. Guarded by a 5s activity timeout.

Aside – Cache Invalidation Triggers

One pattern that I invented, which was weird the first time but a Thing once I’d used it again in a different context, is forcing an action by invalidating the piece of data that guards it. Known as re-crit in this context, there are some situations when a crit run should directly follow a chan or steady, because the state likely changed, so in the async context, we force it to happen by deleting the cached price (which is compared to incoming prices before triggering crit). The cached price exists to save the algo from running the crit logic unnecessarily, to keep the event loop nimble. As for the other context: when I made a spreadsheet formatting algorithm, it was important that the cell that triggered it was forcibly recoloured, alongside others where the colour was now wrong, but I couldn’t recolour everything because each reformat was expensive; hence, I invalidated the “cached” colour of that cell – its current colour, having been read in to check if the colours were wrong – so that it triggered the gate. Same idea.

State #1 – Orders

The algo is most easily understood by looking at the code not procedurally, but by its state contracts. So first, let’s think about orders. These are mostly governed by a state number, and the following transitions.

0 non-existent                  [crit] 0 -> 1: place
1 being placed                  [chan] 1 -> 2: ack new order
                                [chan] 1 -> 0: ack instant fill
2 up                            [crit] 2 -> 2: edit
                                [crit] 2 -> 3: predicted fill
                                [crit] 2 -> 4: cancel
                                [chan] 2 -> 2: ack edit
                                [chan] 2 -> 0: ack unpredicted fill   (usually partial; some full)
3 predicted filled (in full)    [chan] 3 -> 0: ack predicted fill     (most full fills)
4 being cancelled               [chan] 4 -> 0: ack cancel

States 1/3/4 are transient; they only exist while there is some information on what happened, but not all of it. If steady sees a transient state, it kills the algorithm (it also performs checks on stable, i.e. state-0/2, orders). There’s a call-and-response dynamic, whereby the faster part of the code (crit) moves orders into transient states, and the slower (chan) resolves them into stable states.

An order’s lifecycle is typically 0 → 1 → 2 → (3/4 →) 0. State 0 signifies that there’s no order; the order object has stale data (since this is low-latency, we don’t signal this with null pointers but rather in-place). Once crit decides to place an order, it progresses it to state 1. But we can’t manage the order until the exchange gives us an ID for it, so we wait for chan to give us that confirmation, which moves it to state 2. These orders are the main objects; we edit state-2 orders to chase prices without waiting for chan by predicting things. There’s a criterion (explained in the next paragraph) that allows us to predict an order fill (fulfilment) just from a price update – recall that crit is tasked with handling price updates because they’re faster than order updates (which chan handles). This moves the order to state 3. Rarely, crit may cancel the order instead, which moves it to state 4, or it gets filled without crit noticing (so crit had left it at 2). Chan resolves all of these options to state 0.

A key illustration of the point of this system is the fill-prediction mechanic, which is one of the two awaitless aspects of the algorithm. If the market price moves to higher than our sell order’s price, then our order’s price is lower/better than the ask, so crit can assume it was fully filled. Predicted fills give crit information on position and price more quickly, so it can put in an order on the other (buy) side without waiting for chan to acknowledge the fill. Chan is still required for a new order to go in on the same side, in order to resolve the old order, because enforcing a one-order-per-side cap in the exchange enforces safety from leaking orders and simplifies the algorithm (only 2 order objects in memory!).

State #2 – Positions and Prices

These variables have known variants, which reflect what’s been confirmed by the absolute record of sequential order updates as read by chan, and expected variants, where crit modifies them directly based on predictions, and chan resyncs them to equal the known variant.

Position:
  pos_known is only updated from confirmed fills (in chan)
  pos_exp is updated from predicted fills (in crit) and confirmed unpredicted fills (in chan)
  pos_exp is set to pos_known during steady

Position tracking is needed to make sure the quantities in our placed orders are always balancing, so we don’t get a residual position, which is risky. The expected variant of position reacts to predicted fills, one of the two awaitless mechanics, described in the last section.

price_exp:
  a temporary variable tracking the last price we attempted to place a (state-2) order at
  - chan sets it back to the known value whenever an order is confirmed after being placed/edited
  - crit converges it towards the market price (in tandem with order edits)

The other awaitless mechanic is predictive price chasing – the ability to edit orders (of state 2) to track the best price from price updates, without regard to order updates (chan), which are slower. This lets us adapt to improvements in price faster, which is a key aim. For our sell orders (for example), we only edit orders to lower prices, even if we’d rather be offering a different quantity at the same price, because an edit always loses us our queue position at that price, and putting the new quantity in a separate order would violate the simplicity/safety of one-order-per-side – it’s an improvement to contemplate. This shows the same call-and-response style between crit and chan; crit lowers price_exp from the known (= last confirmed) price; chan resets it to the known one once it gets new confirmatory info.

Predictive price chasing lets us react to the market price getting lower by chasing it, while fill prediction lets us interpret the market price getting higher and react by placing an order on the other side.

Emergence

To give some idea of how emergence works, I’ll describe an unanticipated behaviour that led me to introduce price_exp – the state aspect with the most interesting dynamics. The moral of the story is that simple state contracts let us make sense of complicated emergent patterns.

Before price_exp, a pattern emerged whereby a bid/ask would improve, but then start alternating with its original value. This would cause a race – here’s an example that actually happened:

ask = 2; order = 2
ask = 1; we edit order to 1
ask = 2; we edit order to 2
ask = 1; we edit order to 1
first edit confirmed; order = 1
(second edit processed by exchange but we don’t yet know; order is now at price 2)
(fills at price 1 are processed by the exchange; ask moves to 2)
ask = 2; we predict fill (incorrectly)
second edit confirmed; order = 2
third edit confirmed; order = 2

What’s happened here is we caught the first move of the ask price from 2 to the better price of 1, but before it could get hit, we yanked it back to 2 thanks to receiving info that the ask had returned. The ask then oscillated again, after we’d received confirmation of getting into the better price, which caused us to predict it had gotten hit when it once-again returned. At the same time, we tried to follow this second oscillation, unsuccessfully as evidenced by the third edit confirmation bearing the worse price – this is because the exchange snaps orders that go in at better-than-best prices back to the best price. The unresolved transient-state (predicted-fill) order led to steady crashing the algorithm 5s later.

My response was to impose the restriction that order edits should only be made if a price improves (gets lower, in our sell example), and price_exp is there to track our most recent attempted price, which blocks any edits to prices worse than itself. In this case, this means we’d only have done the first edit, so the fill would’ve happened and been predicted correctly. The idea is sound theoretically because price_exp, though it may not equal the price that gets confirmed for that order in the end, is always lower than it (in the sell example), so since fills are predicted from the known (last-confirmed) price, the prediction will be correct regardless of whether the edited order was placed before or after the price-movement and corresponding fills. It’s only in the really-precise hypothetical whereby the fill happens while the order is removed but before it can be replaced by the exchange that a fill is mispredicted and the algorithm errors out.