Archetype ECS in TypeScript: why I didn't use bitECS or miniplex

Everyone says "use an ECS." Nobody tells you the real decision is how it stores components. That choice shapes the API, the performance envelope, and the limits you'll hit two years later.

The TypeScript ecosystem has two well-built options. bitECS goes struct-of-arrays: each component field is a TypedArray indexed by entity id. It's the throughput pick - designed for high entity counts (tens of thousands and up), and it serializes efficiently because the bytes are contiguous typed arrays. miniplex goes object-based: entities are plain objects, queries are archetype buckets that re-index on add/remove, and the TypeScript integration is the friendliest I've seen. If you want React bindings out of the box, miniplex 2.x is the one.

I used neither for @flare-engine/ecs. I wrote a third approach - bitmask archetypes plus per-component Maps - and accepted one hard limit to get there. This post is about that trade-off.

What an archetype ECS TypeScript implementation actually decides

An ECS is three things: entities (integer IDs), components (plain data attached to entities), and systems (functions that run over "every entity with components X and Y").

The part libraries differ on is the last word: how do you find all entities with components X and Y?

The naive answer is to scan every entity and check. That is O(n) in entity count and dies at scale. The productive answer is to maintain an index so "give me all entities with Position and Velocity" is nearly free. How you build that index is the design decision.

There are three ways to build that index:

100%

Struct-of-arrays (bitECS). Every field of every component is a typed array indexed by entity ID. Position.x[3] and Position.y[3] are the position of entity 3. CPU cache loves this - you iterate a contiguous block of floats. The cost: your data is parallel arrays, not objects. You can't console.log an entity and see its components. Serialization is a typed-array snapshot. This is the right shape when you're at 10k+ entities and throughput is the primary constraint.

Object archetype buckets (miniplex). Entities are plain objects. miniplex uses an archetype model - archetypes in 1.x, reworked into the .with() / .without() query API in 2.x: entities with the same component set land in the same bucket, so a query iterates one array of objects rather than all entities. Great DX, strong TypeScript narrowing, first-class React integration. The cost: you're iterating object heap references, which is harder for the CPU to prefetch, and GC pressure increases with churn. At hundreds of entities this is invisible. At 50k with frequent add/remove it starts to matter.

Bitmask archetypes plus per-component Maps (flare-engine). Each component type gets a unique integer ID and a bit (1 << id). An entity's current component set is stored as a single number - its archetype bitmask. Component data lives in Map<entityId, data> per component type. "Does entity 3 have Position and Velocity?" is (archetype3 & mask) === mask - one bitwise AND. The cost: Map indirection is not cache-local, and the 32-bit JavaScript number limits you to 32 component types.

The flare-engine code

The whole ECS is eight files. Here are the three that carry the bulk of the design.

Component definition

// packages/ecs/src/component.ts
let nextComponentId = 0;
 
export function defineComponent<T = undefined>(name: string, defaults?: () => T): ComponentDef<T> {
  const id = nextComponentId++;
  if (id >= 32) {
    throw new Error(`Maximum 32 component types exceeded (defining "${name}")`);
  }
  return {
    id,
    bit: 1 << id,
    name,
    create: defaults ?? (() => undefined as T),
  };
}

defineComponent hands out sequential IDs and computes a bit. When id hits 32, 1 << 32 wraps to 1 in JavaScript - a silent collision - so the throw is load-bearing, not defensive. Tag components (no data, just presence) pass no defaults; the returned create() returns undefined.

World - storage and mutation

// packages/ecs/src/world.ts
export class World {
  private archetypes = new Map<number, number>();       // entityId → bitmask
  private componentStores = new Map<number, Map<number, unknown>>(); // componentId → (entityId → data)
 
  add<T>(entityId: number, def: ComponentDef<T>, data?: T): void {
    const arch = this.archetypes.get(entityId);
    if (arch === undefined) return;
 
    let store = this.componentStores.get(def.id);
    if (!store) {
      store = new Map();
      this.componentStores.set(def.id, store);
    }
    store.set(entityId, data !== undefined ? data : def.create());
 
    const newArch = arch | def.bit;          // OR in the bit
    this.archetypes.set(entityId, newArch);
 
    for (const query of this.queries) {      // notify live queries
      query._check(entityId, newArch);
    }
  }
}

The JSDoc on World says explicitly: "Archetype bitmasks enable fast query matching." add ORs the component's bit into the entity's archetype, then calls _check on every registered query. remove ANDs the inverted bit out: arch & ~def.bit. has is (arch & def.bit) !== 0 - a single expression.

Query - the bitmask match

// packages/ecs/src/query.ts
export class Query {
  readonly requiredMask: number;
  readonly excludedMask: number;
 
  private entities: number[] = [];
  private entitySet = new Set<number>();  // O(1) membership
 
  readonly onEntityAdded = new Signal<number>();
  readonly onEntityRemoved = new Signal<number>();
 
  matches(archetype: number): boolean {
    return (
      (archetype & this.requiredMask) === this.requiredMask &&
      (archetype & this.excludedMask) === 0
    );
  }
 
  _check(entityId: number, archetype: number): void {
    const isMatch = this.matches(archetype);
    const wasMatch = this.entitySet.has(entityId);
 
    if (isMatch && !wasMatch) {
      this.entitySet.add(entityId);
      this.entities.push(entityId);
      this.onEntityAdded.emit(entityId);
    } else if (!isMatch && wasMatch) {
      this.entitySet.delete(entityId);
      this.entities.splice(this.entities.indexOf(entityId), 1);
      this.onEntityRemoved.emit(entityId);
    }
  }
}

matches is the whole archetype-query engine. Everything else is bookkeeping.

_check keeps entities[] and entitySet in sync and fires onEntityAdded / onEntityRemoved Signals. Those signals are what let @flare-engine/react subscribe to query membership changes without polling - the React side calls useEffect on the signal and gets notified when enemies spawn or despawn, no manual diffing.

world.query(Position, Velocity) builds a requiredMask by ORing the bits of the passed components and constructs a Query(mask, 0). The .queryBuilder().with(A).without(B).build() path composes requiredMask and excludedMask separately.

Why this shape for this engine

Three reasons.

Readability. world.query(Position, Velocity) reads like English. The underlying data - Map<number, {x: number, y: number}> - prints as a plain object in any debugger or console.log. I spent a significant portion of the engine's early development reading entity state in the Bun REPL. Parallel typed arrays are opaque there.

Reactive queries. The Signal-based onEntityAdded / onEntityRemoved pair means the React layer and the ECS layer share membership information without a polling bridge. When Pan Tvardowski's bullet-spawner adds a Bullet component, every query that requires Bullet reacts immediately - the lifetime/cleanup sweep, the bullet alpha-pulse pass, the devtools snapshot counter. No per-frame scan.

Simplicity of the match. A single AND covers "has all required components and none of the excluded ones." Every query - no matter how many with/without clauses - reduces to the same fixed two-operation test.

The numbers - measured, off-device

The ecs.step microbench runs world.step(dt) with a single move system (read Position + Velocity, write Position) on Bun, off-device. These numbers measure the ECS step in isolation - not on-device frame time, not the full game loop.

Entities	Step cost	Throughput
500	11.51 µs	86.9K ops/sec
1,000	23.7 µs	42.2K ops/sec
2,000	46.70 µs	21.4K ops/sec
5,000	119.68 µs	8.4K ops/sec

The curve is linear - roughly 24 ns per entity, no cliff. Pan Tvardowski runs on the order of hundreds of live entities at peak. At that scale the ECS step is well under a millisecond.

The more honest benchmark is the full integration frame: 1,000 entities plus 300 particles, measured off-device on Bun, costs 9.01 ms - and the ECS step is a small slice of that. On a Samsung Galaxy A54 5G under heavy load (50 enemies, 300 bullets), the game drops to around 48 fps. That cliff is not the ECS. The engine's own analysis points to CollisionWorld.step - physics broadphase, not entity queries - as the bottleneck at that load (~350 bodies: 50 enemies + 300 bullets), where the off-device physics microbench crosses its 500-AABB threshold and the per-step cost extrapolates to most of the frame budget. The ECS was never my bottleneck, so I optimized it for readability rather than for a throughput crown I didn't need.

Caveat on the external numbers. The noctjs/ecs-benchmark suite reports bitECS at ~335K ops/sec (packed_5 scenario) and miniplex at ~109K ops/sec. Those numbers come from a different harness measuring a different operation - I cannot put "23.7 µs" and "335K ops/sec" in the same sentence as if they're comparable. What they do tell you: struct-of-arrays is built for throughput at scale. I wasn't at scale.

Limits - where this choice is wrong

The 32-component ceiling is real and hard. 1 << id on a JavaScript number is 32-bit. Past 32 component types you get a silent bit collision - the throw in defineComponent is the only guard. If your game or simulation has 40+ component types, this architecture forces you to either pack flags into a single component, move to a BigInt multi-word mask, or switch to bitECS. For Pan Tvardowski, the current count is well under 32; I've never approached the wall. But I know where it is.

No cache locality. Map<entityId, data> indirection is the opposite of bitECS's whole point. Each world.get(entityId, Position) is a Map lookup - a hash, a pointer follow, a heap read. At 10k+ entities with a tight system loop this compounds into measurable latency. The envelope for this architecture is hundreds-to-low-thousands of entities. The engine's design notes float a struct-of-arrays path as a possible next step if I outgrow Maps - but it doesn't ship yet. Per-component Maps are the only storage path today. SoA is what I'd reach for if the benchmarks started showing a cliff.

No serialization story. bitECS can snapshot a game state by cloning typed arrays - the bytes are contiguous and typed. Maps of plain objects require a custom serializer. Pan Tvardowski sidesteps this differently: it uses a seeded deterministic RNG for reproducible benchmark runs rather than serializing live state. If you need save-state or netcode out of the box, that's a point for bitECS.

No multi-threading. becsy is the ECS being built around multi-threading - though as of 0.15.x that multi-threaded execution is still on its roadmap, not yet shipped. This one is single-threaded, and makes no attempt otherwise.

The pattern beyond games

Archetype queries are not game-specific. They are indexed sets of things matching a shape. If you have a kanban board with thousands of cards that can be "blocked", "overdue", "assigned", "flagged" - a bitmask-per-card plus a query for "overdue AND assigned AND NOT blocked" is an O(1) per-card match plus a push-based reactive update. The same pattern applies to chat messages with read/pinned/flagged states, to map markers with type/visible/selected/clustered states, to live feeds where items enter and leave "above-the-fold" visibility.

The ECS framing is the cue: when you're managing thousands of homogeneous items and the interesting logic is "do something to all items matching these conditions," archetype queries are a legitimate state model regardless of whether you're rendering sprites or rendering a React list.

The allocation discipline that makes this safe in a hot loop is a separate piece - the zero-alloc game loops post covers why object storage stays harmless in a hot loop only when the move system pools and mutates in place, never allocating inside step. And the Skia Atlas post shows the other end of the frame: the ECS query for renderable entities feeds the SpriteBatch that draws them in a single draw call.