A long-form frontend case study built around a kanban board: nested versus normalised data, API sketches (REST snapshot reads, intent-shaped moves, idempotency and 409 reconciliation), client replica and command-queue thinking, ordering trade-offs from dense integers through LexoRank-style strings, SSE as a transport example, a performance slice on code-splitting and lazy card details (plus perceived performance and where virtualisation and caching belong), local Pragmatic drag-and-drop demos plus a separate accessibility demo (keyboard path, menu, live region), reliability failure modes, and an end-of-article testing rubric. Interactives are mostly browser-local teaching aids—no shipped backend walkthrough—and the piece does not compare state-management or data-fetching libraries; it stays at shapes, contracts, and how things fail under concurrency and bad networks.
Last edited
This guide walks through a kanban-style board application: work items are cards, grouped into columns like Backlog / In progress / Done (as in Trello or Jira). We'll discuss common challenges, patterns, and strategies that address them—and the trade-offs behind those choices.
We'll focus mainly on frontend concerns: how to manage state, how to shape data and APIs, and where things break in production—written for depth. You can use it to prepare for a system design interview, but it also helps you develop a feel for how systems are designed and how to make decisions when you have many options.
Introducing the board application
In practice, a board is a shared place where work is represented as cards in columns. People create and edit cards, and they move them—reordering within a lane or jumping across columns. When several people share a board, you also have to think about concurrent edits, how moves propagate, and what “correct order” means when two gestures land close together.
Products like Jira or Trello made this shape familiar; the ideas transfer, but the hard parts are rarely the rectangle layout. They show up in ordering, retries, collaboration, and the gap between what the UI shows and what the server will accept.
Let's start with something we can touch. The widget below shows columns and ordered cards loaded from a JSON-shaped fixture—drag cards between Backlog, In progress, and Done. It uses Pragmatic drag and drop and local state only (no server), so you see the core UX before we talk about APIs and persistence.
A simple local-only board application using Pragmatic drag and drop, drag and drop cards between columns to see how it works.
Backlog
Setup project structure
Unassigned
Install dependencies
Unassigned
Configure ESLint & Prettier
Charlie Moore
In Progress
Implement user list API
Hannah Smith
Create Board UI
Diana Lopez
Done
Add TailwindCSS setup
Quentin Davis
From here on, it helps to split what you're building into two buckets: functionality (the features users see) and cross-functional requirements (how the system behaves under pressure).
Functionality is the main feature of the application. For our case, we need to render a board, let users create and edit cards, and move cards across columns. We also need to support multiple users working on the same board at the same time.
Cross-functional requirements include reliability, performance (both actual speed and perceived performance), security, accessibility, internationalization, and more. They’re equally important as functionality, because they determine whether the product feels solid in the real world.
For that, your implementation should stay correct when the network is slow or flaky (reliability), when the same move is retried (maybe by another user), when two people move cards close together in time, and when users rely on the keyboard for some interactions instead of only drag-and-drop (accessibility).
If you’re preparing for a system design interview, it helps to practice clarifying questions that narrow scope before you draw boxes: what you must ship first, how big the product can get, and which non-functional bar you’re held to. That framing makes the strategies in later sections easier to pick—you’re choosing for a stated problem, not a generic “board app.”
How this guide walks it
There is no single architecture that is “best” for every product and team. Here I use one path I like because it is easy to understand and to follow—especially on the web: thin UI, clear data shape, then persistence—without pretending it is the only respectable approach.
Complicated systems are easier if you do not start from “everything at once.” The same idea is often called a steel thread: the thinnest slice that still threads through the real boundaries of the system for one important use case, shipped so it holds in production—then you grow from that spine. Jade Rubick’s article is the clearest write-up I know of the term.
For this board application, a reasonable split can be as follows:
Start from fixture data (for example a JSON snapshot in the frontend, or assume you can always fetch it), and render the board faithfully—columns, card order, and the fields the UI needs.
Add moving cards with ordering kept in client state only, so you learn how updates and drag-and-drop boundaries behave before you layer on distributed-system concerns.
Persist moves to the server next, one user at a time, until reload shows the same board—and defer notifications, realtime, and conflict UX until that baseline is boring.
The above approach is simple and focuses on the frontend first. You can work safely on the UI (design + code) and iterate quickly. The benefit is you always have something to show a product owner or user, so you can get early feedback and adjust accordingly. You can design with mock data—avatars, default fallbacks, empty states—assuming the backend will eventually return the shape the UI consumes.
The drawback, on the other hand, is you’re delaying integration, which can bite you later. If the backend can’t handle a case efficiently (or the API contract can’t support the UX you designed), you might have wasted time on UI work. You can also make assumptions that you won’t realize are wrong until much later.
Another split can be (which is usually my favorite if I have to implement it):
Add a card (only with a title), and call a backend service (the backend can decide to use a mock or actually persist to a database)
Refresh the board and you should now see the new card
Edit the card details (e.g. change the title), and call update endpoint (could be updating the in-memory array in the backend)
See the changes applied
And implement more end-to-end integrations (realtime updates, multi-user collaboration, etc.)
The second approach has the same principle as the first one: we delay architecture decisions until we have to. The reason is simple—those decisions are normally hard to reverse. If you start with a non-SQL database, ship with some data already inside it, and then later realize a relational database is a better fit, you end up dealing with downtime and migration work.
So try to delay the decision until later, when you have a better understanding of what you're building, what the constraints are, and what options you have—so you can make better decisions.
Again, there isn't a right or wrong split here—but the goal stays the same:
After that (when you need them): ack / rollback, idempotent operations, reload that matches the server, two clients converging, WIP limits, automation, and the rest.
Designing the UI
For most frontend engineers, UI is the easiest place to start because it’s concrete and close to what we do every day. So let’s begin with the board page and sketch the happy path first: the board has a few columns, each column already has some cards, and the initial fetch succeeds.
From that snapshot, a reasonable first component list looks like:
Component
Role
Board
Name, meta (e.g. total card count), and the overall layout.
Filters
Status, assignee, labels, etc. (if in scope).
Search
Query input + results behavior (if in scope).
Column
Title, card count / WIP badge, drop target.
Card
Title + lightweight metadata (assignee, labels, due date).
UserAvatar
Consistent avatar + fallback rendering.
But that list is still read-only. Once you ask “where does this data come from?” the UI immediately grows:
Can users add/rename/reorder columns?
Are filters built-in, or can users create saved views?
When a card is edited, is it inline, a side panel, or a modal?
And then there’s the part people forget when they only design the happy path.
Loading
Initially, when the page starts loading, it’s safe to assume not every user has perfect network conditions. They might be on 4G, travelling, or switching networks—so a blank page is rarely the right experience. Design for this state explicitly: use a spinner or a skeleton to make it feel fast while you fetch the snapshot.
A loading skeleton for the board. In production, you’d show this while fetching the initial snapshot.
Errors
In the real world, things go wrong—and the UX here is part of the product. Decide how the UI behaves on timeouts, 5xx, and partial failures (retry, toast, inline error, status page).
Error-state simulator
Toggle a “fetch result” to see the UI state you’d ship.
A simple local-only board application using Pragmatic drag and drop, drag and drop cards between columns to see how it works.
Backlog
Setup project structure
Unassigned
Install dependencies
Unassigned
Configure ESLint & Prettier
Charlie Moore
In Progress
Implement user list API
Hannah Smith
Create Board UI
Diana Lopez
Done
Add TailwindCSS setup
Quentin Davis
Empty state
When there are no cards yet, no columns yet, or no results for a search—what should the user do next (create a card, create a column, start from a template)?
Sprint board
Columns are ready — add the first card to get started.
No cards yet. Start by adding one in To do, then move it across columns.
To do
Tip: create cards here first. You can always move them later.
In Progress
Drop cards here
(Once cards exist, this becomes a real drag target.)
Done
Drop cards here
(Once cards exist, this becomes a real drag target.)
Data modeling
Once we have a solid picture of the UI—what we are including, how it loads, and which edge states we actually own—we can reason much more clearly about data modeling. The question becomes: what shape stays correct when cards move, assignees change, and events arrive out of order?
It’s common to first render from a nested snapshot (often what a read API returns), then introduce a normalized client shape when duplication and partial updates become painful.
Entities and relationships
At minimum:
Entity
Role
Board
Id, title, settings (e.g. column ids in order).
Column
Id, board id, title, order among columns.
Card
Id, board id, column id, position (index or sort key), title and other fields.
User
Id, display name, avatar URL, etc.; cards reference an assignee by user id (or null).
Cardinality: a board has many columns; a column has many cards. In the classic model each card belongs to one column at a time. A user may be the assignee on many cards; each card has at most one assignee in the usual model.
ER sketch with attributes (PK underlined) and associations: each Card may reference a User as assignee (0..1 : 1 on the link). Drag entities to rearrange; pan the canvas by dragging empty space (or scroll / controls).
Board
idpk
title
workspaceId
columnIds[] (order)
version?
Column
idpk
boardId (FK → Board)
title
order
Card
idpk
boardId (FK → Board)
columnId (FK → Column)
position | rank
title
assigneeId? (FK → User)
User
idpk
name
avatarUrl
Press enter or space to select a node. You can then use the arrow keys to move the node around. Press delete to remove it and escape to cancel.
Press enter or space to select an edge. You can then press delete to remove it or escape to cancel.
Nested snapshot
The following is a typical response from the server side: columns at the top level, each column embedding its cards array with the fields the UI needs.
Such a data structure is easy to use on the frontend, because we can render components by traversing the tree in a straightforward way: walk the columns array, and for each column walk its cards. The payload lines up naturally with a Board → Column → Card component tree.
JSX
function Board({ data }) { return data.columns.map((col) => <Column key={col.id} data={col} />);}function Column({ data }) { return data.cards.map((card) => <Card key={card.id} data={card} />);}function Card({ data }) { return ( <article> <h2>{data.title}</h2> </article> );}
On the other hand, a nested structure like this makes it harder to update a particular piece of data when the same entity is copied in more than one place. In the example above, both card-2 and card-3 are assigned to Charlie Moore, and Charlie’s profile (including avatar_url) appears twice in the tree. If the user changes their avatar on a profile screen, how do we bring the board back in sync? We can refresh the whole board and re-render everything, or we can search the tree and apply a precise update to every node that embeds that user—but either approach takes extra effort, and as the board grows it is easy to miss a nested copy or to do redundant work.
Stale object identity (“zombie card”)
Flattening is not only about avoiding duplicated embeds (like the same avatar_url on two cards). Another failure mode appears when two parts of the UI think they are editing the same card, but a move uses object spread ({ ...card }) so the column now holds a new object while a detail panel or inspector still holds a reference to the old one. That old object is no longer in your React state tree for the board, yet mutating it (or binding a form to it) looks like it should work — classic identity fork.
When the detail drawer edits a stale object
If a detail view keeps a reference to a card object nested in column state, and the move replaces that slot with { ...card }, typing in the drawer can update a detached object while the board renders the clone — a "zombie" card. Two equal columns, click the title to open the drawer, drag the grip to move between lanes — same Pragmatic drag and drop setup as the other board demos in this article.
Nested + captured ref
To do
Doing
—
Drop here
Click the card title to open the detail drawer. Use the grip to drag between columns (same Pragmatic DnD stack as the board demo).
Normalized client state
A better approach is to flatten the data into a format where we only need to update one place for a given entity, and every usage of that entity is derived from that single source of truth. References between entities are ids, not duplicated objects. In a database or in traditional backend design this is usually called normalisation; on the frontend you will see the same idea in write-ups about normalized state and client caches (for example Redux-style stores). A normalised snapshot that carries the same board might look like this:
Same data as snippets/board-1.json → board-1-normalised.json. Before is true nesting (boxes inside boxes — no link lines). After is flat maps; shared layoutIds let Framer Motion carry columns, cards, and the user into their map rows.
Before — nested DOM
Board
Board 1
Columns and cards are physically nested inside this box
Backlog
card-1
Setup project structure
card-3
Configure ESLint & Prettier
User
Charlie Moore
id 2
In Progress
card-2
Install dependencies
Done
empty
The interactive issue view below runs the same gestures as the problem demo above — open the drawer from the card title, drag the grip between columns — but the client keeps a single cards[id] record and only moves ids between lanes, so the drawer and the board cannot drift apart.
Same flow with a normalised cards[id] map
Keep one record per card id and only move ids between columns; the drawer always edits cards[id], so the lane and the panel stay aligned after a drag.
Normalized + card id
To do
Doing
—
Drop here
Click the title to open the drawer; drag by the grip to change columns. Title edits stay aligned with the lane chip.
This is not an argument that nested trees are always wrong: if every surface always derives from the latest state (for example by stable id and a single lookup), you stay consistent. The bug is real when identity is split — often after a clone — and something kept the pre-fork reference.
Server canonical model vs client view
Even when the client keeps a normalised map for rendering, persistence still belongs to the server. The server decides what is actually stored: which column a card lives in, in what order, and what happens when two moves race. The client is a mirror—it applies patches, shows optimistic UI, and must be able to reconcile when the server says “here is the truth now.”
The normalised JSON above already hints at two different jobs hiding inside “show the board”:
Look up a card by id — In the example, every full card body lives under cards, keyed by id ("card-1", "card-2", …). When a realtime payload or a mutation response says “something changed on card-3,” you jump straight to cards["card-3"] and update one object. You do not hunt through col-1, then col-2, to find which nested array still holds that card. In TypeScript people often write that table as Record<CardId, Card> or cardsById.
Know the order inside each column — In the same example, columns["col-1"].cards is not an array of full card objects anymore; it is ["card-1", "card-3"]: an ordered list of ids for Backlog. To paint that column in the UI, you walk that list in order, and for each id you read the title and assignee from cards[id]. That list is your source of truth for ordering on the client (until the server sends a correction). People often name that structure columnId → orderedCardId[].
Why keep both instead of only the map? If you only had cards and each card carried columnId, you could still render, but every frame you would be asking “which cards belong to Backlog, and in what order?”—usually by scanning all cards. The per-column id list answers that in one pass: you render at most the cards that appear in ["card-1", "card-3"] for Backlog, in exactly that sequence, without touching card-2 in In Progress. When an event names a single cardId, the map still lets you jump to the record in one step and update assignee, title, or flags without walking the whole board.
So: server owns the canonical order and persistence; client holds a table of cards by id plus, for each col-*, an ordered list of those ids—the same split you already see in the normalised fixture, just spelled out as why it is there.
Designing the API
At some point the board has to talk to a real backend, and that is when API design stops being a tidy diagram and turns into a conversation. I have never seen it work well as a solo act—backend and frontend both have constraints worth listening to—but I still think the shape the UI actually consumes (nested vs normalised, field names, how you represent ordering) ought to win when there is tension, because that is what you render and patch every day. We already sketched that side in Data modeling; the API should line up with it instead of fighting it.
Traditionally—and I mean the ordinary, mainstream history of our industry, not an academic detour—most teams still reach for REST-flavoured HTTP + JSON: GET for reads, POST or PATCH for writes, routes that read like resources. Every ecosystem has a DSL that makes the first draft look easy.
Those paths mirror the /api/boards/... style used in the Read and Write reference panels later in this section (including POST /api/boards/:boardId/operations for intent-shaped moves).
What I like to do next is take the operations you already named in Collect constraints—what must exist for v1, how big things can get, whether multiple people are in the same board at once—and map them straight onto endpoints. The interesting work is not the first happy-path sketch; it is what happens when retries, concurrency, and a suddenly enormous backlog show up and your innocent GET is no longer innocent.
Transports
HTTP is still the spine: load the snapshot, submit mutations, and read status codes in DevTools without needing a second mental model. When the product truly needs other people to see moves without hammering refresh, you add something parallel—usually a WebSocket, sometimes server-sent events (SSE)—where the server can push “this operation landed” or a small patch after the fact.
Polling on a timer can be enough when changes are rare and latency is loose. For a shared board, you usually do not know when another collaborator will change something, so you want a push subscription instead of guessing with setInterval. If the client mostly listens and still uses HTTP for writes, SSE is a simple first step; reach for WebSockets when you need symmetric, low-latency messaging both ways.
TypeScript
const es = new EventSource(`/api/boards/${boardId}/events`);es.addEventListener("card-assigned", (event) => { // … handle assignee changes});es.addEventListener("card-updated", (event) => { // … handle card field updates});es.onerror = () => { // Reconnect with backoff, surface a toast, or fall back to polling — production paths vary.};
One mental model: one browser tab performs the REST mutation; the server commits it, then every other tab that holds an open SSE connection receives a push. The diagram below runs a simplified timeline—solid line for the HTTP write, dashed listeners for long-lived streams, then a pulse when the fan-out fires.
REST write, then SSE to everyone else
Topology styled like a dark dashboard graph: card nodes, jewel icon chips, dashed links. POST uses a dedicated bottom port and a U-shaped wire (clear vertical legs) so it never overlaps the flat SSE links. Cyan pulse on write, green on push (muted for the writer echo).
Dashboard-style graph: U-shaped POST from client one to server bottom port, flat horizontal SSE side links, cyan and green pulse dots
Server
Hub · 3 SSE (side) + REST write (U path)
POST inOnline
Client 3
EventSource listener
Listening
Client 2
EventSource listener
Listening
Client 1
REST writer · board tab
POST /api/boards/:boardId/operations
Listening
Press enter or space to select a node. You can then use the arrow keys to move the node around. Press delete to remove it and escape to cancel.
Press enter or space to select an edge. You can then press delete to remove it or escape to cancel.
● HTTP · cyan pulse● SSE · green pulse● Writer SSE · dim (ignored)
Press play: cyan U-path for POST, green on flat SSE wires (muted on the writer tab).
Behind the scenes, these push-style APIs line up with the observer idea: subscribe once, react when the source emits. For a longer walkthrough (with diagrams), see Understanding the observer pattern on this site.
Read: board snapshot
Picture the first load of the board page. A natural first contract is a single read that returns enough to paint columns and cards in one round trip.
GET/api/boards/:boardId
Board snapshot for first paint: metadata, columns, and card rows aligned with your client model.
{ "code": "BOARD_NOT_FOUND", "message": "Board does not exist or you do not have access."}
For 200, the body should resemble what you already committed to in Data modeling. Returning a nested columns-with-cards snapshot early on is not a moral failure; it is often the fastest way to ship, as long as you know how you might evolve toward maps and ordered id lists later.
I would treat errors with the same care as loading and empty states in Designing the UI: decide what timeouts, 5xx, 429, missing resources, and validation failures look like in JSON, not only as bare status codes, so the client can retry or explain instead of showing a generic “something went wrong.” A stable code is worth more than a clever English message you will regret string-matching in six months. The 404 block in the reference above is the pattern—pair each status with a body the UI can branch on.
In an interview, showing that you think past 200—including throttle and dependency failures—is a strong signal.
Pagination and large columns
Once the basic fetch contract is clear, the next lens is scalability: the happy and unhappy paths are handled, but volume forces different tactics.
Scale is a separate dimension from correctness. It assumes core behaviour (including failure modes) is solid, yet the amount of data is large enough that loading everything at once stops being acceptable.
If you are somewhere under “a few dozen to low hundreds” of cards, one round trip that pulls the whole board is often totally reasonable—I would not paginate for the sake of doing it. If you are building for a huge backlog—years of cards, a Done column that never shrinks, or many teams on one enormous surface—you want enough to paint the screen quickly, then load more as the user scrolls or drills into a lane.
A dedicated interactive note on pagination patterns and trade-offs is planned for the patterns section; until then, the sketch below stays focused on how infinite scroll fits a column.
That said, in our particular case in a board application, Cursor-based pagination fits better than offset-based pagination as new items might keep landing at the top of a column while someone is paging. When one column can hold thousands of cards, I would seriously consider splitting reads behind a dedicated route:
As the user scrolls, you typically watch a sentinel node with IntersectionObserver (or a debounced scroll handler) so the next cursor-backed batch loads when the sentinel enters the column viewport.
Column viewport + sentinel (IntersectionObserver)
A fixed-height column clips a longer list. Top and bottom fades hint at overflow; the thin scrollbar shows you can scroll. After you scroll once, bringing the sentinel into view triggers the next “page” of cards.
Scrollable column demonstrating infinite scroll with IntersectionObserver
card 01
card 02
card 03
card 04
card 05
sentinel
Scroll inside the column first; when the bottom sentinel meets the viewport, the observer requests the next batch.
Writes, idempotency, and retries
Reads are socially cheap and easy to implement; writes, on the other hand, are where you discover you were never alone in the system—other users, your own double submits, flaky Wi‑Fi, multiple tabs, and all the ways a response can vanish after the server already committed.
You still need a story for conflicts: editing something someone else is changing, or acting on a row another client already deleted. Those resolutions are often product decisions first; engineering implements the contract you agree on.
Start with the simplest case: a user commits a change, the network drops before the response arrives (even though the server already persisted), and they retry the same action. That retry is still the same intent—you do not want a duplicate irreversible effect just because the acknowledgement was lost.
Idempotency is mainly a backend contract, but the UI can still disable or debounce a control until an in-flight mutation settles so you are not spamming retries before the first response returns.
Let's now talk about the move card operation in our board application. There are many cases for such seemingly simple operation:
move card inside a column (drag one to the top)
move card into a empty column
move card into another column that has card already
We still care about stable ordering; the next section goes deeper. For now, stick with the simplest mental model: an integer position per card within a column.
For example, there are 5 cards in column "To do", and we want to move the bottom one to the second.
This is only an example to show how a seemingly simple gesture can imply a lot of server work. The same idea applies to other domains too: these domain-specific challenges are often the core of system design and good algorithms can make your product materially more efficient than a naive approach.
Moves as operations
The above resource based API works, but there are some limits with it. Either way the frontend can be quite complicated to maintain that order list-think of we have 200 cards in column "Testing". Another tricky scenario to think of is if User A moves a card to the top and User B moves a different card to the bottom at the same time, the person whose request arrives last will completely overwrite the other person's move.
In comparison, we could change the perspective a bit, instead of thinking the change as a resource, we could also think it as an operation, like a command in the Command Pattern.
You aren't telling the server what the state is; you are telling it what the user did. Meaning we can design the move (or other user action) as a command that states intent—often with an operationId so retries stay idempotent, plus fromColumnId / toColumnId when the card crosses lanes. A tiny payload might be only type, cardId, and afterCardId; richer bodies look like this:
POST/api/boards/:boardId/operations
Append-only intents; server applies against current DB state and returns truth (version / patch).
By sending an operationId, you enable the server to be idempotent: if the client retries after a lost response, the server recognizes the ID and avoids double-applying the move. The afterCardId carries intent; the server is allowed to adjust if the neighbour card moved, and it should return enough truth—final order, version—that the client is not guessing in the dark.
When the client posts to that route, it enters a state of Optimistic Uncertainty. To provide a "snappy" experience, the UI moves the card immediately, but it must track this action as pending (via a local queue or status flag) until the server acknowledges it.
On Success (200 OK), the server returns the new version (or seq) of the board. To keep the client in sync, the response should include a minimal patch—the updated order array and any affected ranks—rather than a full, expensive board snapshot.
On Conflict (409 Conflict), the server rejects the intent because the "world moved" while the user was dragging (e.g., the afterCardId was deleted). Instead of a generic error, the 409 response should carry the current server truth for that column. This allows the client to perform a smart reconciliation: it can roll back the optimistic move or "re-base" the card to a valid position without forcing the user to refresh the entire page.
State management
Once data modeling has settled the shapes you render, and API design has settled how you fetch and mutate on the wire, the next slice of frontend architecture is state management: how the client holds a working copy of that data while the UI is open.
Think of it as a small subsystem under the interface. It owns the source of truth the UI reads: which cards exist, where they sit in column order, what is still optimistic or in-flight, and so on. Alongside that read model you expose operations—functions that transition state in response to gestures, timers, or inbound events (SSE, EventSource, focus refresh, etc.). The UI calls those operations; consumers re-render when the store updates.
A familiar starting point is React Context: a provider holds a store value, descendants subscribe via hooks, and any change (someone moved a card, edited a title, or an assignee’s avatar arrived from the server) flows through context so the tree stays aligned.
A context API for a normalised board can mirror the same field names as the snapshot earlier: users, cards, columns, and columnIds — each card still references an assignee by id (assignee: number | null in the fixture, with users keyed by that id as a string).
Keeping the board normalised in the client (maps by id, column order as id lists) avoids duplicating entities in nested trees—the stale identity problem we walked through with the zombie-card demos—and keeps avatars and assignee labels consistent everywhere.
A thin consumer might resolve the card and assignee from the same tables as the JSON example:
You might implement that store with Redux, Zustand, Jotai, or plain React state—this article stays with principles rather than library shopping; a separate walkthrough can compare stacks another day.
The harder part, once the shape and mutators exist, is behaviour under the network: overlapping requests, retries, and optimistic UI that must not stay wrong for long. A useful pattern is to treat the client as holding not only a snapshot of the board but also a log of in-flight intents—a command queue (or “pending operations” layer) that sits between gestures and the API.
Introducing the command queue
On a busy board, a user is not firing isolated clicks; they produce a stream of intents—three quick moves, a rename, an assignee change. If each handler fires an async request and blindly applies whichever response returns last, network reordering can paint an old outcome on top of a newer one: the UI can snap back because a slow request finished after a fast one.
A queue makes each intent a discrete command (id, type, payload, timestamps, status). The rendered state is then something like: last acknowledged server truth, plus deterministic application of pending commands on the client—so you can define rules for ordering (FIFO for the same resource, merge by version / sequence, or block until a prior command settles). The exact policy is a product and protocol choice; the point is that responses are not applied naïvely by arrival time.
That lines up with the operation id and idempotent POST .../operations idea from API design: the queue can reuse one logical id across retries so the server deduplicates instead of seeing a retry storm of duplicate side effects.
Rollback without guessing
When an optimistic move fails, ad-hoc “put it back” code is easy to get wrong—you get ghost cards in the wrong column or half-updated order. A command object can carry an explicit compensating update for the client (or enough data to derive one): if the API rejects the move, the queue runs the inverse for that command instead of reconstructing state from memory.
In practice, if the server world has moved (someone else deleted the anchor card, or a 409 returned new truth), a frozen inverse may be invalid. Then the right move is reconciliation: apply the server patch or snapshot from the error body / SSE, then re-base or drop pending commands that no longer apply. The queue gives you a single place to implement that policy instead of scattering special cases across handlers.
Framing: snapshot plus pending log
One compact mental model is “UI as a transaction log”: the screen shows the result of base state + pending operations, not only a mutable blob. Undo, retry, and optimistic updates become properties of how commands enter, apply, sync, and settle—not three unrelated features.
A coarse lifecycle runs Capture → Apply → Buffer → Sync → Settle → Reconcile. The diagram below is a left-to-right pipeline (pan and zoom on a narrow viewport); the ordered list under the canvas repeats each step in prose.
Command lifecycle (client pipeline)
Read-only sketch — pan empty space or use controls to zoom. Arrows follow the usual order of operations between gesture and settled server truth.
Capture
Gesture → command object (stable id for retries)
Apply
Reducer / mutator updates local state (optimistic)
Buffer
Pending queue + in-flight affordance
Sync
API drain — operationId, backoff, single-flight
Settle
Archive or compensating / reconcile from response
Reconcile
Server or SSE contradicts client → trim pending
Press enter or space to select a node. You can then use the arrow keys to move the node around. Press delete to remove it and escape to cancel.
Press enter or space to select an edge. You can then press delete to remove it or escape to cancel.
Capture — gesture → command object (stable id for retries).
Apply — reducer / mutator updates local state optimistically.
Buffer — command joins pending queue (in-flight UI on card or column).
Sync — queue drains to the API (operationId, backoff, single-flight per command).
Settle — success removes or archives the command; failure runs compensating logic or reconciliation from the response.
Reconcile — when the server or a push channel contradicts the client, refresh the base and trim or rewrite invalid pending commands.
A command can be as small as a typed object like the one below. The undo field (or an equivalent inverse / snapshot) supports client rollback if the API fails before you have fresh server truth—subject to reconciliation if the board has moved underneath you.
TypeScript
type MoveCardCommand = { id: string; type: "MOVE_CARD"; operationId: string; payload: { cardId: string; toColumnId: string; afterCardId: string | null }; /** For client-side rollback if the API fails before server truth arrives */ undo: { cardId: string; toColumnId: string; afterCardId: string | null }; status: "pending" | "syncing" | "failed" | "settled";};
Production systems often persist pending commands (IndexedDB, localStorage) so a refresh mid-flight does not silently drop user intent—that is another step up in complexity worth mentioning only when you need offline or crash recovery.
Optimization strategies
Optimisation on a board is a wide topic; ordering is the piece that usually becomes expensive first. We already touched moves and positions in API design—here we zoom in on how you store order so one drag does not rewrite half the database. Accessibility for the same moves (keyboard, menus, screen readers) sits with performance and reliability in the next major section—this block stays focused on rank representation and the demos below.
If you have shipped drag-and-drop before—Kanban like Jira or Trello, a playlist, or a sortable list in React—the gesture feels trivial: drag, drop, the UI updates. Persistence is where the real question appears: how do you store positions efficiently when the list gets large?
Sequential integer indexing
The most direct approach is a numeric position per item: 1, 2, 3, 4, 5. Render by sorting on that field; easy to implement and to reason about.
Now drag the last item between the first and the second. Its new position becomes 2, so the old 2 becomes 3, then 4, then 5—cascade reindexing. With five items you might touch four rows; with five thousand you might touch 4,999. The logic is correct; the problem is scalability: write cost grows linearly with list size. That can be fine for small surfaces; in large, collaborative boards it is usually unacceptable.
Naive integer positions — cascade reindex
One narrow column, five cards. Positions are implicit in sort order (like 1…5). After a drop, any card that moved to a new slot is highlighted—what a naive per-row position update would touch.
Single column with five draggable cards; dropped order shows naive position rewrites
To do
pos 1Card 1
pos 2Card 2
pos 3Card 3
pos 4Card 4
pos 5Card 5
Drop to append
Try dragging Card 5 onto Card 2 — every card whose slot changes flashes red briefly.
Sparse indexing (gaps between numbers)
A common mitigation is to stop using consecutive integers. Store something like 1000, 2000, 3000, 4000, 5000. To insert between 1000 and 2000, assign a value in the gap—e.g. 1500—and only one row changes. No full cascade.
This is often called sparse indexing or gap-based ordering: leave room so future inserts stay cheap. Many real systems stop here and do well.
The catch is local exhaustion. A common tactic is to keep picking a value between two neighbours—often the integer midpoint: between 1000 and 2000 you might use 1500, then between 1000 and 1500 you get 1250, then 1125, and so on. In one busy stretch the ranks look like 1000, …, 1125, 1250, 1500, …, 2000 rather than a long run of +1 steps. Integers are still discrete, so after enough inserts in the same gap two neighbours become consecutive (e.g. 1500 and 1501) and there is no integer strictly between them anymore. Then you rebalance that segment—e.g. spread back to 1000, 2000, 3000, 4000—which is a batch update. Frequent hot spots mean frequent rebalancing, so you are trading cascades on every move for periodic segment work.
Sparse integer ranks — one row update
Wide pos gaps (1000, 2000, …) so a move usually picks an integer between two neighbours. Same shell as the dense demo—only the moved card flashes when a midpoint exists.
Sparse integer rank ordering demo with drag and drop
To do
pos 1,000Card 1
pos 2,000Card 2
pos 3,000Card 3
pos 4,000Card 4
pos 5,000Card 5
Drop to append
Try dragging Card 5 (5000) before Card 2 (2000)—you should land on a single midpoint (e.g. 1500). Moving Card 4 between Card 2 and Card 3 only changes Card 4 (e.g. to 2500)—Card 5 stays at 5000 because the list order is still valid. Keep splitting the same gap until no integer fits to see the exhaustion message.
The gap demo lets you subdivide until a midpoint no longer exists. When neighbours are consecutive integers, the usual fix is a segment rebalance: rewrite every rank in that slice in one batch so gaps are wide again—same card order, multiple pos updates.
Rebalancing a tight sparse segment
Same five cards, same visual order. When neighbours are consecutive integers, there is no integer strictly between them—so you reassign the whole slice to fresh wide gaps (one batch of writes).
Sparse rank rebalance: consecutive positions then evenly spaced after rebalance
Column slice
pos 1,000Card 1
pos 1,001Card 2
pos 1,002Card 3
pos 1,003Card 4
pos 1,004Card 5
Example: no integer strictly between 1000 and 1001—rebalance (or switch to string keys) to keep inserts cheap.
LexoRank-style string keys (demo)
Integers, even spaced out, still live in a finite space per gap. Another step is to grow the ordering space itself: store a lexicographic rank string per item (often with a bucket prefix and a variable-length fractional part), sort by plain string order, and assign a new token between two neighbours when something moves—usually one row update, like sparse integers, but with much more headroom before you must rebalance.
The interactive demo below uses @dalet-oss/lexorank—an open-source LexoRank-style TypeScript implementation (fork of the older lexorank-ts project). It follows the same idea as Jira’s approach (lexicographic keys, between / genNext-style operations); it is not Atlassian’s shipping binary, so treat it as educational, not a compatibility guarantee.
In the demo, starter ranks are labelled a–e; the real stored values look like 0|…: (bucket, body, sentinel). After you subdivide, the full string is shown.
LexoRank-style keys — one row update
Sort order follows lexicographic pos strings (a…e label the five starter ranks; after a move you see the real key). Same pattern Jira popularised; other products use different encodings (e.g. Figma's fractional indexing).
LexoRank-style ordering demo with drag and drop
To do
pos aCard 1
pos bCard 2
pos cCard 3
pos dCard 4
pos eCard 5
Drop to append
Try dragging Card 5 (pos e) before Card 2 (pos b)—the library computes a new rank between its neighbours (you will see the full bucket|decimal: string). Implemented with @dalet-oss/lexorank (open-source LexoRank-style algorithm; not Atlassian's binary).
As rank strings grow, the space of possible keys grows quickly; rebalance is still needed when a region gets tight or keys get unwieldy—Jira’s buckets exist partly for that. The demo does not model bucket migration or server-side conflict policy.
When LexoRank strings get crowded
A local region can still run out of comfortable midpoints—neighbours too close under the numeral rules, or strings longer than you want. Then you rebalance that segment (in Jira, often involving bucket moves): assign a fresh spread of ranks in a bounded batch. The exact mechanics are implementation-specific; the intent is to open room again without rewriting the whole board.
Compared with sparse integers, LexoRank-style strings usually buy more inserts before that pain, but they do not remove rebalance entirely.
System design takeaway
From a system perspective this is not a party trick—it is write amplification under insert-heavy UIs. Boards are constantly “insert between” operations.
Dense integers — simplest, but cascade reindexing hurts at scale.
Sparse integers — fewer writes per insert, but gaps close and rebalance returns.
LexoRank-style strings (this demo: @dalet-oss/lexorank) — more logic in client and server, but most moves stay O(1) writes and rebalance is comparatively rare. Other string-key schemes (e.g. fractional indexing in Figma’s write-up) trade different details on interleaving, key length, and implementation cost.
The trade-off is familiar: simpler storage versus ranking machinery in exchange for scalability when lists and collaboration grow.
Performance, reliability, and accessibility
This section picks up cross-cutting concerns: shipping less JavaScript up front, how loading states feel, pointer-independent moves, and what breaks when the network or tabs lie to you. Ordering and rank keys stayed in Optimisation strategies above.
Performance
One practical performance pattern for a board is to avoid shipping heavy “card detail” UI in the initial bundle.
The board surface is read-mostly and list-heavy; the detail view is often form-heavy (comments, activity, attachments,
rich inputs). That makes it a good candidate for code splitting and lazy loading: fetch the snapshot and render
cards fast, then load the details UI only when the user opens a card.
Preload on hover (or focus) is a natural next step after that: same lazy chunk, but you start fetching it when the user shows intent — not instead of splitting.
Two layers of the same idea
Preload on hover is not a replacement for code-splitting — it sits on top. Ship heavy UI in a separate chunk first; then, if you want snappier opens, start loading that chunk (and optional data) when the user shows intent.
1 · Code-split + load on demand
Heavy card-details UI lives in its own chunk; the board stays lean. Users who never open a card never pay for that JS.
Baseline win: smaller initial bundle, work starts when they commit (e.g. click).
then
2 · Preload on intent (e.g. hover)
Same chunk — you only change when you kick off import() (and any data prefetch). Intent often overlaps with idle time, so the open feels instant.
Optional enhancement on top of step 1 — not a substitute for splitting.
Lazy-load card details on demand
The board stays lightweight until you open a card. The first open shows a deliberate skeleton (teaching aid), and the header pill flips to show when the chunk is actually loaded.
Details code not loadedChunk evaluated: —
Backlog
In Progress
Done
Tip: with preload enabled, hover a card first — the pill should flip to “loaded” before you click, so the modal opens instantly.
Perceived performance
Actual performance (smaller bundles, fewer round trips, cheaper work per frame) is only half of what users experience. Perceived performance is the other half: does the UI look like it is keeping up—structure in place, motion purposeful, wait states explained—while data and code still arrive?
That is why the loading skeleton in Designing the UI → Loading belongs in the same story as code-splitting above. We sketched the board shell before the snapshot lands so the first paint is not an empty void; the same idea shows up again when a lazy-loaded details panel opens: a short skeleton (or placeholder) bridges the gap until the chunk and data are ready. One pattern, two places: initial board load and on-demand heavy UI.
Treat perceived performance as a first-class requirement in reviews: if a feature is “fast enough” on your machine but still feels broken on a slow network, the product story is incomplete.
Accessibility
Optimisation is not only write patterns and frame budget. Menu moves should call the same moveCard(cardId, fromColumnId, toColumnId, afterCardId) path as pointer drag (see State management). Pragmatic DnD’s accessibility guidelines expect drag to be paired with explicit controls—not replaced by a fragile “keyboard drag” simulation alone. For the menu UI, a headless primitive such as @radix-ui/react-dropdown-menu buys arrow keys, Enter, Escape, and focus handling without reimplementing it.
A common starter wraps each card in a plain <li> and attaches the drag sensor to an inner <div>. That works for the pointer, but Tab never lands on the card and assistive tech gets little context:
The enhanced version labels the list, makes the draggable surface focusable, and gives it an accessible name (title + column). role="group" is one reasonable choice for a card row that also contains a menu button; listbox / option is another pattern if you are modelling selectable items. focus-visible keeps keyboard focus obvious without ringing every mouse click:
After a menu action, we also want to mirror the visual update for screen readers: mount a visually hidden role="status" with aria-live="polite" and aria-atomic="true", set a short string, then clear it so the next change can announce.
Same fixture as the intro board, with keyboard focus on each card, a ⋯ menu (Move to… + Delete card), and a screen-reader status region after actions. Still Pragmatic drag and drop for pointer moves.
Backlog
Setup project structure
Unassigned
Install dependencies
Unassigned
Configure ESLint & Prettier
Charlie Moore
In Progress
Implement user list API
Hannah Smith
Create Board UI
Diana Lopez
Done
Add TailwindCSS setup
Quentin Davis
Without focus and names, Tab skips your cards. Without a menu wired to moveCard, keyboard users have no reliable parallel path. Without a live region, the DOM updates silently for screen reader users.
Reliability
Once you have optimistic UI and a command queue, reliability stops being only “show an error toast.” The board is a replica: the user should either see truth, see explicit pending state, or be nudged to recover—not get stuck believing a move landed when it did not. The patterns below are the ones that tend to separate a polished product from a demo; they line up with operationId, 409, and version from API design and State management.
When the UI moves but the server never did
Ghost moves happen when the client paints a new column for a card, then the request fails silently, times out, or the tab goes away before an ack. The user thinks the work is filed; the backlog on the server still says otherwise. A practical response is a pending timeout: if no success arrives within a bounded window (often on the order of 5–10 seconds, tuned to your p95), snap the card back to its pre-move position and offer Retry—toast or inline—so intent is not lost. Retries must reuse the sameoperationId where the protocol allows idempotency; minting a new key for the same gesture is how duplicate cards slip in.
Competing moves on the same card
Two people (or two tabs) can issue moves that look compatible locally—same destination column, same rough intent—while the server has already committed a different order. Treat the server as the serializing authority: the first commit wins; the loser should receive something like 409 Conflict with enough body to reconcile (current column order, version, or a patch), not a generic “something went wrong.” The client’s job is to merge that truth into the replica and drop or re-base optimistic commands that no longer apply, rather than guessing from stale local order.
Retries and duplicate side effects
Networks retry. Users double-click. If each attempt sends a fresh idempotency key, the server may apply the same move twice. The fix is boring and important: one stable operationId (or equivalent) per user gesture, with server-side dedupe so duplicates collapse to a single effect. That is the same contract we sketched for POST .../operations; reliability here is mostly not forgetting it under load.
Long disconnects and “patch stacking”
A harder class of bug shows up after the laptop sleeps, the WebSocket drops for a long time, or the user returns to a tab that kept applying local optimism while the world moved on. The UI can look fine while version / seq on the server is far ahead. When the channel is healthy again, the client should not only subscribe to new events—it should compare revision counters (or equivalent). If the gap exceeds a product-specific threshold, refetch the board or affected columns and reconcile the pending queue explicitly. Stacking patches forever without ever checking version is how column order silently diverges from production; senior interviews love this failure mode because it is subtle until it is catastrophic.
Testing
In real-world applications, things could go wrong easily by accident: a refactor of column rendering, a tweak to the event reducer, or a “small” API change can reorder cards, drop moves, or fork identity—and you may not notice until a user reports it. Automated tests are the cheapest way to keep existing behaviour stable while you change code: they fail when a regression slips in, instead of shipping it to production first.
They also shorten the loop when something does go wrong. A failing test gives you a reproducible path (inputs → expected state or DOM); fixing the bug and keeping the test locks the fix so the same class of defect is harder to reintroduce. That matters twice over in an agent-assisted workflow: generated diffs are fast to produce but easy to get subtly wrong—tests are a ground truth you can run on every change without re-reading the whole patch by hand.
The table below is a compact checklist for this kind of surface: what to cover at unit, integration, and e2e depth, grouped by concern. It is not a substitute for your team’s full QA strategy; it is a starting point you can map to Jest, Vitest, Playwright, or whatever you already run.
Coverage checklist
By concern and test depth—rename layers to match your codebase.
Category
Level
Requirement
Done when
Functional
unit
Board (or list) rendering from the client model: columns, ordered cards, empty column, assignee/avatar slots.
Tests pin representative output (components or view-model) for a fixture shaped like your snapshot / normalised store.
State
unit
State layer applies moves, optimistic updates, rollbacks, and dedupes by stable intent id (`operationId` or equivalent).
Pure reducers, command appliers, or queue reducers are covered for success, failed ack, retry-with-same-id, and duplicate event.
Realtime
unit
Inbound event reducer respects ordering, dedupe, and gaps (SSE/WebSocket fan-out, rAF batching if you batch).
Tests feed out-of-order, duplicate, and missing events; resulting client model matches a single source of truth.
Contract or MSW-style tests prove the client maps responses without shape drift and surfaces failures to the queue/UI.
Realtime
integration
After disconnect or long idle: reconnect uses `version` / `seq` (or refetch) so local order cannot diverge silently.
Simulated reconnect proves no duplicate cards, no missing moves, and queued ops reconcile or drop under a clear policy.
A11y
integration
Parallel to pointer drag: keyboard/menu moves, focus visibility, and `aria-live` announcements that match optimistic moves.
Automated checks where practical (roles, labels); manual screen-reader pass for timing and wording of status updates.
Visual
e2e
Critical paths under Playwright (or similar): e.g. drag move, menu move, loading skeleton → board paint, error banner.
Visual regression or screenshot baselines in CI stay green for agreed flows; failures triaged as product or flake.
Summary
This walkthrough keeps one kanban board as a through-line: the same product shape carries collecting requirements (including loading, errors, and empty states), data modeling (nested snapshots versus normalised maps and ordered id lists), API design (REST-shaped reads, intentful writes, operationId and 409), state management as a replica plus a command queue and reconciliation, optimisation (especially ordering at scale—integers, gaps, LexoRank-style strings), performance (code-splitting, lazy details, perceived speed, and where heavier read-path tools belong), accessibility (keyboard paths and menus next to drag), and reliability when the network and tabs misbehave. That is steel-thread-like—one concrete use case crossing real boundaries—but it is not the strict steel thread from the literature: a shipped thinnest slice, end to end. Most of what you clicked through here is local-only and pedagogical; the article widens the spine on purpose. When you implement, the slice worth shipping first is still the sequence from the intro: fixture → faithful UI → moves in client state → persistence for one user → then realtime and conflicts—then grow from there. The testing checklist ties these themes to unit, integration, and e2e expectations without dictating a single tool.
What we did not optimise for here: picking a state-management library (Redux, Zustand, Jotai, signals, etc.) or a data-fetching stack (TanStack Query, SWR, Relay, hand-rolled fetch). Those choices matter in real teams, but they change more often than the patterns above; the article stays at the level of shapes, contracts, and failure behaviour so you can map them onto whichever stack you use.
What that implies: the goal is not to ship “the board article’s architecture,” but to recognise recurring decisions in scalable, collaborative frontends—where truth lives, how moves are identified and retried, how order is stored cheaply, and how everyone (pointer, keyboard, screen reader) can complete the same job. When you interview or design a new surface, you can reuse this checklist mindset even when the widgets are not cards and columns.
Local-first & CRDTs — If you outgrow “apply operations in order” and want automatic merging and offline edits, explore Automerge and Yjs as representative CRDT stacks (different trade-offs than OT or server-serialized moves).
Pragmatic drag and drop (this walkthrough’s stack) — The board demos use Atlassian’s Pragmatic drag and drop (@atlaskit/pragmatic-drag-and-drop, GitHub). For accessibility, their docs are explicit that pointer drag alone is not enough: pair it with visible controls (buttons, menus) that achieve the same outcomes, then confirm what happened and keep the user unblocked—see Accessibility guidelines and the broader Design guidelines for how they expect experiences to be structured.
Hey — I’m Juntao
Engineer, Educator, Creator.
Helping developers design and build software — with intention.
I break complexity into structure, then guide the building process so it stays practical and grounded — even when AI is involved.