Skip to content

Legible Code

Legibility is structural, not surface. Formatting and naming are the visible layer; underneath sits a design — a coherence, a layering, a conceptual integrity. Code is “simple” in Rich Hickey’s sense when it is un-braided: each strand of concern running cleanly alongside the others rather than woven through them. When code feels tangled, hard to test, or hard to change, the cause is almost always that one of these structural conditions has broken down — not local mess.

A legible system lets a reader:

  • Trace any event through the code as a single spine, from entry to final effect
  • See the shape of data at each stage, rather than reconstructing it from scattered mutations
  • Separate decisions from effects — distinguish what the code decides from what it does
  • Find a vocabulary that mirrors the domain rather than the machine
  • Rely on module seams that hide design decisions, not process steps
  • Find each rule defined once, in one place, rather than enforced through repetition

This skill diagnoses which of those conditions has broken down, then loads the matching remedy. Read each diagnostic below, look at the function, module, or system in question, and note the ones that fire. Then read the matching reference file in references/ for the underlying concept and the concrete remedy. Lower-numbered diagnostics are foundations; higher-numbered ones emerge from them. When several fire, address the foundation first. D10 (Scattered Knowledge) is cross-cutting.

Pick any user interaction or system event. Starting from the moment it enters the system, try to trace what happens at each stage until the final side-effect occurs.

You have this problem if:

  • Understanding one event requires jumping across many distant files
  • The path goes through implicit callbacks, event emitters, or ambient state mutations
  • You can’t point to a single place that shows the “outline” of what happens
  • A new team member would have to open 5+ files to understand one request

references/F1-narrative-clarity.md

As you trace the event path from D1, try to see the data at each stage. What shape does it have? What was added? What was transformed?

You have this problem if:

  • State is mutated in place — a function modifies an object and returns void
  • Data is accessed through ambient globals, context objects, or “reach-up” patterns
  • Data is reconstructed from scattered sources rather than passed explicitly
  • You can’t describe the type/shape of the data at a given point without reading surrounding code

references/F2-data-flow-pipelines.md

Look at any function that makes a decision or performs a calculation. Does it also read from a database, call an API, or write to the filesystem?

You have this problem if:

  • You can’t test the logic without setting up external services or writing mocks
  • A single function both decides what to do and does it (reads DB, computes, writes DB)
  • Business rules are buried inside I/O-heavy functions
  • A user interaction triggers an untraceable web of immediate side effects — intent and execution are fused together

references/F3-functional-core-interpreter.md

D4. Scattered Validation and Boolean Blindness

Section titled “D4. Scattered Validation and Boolean Blindness”

When data enters your system (user input, API response, file contents), is there a single point where it’s parsed into a well-typed internal representation?

You have this problem if:

  • You find null checks, type guards, or if (!x) scattered deep in interior logic
  • Validation functions return boolean — downstream code has no proof of what was validated
  • The same field is checked for validity in multiple places
  • Functions deep in the core handle cases like “what if this field is missing?” that should be impossible by that point

references/F4-parse-dont-validate.md

Look at the core types and modules of your system. Can you see distinct layers of abstraction? At the lowest layer, small precise primitives? At higher layers, richer compound concepts?

You have this problem if:

  • Your domain is a flat collection of similarly-sized types and functions with no clear hierarchy
  • You find yourself writing the same 3-4 line pattern in multiple places but there’s no name for it
  • High-level business functions are written in terms of low-level primitives — there’s no intermediate vocabulary
  • Reading a top-level function requires understanding all the implementation details beneath it

references/F5-growing-a-language.md

Look at your module boundaries. What does each module correspond to?

You have this problem if:

  • Modules correspond to process steps (“first we do X, then Y”) rather than design decisions they hide
  • Some modules are so thin they’re just pass-through wrappers — their interface is as complex as their implementation
  • Some modules hide multiple unrelated decisions
  • Changing one module’s internals forces changes in its callers (leaky abstraction)

references/F6-deep-modules.md

You have a concept (a user event, a domain operation, a data type) and you need to decide: should it stay as one thing or be broken into parts?

You have this problem if (splitting too eagerly):

  • Downstream consumers keep needing context that was lost when the concept was decomposed
  • Two pieces of data are always passed around together but live in different structures
  • A split created two things that can’t be understood independently — they only make sense as a pair
  • A lower-level module has tangled control flow with many branches and heuristics — and when you trace back, it turns out the ambiguity was introduced by an upstream decomposition that stripped away constraining context. The module is pattern-matching its way back to information that was already available before the split

You have this problem if (bundling too much):

  • Consumers only need one part of a compound concept but are forced to depend on the whole thing
  • Different parts of a bundled concept change at different rates or for different reasons
  • Adding a new consumer means they must understand the full bundle even though they only use a slice

references/F7-cohesion-semantic-integrity.md

Read through a function or module. Count the things you must hold in your head simultaneously: variables in scope, branches in flight, implicit state, conventions to remember.

You have this problem if:

  • The count exceeds 4-5 simultaneous concerns
  • You find yourself scrolling back up to remember what a variable held
  • A function has 3+ levels of nesting (if inside if inside loop)
  • A single loop body handles multiple unrelated transformations
  • Understanding a line of code requires knowing something that happened in a distant part of the file

references/F8-cognitive-load.md

You have a large, unwieldy function or module. You try to improve it by extracting pieces into smaller functions. But afterward, nothing feels meaningfully better — you’ve segmented the code, not simplified it.

You have this problem if:

  • Your refactoring consists entirely of extracting chunks into helper functions that are only called from one place
  • The extracted functions take many parameters — they need nearly the full context of the original function to do their work
  • You can’t name the extracted functions well — they end up as processStep1, handlePartA, or prepareData
  • After extraction, understanding the code still requires reading all the pieces together — the helpers can’t be understood independently
  • A file has many small functions, but no types — the module’s vocabulary is entirely verbs (functions), with no nouns (data types) to anchor them

references/F9-type-centric-modularization.md

Pick any convention in your codebase — a naming rule, a directory layout, a policy, a format assumption. Is it defined somewhere, or only followed in many places?

You have this problem if:

  • A rule is encoded through repetition — many places follow it, no place defines it
  • Changing the rule requires shotgun edits with no compiler help finding what you missed
  • A new reader can only learn the convention by inferring from scattered instances

references/F10-single-point-of-knowledge.md

Pick a property the code relies on as data flows through it — that an array is sorted, that a record has been enriched, that two fields are in a valid combination (if status==active then expires_at must exist), that a list is non-empty. Is that property established once and carried in the type, or does each consumer independently re-check, branch on a flag, or silently trust?

You have this problem if:

  • A flag describing the data’s shape ('ascending' | 'descending', 'raw' | 'normalized', 'enriched' | 'unenriched') is threaded through multiple function signatures, and consumers branch on it
  • Two or more downstream consumers ask the same question about the data and could fall out of sync — one handles both cases correctly, another silently assumes a default
  • Two fields must be in valid combinations and the relationship is enforced by reading code conventions, not by the type system
  • The same property is re-sorted, re-checked, or re-derived at multiple call sites
  • A bug arises because one consumer trusted an assumption that another consumer was responsible for upholding

references/F11-lifted-invariants.md