Legible Code

Legibility is structural, not surface. Formatting and naming are the visible layer; underneath sits a design — a coherence, a layering, a conceptual integrity. Code is “simple” in Rich Hickey’s sense when it is un-braided: each strand of concern running cleanly alongside the others rather than woven through them. When code feels tangled, hard to test, or hard to change, the cause is almost always that one of these structural conditions has broken down — not local mess.

A legible system lets a reader:

Trace any event through the code as a single spine, from entry to final effect
See the shape of data at each stage, rather than reconstructing it from scattered mutations
Separate decisions from effects — distinguish what the code decides from what it does
Find a vocabulary that mirrors the domain rather than the machine
Rely on module seams that hide design decisions, not process steps
Find each rule defined once, in one place, rather than enforced through repetition

This skill diagnoses which of those conditions has broken down, then loads the matching remedy. Read each diagnostic below, look at the function, module, or system in question, and note the ones that fire. Then read the matching reference file in references/ for the underlying concept and the concrete remedy. Lower-numbered diagnostics are foundations; higher-numbered ones emerge from them. When several fire, address the foundation first. D10 (Scattered Knowledge) is cross-cutting.

Diagnostics

D1. The Scattered Story

Pick any user interaction or system event. Starting from the moment it enters the system, try to trace what happens at each stage until the final side-effect occurs.

You have this problem if:

Understanding one event requires jumping across many distant files
The path goes through implicit callbacks, event emitters, or ambient state mutations
You can’t point to a single place that shows the “outline” of what happens
A new team member would have to open 5+ files to understand one request

→ references/F1-narrative-clarity.md

D2. Invisible Data

As you trace the event path from D1, try to see the data at each stage. What shape does it have? What was added? What was transformed?

You have this problem if:

State is mutated in place — a function modifies an object and returns void
Data is accessed through ambient globals, context objects, or “reach-up” patterns
Data is reconstructed from scattered sources rather than passed explicitly
You can’t describe the type/shape of the data at a given point without reading surrounding code

→ references/F2-data-flow-pipelines.md

D3. Tangled Computation and I/O

Look at any function that makes a decision or performs a calculation. Does it also read from a database, call an API, or write to the filesystem?

You have this problem if:

You can’t test the logic without setting up external services or writing mocks
A single function both decides what to do and does it (reads DB, computes, writes DB)
Business rules are buried inside I/O-heavy functions
A user interaction triggers an untraceable web of immediate side effects — intent and execution are fused together

→ references/F3-functional-core-interpreter.md

D4. Scattered Validation and Boolean Blindness

When data enters your system (user input, API response, file contents), is there a single point where it’s parsed into a well-typed internal representation?

You have this problem if:

You find null checks, type guards, or if (!x) scattered deep in interior logic
Validation functions return boolean — downstream code has no proof of what was validated
The same field is checked for validity in multiple places
Functions deep in the core handle cases like “what if this field is missing?” that should be impossible by that point

→ references/F4-parse-dont-validate.md

D5. Flat Domain / Missing Vocabulary

Look at the core types and modules of your system. Can you see distinct layers of abstraction? At the lowest layer, small precise primitives? At higher layers, richer compound concepts?

You have this problem if:

Your domain is a flat collection of similarly-sized types and functions with no clear hierarchy
You find yourself writing the same 3-4 line pattern in multiple places but there’s no name for it
High-level business functions are written in terms of low-level primitives — there’s no intermediate vocabulary
Reading a top-level function requires understanding all the implementation details beneath it

→ references/F5-growing-a-language.md

D6. Wrong Module Seams

Look at your module boundaries. What does each module correspond to?

You have this problem if:

Modules correspond to process steps (“first we do X, then Y”) rather than design decisions they hide
Some modules are so thin they’re just pass-through wrappers — their interface is as complex as their implementation
Some modules hide multiple unrelated decisions
Changing one module’s internals forces changes in its callers (leaky abstraction)

→ references/F6-deep-modules.md

D7. Premature Splitting or Over-Bundling

You have a concept (a user event, a domain operation, a data type) and you need to decide: should it stay as one thing or be broken into parts?

You have this problem if (splitting too eagerly):

Downstream consumers keep needing context that was lost when the concept was decomposed
Two pieces of data are always passed around together but live in different structures
A split created two things that can’t be understood independently — they only make sense as a pair
A lower-level module has tangled control flow with many branches and heuristics — and when you trace back, it turns out the ambiguity was introduced by an upstream decomposition that stripped away constraining context. The module is pattern-matching its way back to information that was already available before the split

You have this problem if (bundling too much):

Consumers only need one part of a compound concept but are forced to depend on the whole thing
Different parts of a bundled concept change at different rates or for different reasons
Adding a new consumer means they must understand the full bundle even though they only use a slice

→ references/F7-cohesion-semantic-integrity.md

D8. Working Memory Overflow

Read through a function or module. Count the things you must hold in your head simultaneously: variables in scope, branches in flight, implicit state, conventions to remember.

You have this problem if:

The count exceeds 4-5 simultaneous concerns
You find yourself scrolling back up to remember what a variable held
A function has 3+ levels of nesting (if inside if inside loop)
A single loop body handles multiple unrelated transformations
Understanding a line of code requires knowing something that happened in a distant part of the file

→ references/F8-cognitive-load.md

D9. The Extract-Function Dead End

You have a large, unwieldy function or module. You try to improve it by extracting pieces into smaller functions. But afterward, nothing feels meaningfully better — you’ve segmented the code, not simplified it.

You have this problem if:

Your refactoring consists entirely of extracting chunks into helper functions that are only called from one place
The extracted functions take many parameters — they need nearly the full context of the original function to do their work
You can’t name the extracted functions well — they end up as processStep1, handlePartA, or prepareData
After extraction, understanding the code still requires reading all the pieces together — the helpers can’t be understood independently
A file has many small functions, but no types — the module’s vocabulary is entirely verbs (functions), with no nouns (data types) to anchor them

→ references/F9-type-centric-modularization.md

D10. Scattered Knowledge

Pick any convention in your codebase — a naming rule, a directory layout, a policy, a format assumption. Is it defined somewhere, or only followed in many places?

You have this problem if:

A rule is encoded through repetition — many places follow it, no place defines it
Changing the rule requires shotgun edits with no compiler help finding what you missed
A new reader can only learn the convention by inferring from scattered instances

→ references/F10-single-point-of-knowledge.md

D11. Re-Litigated Invariants

Pick a property the code relies on as data flows through it — that an array is sorted, that a record has been enriched, that two fields are in a valid combination (if status==active then expires_at must exist), that a list is non-empty. Is that property established once and carried in the type, or does each consumer independently re-check, branch on a flag, or silently trust?

You have this problem if:

A flag describing the data’s shape ('ascending' | 'descending', 'raw' | 'normalized', 'enriched' | 'unenriched') is threaded through multiple function signatures, and consumers branch on it
Two or more downstream consumers ask the same question about the data and could fall out of sync — one handles both cases correctly, another silently assumes a default
Two fields must be in valid combinations and the relationship is enforced by reading code conventions, not by the type system
The same property is re-sorted, re-checked, or re-derived at multiple call sites
A bug arises because one consumer trusted an assumption that another consumer was responsible for upholding

→ references/F11-lifted-invariants.md