F2. Data Flow Primacy and Data Mapping Pipelines
Diagnoses: D2. Invisible Data Related fixes:
- F1 (Narrative Clarity) — visible data flow is what makes a narrative spine readable
- F8 (Cognitive Load) — named intermediate values are rest stops for working memory
- F9 (Type-Centric Modularization) — the intermediate types of a pipeline are exactly the domain types worth their own modules
The concept
Section titled “The concept”The most important thing to see in a codebase is not the control flow (which function calls which) but the data flow (what data enters, how it transforms, what data exits). Control flow is the skeleton; data flow is the blood.
The principle: every interaction should be immediately converted into a piece of data, and that data should travel visibly through the system, being transformed at each stage, until it reaches an edge where I/O occurs.
This is the philosophical core of functional programming, but it doesn’t require a functional language. It requires a discipline: make data explicit. Name intermediate values. Use types to describe the shape of data at each stage. Let the reader see the data flowing, not just the functions calling.
The Unix pipe philosophy embodies this: each stage takes data in, transforms it, and passes data out. The pipe itself makes the data flow visible.
Technique: Data Mapping Pipelines
Section titled “Technique: Data Mapping Pipelines”When data must undergo multi-step transformations, structure them as pipelines — sequential, declarative stages where each step takes a typed input, produces a typed output, and the intermediate values are named and visible.
Rather than one large block that intermixes multiple transformations on different pieces of data, break it into discrete stages: rawInput → parsed → enriched → validated → command. Each arrow is a named function. Each intermediate value has a type. The reader can understand each stage independently, and the pipeline as a whole reads as a sentence: “we take raw input, parse it, enrich it with context, validate business rules, and produce a command.”
Remedy
Section titled “Remedy”- Convert events/interactions into data structures as early as possible
- Structure transformations as pipelines with named intermediate types at each stage
- Name intermediate data at each transformation stage (don’t nest transformations deeply)
- Prefer returning values over mutating state
- Use types/interfaces to make the shape of data at each stage explicit
- When data must be accessed from multiple places, pass it explicitly rather than storing it in ambient state