Data Lifecycle — How: The Method
Lifecycle Mapping
The core skill is lifecycle mapping — taking any system or feature and tracing what happens to the data from birth to death. You don't need code for this. You need a diagram and the right questions.
The Three-Column Method
For any system, draw three columns:
| Storage | Transform | Transport |
|---|---|---|
| Where data rests | How data changes | How data moves |
Then fill them in for the system you're analyzing. Every piece of data in the system should appear in at least one column. Most data will touch all three.
How to Map a System
Follow these steps in order:
Step 1: Identify the Data
Before mapping anything, list every piece of data the system touches. Don't worry about how it works yet — just name the data.
Ask:
- What does the user provide?
- What does the system store?
- What does the system calculate or derive?
- What does the system output or display?
- What does the system exchange with other systems?
Step 2: Trace Each Piece Through Its Lifecycle
For each piece of data, follow it from birth to death:
- Where is it born? (user types it, another system sends it, it's calculated from other data)
- Where does it live? (memory, database, file, cache, queue)
- What changes it? (validation, calculation, formatting, enrichment)
- Where does it travel? (screen, API, email, another module)
- Where does it die? (deleted, archived, expired, overwritten)
Step 3: Build the Map
Use either a table or a flow diagram.
Table format — best for initial analysis:
| Stage | What Happens | Category |
|---|---|---|
| (describe each step) | (what specifically occurs) | Storage / Transform / Transport |
Flow diagram — best for communicating with others:
- Boxes = storage (data at rest)
- Arrows = transport (data in motion)
- Labels on arrows or diamonds = transforms (data being changed)
┌──────────┐ ┌──────────┐ ┌──────────┐
│ Source │ ──────────► │ Process │ ──────────► │ Dest │
│(storage) │ transport │(transform)│ transport │(storage) │
└──────────┘ └──────────┘ └──────────┘
Step 4: Find the Hidden Data
The most common mistake is forgetting data that isn't obvious. Every system has hidden data. Look for these specifically:
Metadata — data about data. When was this created? Who created it? What version? How many times has it been accessed? Metadata is critical for debugging and auditing, and it's almost always overlooked.
State — the current condition of something. Is this order pending, paid, in progress, or complete? Is this account active or suspended? State is data, and managing state transitions is where most bugs live.
Configuration — data that controls how the system behaves. Tax rates, store hours, feature flags, allowed file types, maximum limits. Configuration is storage that affects transforms.
Logs — a record of what happened. Every transport and transform should produce a log entry. When things break — and they will — logs are how you reconstruct what happened.
Derived data — data that is calculated from other data. A running total, a user's "membership level," an average rating. This data doesn't come from outside — it's created internally through transforms.
The Lifecycle Question Checklist
When analyzing any system or feature, run through these questions:
Storage:
- What data is stored?
- Where is it stored? (and is it more than one place?)
- How long does it persist? (seconds? days? forever?)
- What happens if storage fails or data is lost?
- Who can access it?
- How much data accumulates over time?
Transform:
- What transforms happen to the data?
- In what order?
- What can go wrong at each step?
- Are transforms reversible? (Can you undo them?)
- Are there transforms that happen on a schedule vs. on demand?
Transport:
- Where does data move from and to?
- How quickly must it move? (real-time? batch? eventually?)
- What happens if transport fails? (retry? lose it? queue it?)
- How much data moves at once? (one record? thousands?)
- Is the transport secure? (does it need to be?)
- Who initiates the transport — the sender or the receiver?
If you can answer all of these for a given system, you understand that system deeply enough to build it, debug it, or redesign it.
What a Good Lifecycle Map Looks Like
A complete lifecycle map has these properties:
- Every piece of data is accounted for — nothing appears from nowhere, nothing vanishes without explanation
- Every stage is labeled — you know whether each step is storage, transform, or transport
- Hidden data is included — metadata, state, configuration, and logs are on the map
- Failure points are visible — you can point to each stage and say "if this fails, here is what breaks"
- A stranger could follow it — someone who has never seen the system could read your map and understand the data flow
The following sections present complete worked examples. Study them, then compare them to the test questions. The test will ask you to produce maps at this level of detail.