Data Lifecycle — Example: Social Media Photo Post
The Scenario
A user takes a photo on their phone, types a caption, adds a location tag, and posts it to a social media platform. Their followers see it in their feeds. Some like it. One person comments. The post appears in search results. A week later, the user checks how many views it got.
This example is interesting because the data fans out — one input (a photo) creates dozens of downstream data flows touching many different parts of the system.
Step 1: Identify All the Data
The obvious:
- The photo file
- The caption text
- The location tag
- Likes
- Comments
The hidden:
| Data | Why It Exists |
|---|---|
| Photo metadata (EXIF) | Camera embeds date, time, GPS coordinates, camera model, exposure settings into every photo file |
| Multiple photo sizes | The platform doesn't serve the original 12MB file — it creates thumbnail, medium, and full-size versions |
| The follow graph | The system needs to know who follows this user to build their feeds |
| Feed entries for every follower | Each follower's personalized feed needs an entry for this post |
| Notification records | Followers with notifications enabled need to be alerted |
| Search index entries | The caption and location need to be searchable |
| View count | Every time someone sees the post, it's counted |
| Engagement metrics | Likes, comments, shares, saves — each tracked separately |
| Content moderation signals | Automated scan for prohibited content, nudity detection, etc. |
| Ad relevance signals | The platform categorizes the post to match with advertisers |
| User activity timestamp | "Last active" and "posting frequency" updated |
| Privacy settings | Who can see this post? Public? Friends only? Custom list? |
A single photo post touches 15+ data categories.
Step 2: Full Lifecycle Map
Phase 1: Upload and Ingest
| # | Stage | What Happens | Category |
|---|---|---|---|
| 1 | User taps "Post" | Photo file + caption + location sent to server | Transport (phone → server) |
| 2 | Upload received | Raw data held in temporary upload storage | Storage (temporary) |
| 3 | Input validated | File type check (is it actually an image?), file size check (under limit?), caption length check | Transform (validation) |
| 4 | Content moderation scan | Automated analysis for prohibited content | Transform (analysis) |
| 5 | EXIF data extracted | GPS, timestamp, camera info pulled from photo file | Transform (extraction) |
| 6 | EXIF data compared to provided location | If user tagged "Paris" but EXIF says "Tokyo," flag for review | Transform (comparison) |
| 7 | Photo resized | Original → thumbnail (150px), medium (600px), large (1200px) | Transform (image processing) |
| 8 | Photos stored | All sizes stored in file storage (not the database — a separate file system) | Storage (persistent) |
| 9 | EXIF stripped from public copies | GPS and camera data removed from versions served to viewers (privacy) | Transform (redaction) |
| 10 | Post record created | Database record: post ID, user ID, caption, location, timestamp, photo URLs, privacy settings | Storage (persistent) |
Phase 2: Distribution (Fan-Out)
| # | Stage | What Happens | Category |
|---|---|---|---|
| 11 | Follower list retrieved | System looks up everyone who follows this user | Transport (database → distribution service) |
| 12 | Privacy filter applied | Remove followers who are blocked or excluded by privacy settings | Transform (filtering) |
| 13 | Feed entries created | For each eligible follower, a feed entry is generated pointing to this post | Storage (persistent — one entry per follower) |
| 14 | Notification candidates identified | Which followers have notifications enabled for this user? | Transform (filtering) |
| 15 | Notifications dispatched | Push notifications sent to eligible followers | Transport (server → notification service → devices) |
| 16 | Notification delivery logged | For each notification: sent/delivered/failed | Storage (persistent) |
Phase 3: Indexing
| # | Stage | What Happens | Category |
|---|---|---|---|
| 17 | Caption text indexed | Words from caption added to search index | Transform (tokenization) + Storage (search index) |
| 18 | Location indexed | Location added to geographic search | Storage (geo index) |
| 19 | Hashtags extracted and indexed | #sunset, #paris pulled from caption and indexed | Transform (extraction) + Storage (hashtag index) |
| 20 | Post added to user's profile timeline | Post appears on the user's own profile page | Storage (profile index) |
Phase 4: Engagement (Ongoing)
| # | Stage | What Happens | Category |
|---|---|---|---|
| 21 | Follower views post | Post data retrieved and displayed | Transport (server → follower's phone) |
| 22 | View recorded | View count incremented | Transform (increment) + Storage (counter update) |
| 23 | Follower taps "Like" | Like event sent to server | Transport (phone → server) |
| 24 | Like recorded | Like record created (who liked what, when) | Storage (persistent) |
| 25 | Like count updated | Post's like count incremented | Transform (increment) |
| 26 | Post author notified of like | Notification sent to original poster | Transport (server → phone) |
| 27 | Someone comments | Comment text sent to server | Transport (phone → server) |
| 28 | Comment validated and stored | Checked for length/prohibited content, then saved | Transform + Storage |
| 29 | Comment count updated | Post's comment count incremented | Transform (increment) |
| 30 | Post author notified of comment | Notification sent | Transport |
Phase 5: Analytics (Later)
| # | Stage | What Happens | Category |
|---|---|---|---|
| 31 | User checks "insights" | Analytics data aggregated from view counts, like records, comment records | Transform (aggregation) |
| 32 | Insights displayed | Aggregated data formatted and sent to user | Transform (formatting) + Transport (server → phone) |
Step 3: The Fan-Out Problem
This example reveals a pattern the other examples don't: fan-out.
When a user with 10,000 followers posts a photo, the system must:
- Create 10,000 feed entries (one per follower)
- Potentially send 10,000 notifications
- Handle 10,000 potential views, likes, and comments
This is a one-to-many transport and storage problem. The lifecycle of a single post multiplies at the distribution phase.
┌─ Follower A's feed
├─ Follower B's feed
┌──────┐ ┌────────┐ ├─ Follower C's feed
│ Post │──────►│Fan-Out │──────────────├─ Follower D's feed
│ │ │Service │ ├─ ...
└──────┘ └────────┘ └─ Follower N's feed
│
│
▼
┌──────────┐ ┌─ Notification → A
│Notify │─────────────├─ Notification → B
│Service │ └─ Notification → (subset)
└──────────┘
This creates interesting data lifecycle questions:
- Do you create all 10,000 feed entries immediately? Or lazily when each follower opens their app?
- What if a follower opens their feed while the fan-out is still in progress? Do they see the post or not?
- What if the user deletes the post 5 seconds after posting? Can you recall all 10,000 feed entries?
These are design decisions that emerge directly from mapping the lifecycle.
Step 4: Multiple Storage Locations — Same Data
Notice that the same photo exists in multiple forms and multiple places:
| Version | Storage Location | Purpose | Lifetime |
|---|---|---|---|
| Original upload | Temporary upload storage | Processing input | Deleted after processing (hours) |
| Original (full resolution) | Permanent file storage | Backup/recovery, "download original" feature | Forever (or until user deletes post) |
| Large (1200px) | Permanent file storage + CDN cache | Desktop viewing | Forever |
| Medium (600px) | Permanent file storage + CDN cache | Mobile feed viewing | Forever |
| Thumbnail (150px) | Permanent file storage + CDN cache | Grid view, previews | Forever |
Five copies of what started as one photo. Each has a different purpose and potentially a different lifecycle. If the user deletes the post, ALL five must be deleted — plus the CDN caches must be invalidated. Missing any one copy means orphaned data sitting in storage forever.
Step 5: Comparing All Three Examples
| Aspect | Coffee Shop | ATM | Social Media Post |
|---|---|---|---|
| Data flow shape | Linear (order → payment → fulfillment) | Linear with two-phase commit | Fan-out (one post → many feeds) |
| Number of data copies | 1 (the order) | 1 (the transaction) | Many (photo versions, feed entries, index entries) |
| Time sensitivity | Minutes (order should be ready soon) | Seconds (transaction must be instant) | Mixed (post immediately, analytics later) |
| Deletion complexity | Simple (one record) | N/A (transactions are permanent legal records) | Complex (must remove from all copies, feeds, indexes, caches) |
| Who consumes the data | Customer + barista | Customer + bank | Thousands of followers, search engines, analytics |
| Biggest hidden data | Tax config, menu version | Fraud signals, hardware status | Follow graph, EXIF metadata, content moderation |
Key Takeaways From This Example
- Fan-out multiplies the lifecycle — one action can create thousands of downstream data events
- The same data exists in multiple forms — and each form has its own storage, its own lifecycle, and its own deletion requirements
- Indexing is a separate lifecycle stage — making data searchable requires transforming and storing it in additional specialized formats
- Privacy intersects with data lifecycle — EXIF stripping, privacy-filtered distribution, and blocked-user exclusion are all transforms driven by non-obvious data (privacy settings, block lists)
- Analytics are derived data — not stored at the time of action, but aggregated later from atomic records (views, likes, comments)
When mapping a system where one action triggers many reactions, always ask: "How many copies of this data exist, and what happens to all of them when the original changes?"