Architecture & Dependencies How services depend on each other across home + farm

The dependency graph for the homelab. When something breaks, walk up the graph to find which upstream is the actual problem โ€” most “X is broken” stories end at DNS, the router, or a specific Proxmox host.

Network path (any client โ†’ service)
flowchart TD
    Client[Client device
Mac / iPhone / iPad] NetBird[NetBird mesh
WireGuard P2P + relays] OPNsense[OPNsense router
192.168.8.1] Pihole[Pi-hole DNS
CT102 192.168.8.53] Unbound[Unbound recursive
CT100 :5335] Caddy[Caddy reverse proxy
CT103 192.168.8.54] Services[Services on CT100
192.168.8.100] Immich[CT101 Immich
192.168.8.103] Roon[CT105 Roon
192.168.8.105] Client -->|DNS query *.edmd.me| Pihole Client -->|HTTPS request| OPNsense Pihole --> Unbound Unbound -->|forward *.edmd.me| Pihole OPNsense -->|192.168.8.54| Caddy Caddy -->|reverse_proxy| Services Caddy -->|reverse_proxy| Immich Caddy -->|reverse_proxy| Roon Client -.->|off-LAN| NetBird NetBird -.->|via hpve subnet route| OPNsense

When DNS is broken, every .edmd.me URL fails. Check Pi-hole first, then Unbound. When DNS works but a specific service is unreachable, the issue is in Caddy or the service itself.

Home โ†” Farm cross-site
flowchart LR
    subgraph home["Home LAN โ€” 192.168.8.0/24"]
        hpve[hpve
Proxmox] homeCTs[CT100-CT105] studio[Mac Studio] hpve --> homeCTs end subgraph farm["Farm LAN โ€” 192.168.0.0/24"] fpve[fpve
Proxmox] farmCTs[CT100-CT103] ha[Home Assistant
192.168.0.10] fpve --> farmCTs end subgraph mesh["NetBird mesh"] nb[NetBird cloud] end hpve -.->|peer 100.123.31.199
routes 192.168.8.0/24| nb fpve -.->|peer 100.123.49.175
routes 192.168.0.0/24| nb studio -.->|peer 100.123.217.253| nb nb -.->|advertised routes| studio

Failure pattern: when fpve falls off the mesh, the farm LAN becomes unreachable from home โ€” even though fpve itself is fine inside the farm LAN. The dependency is on fpve being NetBird-connected, not on it being alive.

Data flows โ€” dictation + briefing + doc-sync
flowchart TD
    JPR[Just Press Record
Watch / iPhone] iCloud[iCloud Drive] Studio[Mac Studio] Whisper[Whisper large-v3
mlx-whisper] Parse[parse-dictation.py
dash-command parser] Tasks[TASKS.md] Diary[~/Sync/ED/dictation/diary/] Tana[Tana #interaction nodes] JPR --> iCloud iCloud -->|brctl download| Studio Studio --> Whisper Whisper --> Parse Parse --> Tasks Parse --> Diary Parse --> Tana Briefing[daily-briefing
Cowork 04:00] ArrHelper[arr-briefing-data.py] BackupHelper[homelab-backup-status.py] Snapshot[~/.homelab-snapshot.json] Email[Gmail msmtp] Drafts[Drafts note] BriefMd[morning-briefing.md] Briefing --> ArrHelper Briefing --> BackupHelper Briefing --> Snapshot Briefing --> Tasks Briefing --> Diary Briefing --> BriefMd Briefing --> Email Briefing --> Drafts DocSync[doc-sync
launchd 03:00] Key[~/.config/anthropic-api-key] Transcripts[Yesterday's Claude transcripts] Report[~/Sync/ED/.doc-sync-log/.md] DocSync --> Key DocSync --> Transcripts DocSync -->|patches| Tasks DocSync -->|patches| Diary DocSync --> Report

The briefing reaches 5+ data sources. If any single source is unavailable, the SKILL is hardened to print [source unavailable] and continue โ€” no single failure kills the whole briefing.

Implicit dependencies

These don’t show on the diagrams but matter when something breaks:

  • ~/.homelab-snapshot.json is written by com.bee.homelab-snapshot.plist every few minutes. If the launchd job is dead, the snapshot ages out and the briefing’s homelab-health section falls back to live SSH (which is slower and brittle if pve is busy).
  • Time synchronization (NTP) matters everywhere โ€” backup timestamps, log alignment between hpve and Mac Studio, doc-sync’s “yesterday” calculation. macOS handles it automatically; pve uses systemd-timesyncd.
  • Cloudflare is implicit upstream for DNS resolution outside the LAN, wildcard TLS cert issuance for Caddy, and the edmd.me domain. A Cloudflare account outage breaks new cert renewals (existing certs survive for ~30 days).
  • Anthropic API is implicit upstream for doc-sync, the briefing’s prompt assembly, dictation parsing, and most of Cowork. An outage cascades to most automation.
  • Syncthing relay infrastructure (run by the Syncthing project) is implicit upstream when home and MacBook can’t establish a direct P2P link. Default-on, free, but if it goes down sync degrades silently.
When X breaks, look upstream
Symptom Walk up to
*.edmd.me cert errors Caddy โ†’ Cloudflare wildcard cert state
All .edmd.me unreachable Caddy โ†’ DNS path (Pi-hole โ†’ Unbound)
One specific .edmd.me URL fails Caddy’s Caddyfile (port mapping for that subdomain)
Whole LAN slow OPNsense โ†’ ISP
Off-LAN can’t reach anything NetBird mesh โ†’ device’s NetBird client state
ha-mcp dead fpve NetBird state โ†’ farm internet โ†’ fpve uptime
Briefing missing doc-sync auth state โ†’ API key โ†’ Cowork scheduler
Dictation not in diary iCloud sync state โ†’ whisper venv โ†’ parse-dictation.py
Doc-sync producing 300-byte reports API key state (~/.config/anthropic-api-key) โ€” see doc-sync runbook