Pipeline overview

10 phases · 52 sub-steps · everything that runs, in one view.

intake — runs once every 30 days, from scratch monthly — first run after intake weekly — every Monday biweekly — every Monday + Thursday
PHASE 001

Data Acquisition

intake
Fetch raw Google Maps SERPs, clean them down to a candidate-domain list, write silver. 9 sub-steps.
PHASE 002

Domain Verification

intake
Confirm each candidate domain actually resolves and serves content. Kill the dead ones.
002a
HTTP-verify each domain

Hit each candidate domain over HTTP. Confirm DNS resolves and the server returns content.

HTTP
PHASE 003

Domain Classification

intake
Decide which verified domains are actually personal-injury law firms. Produces gold_domains.
003a
Keyword pre-filter

Cheap string-match: does the domain contain known PI-attorney keywords? Skip the AI on obvious matches.

003b
Playwright fetch

Load each homepage in a real browser. Necessary for JS-heavy sites that don’t render via plain HTTP.

Playwright
003c
Haiku classify

Ask Claude Haiku: “is this actually a PI law firm?” Yes/no per domain.

Anthropic Haiku
003d
Promote → gold_domains

Insert the Haiku-confirmed PI firms into the authoritative gold_domains table.

PHASE 004

Specialties

intake
Tag each gold firm with practice areas. Closes out intake.
004a
Detect practice areas

Scrape each firm’s site for the practice areas they advertise (auto accidents, slip-and-fall, med-mal, etc).

004b
Merge into gold_domains

Write the detected specialty tags back into the gold_domains row for each firm.

PHASE 005

OnPage

monthly
Crawl every page of every firm’s website, then decompose the crawl into 9 downstream tables.
005a
DFS deep crawl

Crawl every page of every firm’s site: HTML, headers, status codes, structure. The heavy expensive step.

DFS OnPage
005b
OnPage summary

Summarize each crawl: page count, average load time, error counts. Derived from 005a.

005c
Per-page detail

Per-URL detail row: title, meta description, H1, word count.

005d
Link graph

Internal & external link graph extracted from the crawl.

005e
Duplicate tags

Find pages that share <title> or <meta description> — SEO duplicate-content bugs.

005f
Redirect chains

Find multi-hop redirects and redirect loops.

005g
Non-indexable pages

Find pages blocked from Google (noindex, robots.txt disallow).

005h
Lighthouse audit

Google Lighthouse scores: performance, accessibility, SEO, best practices.

DFS Lighthouse
005i
PageSpeed Insights

Real-user Core Web Vitals (LCP / CLS / INP) from Google’s field-data dataset.

Google PSI
005j
Haiku attorney count

Send each firm’s “Our Team” / “Attorneys” page to Haiku. Count the attorneys.

Anthropic Haiku
PHASE 006

Domain Intel

monthly
Who owns the domain, when it was registered, what tech it’s built on.
006a
WHOIS lookup

Domain registration date, registrar, expiry. Old domains rank better — age is a trust signal.

DFS WHOIS
006b
Tech-stack detection

Detect WordPress / React / jQuery / GA / Facebook Pixel / etc. per domain.

DFS Tech
PHASE 007

Google Business Profile

weekly
The Maps listing for each firm. Reviews and updates change often — refreshed weekly.
007a
GBP info

Basic fields: address, phone, hours, primary category, rating, review count, photo count.

DFS GBP
007b
GBP reviews

Pull every Google review: text, rating, date, author, owner reply.

DFS GBP
007c
GBP updates

Pull every Google Post / update the firm has published.

DFS GBP
PHASE 008

SERP Analysis

monthly weekly
Who outranks each firm on Maps & Search. Drives heatmaps and competitor sets.
008a
Maps SERP grid

Maps SERP from a grid of GPS points around each firm. Source for local-dominator heatmaps.

DFS Maps SERP
008b
Organic SERP

Who outranks each firm on Google organic for their target keywords.

DFS Organic SERP
008c
Local Finder

Google’s expanded local pack — beyond the visible 3-pack.

DFS Local Finder
008d
Autocomplete

Google autocomplete suggestions when typing the firm’s name or category.

DFS Autocomplete
PHASE 009

Backlinks

monthly biweekly
Who is linking at each firm, how much authority they carry. Deep monthly + bulk biweekly.
009a
Backlinks summary

Per firm: total backlinks, total referring domains, domain rank.

DFS Backlinks
009b
Backlinks live

The full list of live URLs pointing at each firm.

DFS Backlinks
009c
Bulk Domain Rank

DFS authority score per firm. Fast bulk endpoint.

DFS bulk
009d
Bulk total backlinks

Total backlinks count per firm. Fast bulk endpoint.

DFS bulk
009e
Bulk spam score

DFS spam score per firm. Flags low-quality backlink profiles.

DFS bulk
009f
Bulk referring domains

Count of unique referring domains per firm.

DFS bulk
009g
Bulk new / lost referring

Referring domains gained or lost in the last period — who started or stopped linking.

DFS bulk
009h
Bulk indexed pages

Total pages of each firm’s site indexed by Google.

DFS bulk
PHASE 010

Keywords

monthly weekly biweekly
What every firm ranks for, what they could rank for, where their traffic comes from.
010a
Ranked keywords

Every keyword each firm ranks for: position, volume, traffic estimate, CPC.

DFS Labs
010b
Bulk traffic

Estimated organic traffic per firm. Fast bulk endpoint.

DFS bulk
010c
Domain rank overview

High-level organic visibility score per firm.

DFS Labs
010d
Bulk keyword difficulty

Difficulty scores for the tracked keywords.

DFS bulk
010e
Related keywords

Semantically related keywords per tracked term.

DFS Labs
010f
Keyword suggestions

Autocomplete-style expansions per tracked term.

DFS Labs
010g
Keyword ideas

Long-tail keyword ideas adjacent to the tracked set.

DFS Labs
010h
Categories for domain

Topical categories each firm covers (auto law / family law / personal injury / etc.).

DFS Labs
010i
Keyword overview

Overall keyword performance summary per firm.

DFS Labs