# Session 2026-06-10 — Phase 12: recruiters + knowledge base + demographics

## What's live

### Question bank (the learning loop)
- New tables: `question_bank`, `question_answers`
- Wired into every Playwright applier via `appliers/playwright_base.fill_known_or_guess`
- Precedence: question_bank → canonical matcher → heuristic → unknown
- Every field encountered on every form is logged; every answer used is
  stored; fuzzy-matched (SequenceMatcher ≥0.85) for next-time reuse
- 5 new MCP tools: `list_unanswered_questions`, `list_answered_questions`,
  `teach_question_answer`, `search_question_bank`, `question_bank_stats`
- Bank stays at 0 until first dry-run; then grows fast

### Demographics infrastructure (multi-role profiles)
- New `demographics` table (one row per real human, shared across profiles)
- Profiles gain `demographic_id` FK
- Migrated Michael's current profile data into `demographics` row
  `michael_holleran`:
  - Name: Michael J. Holleran II
  - Address: 6815 Whitetail Ln, Westerville OH 43082
  - Phone, email, LinkedIn, EEO (Male/White/non-veteran/has-disability)
- 4 new MCP tools: `get_demographics`, `set_demographics` (nested-dict-aware),
  `create_role_profile` (spawn sibling profile sharing demographics),
  `list_profiles`

### Executive recruiter sources (NEW source type)
- **D. Hilton Associates** scraper — `dhilton.com/jobs`. ✅ 15 CU exec
  positions ingested live: CIO/CTO/CFO/COO/SVP/VP roles across CUs from
  $182M to $4.3B asset sizes. Brotli compression fix in `utils/http.py`
  (httpx doesn't decompress `br` without the brotli pkg — dropped from
  Accept-Encoding header)
- **DDJ Myers** scraper — `ddjmyers.com/open-positions`. ⚠ Built but
  returns 403 to httpx (Cloudflare bot fingerprinting). Curl works.
  Defer to Playwright-based version next session.

### Recruiter email applier
- New file: `appliers/recruiter_email.py`
- Detects D. Hilton + DDJ Myers URLs
- Auto-fetches contact email from job detail page
- Composes tailored email body with:
  - Subject: "Application: {role} at {company} — Michael J. Holleran II"
  - 4-bullet career summary leaning into KEMBA Financial CU experience
  - Resume PDF attached
  - Prominent link to interactive resume at
    https://genoa-entwuerfe.com/resume/
  - Full signature
- Same dry_run → confirm_token → submit safety rail as ATS appliers
- Sent via existing /usr/sbin/sendmail with envelope sender
  `jobs@genoa-entwuerfe.com` (DKIM/SPF aligned)
- Logs as standard `Application` row, flips job status to `applied` on
  successful send

## Saved answers (from Michael's checklist)

Updated `application_answers` JSON on default profile:
- `notable_achievement`: Signa Sports United US BI Lead bullet
- `why_this_role`: "Given my background in BI leadership across financial
  services and global multinationals I'm looking for a role where I can
  own the process end to end."
- `why_leaving_current_role`: "Can discuss at later time."
- Education: structured + flat + degree aliases for dropdown variants:
  - MBA Finance Specialization · Ashland University · 2007-2008
  - BS Business Administration · UCF · 1996-2000
- `salary_expectation_min`: 160,000 (bumped from 140k)
- `salary_expectation_target`: 180,000 (bumped from 175k)
- `travel_willingness`: "Up to 50%" (bumped from 25%)
- `references_statement`: "Available Upon Request"
- `how_heard`: "Company Website"
- `eeo_*`: Male/White/No/Yes
- `hybrid_onsite_ok`: true
- `portfolio_url`: https://genoa-entwuerfe.com/resume/

## Public assets live
- Interactive resume HTML (Michael's tailored Career-format) served at
  https://genoa-entwuerfe.com/resume/ (currently the simplified version —
  Michael can SCP the full interactive HTML over to replace)

## Counts
- 11 scrapers registered (added dhilton + ddjmyers)
- 7 appliers (added recruiter_email)
- **46 MCP tools** total
- 1 profile with 1 demographics row
- 270 jobs in DB (255 + 15 fresh CU exec roles from dhilton)

## Phase 13 addendum — Playwright fallback infrastructure
- **`AbstractScraper.fetch_html(url)`** — tries httpx first, falls back to
  Playwright on 403/429/503/empty body. Used by any future scraper.
- **`utils/playwright_pool.fetch_html()`** — Playwright fetch helper with
  realistic UA + stealth init script (already masks `navigator.webdriver`)
- **DDJ Myers** tested with new fallback. Result: still 403 — turned out
  to be hard IP-blacklist (datacenter VPS IP banned outright), not a JS
  challenge. Playwright can't fix this without a residential proxy.
- DDJ Myers disabled. Code stays.
- **Future bot-protected sites** with soft detection (Cloudflare JS
  challenges, UA fingerprinting) will pass through automatically with
  zero per-source code.

## Open / next
- **More CU/exec recruiters** — Smith & Wilkinson, JM Search, O'Rourke,
  CUInsight, Mitchell Stankovic, Humanidei. Each ~30 min once we have
  a working Playwright pattern
- **Replace simplified HTML resume** with Michael's full interactive
  version (SCP from his machine when convenient)
- **Live test the recruiter-email applier** — pick a D. Hilton role, run
  `dry_run_apply`, review the composed email, then `submit_apply` to
  actually send. First real submission of the project
- **Start populating question_bank** — first dry-run against a Workday
  job at Nationwide would teach the bank ~30 questions in one shot

## Honest note
Phase 12 was a long session and we built a lot of infrastructure tonight:
question bank + demographics + 2 recruiter scrapers + recruiter email
applier. Most of it hasn't been live-tested yet beyond the dhilton scrape.
Tomorrow's morning digest will pull jobs from dhilton; that's the next
real-world signal.
