Designing a Local-First Resume Parser: Architecture Decisions for Edge AI

Posted on Dec 11, 2025

The Problem I Want to Solve

Recruiters pay $0.05-$0.20 per document to cloud parsing services like DaXtra, RChilli, and Affinda. Beyond cost, there’s a privacy issue: candidate data leaves the device and travels to vendor servers.

I want to explore whether browser-based AI can eliminate both problems.

Project Status: Planning Complete, Build Starting

I’ve completed the architecture planning phase:

PRD defining MVP scope
6 Architecture Decision Records (ADRs)
89-task implementation plan across 8 weeks

No production code yet. This post documents the planned architecture, not a working system.

Planned Technical Stack

Layer	Technology	Rationale
UI Framework	Leptos 0.6 (Rust)	SSR + hydration, single-language codebase
Inference	transformers.js + Phi-3-mini	WebGPU acceleration, browser-native
PDF Extraction	pdf.js	Mozilla-maintained, client-side
Storage	IndexedDB + SubtleCrypto	AES-256-GCM encryption at rest
Extension	Manifest V3	Chrome Web Store requirement

Key Architecture Decisions

ADR-001: Leptos Over React

Decision: Use Leptos with SSR + hydration for the extension UI.

Tradeoffs:

Pro: Single Rust codebase (UI + inference bridge + storage)
Pro: Direct WASM memory access, no serialization overhead
Con: ~500KB bundle vs ~100KB React
Con: Smaller ecosystem, fewer examples

I’m betting that unified language benefits outweigh ecosystem maturity for this use case.

ADR-002: transformers.js + Phi-3-mini

Decision: Use transformers.js with Phi-3-mini q4 (~1.5GB quantized).

Why Phi-3:

Optimized for instruction following
1.3B parameters fits browser cache
WebGPU support in transformers.js

Risk: 1.5GB initial download. First-time UX will require patience.

ADR-006: WebGPU Primary, WASM Fallback

Detection logic:

async function selectBackend(): Promise<'webgpu' | 'wasm'> {
  if (navigator.gpu) {
    const adapter = await navigator.gpu.requestAdapter();
    if (adapter) return 'webgpu';
  }
  return 'wasm';
}

Expected performance:

WebGPU: Target < 5s parse latency
WASM SIMD: Target < 30s parse latency

These are targets, not measurements. Actual performance unknown until implementation.

MVP Scope

What I’m planning to build first:

PDF drag-drop upload
Client-side text extraction
Local LLM inference with confidence scores
Human-in-the-loop field correction
Encrypted local storage
JSON export

Not in MVP: LinkedIn overlays, ATS integration, team features.

Open Questions

Question	Status
Phi-3 vs Llama-3.2-1B accuracy	Need to benchmark
Safari fallback strategy	CPU-only, need to test UX
Actual parse latency	Unknown until built

Next Steps

Week 1: Extension scaffold with Leptos popup rendering. I’ll document what actually works vs. what was planned.

Documenting the build at Intent Solutions. jeremy@intentsolutions.io