mpathic logo

Technical Program Manager (TPM) — Red Team Ops Lead

mpathic
Full-time
Remote
United States
Manager

About mpathic

mpathic is building the future of empathetic, trustworthy AI. Grounded in behavioral science and human-centered design, our technology delivers AI systems that are safe, aligned, and emotionally intelligent. As enterprises race to adopt AI, we believe the companies that win will be those that build trust first.

About the Role

We’re building a high-throughput, high-quality AI Red Team to evaluate and strengthen advanced AI systems. As the TPM — Red Team Ops Lead, you’ll design and run the operational engine behind red teaming delivery: staffing and scheduling, workflows and SLAs, QA loops, tooling requirements, and customer-ready reporting.

This role is ideal for someone who thrives at the intersection of technical systems + human operations. You’ll partner closely with Red Team experts, QA, Engineering, and Customer/Delivery to ensure we can ship reliably at scale—without sacrificing quality.


What You’ll Do

Own Red Team Delivery Operations (Core)

  • Run end-to-end delivery for red team engagements across multiple customers/models:
    • project kickoff → scoping → execution → QA → final reporting
  • Define milestones, timelines, and SLAs; track progress and manage risk.
  • Build repeatable operating rhythms: standups, weekly planning, retros, and incident response.

Build Scalable Workflows & Processes

  • Create scalable workflows for:
    • task intake and prioritization
    • expert assignment and capacity planning
    • review/approval gates and escalation paths
    • version tracking and auditability (model versions, prompt variants, test suites)
  • Standardize definitions of done and ensure deliverables are consistent and customer-ready.

Drive QA System Design (with QA Lead)

  • Partner with QA to implement:
    • sampling strategies and review tiers (peer review → QA → escalation)
    • defect taxonomy and severity calibration processes
    • gold set creation + drift monitoring
  • Track quality metrics and continuously improve signal-to-noise.

Tooling & Pipeline Ownership (Requirements + Adoption)

  • Translate operational needs into actionable product requirements for internal tooling:
    • work queues, review interfaces, audit trails
    • tagging taxonomies and attack libraries
    • reporting automation and reproducibility tooling
    • dashboards for throughput and quality
  • Coordinate across Engineering/Product to launch improvements and drive adoption.

Metrics & Performance Management

  • Own dashboards and reporting for delivery performance:
    • throughput per expert/day by task type
    • on-time delivery %, cycle time
    • QA defect rate and agreement rates
    • “actionability” metrics (e.g., customer acceptance or fix rate)
  • Identify bottlenecks and implement process improvements.

Customer Execution Support (as needed)

  • Partner with customer-facing teams to:
    • scope engagements and set expectations
    • ensure deliverables map to customer priorities
    • present results and operational summaries
  • Translate ambiguous customer needs into clear execution plans.


What Success Looks Like (First 60–90 Days)

  • A reliable red team delivery system exists (not heroics):
    • clear SLAs, staffing model, escalation rules, and reporting templates
  • Quality is measurable and improving:
    • consistent severity calibration
    • reduced disagreement and rework
  • Tooling roadmap is defined and at least 1–2 workflow improvements shipped
  • The team can run multiple concurrent engagements without breaking


Required Qualifications

  • 5+ years experience in technical program management, operations program management, or delivery operations roles.
  • Proven ability to run complex programs involving both technical systems and human execution (e.g., evals, annotation, trust & safety ops, QA, vendor ops, or platform ops).
  • Strong process design skills: you can build workflows, SOPs, and operating cadences that scale.
  • Comfortable partnering with engineering teams on tooling requirements and delivery.
  • Excellent written communication: clear specs, crisp status updates, and structured reporting.


Preferred Qualifications

  • Experience with AI evaluation, labeling/annotation ops, trust & safety operations, or security testing programs.
  • Experience building QA systems (review flows, sampling, gold sets, calibration, IRR or proxy metrics).
  • Familiarity with LLM red teaming concepts (jailbreaks, prompt injection, policy probing).
  • Strong analytics skills (SQL/Looker/Sheets) and ability to define practical metrics.
  • Experience working with enterprise customers and delivery constraints.

Core Competencies

  • Operational excellence + calm execution under pressure
  • Systems thinking: scalable processes, not fragile heroics
  • Metric-driven decision making
  • Cross-functional leadership (experts, QA, Eng, delivery)
  • High standards for quality, clarity, and auditability

Team Interfaces

You’ll collaborate with:

  • Red Team Experts (execution, findings quality, capacity needs)
  • QA / Evaluation Lead (review system, severity calibration, drift monitoring)
  • Engineering / Product (tooling requirements, automation, pipelines)
  • Trainer / Enablement (ramp, certification, calibration routines)
  • Customer Delivery / Solutions (optional) (scoping, results, renewals)