Red Team Reviewer

mpathic

2 hours ago

Full-time

Remote

United States

Analyst

About mpathic.ai

mpathic is keeping humans safe in the AI era through automated tools and expert datasets that are rooted in psychology and powered by clinicians.

We are a series A start-up backed by Tier 1 investors including Foundry.vc and Next Frontier Capital.

Position Overview

mpathic is looking for a full-time Red Team Reviewer, ideally a candidate with a strong background in LLM Red Teaming, to join our team. The role centers on a confidential initiative focused on AI safety protocols and mental-health policy implementation for large language models (LLMs). You will help design, perform, and review realistic conversational scenarios, red-team model behavior, identify behavioral edge cases, and ensure appropriate recognition of distress or risk in AI-driven interactions. You may also help develop novel psychometrics, rubrics, behavioral taxonomies, evaluation criteria, and qualitative analyses. A strong commitment to safety, clinical ethics, and confidentiality is essential.

This position is open to candidates without technical degrees or licensure who demonstrate commensurate experience working with LLMs and Red Teaming. This role will report to the Red Team Manager.

This role involves roleplaying and reviewing clinical scenarios with AI agents. As such, we are ideally seeking candidates who bring creative or performance-driven strengths, as these competencies enhance the realism, nuance, and emotional depth needed for AI safety testing. Examples of these can include, but are not limited to:

Theatre degrees or studies
Acting, theatre, improv, or voice-over experience

Strong writing skills, especially dialogue or scenario writing

Experience creating or inhabiting characters (e.g., performers, TTRPG roleplay, narrative designers)

Conversational design, interaction writing, or scripted roleplay experience

Participation in gaming, interactive storytelling, or digital communities where roleplay is common

Successful candidates are proactive, reliable, collaborative, and skilled at balancing independent problem-solving with appropriate escalation. Candidates are comfortable navigating ambiguity and building durable systems for onboarding, training, and shared learning as the team continues to grow.. Consistency and communication are key at mpathic.

Key Responsibilities:

Review, design, and roleplay chat experiences with AI agents across diverse clinical and emotional scenarios
Provide feedback on roleplays on the grounds of characterization, realism, and AI model boundary testing
Assess AI model responses for potential risk/safety violations
Help clinicians implement feedback to improve quality of roleplay scenarios

Perform or simulate characters across ages, backgrounds, severity levels, and emotional states (spoken or written)

Collaborate with clinicians to provide a holistic review of AI chat experiences

Conduct qualitative analyses of conversations to derive taxonomies, personas, and behavioral patterns

Translate red team expertise into structured prompt patterns and evaluation rubrics
Maintain proactive, timely communication with the team, including over-communicating when appropriate and demonstrating flexibility in availability and hours based on project needs.

Collaborate with engineering and research teams to define evaluation metrics for tone, realism, AI model behavior, and appropriateness

Identify and document failure cases, risk signals, and edge behaviors

Contribute to scenario modeling, red teaming, and rapid experimentation cycles

Ensure all work adheres to strict confidentiality agreements and NDAs

Implement quality-assurance protocols for conversation and behavioral analysis

Participate in review sessions with engineers, researchers, and clinical consultants, in addition to holding office hours for onboarding and/or continued training of red teamers

Basic Qualifications:

Knowledge of LLM Red Teaming and risk/safety assessment

Demonstrated experience in creative writing, theatre, improv, acting, voice acting, or character-driven roleplay (optional, but preferred)

Interest in NLP, AI, ML, safety evaluation, or speech-signal processing

Strong understanding of mental-health ethics, boundaries, and responsible handling of sensitive data

Ability to telecommute and use Slack, LLM tools (trainable), Google Workspace apps, and other remote-first productivity tools

Comfort with ambiguity, iteration, and emerging technology

Ability to give, take, and integrate constructive feedback

What you’ll accomplish in the first 3 months…

Build fluency in mpathic’s red teaming workflows, safety protocols, confidentiality expectations, and evaluation standards.
Review, design, and roleplay chat experiences with AI agents across a range of emotional, clinical, and risk-sensitive scenarios.
Assess AI model responses for potential safety violations, boundary concerns, missed risk signals, and other failure cases.
Provide clear, actionable feedback to improve the realism, quality, and depth of roleplay scenarios.
Collaborate with clinicians and project leads to understand scenario goals, escalation pathways, and expectations for sensitive content.
Document edge behaviors, model failure patterns, and recurring themes in a structured and consistent way.
Participate in review sessions with engineers, researchers, clinical consultants, and other red teamers.

What you’ll accomplish in the first 6 months…

Contribute to durable systems for onboarding, training, shared learning, and quality assurance across the red team.
Help refine rubrics, evaluation criteria, prompt patterns, and behavioral taxonomies based on observed model behavior.
Conduct qualitative analyses of conversations to identify personas, risk patterns, conversational dynamics, and safety-relevant model behaviors.
Collaborate with engineering and research teams to define evaluation metrics for tone, realism, appropriateness, and AI model behavior.
Support rapid experimentation cycles by identifying failure cases, testing edge scenarios, and translating findings into structured recommendations.
Help clinicians and red teamers improve scenario quality through feedback, office hours, training support, and collaborative review.
Strengthen team communication practices by maintaining proactive updates, surfacing ambiguity early, and over-communicating when appropriate.

Above and Beyond:

Deep experience with high-velocity online communities (e.g., Discord, Reddit, gaming spaces) and narrative roleplay environments that mirror real user interaction patterns.

Background in trust & safety, content moderation, or policy development

Experience with AI/ML in clinical or healthcare settings

Experience with data classification, annotation, or qualitative analysis projects

About the Team

You’ll work with the Red Team Manager, clinicians, researchers, engineers, clinical consultants, and other red teamers to evaluate high-impact AI systems and improve how models respond in sensitive human interactions.

We’re a mission-driven, collaborative team that values precision, clinical judgment, confidentiality, and responsible innovation. We move quickly, communicate clearly, and believe that rigorous human evaluation is essential to keeping people safe as AI becomes more deeply embedded in everyday life.

Additional Requirements:

Must be willing to sign comprehensive NDA, confidentiality agreements, and any other agreements that may be required by the end customer

Comfortable working with sensitive mental health content and in an area of high impact for billions of end-users

Recurrent team meetings and project coordination calls

To Apply

Please submit your resume along with a brief cover letter describing your relevant experience, skills, and interests.

Applications must be submitted through mpathic’s official recruiting portal. Only applications submitted directly by the applicant through our portal will be considered.

Apply now

Red Team Reviewer

More jobs

Bilingual Chinese (Mandarin) Community & Content Moderator

VXI Global Solutions

Head of Product – AI Agent Commerce Platform

Saga