CBRN - Adversarial Prompt Expert

Reinforce Labs, Inc.

Part-time

Remote

United States

$40 - $60 USD hourly

Specialist

Job Description

We are seeking adversarial prompt experts with specialized expertise in Chemical, Biological, Radiological, or Nuclear (CBRN) sciences to join our Trust & Safety team. In this role, you will serve as a critical line of defense in ensuring AI models cannot be exploited to produce actionable guidance for high-harm activities. Through rigorous adversarial evaluation, you will systematically probe model behavior to identify and document vulnerabilities related to CBRN content, directly contributing to the safety and integrity of our products.

Work Details

Dual-Use Probing: Execute sophisticated prompts to see if the model can be "tricked" into providing actionable instructions for the synthesis or weaponization of harmful agents.
Tacit Knowledge Extraction: Determine if the model provides "bottleneck" information—the specific lab details that aren't public but are essential for high-risk procedures.
Regulatory Stress-Testing: Evaluate model responses against international non-proliferation standards (e.g., Australia Group, BWC) and chemical/biological safety guidelines.
Reporting: Translate scientific "hits" into clear safety reports for the alignment team to help refine the model's "hard refusals."

Candidate Profile

Subject Matter Expert: You hold deep expertise in your scientific domain and understand the difference between textbook theory and actionable weaponization protocols.
Analytically Precise: You can distinguish between a harmless scientific explanation and a dangerous technical instruction.
Ethically Grounded: You understand the gravity of CBRN risks and are committed to building AI that resists misuse in the physical world.

Qualifications

Education: Master’s degree or PhD in Chemistry, Biology, Physics, Biochemistry, Nuclear Engineering, or a closely related scientific field.
Scientific Literacy: Deep understanding of chemical precursors, pathogen cultivation, or nuclear enrichment logic.
Adversarial Thinking: Ability to use "social engineering" and "scientific masking" (framing a dangerous request as a legitimate research query) to bypass guardrails.
Communication: Ability to explain why a specific output is dangerous to non-scientist engineering teams.

Apply now

CBRN - Adversarial Prompt Expert

More jobs

Rapid Response Specialist - USDS

TikTok

Trust & Safety Subject Matter Expert (SME)

Accenture