LinkedIn logo

Red Team harmful Manipulation Evaluation AI Trainer, $100–$120/hour

LinkedIn
3 hours ago
Contract
Remote
United States
$100 - $120 USD hourly
Trainer

Project Overview:

Join a growing community of professionals advancing the next wave of AI. As an AI Trainer, you’ll play a hands-on role by analyzing and providing feedback on data to improve LLM performance, helping ensure that the next generation of AI technology is accurate and trustworthy.


We are seeking a skilled Behavioral Science, Trust & Safety, or Human-Computer Interaction expert to work as a project consultant in our AI Labor Marketplace. This is not a full-time employment position — you will be engaged as an expert project consultant on a contract basis.


Location: U.S.-based experts only

Engagement: Part-time, project-based expert evaluation work

Work Type: Remote


Project Summary:

Contributors will design adversarial prompts targeting harmful manipulation scenarios, evaluate model responses, and apply structured annotations to assess risk. The work combines behavioral insight, analytical judgment, and structured evaluation, along with peer review responsibilities to support quality and consistency.


Consultant Engagement Terms:

This is a project-based consultant role. Consultants will be paid on a per-project basis; hourly rates are estimates based on anticipated completion time. Consultants control their own schedule, provide their own tools, and may simultaneously provide services to other vendors/employers (subject to those vendors’ allowances).


Responsibilities:

  • Design realistic adversarial prompts reflecting manipulation and influence risks
  • Execute prompts against AI systems and capture outputs
  • Apply structured annotation rubrics to evaluate model behavior
  • Provide clear written justifications for evaluations
  • Review peer submissions for quality and consistency
  • Identify edge cases and nuanced failure modes
  • Incorporate feedback and maintain calibration over time


Expected Outcomes:

  • High-quality adversarial prompt sets
  • Consistent, well-reasoned annotations aligned with rubric standards
  • Constructive peer review feedback
  • Reliable contribution to overall dataset quality and evaluation goals



Qualifications:

  • Background in behavioral science, social psychology, trust & safety, HCI, disinformation research, or related field
  • 3–10+ years of relevant professional or research experience
  • Strong analytical writing and decision-making under ambiguity
  • Experience with AI evaluation, red teaming, or content policy preferred
  • Ability to apply structured guidelines consistently across tasks