Founding Abuse Engineer

Arena Intelligence, Inc.

3 hours ago

Full-time

On-site

California, United States

Engineer

About Arena Intelligence

Arena Intelligence is the open platform for evaluating how AI models perform in the real world. Created by researchers from UC Berkeley’s SkyLab, our mission is to measure and advance the frontier of AI for real-world use.

Millions of people use Arena Intelligence each month to explore how frontier systems perform — and we use our community’s feedback to build transparent, rigorous, and human-centered model evaluations. Leading enterprises and AI labs rely on our evaluations to understand real-world reliability, alignment, and impact. Our leaderboards are the gold standard for AI performance — trusted by leaders across the AI community and shaping the global conversation on model reliability and progress.

We’re a team of researchers, engineers, academics, and builders from places like UC Berkeley, Google, Stanford, DeepMind, and Discord. We seek truth, move fast, and value craftsmanship, curiosity, and impact over hierarchy. We’re building a company where thoughtful, curious people from all backgrounds can do their best work. Everyone on our team is a deep expert in their field — our office radiates excellence, energy, and focus.

About the Role

Arena Intelligence is seeking a Founding Abuse Engineer to own platform misuse end-to-end. Arena's evaluations are only as trustworthy as the signal behind them — and that signal is under constant, creative attack. You will build the detection, enforcement, and investigation systems that keep Arena's leaderboards trustworthy, stop automated abuse across our services, and defend against the full spectrum of AI-era harms.

This is a founding builder role. You will set the strategy, write the code, and build the platform that future abuse, integrity, and trust & safety hires grow on top of. You'll work shoulder-to-shoulder with product, infrastructure, model partners, policy, and leadership, and you'll be accountable for outcomes the whole company can see: is the leaderboard clean, are harmful uses caught, are our services safe to ship?

You’ll

Own the abuse vision for Arena: what gets detected, what gets enforced, how fast, and with what false-positive budget
Design and operate detection for bots, sybils, coordinated inauthentic voting, and rating-system manipulation — the integrity of Arena's leaderboards is the product
Build enforcement primitives (rate limits, challenges, shadowbans, account actions, model-side refusals) that are reversible, auditable, and humane
Detect and mitigate inference abuse and cost exploitation at the platform layer
Build jailbreak and multi-provider misuse detection across the models Arena serves, and partner with model-provider trust & safety teams on signal-sharing and escalation
Scope and implement abuse monitoring for every new product launch — web search, web fetch, live site deployment, and whatever's next — as part of the launch checklist, not after the fact
Prototype and mature into production systems of detection, review, and enforcement for the highest-severity harms (CSAM/NCII, violent extremism, self-harm), including the legal reporting pipeline (e.g., NCMEC)
Build internal investigator tooling so policy, on-call, and future T&S analysts can triage incidents without engineering bottleneck
Partner with Security on shared surface — account takeover, credential stuffing, API-key abuse, and the identity/behavioral-signal platform
Partner with policy, legal, and leadership on acceptable-use policy, enforcement escalations, and public-integrity narrative

You’ll have

6+ years of production software engineering experience, including building and operating systems under adversarial conditions
Shipped experience in at least one of: trust & safety, anti-abuse, anti-fraud, anti-spam, integrity, or risk engineering
Strong SQL and data-analysis skills — this role is 30%+ pattern-finding and investigation, not just shipping code
Adversarial and investigative mindset — you can articulate a novel attack before designing the defense, and follow evidence when a novel harm surfaces
High judgment on false-positive cost, user harm, and the reversibility of enforcement actions
Proficiency in a modern backend language (Node.js, TypeScript, Python, or Go)
Excellent communication — you'll build alignment with engineering, product, policy, and leadership routinely

Bonus Experience

Experience with LLM-specific adversarial inputs — jailbreaks, direct and indirect prompt injection, tool-use abuse
Experience with agent safety, browser-automation abuse, or LLM-API abuse
Background in securing voting, rating, reputation, or marketplace platforms against coordinated manipulation
ML or ML-systems experience — feature engineering, online/offline evaluation, label acquisition, drift handling
Experience building investigator or analyst tooling used by non-engineers
Contributions to open-source trust & safety, abuse-detection, or adversarial-ML work
Background in gaming integrity, ad-fraud, or financial-crime engineering at scale

What we offer

We offer competitive compensation and equity aligned to the markets where our team members are based. The base salary range will depend on the candidate’s permanent work location.
Comprehensive health and wellness benefits, including medical, dental, vision, and additional support programs.
The opportunity to work on cutting-edge AI with a small, mission-driven team
A culture that values transparency, trust, and community impact

Come help build the space where anyone can explore and help shape the future of AI.

Arena Intelligence provides equal employment opportunities (EEO) to all employees and applicants for employment without regard to race, color, religion, sex, national origin, age, disability, genetics, sexual orientation, gender identity, or gender expression. We are committed to a diverse and inclusive workforce and welcome people from all backgrounds, experiences, perspectives, and abilities.

Apply now

Founding Abuse Engineer

About Arena Intelligence

About the Role

You’ll

You’ll have

What we offer

More jobs

Research Engineer, Judgment Systems

Variance

Research Engineer, Evals

Variance