AI Safety Specialist (AI Engineering)

San FranciscoOn-siteFull-time

AI Summary

AI Safety Specialist focusing on adversarial testing, guardrails, and RLHF alignment to ensure safe deployment of language models and autonomous tools.

About this role

We are searching for an AI Safety Specialist who will play a crucial role in enhancing the security and robustness of language models. You will ensure the safe deployment of AI systems by conducting adversarial testing, implementing protective measures, and aligning AI behavior with ethical principles.

Responsibilities:

Conduct adversarial testing on LLMs and multimodal agents.
Implement guardrails and real-time filtering for autonomous tool use.
Develop constitutional AI principles and assist with RLHF alignment pipelines.

Qualifications:

Background in cybersecurity, prompt engineering, or adversarial ML.
Experience with jailbreak taxonomies and automated red-teaming frameworks.
Strong analytical mindset for identifying edge cases.

Skills

Adversarial MLAutomated Red TeamingAutonomous Tool UsageConstitutional AICybersecurityEdge Case AnalysisGuardrailsJailbreak TaxonomiesLLMsMultimodal AgentsPrompt EngineeringReal-time FilteringRLHF Alignment Pipelines

AI Safety Specialist (AI Engineering)

About this role

Skills

Explore related jobs

Related roles

Browse these categories