Posted 2 months ago
AI Safety Specialist (AI Engineering)
San FranciscoOn-siteFull-time
AI Summary
AI Safety Specialist focusing on adversarial testing, guardrails, and RLHF alignment to ensure safe deployment of language models and autonomous tools.
About this role
We are searching for an AI Safety Specialist who will play a crucial role in enhancing the security and robustness of language models. You will ensure the safe deployment of AI systems by conducting adversarial testing, implementing protective measures, and aligning AI behavior with ethical principles.
Responsibilities:
- Conduct adversarial testing on LLMs and multimodal agents.
- Implement guardrails and real-time filtering for autonomous tool use.
- Develop constitutional AI principles and assist with RLHF alignment pipelines.
Qualifications:
- Background in cybersecurity, prompt engineering, or adversarial ML.
- Experience with jailbreak taxonomies and automated red-teaming frameworks.
- Strong analytical mindset for identifying edge cases.
Skills
Adversarial MLAutomated Red TeamingAutonomous Tool UsageConstitutional AICybersecurityEdge Case AnalysisGuardrailsJailbreak TaxonomiesLLMsMultimodal AgentsPrompt EngineeringReal-time FilteringRLHF Alignment Pipelines