
Canvas Medical
Posted 12 months ago
Applied AI Software Engineer
San FranciscoRemoteFull-time
AI Summary
Leading evaluation of AI agents in production, designing and running large-scale tests to measure accuracy, safety, and reliability of LLM-based workflows in healthcare.
About this role
Canvas Medical is the electronic medical records (EMR) and payments development platform for healthcare. We build modern, elegant front- and back-end tooling to enable new ways for developers and clinicians to collaborate to solve healthcare’s toughest challenges. Canvas is institutionally backed by some of the greatest technology investors in the world (funded notable health tech companies such as GoodRx, Oscar Health, and Hims & Hers Health).
The Role
We’re hiring an Applied AI Software Engineer to lead evaluations for agents in development and the post-deployment fleet of agents operating in Canvas to automate work for our customers. You will help develop agents in Canvas using state of the art foundation model inference and fine-tuning APIs along with our server-side SDK. The server-side SDK provides extensive tools and virtually all the context necessary for excellent agent performance. You’ll be responsible for designing and running rigorous evaluation experiments that measure performance, safety, and reliability across a wide variety of clinical, operational, and financial use cases.
This role is ideal for someone with deep experience evaluating LLM-based agents at scale. You’ll create high-fidelity unit evals and end-to-end evaluations, define expert-determined ground truth outcomes, and manage iterations across model variants, prompts, tool use, and context window configurations. Your work will directly inform model selection, fine-tuning, and go/no-go decisions for AI features used in production settings.
You’ll collaborate with product, ML engineering, and clinical informatics teams to ensure that Canvas's AI agents are not only capable, but trustworthy and robust under real-world healthcare constraints. You will also work with technical product marketers and developer advocates to help our broader developer community and the broader market understand the uniquely differentiated value of agents in Canvas.
Who You Are
What You’ll Do
What Success Looks Like at 90 Days
Qualifications
Skills
A/b TestingClaude APIDatabase ManagementData EngineeringEnd-to-end TestingEvaluation HarnessesExperiment TrackingFoundation Model APIsGemini APIGold Standard Outcome DefinitionsGround Truth AnnotationLLM EvaluationModel Evaluation PipelinesModel MonitoringMulti-agent SystemsOpenAI APIPost-deployment GovernancePrompt EngineeringPythonReinforcement Learning From Human FeedbackReproducibilityRetrieval AugmentationRLHFSQLTool IntegrationUnit Testing
Explore related jobs
More jobs at Canvas Medical
Similar A/b Testing jobs
CRO Manager (m/w/d) – Personalisierung & A/B-TestingJobs bei Lautsprecher Teufel GmbH · Berlin
Google Ads / SEA Marketing ManagerIn für Growth Hacking (m/w/d) | A/B-Testing | Performance MarketingNetzproduzenten® GmbH · Dresden, Germany
Vactor/Vacall Combination Truck DriverB&B Contracting Group · Surrey, Canada
Jobs in San Francisco
Apprentice Carpenter | High End Residential GCMatarozzi Pelsinger Builders · San Francisco, California
Journeyman Carpenters | High End Residential GCMatarozzi Pelsinger Builders · San Francisco, California
Education Team Operations CoordinatorImmigrant Legal Resource Center · San Francisco, California
Residential Energy Specialist | L1 Site Surveyordcbel Inc · San Francisco, California- Senior Engineering Manager, Object PlatformRippling (Rippling) · San Francisco, Canada
- Senior Engineering Manager, Cloud InfrastructureRippling (Rippling) · San Francisco, Canada