Posted 2 months ago

Senior AI Engineer

TallinnRemoteFull-time

AI Summary

Senior AI Engineer builds and ships AI systems powering the product, including recommendation engines, conversational agents, research pipelines, and knowledge-graph enrichment, in a high-autonomy environment.

About this role

About Us

Dragonfly is building the world's first Automated Solutions Architect. We help businesses navigate the complex landscape of modern tools (SaaS, AI, Infrastructure) by using an AI-powered platform that understands their unique context and recommends the optimal tech stack.

Our platform is powered by a proprietary knowledge graph of 230K+ products, 4M+ companies, and the relationships between them — built from 100+ external data sources, LLM-driven research, and human curation. Every user-facing interaction — search, recommendations, diagram building, research — runs through AI systems that we design, build, and operate in-house.

The Role

This is not a research role. This is a building role. You'll be writing the AI systems that power the product — recommendation engines, research pipelines, conversational agents, structured LLM orchestration — and shipping them to production.

We don't separate "AI" from "engineering." The AI systems live in the same monorepo as the product, follow the same engineering practices, and ship through the same CI pipeline. You'll be working alongside product engineers in Python and TypeScript — the same codebase, the same review process, the same standards. If a product change is needed to ship your work — a new API endpoint, a streaming interface, a UI prototype to expose an agent — you make it.

We've built the foundations: a recommendation engine with search, ranking, and requirement analysis; a conversational AI for interactive architecture diagrams; an autonomous research pipeline that enriches our knowledge graph; and structured output schemas for every AI interaction. We need someone to own these systems — improve what exists, build what's next, and push the boundaries of what's possible with applied AI.

We operate a high-autonomy, high-trust environment. You'll be given a problem and the space to solve it — not a task list. We expect you to think beyond the immediate task — consider cost, latency, reliability, and how your work fits into the broader product. Curiosity matters: you should want to understand how everything connects, not just the model you're tuning.

What You'll Do

1. Recommendation Engine

The recommendation pipeline is the core of the product. Given a user's context, it retrieves candidate products, ranks them, analyses them against requirements, and streams results back in real time. You'll own the intelligence behind this.

Evolve how we rank and score products — the models that decide what gets recommended and why
Improve retrieval quality — query generation, embedding strategies, hybrid search
Experiment with new approaches to scoring, filtering, and personalisation
Optimise for cost and latency — these pipelines run on every user interaction

2. Conversational AI & Agents

Architect is our interactive tool where users describe what they need in natural language and the platform researches, plans, and composes architecture diagrams in real time. Safety checks run concurrently. Results stream live.

Extend the conversational AI — new capabilities, better planning, richer context
Improve safety and guardrails systems
Build and ship new agent experiences as the product evolves
Design agent architectures that balance cost, quality, and latency

3. Research Pipeline & Knowledge Graph

Our research pipeline autonomously discovers, enriches, and classifies entities at scale — products, companies, and the relationships between them. It's how the knowledge graph grows and stays current.

Own the research pipeline end-to-end — improve coverage, accuracy, and throughput
Improve research quality — better prompts, validation loops, automated quality checks
Extend to new domains, new data sources, and new entity types
Build human-in-the-loop systems where AI proposes and humans approve

4. Search, Retrieval & Embeddings

Every surface of the product depends on finding the right entities quickly. Search powers recommendations, diagram building, and the product catalogue.

Own and evolve our search and retrieval infrastructure — embedding models, hybrid search, ranking
Evaluate and integrate new embedding models as the field advances
Improve relevance across different search surfaces (recommendations, catalogue, conversational)
Web scraping and data extraction to enrich entities with live information

5. Self-Serving Agents & Internal Tools

We build agents that automate our own workflows — from orchestrating development tickets in parallel to maintaining the knowledge graph via chat commands with mandatory human approval. You'll build more of these.

Identify automation opportunities across the business and prototype solutions
Build internal agents that save the team hours of manual work
Ship proof-of-concept UIs when needed to expose agent capabilities to non-technical stakeholders

6. AI-Native Workflow & Team Enablement

You are expected to keep up with the latest advancements in AI-powered development and proactively bring new tools, techniques, and workflows into the team. This isn't a nice-to-have — it's a core responsibility.

AI coding tools (Claude Code, Cursor, or similar) as your primary development environment
Trial new AI tools and techniques, and evangelise what works
Help the team adopt AI-native practices — pair with engineers, share workflows, raise the bar
Contribute to our AI-powered SDLC practices and tooling

Boundaries (Soft, Not Hard)

Your primary focus is the AI systems, but you're not siloed. If shipping your work means writing a FastAPI endpoint, building a streaming interface, prototyping a React component to demo an agent, or writing a SQL transformation to feed a new feature — you do it. The codebase is a monorepo for a reason.

You won't be the primary owner of the data platform, the frontend, or the infrastructure — but you'll touch all of them when the AI work requires it.

Tech Stack

LLM Frameworks: BAML (structured outputs), Google ADK (agent orchestration), LangChain
Models: Gemini, Claude, Cohere
Infrastructure: GCP, Vertex AI, BigQuery, Firestore, PostgreSQL + pgvector
Search: Hybrid search, embedding models, vector databases
Observability: OpenTelemetry, PostHog, Cloud Logging
Languages: Python (agents, pipelines), TypeScript (webapp, infrastructure)
Monorepo tooling: moonrepo, proto, pnpm, uv
AI tooling: Claude Code, Cursor, Gemini CLI, OpenClaw

Requirements

AI-native. This is the most important requirement. AI writes most of our code — Claude Code, not you, will be producing the Python, structured schemas, and pipeline logic. Your job is to direct it well, understand what it produces, and know when it's wrong. If you're not already using AI coding tools daily to ship real work, this isn't the right role.
LLM systems in production. You've built and shipped LLM-powered systems that real users depend on — not just prototypes or demos. You understand prompt engineering, structured outputs, cost management, latency budgets, and the difference between "it works in a notebook" and "it works at scale."
Applied AI pragmatism. You reach for the simplest approach that solves the problem. You know when a well-crafted prompt beats a fine-tuned model, when retrieval beats generation, and when a rule beats a model. You don't over-engineer.
Software engineering foundations. You write production-quality Python. You understand async patterns, streaming, API design, and testing. You can review and course-correct what AI generates. AI is your copilot, not your replacement — you need to know when something is wrong.
Search & retrieval intuition. You understand hybrid search, embedding models, ranking, and the trade-offs between precision and recall. Experience with recommender systems, information retrieval, or knowledge graphs is directly applicable.
Reach beyond your lane. Most of what you build will be AI systems, but sometimes shipping means writing a FastAPI endpoint, prototyping a UI, or debugging a BigQuery query. We want someone who reaches for whatever tool solves the problem, not someone who stops at the boundary of their job title.

Nice to Have

Familiarity with Google Cloud, particularly Vertex AI and the Agent Engine
Experience with knowledge graphs, entity resolution, or ontology systems
Background in search systems — hybrid retrieval, ranking metrics (nDCG, MRR), embedding evaluation
Experience with open-source models and self-hosting (vLLM, Ollama, or similar)
Experience building human-in-the-loop AI systems

Why Join Us

This is a ground-floor opportunity to shape not just the product, but the intelligence that powers it. You won’t be implementing academic models; you’ll be designing real systems that help real businesses make smarter, faster decisions. If you love applying AI pragmatically, moving fast, and building with purpose, this is your playground.

Let’s build something people love to use, and something that actually works.

What We Offer

The opportunity to define and shape the AI foundation of a high-potential startup from day one.
Creative freedom and a high-trust environment focused on outcomes over process.
Direct access to founders and an experienced, mission-driven team.
Competitive salary.
Hybrid work options.
An intellectually stimulating environment where speed, curiosity, and product delivery are celebrated.

We are an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees.

Skills

AI-powered SDLCAPIsCI/CDConversational AIData PipelinesDGX?EmbeddingsFastAPIGCPGraph DatabasesKnowledge GraphsLangChainLLM OrchestrationPythonReact (prototyping)Retrieval-augmented GenerationSearch And RankingSQL TransformationsStreaming InterfacesTypeScriptVertex AI