Jobless Developer
Simular logo
Simular

Posted 1 month ago

Open

Software Engineer, CUA Control

SingaporeRemoteFull-time

AI Summary

An engineer designs and builds the low-level control stack that translates an AI model’s intent into precise, reliable actions across macOS, Windows, and Linux, including input simulation, screen capture, and UI element detection.

About this role

Where multiple locations are listed for this role, the position may be based in any of those locations, with priority determined according to the order of listing.

We're looking for an engineer to work on the control layer - the system that translates an AI model's intent into precise, reliable actions on a real computer. This means mouse movements, keyboard input, window management, UI element detection, and error recovery across macOS, Windows, and Linux.

What you'll do

  • Work on the low-level computer control stack: mouse/keyboard injection, screen capture, coordinate mapping, input simulation

  • Implement UI element detection using accessibility APIs (AXUIElement, UI Automation), DOM/a11y trees, and visual grounding

  • Help build the abstraction layer that lets our agent operate across OS platforms and application types

  • Tackle reliability problems: element targeting under UI changes, window occlusion, resolution scaling, cross-app focus management

  • Contribute to feedback loops: how does the agent know its action worked? How does it recover when something unexpected happens?

  • Work closely with the model and planning team on the interface between intent and execution

You might be a fit if

  • You've built OS-level input automation (CGEvent, SendInput, xdotool, or similar)

  • You understand accessibility frameworks - AXUIElement on macOS, UI Automation on Windows, AT-SPI on Linux

  • You've dealt with flaky element selectors, timing issues, resolution-dependent coordinates

  • You think carefully about reliability and edge cases

  • You've worked with tools like Playwright, Appium, PyAutoGUI, Hammerspoon, or similar

Bonus

Experience with screen reader internals, remote desktop protocols (RDP/VNC), game automation, LLM agent tool-use systems, or mobile device automation (iOS UIAutomation / XCTest, Android UIAutomator / Accessibility).

Skills

AppiumAT-SPIAXUIElementCGEventCoordinate MappingCross-platform AutomationDOM/a11y TreesHammerspoonInput SimulationPlaywrightPyAutoGUIReliability EngineeringScreen CaptureScreen ReadingSendInputUIAutomationUI Element DetectionVisual GroundingWindow ManagementXdotool

Explore related jobs

Browse these categories