Browse 7 exciting jobs hiring in Evals now. Check out companies hiring such as Tetrix, Artificial Intelligence Underwriting Company, Harvey in Cape Coral, Tampa, New York.
Lead the design and delivery of production AI systems—agents, extraction engines, and evaluation pipelines—at a fast-growing startup transforming private market data into reliable, auditable insights.
Join an early-stage AI safety startup as a founding Forward Deployed Engineer to design rigorous AI evals, lead customer implementations, and shape product strategy for certification of real-world AI agents.
Harvey is hiring engineers to build and optimize agent systems that automate complex legal workflows using LLMs, custom tools, and evaluation-driven iteration.
Instrument is hiring a Senior AI Engineer to design and implement the core multi-agent intelligence, context management, and evals infrastructure for a large-scale, stateful generative-AI simulation project.
Lead the Agent engineering team at Descript to deliver a best-in-class, scalable agentic video editing experience by driving technical execution, product-driven experimentation, and team growth.
At Variance, you will design and implement domain-specific benchmarks and evaluation systems that reveal failure modes and drive improvements in ML and agent behavior for fraud, identity, and risk workflows.
Build production-grade agents, developer tooling, and public-facing demos at StackOne to improve developer experience and demonstrate secure, scalable agentic integrations.