Rise Jobs & Careers icon Ai Evaluation Jobs

Browse 56 exciting jobs hiring in Ai Evaluation now. Check out companies hiring such as Cover Whale, LanguageWire, EQL Tech in Greensboro, Huntsville, Ontario.

Photo of the Rise User
Cover Whale Hybrid No location specified
Posted 20 hours ago

Lead and build the agentic AI platform that enables pods of engineers and AI agents to safely and reliably deliver production software at scale.

LanguageWire Hybrid No location specified
Posted yesterday

LanguageWire is hiring an AI Engineer to design and productionize LLM-based translation workflows and bridge ML experimentation with production engineering.

EQL Tech Hybrid No location specified
Posted 2 days ago

Work on a mission-driven fintech team to build and ship core AI products (LLM/VLM and evaluation pipelines) that power eligibility and compliance for education savings accounts.

Photo of the Rise User
Mercor Hybrid No location specified
Posted 2 days ago

Lead and grow an Applied AI engineering team at Mercor to build scalable evaluation and data systems that measurably improve frontier model performance.

Lead the product vision and engineering for clinician-facing AI tools at knownwell, building and operating RAG-based clinical decision support with full product ownership and direct clinician partnership.

Photo of the Rise User
Brillio Hybrid New York, New York, United States
Posted 4 days ago

Experienced technical product leader needed to own prioritization, quality, and stakeholder alignment for LLM-driven products while staying hands-on with architecture, code reviews, and AI cost optimization.

Photo of the Rise User

Help build and deploy production AI agent platforms that power personalized financial advisory workflows for institutional clients at Arta.

Photo of the Rise User
Salesforce Hybrid California - San Francisco
Posted 6 days ago
Inclusive & Diverse
Rise from Within
Mission Driven
Diversity of Opinions
Work/Life Harmony
Feedback Forward
Take Risks
Collaboration over Competition
Medical Insurance
Dental Insurance
Vision Insurance
Paid Time-Off
Maternity Leave
Paternity Leave
Mental Health Resources
Life insurance
Disability Insurance
Health Savings Account (HSA)
Flexible Spending Account (FSA)
Employee Resource Groups

Lead Slack's search and AI platform as VP Product to set strategy, drive model and infrastructure decisions, and deliver reliable, scalable AI-powered search and knowledge services for enterprise users.

Photo of the Rise User
Posted 6 days ago

NiCE is hiring a Forward Deployed Engineer to design, ship, and operate production-scale conversational AI agents that solve high-impact enterprise problems.

Experienced domain experts in Business Operations & Communications or Education and Academic Research are needed for a remote, retainer-based 2‑week role evaluating and crafting prompts for AI writing models with US-contextual standards.

Join an early-stage AI safety startup as a founding Forward Deployed Engineer to design rigorous AI evals, lead customer implementations, and shape product strategy for certification of real-world AI agents.

Posted 7 days ago

Epoch AI is hiring remote Researchers and Senior Researchers to conduct data-driven investigations, build benchmarks, and forecast AI capabilities and trends.

Photo of the Rise User

Visa is hiring a Product Analyst to define and scale generative AI platform capabilities, combining product analytics, prototyping, and cross-functional collaboration to deliver responsible, enterprise-grade AI solutions.

Photo of the Rise User
Posted 7 days ago

Colibri Group is hiring an AI Engineering Intern to help design and evaluate AI-driven educational tools, focusing on model behavior, alignment, and responsible AI practices under senior mentorship.

Posted 9 days ago

Unstructured is hiring an AI Engineer to architect and ship production-grade RAG and agentic systems that process messy multimodal data for high-impact government and military contracts.

Weekday AI Hybrid No location specified
Posted 11 days ago

Contract opportunity to evaluate and improve LLM conversational responses in Hindi and English by performing fact-checking, annotation, and qualitative assessment.

Photo of the Rise User
Posted 12 days ago

Lead the design and production of LLM-driven coaching systems at Valence, applying deep ML and engineering expertise to build enterprise-grade, context-aware AI experiences.

Photo of the Rise User
Posted 13 days ago

LinkedIn seeks a Hybrid Machine Learning Engineer to build and deploy scalable relevance and evaluation models for recommender systems and generative/NLP-driven product features.

Photo of the Rise User
ServiceNow Hybrid Building A,B,C 2225 Lawson Lane, Santa Clara, CALIFORNIA, United States
Posted 13 days ago
Inclusive & Diverse
Mission Driven
Rise from Within
Diversity of Opinions
Work/Life Harmony
Empathetic
Feedback Forward
Take Risks
Collaboration over Competition
Medical Insurance
Dental Insurance
Vision Insurance
Mental Health Resources
Life insurance
Disability Insurance
Health Savings Account (HSA)
Flexible Spending Account (FSA)
Conferences Stipend
Paid Time-Off
Maternity Leave
Equity

A selective, eight-week (mostly virtual) unpaid bootcamp at ServiceNow for undergraduate students to learn agentic AI, build and evaluate agents, and present a capstone project during an in-person finale.

Photo of the Rise User

AIR is hiring a Technical Assistance Consultant to develop and deliver workforce-focused TA, training, and capacity-building to advance economic mobility, workforce development, and future-of-work strategies including AI integration.

Photo of the Rise User
ServiceNow Hybrid 15725 Dallas Pkwy, Addison, TX 75001, USA
Posted 15 days ago
Inclusive & Diverse
Mission Driven
Rise from Within
Diversity of Opinions
Work/Life Harmony
Empathetic
Feedback Forward
Take Risks
Collaboration over Competition
Medical Insurance
Dental Insurance
Vision Insurance
Mental Health Resources
Life insurance
Disability Insurance
Health Savings Account (HSA)
Flexible Spending Account (FSA)
Conferences Stipend
Paid Time-Off
Maternity Leave
Equity

Lead the strategic integration of AI across ServiceNow marketing by owning the MarTech and agentic product portfolio to drive adoption, efficiency, and measurable business impact.

Posted 16 days ago

Senior engineering leader to design, evaluate and productionize agentic AI systems, prompt architectures and multi-agent orchestration for critical banking workflows at Deutsche Bank in Cary, NC.

Generative AI Analyst at Welocalize to craft prompts, annotate and evaluate LLM outputs, and lead labeling workflows in a remote full-time role.

Photo of the Rise User
Posted 17 days ago

Lead the design and implementation of secure, scalable Generative AI and ML architectures for an EdTech organization focused on building production-ready RAG, retrieval, and MLOps solutions.

Photo of the Rise User
Posted 17 days ago

Build the internal tooling and evaluation infrastructure that empowers engineers and researchers to iterate quickly and reliably on Crosby’s LLM-powered legal platform.

Posted 18 days ago

Neighbors Bank is looking for a decisive, process-improvement focused Recruiting Coordinator to manage hiring pipelines, conduct candidate evaluations, and help evolve recruiting practices in a fully remote role.

Photo of the Rise User
Posted 18 days ago
Dental Insurance
Disability Insurance
Flexible Spending Account (FSA)
Health Savings Account (HSA)
Vision Insurance
Sabbatical
Paid Holidays

Handshake is hiring an ML Research Scientist to drive open scientific research, create public benchmarks, and collaborate with top AI labs to advance data and evaluation methods for frontier models.

MLabs Hybrid No location specified
Posted 19 days ago

Lead the design and evaluation of agentic LLM systems that power a fintech's financial intelligence platform, ensuring correctness, scalability, and production reliability.

Photo of the Rise User

SweetRush is hiring an Instructional Designer/eLearning Developer to create and deliver IT-focused learning solutions (AI, cybersecurity, workplace apps) for a global enterprise in a remote, Eastern Time–preferred contract role.

Photo of the Rise User
Posted 20 days ago

Experienced software engineers with strong system-design and ML/LLM experience are needed to build and productionize LLM-powered agents, evaluation pipelines, and scalable AI infrastructure at Permute.

Photo of the Rise User
Posted 20 days ago

Fullscript is looking for a Staff Machine Learning Engineer to architect and ship production LLM-driven clinical features that improve clinician workflows and patient outcomes.

Photo of the Rise User
Inclusive & Diverse
Diversity of Opinions
Growth & Learning
Mission Driven
Social Impact Driven
Empathetic
Dental Insurance
Flexible Spending Account (FSA)
Health Savings Account (HSA)
Vision Insurance
Performance Bonus
Family Medical Leave
Paid Holidays

Khan Academy is hiring a Senior AI Engineer (24-month fixed-term) to lead integration, evaluation, and quality improvements of generative AI features that support learning at scale.

Photo of the Rise User
Dental Insurance
Disability Insurance
Flexible Spending Account (FSA)
Health Savings Account (HSA)
Vision Insurance
Sabbatical
Paid Holidays

Handshake seeks experienced 3D Slicer users to remotely evaluate AI-generated medical imaging content and provide expert feedback on segmentation, DICOM workflows, and clinical research relevance.

Photo of the Rise User
Dental Insurance
Disability Insurance
Flexible Spending Account (FSA)
Health Savings Account (HSA)
Vision Insurance
Sabbatical
Paid Holidays

Handshake seeks experienced Shotcut users to evaluate AI-generated video edits and create tool-focused assessment materials on a flexible, remote, hourly contract basis.

Photo of the Rise User
ServiceNow Hybrid 15725 Dallas Pkwy, Addison, TX 75001, USA
Posted 22 days ago
Inclusive & Diverse
Mission Driven
Rise from Within
Diversity of Opinions
Work/Life Harmony
Empathetic
Feedback Forward
Take Risks
Collaboration over Competition
Medical Insurance
Dental Insurance
Vision Insurance
Mental Health Resources
Life insurance
Disability Insurance
Health Savings Account (HSA)
Flexible Spending Account (FSA)
Conferences Stipend
Paid Time-Off
Maternity Leave
Equity

Lead the AI product portfolio for marketing to turn enterprise AI strategy into a cohesive MarTech roadmap, measurable productivity gains, and durable automation at scale.

Photo of the Rise User
ServiceNow Hybrid 275 Wyman St 2nd floor, Waltham, MA 02451, USA
Posted 22 days ago
Inclusive & Diverse
Mission Driven
Rise from Within
Diversity of Opinions
Work/Life Harmony
Empathetic
Feedback Forward
Take Risks
Collaboration over Competition
Medical Insurance
Dental Insurance
Vision Insurance
Mental Health Resources
Life insurance
Disability Insurance
Health Savings Account (HSA)
Flexible Spending Account (FSA)
Conferences Stipend
Paid Time-Off
Maternity Leave
Equity

Lead the AI MarTech product portfolio at ServiceNow to convert AI strategy into scalable agentic workflows, measurable productivity gains, and sustained marketing leverage.

Photo of the Rise User

Work on TRM’s AI Engineering team to design and ship agentic LLM systems and scalable infrastructure that augment investigations and ensure safe, auditable behavior in high-sensitivity environments.

Posted 23 days ago

aiEDU is hiring a Senior Lead, Research & Evaluation to design and run impact measurement, lead research strategy, and build data systems that inform program decisions across the organization.

Varick Agents Hybrid No location specified
Posted 23 days ago

Varick seeks an AI Engineer to architect and ship production-grade agent systems, evaluation pipelines, and retrieval-driven context strategies for enterprise AI deployments.

Photo of the Rise User

Lead the design, production deployment, and continual improvement of AI-powered features for Savvas's flagship K-12 platform, applying deep LLM, cloud, and software engineering expertise to improve student learning at scale.

Posted 24 days ago

Rwazi is hiring a Decision Intelligence Analyst to validate and improve AI-driven decision outputs by identifying failure modes, formalizing evaluation rubrics, and refining judgment frameworks.

Photo of the Rise User
ServiceNow Hybrid 15725 Dallas Pkwy, Addison, TX 75001, USA
Posted 24 days ago
Inclusive & Diverse
Mission Driven
Rise from Within
Diversity of Opinions
Work/Life Harmony
Empathetic
Feedback Forward
Take Risks
Collaboration over Competition
Medical Insurance
Dental Insurance
Vision Insurance
Mental Health Resources
Life insurance
Disability Insurance
Health Savings Account (HSA)
Flexible Spending Account (FSA)
Conferences Stipend
Paid Time-Off
Maternity Leave
Equity

Lead the AI product portfolio for marketing at ServiceNow, defining and delivering a unified MarTech and agentic roadmap that drives measurable productivity and enterprise-scale adoption.

Photo of the Rise User

Lead architecture and delivery of scalable, secure AI and agentic systems at PointClickCare to drive measurable clinical and operational outcomes across the platform.

Photo of the Rise User

Contract reviewers are needed to compare AI-generated English text pairs, choose the clearer response, and provide concise explanations to help improve model output quality.

Posted 26 days ago

Virtue AI is seeking a hands-on Testing Engineer to lead product and backend QA, automate system testing, and perform model red-teaming for a cutting-edge AI security platform.

Photo of the Rise User
IFS Hybrid Itasca, United States
Posted 27 days ago

Lead architecture and delivery of enterprise-scale LLMs, agent orchestration, and retrieval systems to build safe, scalable AI workflows for IFS Nexus Black.

Photo of the Rise User
Posted 28 days ago

TRM Labs is hiring a Senior AI Research Engineer to drive model evaluation, fine-tuning, and production orchestration for large-scale LLM and ML systems that power blockchain intelligence.

Photo of the Rise User
Posted 29 days ago

Handshake AI seeks Physics PhDs to perform flexible, hourly contract work evaluating AI-generated physics content for scientific accuracy and physical reasoning.

Photo of the Rise User
Posted 29 days ago

Handshake seeks Math PhDs for flexible, remote hourly contracts to design domain-relevant math questions and evaluate AI-generated mathematical reasoning and proofs.

Photo of the Rise User
Posted 29 days ago

Handshake seeks doctoral-level biology experts to review and critique AI-generated biological content on a flexible, remote, hourly contract basis.

Employment type
Remote/Onsite
Application Type
Date Posted
Department
Work Experience
Industries
Skills
Company size
Funding
Company Culture
Benefits & Perks
Company Rating
Salary (USD)
Keywords to Exclude

How much do ai evaluation jobs pay?

Below 50k*
2
15%
50k-100k*
1
8%
Over 100k*
10
77%
*average yearly salary (USD)

Best cities to find ai evaluation jobs