Browse 31 exciting jobs hiring in Model Serving now. Check out companies hiring such as FriendliAI, Zencore, Intel in Stockton, Colorado Springs, Buffalo.
Help architect and operate FriendliAI’s enterprise inference platform as a Senior Backend Engineer focused on APIs, multi-tenant SaaS features, and data/system reliability at scale.
Lead design and delivery of secure, scalable, production-grade AI/ML solutions as Zencore’s Principal Architect, advising clients and shaping cloud-native architectures.
Intel is hiring an AI Software Engineer to develop deployment, data, and evaluation infrastructure for agentic AI frameworks and model-serving systems.
Senior engineering role owning end-to-end ingestion, indexing, and model operationalization for a high-throughput hybrid search platform at Thomson Reuters.
Lead Zapier's AI Platform team to build reusable model-serving, evaluation, and MLOps tooling that helps product teams ship AI features quickly, safely, and cost-effectively.
Workday is hiring a Machine Learning Engineer III to develop and productionize large-scale ML and GenAI solutions that improve payroll processing for enterprise customers.
Lead system- and hardware-focused optimizations for LinkedIn’s AI inference platform, improving GPU utilization, compiler workflows, and low-latency model serving at scale.
Couchbase seeks a Principal Software Engineer to architect and implement the Capella Control Plane using Golang, multi-cloud infrastructure, and AI/ML integrations to scale our DBaaS offering.
Lead the design and delivery of a closed-loop intelligence layer that enables an autonomous trading fleet to learn from real-time outcomes and improve profitability.
Help scale production ML infrastructure and retrieval systems at Foxglove to enable high-performance semantic search and data mining over multimodal robotics data.
Pangram Labs is hiring a Senior Backend Software Engineer to design, build, and scale the production systems that serve its AI-detection platform in Brooklyn.
Straia seeks a Senior Platform Engineer to design and operate the data movement, model-serving, and platform infrastructure that powers low-latency AI analytics for higher education.
Twelve Labs is hiring a senior Machine Learning Engineer to optimize and scale multimodal video foundation models for deployment across cloud and data platforms.
Deepgram is hiring an ML Ops Infrastructure Engineer to design and operate scalable model deployment, CI/CD, and monitoring systems that deliver production-grade voice AI at scale.
AI Engineering Intern at Actian to help integrate ML models into production applications while gaining hands-on experience with model serving, data pipelines, and full-stack development.
Work with research teams to productionize large-scale generative models, build GPU inference infrastructure, and ensure reliable deployment and observability for production ML workloads.
Lead and build True Anomaly’s AI platform and engineering team to deliver production-grade model hosting, agent infrastructure, and enterprise AI tooling that embed AI across the company.
Fundamental is hiring a Model Serving Engineer to build and optimize production inference infrastructure for NEXUS, focusing on Triton-based pipelines, GPU efficiency, and low-latency, high-throughput serving.
General Robotics seeks an ML Systems Engineer in Redmond to productionize and optimize real-time, GPU-accelerated model serving and ML infrastructure for autonomous robotics.
Drive production-quality integrations of NVIDIA Grove into Dynamo and leading open-source AI frameworks, delivering adapters, runtime components, and developer tooling for scalable training and inference.
Shape and own the QA strategy for FriendliAI’s inference platform, covering backend, frontend, model deployments, and novel validation for LLM inference quality.
Senior Backend Engineer needed to design and operate production-grade APIs and backend systems for a fast-moving AI inference platform serving enterprise deployments.
Toyota Research Institute is hiring a Senior Machine Learning Engineer to build ML infrastructure, integrate and fine-tune LLMs, and operationalize multimodal research workflows for robotics, autonomy, energy, and materials programs.
Decagon is hiring a Senior ML Infrastructure Engineer to design and scale distributed training and multi-provider inference platforms for LLMs and multimodal models.
Wizard AI is hiring a Senior MLOps Engineer to own and scale the production ML lifecycle for a real-time inference platform behind a conversational shopping agent.
Lead the design and implementation of scalable ML backend systems and sensor-data pipelines to enable production-grade robotics and autonomous manufacturing at Nidus in New York City.
NVIDIA seeks a seasoned Developer Relations Manager to partner with hyperscaler AI teams, provide hands-on technical enablement for NVIDIA AI software, and drive developer adoption and feedback into the product roadmap.
Lead architecture and delivery of scalable, secure AI and agentic systems at PointClickCare to drive measurable clinical and operational outcomes across the platform.
Lead the design and delivery of scalable ML infrastructure at Genentech to accelerate AI-driven drug discovery and support cross-functional ML teams.
Lead enterprise sales into regulated industries for Baseten, owning full-cycle deals, strategic account expansion, and technical evaluations to establish industry lighthouse customers.
Lead the design, training, and production deployment of large-scale ML models at Absentia Labs to turn complex scientific data into actionable machine intelligence.
Below 50k*
0
|
50k-100k*
1
|
Over 100k*
29
|