Browse 16 exciting jobs hiring in Distributed Inference now. Check out companies hiring such as LinkedIn, DeepWalk, MLabs in Baltimore, Oklahoma City, Sacramento.
Lead system- and hardware-focused optimizations for LinkedIn’s AI inference platform, improving GPU utilization, compiler workflows, and low-latency model serving at scale.
Lead DeepWalk’s computer vision platform as a Staff Software Engineer, driving the architecture and productionization of ML systems that process millions of images for sidewalk inspection and city infrastructure decisions.
Lead the design and delivery of a closed-loop intelligence layer that enables an autonomous trading fleet to learn from real-time outcomes and improve profitability.
Help scale production ML infrastructure and retrieval systems at Foxglove to enable high-performance semantic search and data mining over multimodal robotics data.
Twelve Labs is hiring a senior Machine Learning Engineer to optimize and scale multimodal video foundation models for deployment across cloud and data platforms.
A Research Engineer role focused on GPU/kernel and distributed-training optimizations to scale and accelerate real-time world-model AI.
Drive the design and implementation of experimentation methodologies, inference pipelines, and production tooling as a Full‑Stack Data Scientist on Netflix’s Experimentation Platform.
Shape and own the QA strategy for FriendliAI’s inference platform, covering backend, frontend, model deployments, and novel validation for LLM inference quality.
Decagon is hiring a Senior ML Infrastructure Engineer to design and scale distributed training and multi-provider inference platforms for LLMs and multimodal models.
Work on training and deploying large-scale ML systems for physical robots while building the infrastructure and pipelines to operate them in production.
Adaptive ML is hiring a Performance Engineer (Rust) to develop high-performance, production-grade systems that power scalable RLOps for enterprise LLM deployments.
Senior technical leader needed to define and drive the architecture and roadmap for LinkedIn’s AI and data infrastructure, ensuring scalable, reliable systems for ML training, inference, and observability.
Lead the design, training, and production deployment of large-scale ML models at Absentia Labs to turn complex scientific data into actionable machine intelligence.
Work as an Inference Engine Engineer at FriendliAI to design high-performance GPU kernels and core runtime components that power latency-critical, production-scale generative AI systems.
Visa seeks a Senior Consultant Machine Learning Engineer to lead model onboarding and build production ML services that scale and perform in a hybrid Austin role.
Field AI seeks an MLOps Engineer to build and operate scalable GPU infrastructure, deployment pipelines, and monitoring for ML models powering real-world robotics.
Below 50k*
0
|
50k-100k*
0
|
Over 100k*
2
|