Browse 19 exciting jobs hiring in Triton now. Check out companies hiring such as LinkedIn, K1X, Foxglove in Houston, Hialeah, Knoxville.
Lead system- and hardware-focused optimizations for LinkedIn’s AI inference platform, improving GPU utilization, compiler workflows, and low-latency model serving at scale.
K1X is hiring a hands-on Machine Learning Operations Engineer to design and operate scalable ML infrastructure, pipelines, and production inference systems for a fully remote, Midwest-preferred startup.
Help scale production ML infrastructure and retrieval systems at Foxglove to enable high-performance semantic search and data mining over multimodal robotics data.
Deepgram is hiring an ML Ops Infrastructure Engineer to design and operate scalable model deployment, CI/CD, and monitoring systems that deliver production-grade voice AI at scale.
Tavus is hiring a Multimodal AI Model Optimization Research Engineer to convert cutting-edge multimodal models into efficient, low-latency production systems.
Lead and scale NVIDIA's embedded AI software go-to-market and partner co‑sales to accelerate ISV, OEM, and system integrator adoption of NVIDIA's platform.
A Research Engineer role focused on GPU/kernel and distributed-training optimizations to scale and accelerate real-time world-model AI.
Fundamental is hiring a Model Serving Engineer to build and optimize production inference infrastructure for NEXUS, focusing on Triton-based pipelines, GPU efficiency, and low-latency, high-throughput serving.
Wyetech is seeking an experienced Software Engineer 2 to productionize ML research into high-performance, containerized systems for federal customers while working hybrid from Laurel, MD.
General Robotics seeks an ML Systems Engineer in Redmond to productionize and optimize real-time, GPU-accelerated model serving and ML infrastructure for autonomous robotics.
Drive production-quality integrations of NVIDIA Grove into Dynamo and leading open-source AI frameworks, delivering adapters, runtime components, and developer tooling for scalable training and inference.
Metamorphic is hiring an ML Research Engineer (Performance Engineering) to implement and optimize GPU kernels, low-precision training, and MoE systems for next-generation foundation models.
Lead NVIDIA’s embedded AI software go-to-market and partner co-sales to drive broad ISV, OEM, and system integrator adoption of NVIDIA AI platforms.
Drive adoption of NVIDIA accelerated computing by advising AI-native startups on architecture, optimization, and scaling of agentic, multimodal, and LLM-powered applications.
Work at the kernel layer to design, profile, and ship custom CUDA/ROCm kernels that maximize performance across NVIDIA and AMD GPUs for inference and training workloads.
Lead firmware development for a novel photonic LLM accelerator at a Series A startup, owning low-level embedded software, toolchains, and hardware-software co-design.
Lead federal-focused developer relations to drive integration and adoption of NVIDIA’s GPU-accelerated AI stack across ISVs, defense contractors, and public-sector platforms.
Lead NVIDIA's developer relations strategy for financial services, driving adoption of GPU-accelerated AI across top capital markets ISVs and platform partners.
Lead technical strategy and global developer engagement for manufacturing at NVIDIA, driving adoption of AI and GPU-accelerated platforms across ISVs and developer communities.
Below 50k*
0
|
50k-100k*
0
|
Over 100k*
19
|