Browse 11 exciting jobs hiring in Quantization now. Check out companies hiring such as Zoox, LinkedIn, Tavus in Fort Lauderdale, North Las Vegas, Des Moines.
Drive production-ready model optimization, custom kernel development, and edge deployment to enable real-time inference of large-scale models on vehicle SOCs for Zoox's Perception team.
Lead system- and hardware-focused optimizations for LinkedIn’s AI inference platform, improving GPU utilization, compiler workflows, and low-latency model serving at scale.
Tavus is hiring a Multimodal AI Model Optimization Research Engineer to convert cutting-edge multimodal models into efficient, low-latency production systems.
Lead the development of custom quantization algorithms and low-precision techniques to maximize model performance on Quadric's Chimera GPNPU from our Burlingame engineering office.
Join ADI's Embedded AI Tooling Team to build end-to-end model deployment, optimization, and compilation tooling that unlocks AI on heterogeneous embedded SoCs.
Lead and grow a high-performing edge software engineering team to build and scale AI-enabled IoT solutions deployed across distributed devices for a fast-growing intelligent site technology company.
Toyota Research Institute is hiring a Senior Machine Learning Engineer to build ML infrastructure, integrate and fine-tune LLMs, and operationalize multimodal research workflows for robotics, autonomy, energy, and materials programs.
Senior-level embedded AI engineer role at Renesas to lead development of model translation tooling and high-performance inference for resource-constrained MCUs/MPUs.
Metamorphic is hiring an ML Research Engineer (Performance Engineering) to implement and optimize GPU kernels, low-precision training, and MoE systems for next-generation foundation models.
Work at the kernel layer to design, profile, and ship custom CUDA/ROCm kernels that maximize performance across NVIDIA and AMD GPUs for inference and training workloads.
Lead the architecture and implementation of an AI-driven compiler toolchain that translates modern neural networks into highly optimized executables for Renesas SoCs.
Below 50k*
0
|
50k-100k*
0
|
Over 100k*
11
|