Let’s get started
By clicking ‘Next’, I agree to the Terms of Service
and Privacy Policy, and consent to receive emails from Rise
Jobs / Job page
Senior Software Engineer - AI Inference image - Rise Careers
Job details

Senior Software Engineer - AI Inference

This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Senior Software Engineer - AI Inference in United States.

This role offers an opportunity to work at the forefront of large language model inference, contributing directly to high-performance open-source serving frameworks used at scale. You will help shape how modern AI applications run efficiently on advanced GPU infrastructure by improving the performance, reliability, and scalability of inference systems. Working in a deeply technical and collaborative environment, you will focus on optimizing runtime behavior, reducing latency, and increasing throughput for production-grade AI workloads. The position combines systems engineering, low-level optimization, and open-source contribution, with direct impact on widely used AI frameworks. You will engage with a global engineering community while solving complex performance challenges across distributed GPU systems. This is an ideal role for a hands-on engineer passionate about AI infrastructure and high-performance computing.


Accountabilities:
  • Contribute features, optimizations, and fixes to open-source inference frameworks such as vLLM and SGLang
  • Design and improve inference runtime components including scheduling, batching, request handling, and KV-cache optimization
  • Profile and optimize performance-critical paths across Python, C++, and CUDA layers
  • Enhance multi-GPU inference performance through improved parallelism, communication strategies, and resource utilization
  • Develop benchmarking systems and regression tests to ensure performance stability and correctness across deployments
  • Investigate and resolve bottlenecks using profiling tools, GPU analysis, and data-driven performance evaluation
  • Collaborate with cross-functional teams to translate production needs into scalable, upstream-ready solutions
  • Participate in code reviews, architectural discussions, and open-source community contributions

Requirements:

  • 5+ years of experience in production software engineering with strong systems-level expertise
  • Hands-on experience with LLM inference or serving frameworks such as vLLM, SGLang, or similar systems
  • Strong programming skills in Python and C++ and/or CUDA with ability to debug and optimize performance-critical code
  • Experience with performance profiling tools, benchmarking, and latency/throughput optimization techniques
  • Solid understanding of distributed systems, concurrency, and multi-GPU or multi-node architectures
  • Strong communication skills and experience working in or contributing to open-source projects
  • Bachelor’s or Master’s degree in Computer Science, Computer Engineering, or equivalent experience
  • Strong advantage: contributions to open-source AI, ML, or systems projects such as PyTorch, Triton, NCCL, or similar ecosystems
  • Strong advantage: experience with GPU memory optimization, kernel fusion, or advanced inference techniques such as quantization or speculative decoding
  • Strong analytical mindset with a focus on measurement-driven engineering

Benefits:

  • Competitive base salary ranging from $152,000 to $287,500 depending on level and experience
  • Equity participation in addition to base compensation
  • Comprehensive health, dental, and vision insurance coverage
  • Flexible work arrangements supporting work-life balance
  • Paid time off, holidays, and parental leave benefits
  • Professional development opportunities in advanced AI and systems engineering
  • Exposure to cutting-edge AI infrastructure and large-scale GPU computing systems
  • Inclusive and innovation-driven engineering culture.


How Jobgether works:

We use an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. Our system identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company. The final decision and next steps (interviews, assessments) are managed by their internal team.

We appreciate your interest and wish you the best!

 Why Apply Through Jobgether? 

 

Data Privacy Notice: By submitting your application, you acknowledge that Jobgether will process your personal data to evaluate your candidacy and share relevant information with the hiring employer. This processing is based on legitimate interest and pre-contractual measures under applicable data protection laws (including GDPR). You may exercise your rights (access, rectification, erasure, objection) at any time.

 

 

#LI-CL1

Average salary estimate

$219750 / YEARLY (est.)
min
max
$152000K
$287500K

If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.

Similar Jobs
Photo of the Rise User

Lead the design and optimization of a high-impact, AI-accelerated company website as the Remote Lead Web Designer, driving brand and growth through data-informed UX and scalable web systems.

Photo of the Rise User
Posted 2 hours ago

Senior Digital Brand Designer wanted to lead the creation of cohesive, premium digital brand systems and campaigns for a remote team focused on elevating brand engagement.

Posted 19 hours ago

Lead modernization and secure identity/access efforts for enterprise applications at M&T Bank, driving cloud migrations, containerization, and engineering best practices.

Valon Tech Hybrid No location specified
Posted 1 hour ago

Contribute as a Software Engineer at Valon to build scalable, regulation-aware systems for mortgage servicing within a fast-growing Series C company that supports remote work.

Photo of the Rise User

Constructor seeks a Senior Backend Engineer to design and operate low-latency, high-throughput Attribute Enrichment and Badges services that deliver ML-generated item attributes to global e-commerce customers.

Photo of the Rise User

CSCI Consulting is seeking an experienced MuleSoft Integration Developer to design and implement secure, high-performance integrations and API-led connectivity for a major Federal modernization program.

Posted 12 hours ago

K2 Space is hiring a Senior Embedded Firmware Engineer to design, implement, and validate low-level firmware and bring-up for custom high-performance SoCs used in next-generation satellites.

Photo of the Rise User

WHOOP is hiring a Senior Fullstack Software Engineer to develop scalable AI platform features and seamless member experiences from frontend interfaces to backend APIs.

Photo of the Rise User
Posted 24 hours ago

Lead design and implementation of manufacturing software and diagnostics to assure kinematic performance and safety for next-generation surgical robotic instruments at a market-leading medical robotics company.

Photo of the Rise User
SeatGeek Hybrid Remote - United States
Posted 18 hours ago

SeatGeek is looking for Software Engineers to design, build, and operate scalable services and user experiences for a high-traffic ticketing marketplace in a fully remote work environment.

Photo of the Rise User

Design and deliver full-stack, production-grade AI agent features at Workday—building scalable front-end and backend solutions that simplify HR and finance workflows for millions of users.

Adaptive is hiring a Lead Application Security Engineer to own and harden application security across their Java/TypeScript services and AWS infrastructure as the company scales.

Photo of the Rise User
Posted 13 hours ago

Experienced software engineer needed to build and integrate scalable, secure payment and AI-enabled systems for Visa’s global platforms.

Photo of the Rise User

Lead and mentor a software engineering team to design and deliver manufacturing software and tooling that enables production of next‑generation surgical robotics.

Photo of the Rise User
Posted 20 hours ago

Lead backend development for Bumble's Dating product by building scalable GCP-native services, driving projects end-to-end, and mentoring junior engineers.

Jobgether has the ambition to disrupt the recruitment industry as we know it by simplifying it and making it more accurate 🎯 Jobgether platform connects candidates and companies based on: - Skills -... Values - Ambition - Personality The candidat...

643 jobs
MATCH
Calculating your matching score...
FUNDING
SENIORITY LEVEL REQUIREMENT
TEAM SIZE
EMPLOYMENT TYPE
Full-time, hybrid
DATE POSTED
April 17, 2026
Risa star 🔮 Hi, I'm Risa! Your AI
Career Copilot
Want to see a list of jobs tailored to
you, just ask me below!