LiteLLM is the world’s most popular AI Gateway used by the largest companies (Adobe, Netflix, NASA, etc.) in the world to give their developers access to LLMs and adjacent services (MCP’s, Vector Stores, etc.).
Companies use LiteLLM Enterprise once they put LiteLLM into production and need enterprise features like Prometheus metrics (production monitoring) and need to give LLM access to a large number of people with SSO (secure sign on) or JWT (JSON Web Tokens).
We are hiring an exceptional engineer to own release infrastructure and release security at LiteLLM. This is an opportunity to join us in-person as an early employee and make a large impact at a high growth start-up. You will own a critical part of the company: making sure we can ship secure, reliable releases on a consistent cadence with a high degree of autonomy and ownership.
We work 5 days per week in our SF office, approximately 60 hours per week in total.
We are looking for a software engineer with a strong background in infrastructure, CI/CD, and release engineering. You should be comfortable working across Helm, Terraform, release automation, testing systems, and the developer infrastructure needed to guarantee stable releases. This is a hands-on role.
You should be able to investigate test failures, distinguish real regressions from flaky tests, write Python, fix minor test issues, remove dead tests, and improve the overall reliability of the release pipeline. You should also be able to architect a secure end-to-end release process: how code moves from commit to published artifact, how access is controlled, how secrets are handled, and how we reduce the chance of bad or unauthorized releases.
Own secure, regular releases for LiteLLM, including 2 nightly releases and 1 stable release, per week.
Manage and improve the infrastructure behind our release process, including Helm, Terraform, CI/CD, and other developer systems needed to keep releases stable.
Investigate test failures and determine whether they are true regressions, flaky tests, or dead tests that should be fixed or removed.
Write Python to fix minor test issues, improve release reliability, and support developer workflows.
Architect and implement a secure release process across build, test, approval, and publish steps.
Work closely with the engineering team to improve release quality, reduce operational risk, and keep shipping velocity high.
2+ years of experience in infrastructure engineering, DevSecOps, release engineering, or related systems work.
Proficient in Python and comfortable making code changes in test and release systems.
Experience with Terraform, Helm, CI/CD systems, and cloud infrastructure.
Strong judgment around release reliability, testing, and debugging.
Ability to distinguish between real regressions and flaky infrastructure or test behavior.
Ability to design secure release processes, including access controls, secrets handling, and safe publishing workflows.
Ability to collaborate effectively with engineers across product, infra, and security.
LiteLLM (https://github.com/BerriAI/litellm) is a Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere] and is used by companies like Rocket Money, Adobe, Twilio, and Siemens.
If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.
Syngenta Seeds is hiring a Full-Stack Engineer to build scalable web applications that translate AI/ML capabilities into intuitive tools for growers and global users.
Wellmark is hiring a seasoned Platform Engineer to design, build, and scale agentic AI platforms and infrastructure that enable autonomous, enterprise-grade AI workflows.
Lead and architect enterprise-scale AI initiatives at AVEVA, translating cutting-edge AI research into production-ready architectures, repeatable patterns, and cross-functional delivery across industrial domains.
Temporal is hiring a Staff Software Engineer to lead the architecture and operation of internal builder tools and AI-driven agent platforms that improve developer flow and reliability across the organization.
Constructor seeks a Senior Backend Engineer to design and operate low-latency, high-throughput Attribute Enrichment and Badges services that deliver ML-generated item attributes to global e-commerce customers.
FINRA is hiring a Software Engineer in Rockville, MD to develop robust, maintainable software and support engineering and operational excellence across the SDLC in a hybrid environment.
Design and build AI‑enabled internal systems and integrations to scale Parloa’s Go‑To‑Market operations using TypeScript, Python, and modern AI tooling.
CapTech is hiring a senior Full-Stack Developer (.NET) in Salt Lake City to deliver cloud-ready, API-driven enterprise applications and integrations across front-end and back-end stacks.
Lead the architecture and productionization of Spotify’s shared Agent Engine to power scalable, reliable agent-based experiences across the platform.
Workday is hiring a Principal Software Engineer to own and evolve AI-native infrastructure tooling and automation across large-scale, distributed platform environments.
An established tech organization seeks a Senior Director of Engineering to lead AI-powered engineering practices, operational excellence, and global delivery for product-driven teams.
Senior frontend engineer to lead architecture and development of React/TypeScript platform UIs that surface and orchestrate machine identity workflows at scale for CyberArk.
Temporal is looking for a Senior Software Engineer to build and operate internal developer tooling and agent platforms that improve developer flow and enable safe adoption of AI-driven tooling across the company.