Supernal helps small-to-medium businesses hire their first AI employee. Our AI teammates are built using intelligent, agentic workflows deployed on a proprietary platform. We deliver working, value-generating AI Employees—not tools—that handle real business processes alongside human teams.
We're looking for a Staff/Principal Software Engineer to own and evolve the core platform that powers our AI employees. This is a technical leadership position responsible for the systems that enable our agents to scale reliably: the Django backend, distributed task infrastructure, event-driven architecture, Kubernetes deployments, and observability stack.
You'll work across the full system—from database query optimization to Helm chart tuning to designing new platform abstractions. You'll be a force multiplier for the engineering team, driving architectural decisions, eliminating scaling bottlenecks, and establishing patterns that make the platform more robust and developer-friendly.
This role reports to the Director of Engineering and involves significant autonomy in shaping technical direction.
Drive platform architecture decisions and align the team on scalable patterns and long-term maintainability
Review a high volume of code, design docs, and architectural proposals for scalability, reliability, security, and operability
Be a technical mentor and force multiplier: unblock engineers, raise the bar on production readiness, and establish platform best practices
Own and evolve the core backend platform (Django/DRF/ASGI) performance and correctness
Scale async execution across Celery + Dramatiq + Temporal/Cortex; implement resilient workflow patterns (retries, circuit breakers, graceful degradation)
Optimize PostgreSQL/pgvector (query tuning, connection pooling) and caching strategies
Maintain and improve Kubernetes deployment infrastructure (GKE, Helm, Terraform/OpenTofu) and CI/CD + rollout strategies. Own KEDA autoscaling policies and resource allocation across worker pools.
Own reliability of RabbitMQ, Redis, and PostgreSQL infrastructure; lead incident response and post-mortems
Extend OpenTelemetry + Datadog instrumentation, dashboards, alerts, and SLOs; profile and reduce latency/memory bottlenecks
10+ years building and operating production backend systems at scale
Deep expertise in Python (Django preferred) and relational databases (PostgreSQL)
Hands-on experience with Kubernetes, Helm, and cloud infrastructure (GCP preferred)
Strong background in distributed systems: message queues, event sourcing, workflow orchestration
Production experience with async task systems (Celery, Dramatiq, or similar)
Track record of debugging complex production issues across multiple services
Ability to work autonomously and drive technical initiatives without close supervision
Clear technical communication—able to explain tradeoffs and build consensus
Experience with Temporal or similar workflow engines
Background in LLM infrastructure, RAG systems, or AI/ML platforms
Familiarity with OpenTelemetry, Datadog, or similar observability stacks
Experience with KEDA or other Kubernetes autoscaling solutions
Contributions to multi-tenant SaaS platform architecture
History of improving developer experience and platform abstractions
Platform services maintain high availability with predictable performance under load
Scaling bottlenecks are identified and resolved proactively
New features ship faster because platform primitives are well-designed and documented
Incidents are rare, quickly detected, and thoroughly addressed
Engineers across the team adopt platform patterns and best practices
Technical debt is systematically identified and paid down
You're a trusted technical voice in architectural discussions
Compensation: Competitive salary commensurate with experience (Staff/Principal level)
Location: Remote
Type: Full-time
Requirements: Overlap with Americas timezones for collaboration; reliable high-speed internet
If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.
Intel's Mask Metrology team is hiring an Image Processing Engineer to design and implement C++ image-analysis and metrology algorithms for production use in semiconductor mask manufacturing.
Lead cloud modernization for a Charlotte-based wealth management platform as an AWS Architect specializing in .NET and cloud-native design.
The University of Chicago's CTDS is hiring a Senior Platform Engineer to lead production support, CI/CD pipelines, monitoring, and security automation across hybrid cloud and on‑prem translational data science platforms.
Skydio seeks an Infrastructure Engineer to maintain and evolve its Kubernetes-based cloud platform, balancing product code changes and infrastructure automation to support mission-critical drone operations.
Contribute to scalable, user-focused web applications as a Software Development Engineer specializing in modern front-end frameworks and AWS cloud deployments.
Senior engineering leader wanted to drive AI-first engineering practices, build internal AI platforms, and scale global delivery across a high-performing remote team.
Lead the development of backend services and API integrations for a remote-first solar design and sales platform, helping accelerate clean energy adoption.
Booz Allen is hiring a Platform DevOps Engineer to design, deploy, and secure Kubernetes-based container platforms for critical government missions.
E Source is hiring a Software Engineer II/III to architect and deliver scalable, cloud-native microservices and data solutions for the utility sector using Java, containerization, and modern CI/CD practices.
ICF is hiring a Salesforce Developer (Copado) to design, build, and optimize enterprise Salesforce solutions that support health and government programs.
Transamerica is seeking an experienced AI-focused Specialist Software Engineer to lead the design and deployment of agent-based and LLM-driven systems on AWS in a hybrid Cedar Rapids role.
Lead Western Union's Core Money Transform platform engineering to deliver scalable, low-latency payment systems and drive cross-functional technical strategy.
Develop and productionize agent systems and the Friendli Agent API at FriendliAI to enable developers to build reliable, high-impact AI agent applications.