This is a remote based position in US. Please note that as part of the recruitment and hiring process, there is an in-person meeting that will take place.
We are seeking a skilled and experienced Staff Cloud Platform Engineer with expertise in Kafka to join Cloud Platform team. The Staff Cloud Platform Engineer to design, deploy, operate, and optimize our Apache Kafka-based event streaming infrastructure at scale to design in Google Cloud Platform (GCP).The ideal candidate will have a strong background in DevOps practices, cloud infrastructure automation, and big data technologies. In this role you will partner closely with platform, data, and application engineering teams to ensure our Kafka clusters are reliable, performant, and secure — running natively on GCP or AWS.
Responsibilities:
Design, provision, and manage Apache Kafka clusters (self-managed on GCP/AWS or via Confluent Platform / MSK).
Configure and tune brokers, ZooKeeper/KRaft, topics, partitions, replication factors, and retention policies for high throughput and low latency.
Perform cluster upgrades, rolling restarts, and broker replacements with zero downtime.
Implement and manage Kafka Connect pipelines for data ingestion and egress across heterogeneous systems.
Administer Kafka Streams and ksqlDB deployments for real-time stream processing workloads.
Maintain Schema Registry and enforce schema governance standards across teams.
Define and track SLIs/SLOs for consumer lag, throughput, end-to-end latency, and broker health.
Design and implement cloud infrastructure using IaC – Terraform
Build automated deployment pipelines for Kafka configuration changes using GitOps workflows (ArgoCD, Flux).
Create self-service tooling and runbooks to reduce toil for development teams.
Automate topic provisioning, ACL management, and schema registration via APIs and CLI tooling.
Integrate tools like GitLab CI/CD, or Cloud Build for automated testing and deployment.
Ensure seamless integration of data pipelines with other GCP services like Big Query, Cloud Storage.
Monitor and Optimize performance, reliability, and cost of Kafka and streaming pipelines
Implement security best practices for GCP resources, including IAM policies, encryption, and network security.
Ensure Observability is an integral part of the infrastructure platforms and provides adequate visibility about their health, utilization, and cost.
Collaborate extensively with cross functional teams to understand their requirements; educate them through documentation/trainings and improve the adoption of the platforms/tools.
Qualifications:
10+ years of overall experience in DevOps cloud engineering, or data engineering.
5+ years of experience in Kafka at production scale.
Deep expertise in Kafka internals: replication protocol, log compaction, consumer group coordination, partition leadership, and KRaft mode
Proficiency with container orchestration (Kubernetes / Helm) and deploying Kafka via Strimzi, Confluent Operator, or equivalent
Strong understanding of networking (VPC, peering, private endpoints, DNS, load balancing) in cloud environments.
Hands-on experience with Kafka Connect, Schema Registry, and at least one stream processing framework (Kafka Streams, Flink, Spark Structured Streaming).
Proficiency in Google Cloud Platform (GCP) services, including Dataflow, Pub/Sub, Kafka, Dataproc, Big Query, and Cloud Storage.
Expertise in Infrastructure as Code (IaC) tools like Terraform or Cloud Deployment Manager.
Familiarity with data orchestration tools like Apache Airflow or Cloud Composer.
Experience with CI/CD tools like Jenkins, GitLab CI/CD, or Cloud Build.
Knowledge of containerization and orchestration tools like Docker and Kubernetes.
Strong scripting skills for automation (e.g., Bash, Python).
Experience with monitoring tools like Cloud Monitoring, Prometheus, and Grafana.
Familiarity with logging tools like Cloud Logging or ELK Stack.
Strong problem-solving and analytical skills.
Excellent communication and collaboration abilities.
Ability to work in a fast-paced, agile environment.
#LI-Remote
The base pay range for this position varies based on the geographic location. More information about the pay range specific to candidate location and other factors will be shared during the recruitment process. Individual pay is determined based on location of residence and multiple factors, including job-related knowledge, skills and experience.
San Francisco Bay Area:
156,400 - 265,700 USD AnnualAll Other US Locations:
As a part of the total compensation package, this role may be eligible for a bonus. For information on our benefits click here.
If an employer mentions a salary or salary range on their job, we display it as an "Employer Estimate". If a job has no salary data, Rise displays an estimate if available.
Make infrastructure resilient and scalable at Visa by building automation, database reliability tooling, and GenAI-powered engineering assistants on the Product Reliability Engineering team in Austin.
Senior Director of Engineering needed to drive AI-powered engineering practices and operational excellence across global development teams in a remote role based in Pennsylvania.
Senior frontend engineer to lead architecture and development of React/TypeScript platform UIs that surface and orchestrate machine identity workflows at scale for CyberArk.
AVEVA is hiring a Distinguished AI Tech Lead to shape and operationalize frontier AI capabilities across industrial products, bridging advanced research and production delivery.
Chainguard is seeking a Staff Software Engineer to lead architecture and implementation of a scalable, secure Libraries Platform that automates builds, verification, and distribution of open-source packages (remote, full-time).
SeatGeek is looking for Software Engineers to design, build, and operate scalable services and user experiences for a high-traffic ticketing marketplace in a fully remote work environment.
Lead cross-team engineering to build scalable catalog, integration, and AI-native merchant systems that improve onboarding, catalog health, and merchant growth at Pinterest.
Work remotely on cloud infrastructure and data systems that power large-scale AI-driven automation for a mission-focused company transforming global waste systems.
SEI is hiring a Full Stack Software Engineer II to build cloud-native investment systems using .NET, React, TypeScript and AWS in a microservices architecture.
Temporal is hiring a Staff Software Engineer to lead the architecture and operation of internal builder tools and AI-driven agent platforms that improve developer flow and reliability across the organization.
Experienced backend-focused Staff Software Engineer needed to lead architecture and delivery of scalable Node.js/React services for PayPal's commerce platform.
Experienced software engineer needed to build and integrate scalable, secure payment and AI-enabled systems for Visa’s global platforms.
Lead the architecture and implementation of LinkedIn’s network access control platform to automate secure, policy-driven connectivity across cloud and on‑prem production environments.
To enable CSPs of all sizes to simplify, excite and grow.
5 jobs