About the company
Job Summary
Responsibilities 1. Infrastructure & Platform Engineering Design and manage cloud-native infrastructure (AWS/GCP/Azure) Build and operate Kubernetes clusters for production workloads Deploy and maintain stateful distributed systems: Redpanda / Kafka ClickHouse clusters Redis clusters Solana validator nodes Design high-availability and disaster recovery strategies Optimize infrastructure for performance, latency, and cost efficiency 2. CI/CD & Automation Design and maintain CI/CD pipelines for services and infrastructure Implement Infrastructure-as-Code using Terraform / Pulumi Automate environment provisioning and deployments Enable safe rollout strategies: blue/green deployments canary releases rollback automation Qualifications 5+ years experience in DevOps / Platform / SRE roles Strong experience operating Kubernetes in production Deep understanding of Linux systems and networking Experience with cloud platforms (AWS, GCP, or Azure) Experience running distributed systems at scale Strong experience with: Docker CI/CD pipelines Infrastructure as Code Experience with monitoring and observability tooling Strong troubleshooting and incident management skills Skills Cloud: AWS Containerization: Docker Orchestration: Kubernetes Streaming: Kafka Databases: ClickHouse, Redis Backend Services: Go, Python, Node.js Observability: Prometheus, Grafana, OpenTelemetry IaC: Terraform Storage: S3



