About the company
Fireblocks provides a suite of applications to manage digital asset operations and a complete development platform to build your business on the blockchain
Job Summary
Responsibilities:
📍Own the production infrastructure over AWS and Azure. Implement sustainable and scalable solutions with goals of improving availability, and performance 📍Help Identify root causes for every incident and prevent incidents from ever happening again 📍Have alerts on symptoms and not on outages. Ensure all infrastructure and application alerts are “actionable” alerts and/or self-healing automation 📍Work closely with the R&D and Support: offering education and guidance on integration, support, and monitoring across the toolset 📍Everything as a code approach: Run our infrastructure with Ansible, Terraform, and Kubernetes 📍Document every action and turn it into repeatable actions and then into automation 📍Focus on the system's observability, availability, reliability, performance/latency, monitoring 📍Conduct periodic on-call duties and emergency response
Requirements:
📍At least 3+ years of experience as DevOps or SRE in a SaaS environment 📍Experience with Coding languages - Python/JavaScript/Bash, or similar 📍At least 3+ years of experience with Alerting & Monitoring systems such as DataDog Splunk / New Relic / Prometheus, or similar 📍Experience working with Linux systems from kernel to shell and beyond 📍Cloud systems such as AWS / Google cloud / Azure 📍Configuration management such as Ansible/Chef/Puppet 📍Experience with Docker, Kubernetes and Helm 📍SCM - Git/bitbucket/gitlab/Phabricator/gerrit 📍High Analytical & Troubleshooting skills - ability to solve complex problems 📍Strong verbal and written communication skills and a collaborative mindset 📍Ability to dive into detail while understanding the big picture