About the company
We are a team of world class builders and researchers with expertise across several domains: Ethereum Protocol Engineering, Layer-2, Decentralized Finance (DeFi), Miner Extractable Value (MEV), Smart Contract Development, Security Auditing and Formal Verification. Working to solve some of the most challenging problems in the blockchain space, we frequently collaborate with renowned companies, such as Ethereum Foundation, StarkWare, Gnosis Chain, Aave, Flashbots, xDai, Open Zeppelin, Forta Protocol, Energy Web, POA Network and many more. We actively contribute to Ethereum core development, EIPās and network upgrades together with the Ethereum Foundation, and other client teams.
Job Summary
Responsibilities:
šResponsible for monitoring and maintaining the production system including Ethereum validators and nodes, APIs, and other apps. This includes setting up monitoring tools, troubleshooting issues, and performing regular maintenance tasks to ensure optimal performance. šIn the event of an incident or outage, the SRE will be responsible for quickly identifying the root cause of the issue and implementing a fix to restore service. This may require working outside of normal business hours to respond to incidents in a timely manner. šResponsible for documenting processes, procedures, post-incident reports, and best practices related to running our services in production. This documentation will help ensure consistency and quality across the team, and will also serve as a reference for future team members. šCollaborate closely with other members of the team to ensure that all production services are running smoothly and that any issues are addressed quickly especially Ethereum validators. This may include participating in on-call rotations, attending team meetings, and working on cross-functional projects with other teams. šResponsible for automating as many tasks as possible in order to reduce the amount of manual work required to manage infrastructures. This includes scripting, developing tools, and setting up automation using Terraform and CI/CD to streamline processes. šResponsible for continuously improving the processes, procedures, and tools used to manage blockchain nodes and validators. This includes identifying areas for improvement, implementing changes, and measuring the impact of those changes to ensure they are effective. šResponsible for evaluating the business needs and producing various designs to achieve the assigned projects. šProvide systems expertise and drive operational best practices. Responsible for setting up and maintaining performance system monitoring. šIn this role, we need you to have experience in (you should have): šIAC experience running on any cloud platform preferably on AWS. šProficiency in Linux operating system and command-line tools. šSkills in programming languages such as Python, Golang, or Bash. šExperience with CI/CD pipelines and automation frameworks, preferably ArgoCD. šFamiliarity with containerization technologies such as Docker with Docker Compose and Kubernetes. šDesign and Implementation with high availability, reliability, security, and cost optimization in mind. šPerform proactive analysis of infrastructure capacity and performance, system backup, and recovery. šEnsuring security systems/appliances are functional and improved upon for proactive cyber defense. šAct as a role model for technical competence, helpfulness, facilitation of learning, and teamwork. šwith monitoring and alerting tools such as Prometheus and Grafana. šStrong troubleshooting and problem-solving skills and excellent communication and collaboration skills. šAbility to work independently and remotely, while also being a team player. šExposure to blockchain nodes and validators maintenance, preferably Ethereum.
Nice to have skills
šExpertise in blockchain nodes and validators maintenance especially Ethereumās will be preferred. šExperience with Kubernetes cluster deployment strategy with Argo CD. šScripting Proficiency in multiples from Bash, PowerShell, Python, Golang, or others.