Site Reliability Engineer

Remote: 
Full Remote
Contract: 
Work from: 

Offer summary

Qualifications:

Minimum 5 years of experience in Site Reliability Engineering, systems engineering, or software development with an operational focus., Expert-level knowledge of cloud platforms, particularly GCP, and hands-on experience with Kubernetes and infrastructure-as-code tools like Terraform., Proficiency in multiple scripting and programming languages such as Python, Go, and Bash., Strong analytical, problem-solving, and debugging skills, along with excellent communication and collaboration abilities..

Key responsibilities:

  • Lead technical initiatives to enhance the reliability, performance, and scalability of critical systems.
  • Design and implement resilient and scalable solutions, focusing on distributed systems.
  • Develop and promote automation frameworks and tools, mentoring junior engineers in the process.
  • Take charge of incident management and lead post-mortem analyses to drive improvements based on findings.

Strike logo
Strike
51 - 200 Employees
See all jobs

Job description

Join the Future of Money

At Strike, we're building a world where everyone has access to Bitcoin—a truly open, global, and public digital infrastructure for money. Together, we're leading a financial revolution that promises a more fair, honest, and equitable future for everyone. Strike is the global Bitcoin app—a simple, fast, and secure way to buy bitcoin and send money worldwide. Available in over 100 countries, including the U.S., Europe, Latin America, and Africa, Strike is redefining global finance. Join us in our mission to create a more inclusive financial future for all.

Role: 

We are seeking a highly experienced Site Reliability Engineer located in Europe, with a strong track record of tackling complex reliability and scalability challenges, and a history of providing technical guidance to teams. If you're a seasoned problem-solver with a passion for automation and operational excellence, and enjoy elevating the skills of those around you, we want to hear from you.

This position is available for candidates located in Europe. 

What You'll Do:
  • Lead Technical Initiatives: Drive key technical initiatives focused on improving the reliability, performance, and scalability of our critical systems, often leading technical aspects within projects.
  • Architect and Implement Advanced Solutions: Design and implement sophisticated resilient and scalable solutions, leveraging your deep understanding of distributed systems.
  • Master Troubleshooting and Optimization: Lead complex troubleshooting efforts, identify deep-seated root causes, and implement advanced optimizations.
  • Build and Evangelize Automation: Develop and champion the adoption of robust automation frameworks and tools, potentially guiding more junior engineers in their development.
  • Elevate Observability Practices: Design and implement comprehensive and insightful monitoring and logging solutions, ensuring actionable insights are available across teams.
  • Provide Leadership in Incident Management: Take a leadership role in incident response, providing critical technical direction and mentorship during high-pressure situations.
  • Champion Post-Mortem Excellence: Lead and contribute to in-depth blameless post-mortem analyses, driving significant improvements based on learnings.
  • Mentor and Guide Team Members: Share your extensive knowledge and experience to mentor and guide other SREs and engineers, fostering their technical growth.
What We're Looking For:
  • Extensive experience with minimum 5 years in SRE, systems engineering, or software development with a strong operational focus.
  • Demonstrated experience in providing technical leadership, guidance, or mentorship to engineering teams.
  • Expert-level practical knowledge of cloud platforms, especially GCP.
  • Deep hands-on experience with container orchestration (Kubernetes) and infrastructure-as-code (Terraform, Helm, ArgoCD).
  • Strong command of multiple scripting and programming languages (Python, Go, Bash).
  • Proven expertise in building and leveraging advanced monitoring and observability tools (Prometheus, Grafana, ELK stack).
  • Exceptional analytical, problem-solving, and debugging skills at a senior level.
  • Excellent communication, collaboration, and influencing skills.

Compensation for services is location dependent. 

We do not make hiring decisions based on educational history whatsoever. Our Founder is a college dropout. We work with high school dropouts, PHD candidates and everything in-between. We do not hire credentials. We simply partner with talented, passionate individuals who are excited to be a part of our team.

By clicking submit application below, you consent to our use and processing of your data as described in our Candidate Privacy Notice.

Required profile

Experience

Spoken language(s):
English
Check out the description to know which languages are mandatory.

Other Skills

  • Mentorship
  • Collaboration
  • Communication
  • Problem Solving

Site Reliability Engineer (SRE) Related jobs