Match score not available

Senior Site Reliability Engineer

Remote: 
Full Remote
Contract: 
Experience: 
Senior (5-10 years)
Work from: 

Offer summary

Qualifications:

5+ years of infrastructure experience, 2+ years of AWS experience with certification, Proficiency in CloudFormation, Experience with Docker, ECS, ECR, Strong SQL and data analysis skills.

Key responsabilities:

  • Maintain infrastructure-as-code CloudFormation templates
  • Analyze infrastructure performance metrics
  • Manage deployment pipelines and scaling solutions
  • Collaborate with teams on shared metrics
  • Ensure compliance with SOC-2 and cybersecurity regulations
ZayZoon logo
ZayZoon Fintech: Finance + Technology Scaleup https://www.zayzoon.com
51 - 200 Employees
See more ZayZoon offers

Job description

WHO WE ARE

Our goal is to save ten-million hard-working employees ten-billion dollars. We are a values driven, well-funded, and fast-growing Financial Technology and HR company. We want to empower small and midsize businesses with financial tools that make them the place where people want to work.

We’ve created a financial empowerment platform that helps small but mighty HR teams make a big impact on employee financial wellness.  ZayZoon is quickly becoming the employee financial wellness super-app that employees can’t live without, and employers are clamoring to offer to help attract and retain talent. 

We are growing fast and have been recognized for rapid growth in the 2023 Deloitte Technology Fast 500 and Canadian Technology Fast 50 program! You can read more about it here.

About the Role

We are looking for a Senior Site Reliability Engineer to take ZayZoon’s cloud infrastructure to the next level with complex AWS builds, infrastructure-as-code, and observability/logging/APM solutions. You'll work in an embedded reliability team, alongside app and data engineers, to monitor, benchmark, and scale Zayzoon’s products. You will work with first class technologies and staff to leverage all the goodies AWS has to offer, as well as creating a bridge between our bare metal infrastructure and our Ruby on Rails production app. Predictability, reliability, and scalability are your three favourite words.

YOUR RESPONSIBILITIES:
  • Develop and maintain infrastructure-as-code CloudFormation templates, emphasizing serverless resources (ECS, Fargate, lambda)
  • Instrumentation and daily metrics analysis of both infrastructure performance and our Ruby on Rails applications, using AWS tooling (Athena, CloudTrail, etc) and third party observability platforms (DataDog, OTel)
  • Manage deployment pipelines, including blue/green and intelligent auto-scaling
  • Maintain and stay ahead of resource dependencies, particularly database (RDS, Redshift), including updates, playbooks, downtime planning
  • Project costs and implement AWS cost savings programs and reserved instances
  • Work alongside our risk and security teams to ensure ongoing SOC-2 and cybersecurity compliances
  • Extensive collaboration with app developers on shared metrics, database performance, load testing
  • Extensive collaboration with data engineers on facilitating data warehouse development, ELT, ETL
  • Participating in our agile development process: sprint planning, story grooming and stand ups
  • Adherence to our SDLC and secure coding practises and environment

  • TO BE SUCCESSFUL IN THIS ROLE, YOU NEED TO BE SOMEONE WHO:
  • Has the ability to build quick when we need to experiment and build clean when MVP becomes core functionality
  • Has strong SQL and data analysis skills and an eagerness to dig into data as part of problem solving

  • WHAT YOU BRING TO THE TABLE:
  • 5+ years infrastructure experience
  • 2+ years AWS experience including certification and deployment of production applications
  • Proficiency with IaC, specifically CloudFormation
  • Experience with containerization (Docker, ECS, ECR)
  • Experience analyzing and acting on performance issues using observability platforms (DataDog, NewRelic, OTel)
  • ANYTHING ELSE YOU MIGHT NEED TO KNOW

    Candidates must be located in Canada to be considered.
    We are organized as a remote team, as such we are looking for candidates who can work effectively remotely. You must have access to a secure high speed internet connection and a secure workspace to ensure security of private information. This role is available on a permanently remote basis.

    Please be aware that as part of our final hiring process, we will conduct reference calls with previous managers and possibly other individuals. Additionally, due to the nature of our business, a criminal record check and a basic security clearance will also be required.

    We wish to thank all qualified applicants for their interest in joining our team! 

    #LI-REMOTE

    Required profile

    Experience

    Level of experience: Senior (5-10 years)
    Industry :
    Fintech: Finance + Technology
    Spoken language(s):
    English
    Check out the description to know which languages are mandatory.

    Other Skills

    • Collaboration
    • Problem Solving

    Site Reliability Engineer Related jobs