We’re looking for an engineer experienced with SRE practices to join our Core Platform team. The primary goal of this team is to improve our software delivery performance and availability by driving DevOps culture, tools, and practices across the engineering department. Besides the multiplicate impact of your own contributions, this role is a good mentorship opportunity for teams learning to build and operate their own services.
You will:
Scale continuous deployment practices across the engineering department. Equip our teams to ship software reliably, frequently, with minimal trouble. Teams use reusable patterns and guidance from you to be able to practice DevOps (end-to-end production ownership within each cross-functional team). This means you’ll work with tools and infrastructure in addition to managing processes and practices.
Extend our reusable service template and its associated CI/CD tooling.
Lead in efforts to further mature our cloud infrastructure and platform offering as our business grows
Help mature our production observability practices. Help teams define SLI’s,manage and achieve SLOs
Extend and maintain our internal developer-friendly CLI
Operate and maintain our platform-level shared services and capabilities, such as continuous integration, continuous deployment, infrastructure automation and monitoring.
Coach product teams on operational ownership. Teach blame-free root cause analysis for incidents that impact the customer or our delivery performance.
Participate in our team’s support rotations
You are:
You are an advocate for DevOps practices and principles. We have the buy-in, now it’s about execution on the vision.
You have cloud infrastructure and networking knowledge (AWS, GCP, Terraform) to be able to design and operate services on the cloud
You have CI/CD tooling knowledge (Github Actions, Jenkins) and experience in maintaining large multi-stage pipelines
Hands-on experience with configuration automation (eg Terraform), Docker, observability tooling (Honeycomb). We’re fully on AWS but eyeing GCP.
A team player who can work collaborate with others, but also run with a mostly-self-contained project independently when required
A bonus point - you have application development experience, whether as a developer or in operational roles that involved coding. Proficiency in a software development language – we use Python for most things but understand your skills from any language are transferable.
Why team members love working at Top Hat:
A noble mission that creates meaningful, fulfilling work
A team that cares deeply for customers and for each other
Flexible, remote first work environment
Professional learning and development for all role levels
An awesome and welcoming Toronto HQ
Competitive health benefits that start on day one
A management team focused on performance, growth, engagement and connection
Our winning strategy and market potential
Innovative PTO policy with lots of time and space for self-care
Passionate customers that believe in us—and what we do
A chance to work with new tech like generative AI—and see the customer impact
Byte24 Recruitment
Cognism
DeepSea Technologies
Cogent Labs
Piller & Partners IT Recruitment