Match score not available

Senior Staff Engineer - Site Reliability at Nagarro

Remote: 
Full Remote
Contract: 
Experience: 
Senior (5-10 years)
Work from: 

Offer summary

Qualifications:

Expert in Kubernetes and Terraform, Proficient in AWS services, especially EKS, Experience with incident and problem management, Understanding of networking concepts like TCP/IP and DNS, Familiarity with CI/CD tools like Jenkins.

Key responsabilities:

  • Provide L3 support across full stack
  • Automate SRE tools for proactive support
  • Communicate effectively with stakeholders during troubleshooting
  • Monitor resource utilization and application performance
  • Handle business pressure for critical applications
Nagarro logo
Nagarro Information Technology & Services XLarge https://www.nagarro.com/
10001 Employees
See more Nagarro offers

Job description

Company Description

We are a Digital Product Engineering company that is scaling in a big way! We build products, services, and experiences that inspire, excite, and delight. We work at scale — across all devices and digital mediums, and our people exist everywhere in the world (19000+ experts across 33 countries, to be exact). Our work culture is dynamic and non-hierarchical. We are looking for great new colleagues. That is where you come in!

Job Description
  • Experienced L3 SRE engineer based on business-critical SaaS application.
  • Capacity to L3 across the full stack including infra backend and front-end, before escalation to engineering business unit.
  • Capacity to automate SRE tools to provide proactive L3 support, close to our tech monitoring strategy.
  • Capacity to work under business pressure for business critical applications.
  • Capacity to communicate accordingly with L1,L2, Engineering, Product managers, leadership and end-users during troubleshooting.

Qualifications

Must have Skills: Kubernetes (Expert), Github Actions, Terraform (Expert), and AWS.

  • Capacity to communicate accordingly.
  • Experience with incident and problem management.
  • Experience with multitenant applications.
  • Solid understanding of networking concepts (TCP/IP, DNS, Routing, etc) like VPCs, subnets, firewalls, and load balancing, TLS and SSL.
  • Experience with CI/CD pipelines (e.g., Jenkins, Github Actions) & version control.
  • Python, react/next - Monitoring and logging to analyze & track resource utilization, application performance, and identify potential issues, Grafana, Prometheus, Loki or ELK.
  • Experience with AWS, particularly EKS, serverless, queue & various databases.
  • Solid knowledge Kubernetes.

Required profile

Experience

Level of experience: Senior (5-10 years)
Industry :
Information Technology & Services
Spoken language(s):
Check out the description to know which languages are mandatory.

Other Skills

  • Verbal Communication Skills
  • Troubleshooting (Problem Solving)

Site Reliability Engineer Related jobs