Match score not available

Senior Site Reliability Engineer (Azure Data DevOps)

Remote: 
Full Remote
Contract: 
Experience: 
Senior (5-10 years)
Work from: 

Offer summary

Qualifications:

Proficient in Microsoft Azure and Data Factory, Experience with monitoring tools like Data Dog, Strong analytical and troubleshooting skills, Experience in CI/CD tools like Git and Jenkins, Proficient in PowerShell or Python scripting.

Key responsabilities:

  • Develop and maintain CI/CD pipelines
  • Optimize application performance and reliability
  • Troubleshoot complex data pipeline issues
  • Monitor system performance and address issues
  • Manage release and deployment processes
EPAM Systems logo
EPAM Systems Information Technology & Services XLarge https://www.epam.com/
10001 Employees
See more EPAM Systems offers

Job description

We are actively seeking an experienced Senior Site Reliability Engineer to join our team. In this role, you will be responsible for designing, implementing, and managing Azure cloud infrastructure and services to ensure high availability, scalability, and security. You will collaborate with development teams to optimize application performance and reliability, troubleshoot and resolve complex issues in data pipelines, and proactively monitor system performance and resource utilization.

Responsibilities


  • Develop and maintain CI/CD pipelines for automated deployment, testing, and monitoring of data-driven applications
  • Collaborate with development teams to optimize application performance and reliability, implementing best data management practices
  • Troubleshoot and resolve complex issues in data pipelines, storage, and processing within the Azure cloud environment
  • Monitor system performance and resource utilization, proactively addressing potential issues to prevent downtime and data loss
  • Manage the release and deployment of applications and components


Requirements


  • Proficient knowledge of Microsoft Azure, Data Factory, and Databricks
  • Experience with monitoring tools like Data Dog, including setup and configuration
  • Skilled in system availability, reliability, capacity planning, and scaling environments
  • Strong analytical and troubleshooting skills in a L3 production environment for large Data Platform systems
  • Understanding of Incident and Change Management processes
  • Experience in administering and deploying CI/CD tools such as Git, Jira, GitLab, or Jenkins
  • Proficient in infrastructure scripting with PowerShell or Python
  • Ability to visually present and communicate architectures
  • Excellent interpersonal skills and emotional intelligence
  • Strong English communication skills (B2+ level)


Nice to have


  • Experience in Kubernetes and Terraform for infrastructure orchestration
  • Knowledge of observability and troubleshooting in distributed systems
  • Familiarity with Azure Data Factory and Azure Databricks


Technologies


  • Microsoft Azure
  • Azure Data Factory
  • Azure Databricks
  • Azure DevOps
  • Datadog
  • Infrastructure & Operations
  • Git, Jira, GitLab, Jenkins
  • PowerShell or Python scripting


We offer


  • Career plan and real growth opportunities
  • Unlimited access to LinkedIn learning solutions
  • International Mobility Plan within 25 countries
  • Constant training, mentoring, online corporate courses, eLearning and more
  • English classes with a certified teacher
  • Support for employee’s initiatives (Algorithms club, toastmasters, agile club and more)
  • Enjoyable working environment (Gaming room, napping area, amenities, events, sport teams and more)
  • Flexible work schedule and dress code
  • Collaborate in a multicultural environment and share best practices from around the globe
  • Hired directly by EPAM & 100% under payroll
  • Law benefits (IMSS, INFONAVIT, 25% vacation bonus)
  • Major medical expenses insurance: Life, Major medical expenses with dental & visual coverage (for the employee and direct family members)
  • 13 % employee savings fund, capped to the law limit
  • Grocery coupons
  • 30 days December bonus
  • Employee Stock Purchase Plan
  • 12 vacations days plus 4 floating days
  • Official Mexican holidays, plus 5 extra holidays (Maundry Thursday and Friday, November 2nd, December 24th & 31st)
  • Monthly non-taxable amount for the electricity and internet bills


By applying to our role, you are agreeing that your personal data may be used as in set out in EPAM´s Privacy Notice and Policy.

EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects that deliver the most creative and cutting-edge solutions, and have an opportunity to continuously learn and grow. No matter where you are located, you will join a dedicated, creative, and diverse community that will help you discover your fullest potential.

Required profile

Experience

Level of experience: Senior (5-10 years)
Industry :
Information Technology & Services
Spoken language(s):
EnglishEnglish
Check out the description to know which languages are mandatory.

Other Skills

  • Social Skills
  • Diagnostic Skills
  • Verbal Communication Skills
  • Emotional Intelligence
  • Analytical Skills
  • Reliability

Site Reliability Engineer (SRE) Related jobs