Site Reliability Consultant

LATAM | Remote | Work from Home

Why you?

As a Site Reliability Consultant, you will serve as both a technology leader and trusted advisor to our customers, while mentoring teammates in cutting-edge tools and approaches. Your projects will focus on infrastructure design and modernization, automation of CI/CD pipelines, and building out intelligent monitoring and observability systems—spanning Linux, Cloud, container orchestration, and other open-source technologies. You’ll become our resident expert for Git-based source code management, artifact repository solutions, and Kubernetes in both cloud (e.g., AWS EKS) and on-prem environments.

What you will you be doing:

Operate & Maintain

Administer and optimize platforms such as GitLab (CI/CD pipelines, runners) and artifact repository solutions (e.g., JFrog Artifactory).

Maintain and troubleshoot Kubernetes clusters—either in the cloud (AWS EKS) or on-prem distributions—with a focus on availability, performance, and security.

Automation & CI/CD

Champion “infrastructure as code” using tools like Terraform (or CloudFormation), building repeatable processes for provisioning and updating clusters, repos, and associated services.

Implement or improve CI/CD pipelines to reduce manual toil and ensure quick, reliable deployments across multiple environments.

Monitoring & Incident Response

Design and configure observability solutions (e.g., Prometheus, Dynatrace, Grafana) to proactively detect and address issues in container orchestration environments, code repositories, and artifact repositories.

Participate in an on-call rotation, troubleshooting incidents at all tiers (from first-contact resolution to escalation) and driving continuous improvement based on Root Cause Analysis.

Architectural Guidance & Roadmaps

Collaborate with clients to shape infrastructure strategies around container orchestration, secure CI/CD, and DevSecOps best practices.

Provide leadership and technical direction on automating repetitive administrative tasks, enforcing security policies (RBAC, TLS, container scanning), and adopting GitOps workflows.

Documentation & Mentorship

Create and maintain design documents, runbooks, and operational playbooks for container platforms, CI/CD pipelines, and code management services.

Mentor fellow consultants and client stakeholders on Kubernetes, infrastructure automation, and advanced CI/CD usage to enhance knowledge across the organization.

Process Management

Plan and coordinate maintenance activities, ensuring minimal downtime and clear communication with stakeholders.

Provide ITIL-oriented support (Incident, Change, Problem Management), and champion continuous improvement of operational processes and service reliability.

What we need from you:

Kubernetes & Containerization

Must have strong experience with container orchestration (Kubernetes, Docker) in cloud (AWS EKS) or on-prem distributions.

Familiarity with related ecosystem tools (Helm, Operators, GitOps, etc.).

AWS & Cloud Expertise

Hands-on experience using AWS (VPC, EC2, EKS, IAM, S3, etc.), including provisioning with IaC tools like Terraform (or AWS CloudFormation).

AWS certifications (Solutions Architect, DevOps Engineer) are a plus.

CI/CD & Source Code Management

Experience setting up GitLab or similar platforms (GitHub, Bitbucket) for CI/CD pipelines, managing runners, and integrating code scanning.

Familiarity with artifact repository solutions (e.g., JFrog Artifactory), including repository creation, access controls, and automation of artifact flows.

DevOps & Automation

Track record of infrastructure automation using Terraform, Ansible, Puppet, or Chef to reduce manual intervention and ensure repeatable deployments.

Strong scripting skills (Bash, Python, Go, etc.) to automate system tasks and streamline operational workflows.

Monitoring & Observability

Experience with modern monitoring stacks (Prometheus, Dynatrace, Grafana, ELK/EFK) for analyzing logs, metrics, and traces.

Proven ability to design alerts, dashboards, and runbooks that enable rapid first-contact resolution.

Linux & Networking

Solid understanding of Linux-based systems, performance tuning, and troubleshooting.

Network fundamentals (TCP/IP, load balancers, DNS, NTP, etc.) and ability to diagnose connectivity or performance issues in complex distributed environments.

Security & Compliance

Familiarity with container security best practices (RBAC, TLS, vulnerability scanning) and how to apply them at scale.

Understanding of compliance frameworks (HIPAA, PCI, etc.) and data privacy constraints a plus.

Soft Skills & Collaboration

Adept at communicating technical concepts to both engineering and non-technical stakeholders.

Ability to mentor junior team members, champion DevOps culture, and contribute to an inclusive, knowledge-sharing environment.

Education & Experience

Bachelor’s Degree in Computer Science, Information Systems, or equivalent experience.

Several years of progressive DevOps or SRE experience managing large-scale systems in a production environment.

AI/Automation Tooling

Experience or strong interest in leveraging AI-based services or scripts for operational efficiency and faster issue resolution is highly desirable.

What you get in return:

Love your career: Work on innovative projects such as complex, modern platform initiatives—ranging from on-prem container orchestration to advanced CI/CD and DevSecOps solutions in the cloud.

Love your work/life balance: Flexibly work remotely from your home, there’s no daily travel requirement to an office! All you need is a stable internet connection.

Love your team: Enjoy our collaborative culture and in a diverse, global team of SRE experts who share knowledge, tackle incidents together, and learn from each other.

Love your development: Pythian cares about continues learning and provides opportunities to earn certifications (AWS, Kubernetes, Terraform) and expand your skill set across multiple platforms, frameworks, and industries.

Love your workspace: We give you all the equipment you need to work from home including a laptop with your choice of OS, and an annual budget to personalize your work environment!

Love yourself: Pythian cares about the health and well-being of our team. You will have an annual wellness budget to make yourself a priority (use it on gym memberships, massages, fitness and more). Additionally, you will receive a generous amount of paid vacation and sick days, as well as a day off to volunteer for your favorite charity.

Love your customers: You will be focused on highly impactful work, helping clients modernize their infrastructure, automate critical processes, and improve reliability for mission-critical applications.

Why Pythian

Pythian excels at helping businesses use data, analytics, and cloud to transform how they compete and win by delivering advanced on-prem, hybrid, cloud, and multi-cloud solutions. In the early years, we focused on supporting mission-critical operational databases, and as our experts became known for their ability to solve the toughest data challenges, our services grew to meet the rapidly changing needs of our clients; expanding from on-premise to the cloud and from operational to analytics data systems.

A powerful combination of extensive expertise in data and cloud, as well as our ability to keep ahead of the latest bleeding-edge technologies, makes us the perfect partner. We help mid- and large-sized businesses transform to stay ahead in today’s rapidly changing digital economy.

We pride ourselves on our ability to deliver innovative solutions that meet the specific data goals of each client and have built meaningful partnerships with major cloud vendors AWS, Google Cloud, Microsoft, and Oracle.

Intrigued to see what a job is like at Pythian? Check us out@Pythian and#pythianlife.

Follow@PythianJobs on Twitter and@loveyourdata on Instagram!

Not the right job for you? Check out what other great jobs Pythian has open around the world!Pythian Careers

Disclaimer

The successful applicant will need to fulfill the requirements necessary to obtain a background check.

Accommodations are available upon request for candidates taking part in all aspects of the selection process.

Site Reliability Consultant

Offer summary

Qualifications:

Key responsabilities:

Job description

Required profile

Experience

Hard Skills

Other Skills

Site Reliability Engineer (SRE) Related jobs

Senior Site Reliability Engineer

Site Reliability Engineer

Senior Site Reliability Engineer

Site Reliability Engineer (Senior)

Senior Site Reliability Engineer (SRE) - Disaster Recovery Specialist (m/f/x)