Offer summary
Qualifications:
Strong experience with application servers, Knowledge of automation tools ( Terraform preferred), Good networking and Internet Architecture knowledge, 2+ years in large-scale incident resolution, 3+ years programming (Python, Bash, etc.).
Key responsabilities:
- Improve lifecycle of services from design to refinement
- Monitor system health, establish performance baselines
- Design and maintain logging and telemetry systems
- Automate manual operational tasks through coding
- Troubleshoot incidents and conduct post-mortems