Match score not available

Senior Site Reliability Engineer (Remote, US) at Collective[i]

Remote: 
Full Remote
Contract: 
Salary: 
38 - 192K yearly
Experience: 
Senior (5-10 years)
Work from: 

Offer summary

Qualifications:

Proficient in AWS and Terraform, Expertise in Linux distributions and scripting, Experience with containerization and Kubernetes, Familiarity with CI/CD tools like GitHub Actions, Knowledge of monitoring tools like DataDog.

Key responsabilities:

  • Manage AWS infrastructure using Terraform
  • Develop and implement containerization strategies
  • Collaborate with development teams for system optimization
  • Implement monitoring and logging solutions
  • Proactively manage system stability and reliability
Collective[i] logo
Collective[i] SME https://www.collectivei.com/
51 - 200 Employees
See more Collective[i] offers

Job description

At Collective[i], we value diversity of experience, knowledge, backgrounds and people who share a commitment to building a company and community on a mission to help people be more prosperous. We recruit extraordinary individuals and provide them the platform to contribute their exceptional talents and the freedom to work from wherever they choose. Our company is a wonderful place to learn and grow alongside an incredible and tenacious team. 

Collective[i] was founded by three entrepreneurs with over $1B of prior exits. Their belief in the power of Artificial Intelligence to transform life as we know it and improve economic outcomes at massive scale drove the decision to invest over $100m in the company which has created a state-of-the-art platform for prosperity that helps companies generate sales and people expand their professional connections. In the last decade, Collective[i] has grown into a powerful community of scientists, engineers, creative talent and more, working together to help people succeed in business. 

We are seeking a skilled and motivated professional to join our team as a Senior Site Reliability Engineer. If you have hands-on experience with AWS in roles such as Site Reliability Engineer (SRE), DevOps Engineer, Cloud Administrator, Platform Engineer, Systems Analyst, or Systems Engineer—or any related role where you’ve managed cloud infrastructure—this opportunity could be a great fit for you.

Responsibilities
  • Manage AWS infrastructure across multiple accounts using Terraform with extensive experience in deployment and automation.
  • Utilize Linux and open-source tooling as the foundation of your work, being proficient across various Linux distributions, scripting languages, clustering technologies, database engines, and configuration management tools, with a preference for Ansible.
  • Develop and implement containerization strategies, ensuring well-crafted container builds. Must be capable of creating original containers and not just relying on third-party containers from public repositories.
  • Assess and apply Kubernetes knowledge selectively, understanding when and why it is appropriate to use—note, we are not a Kubernetes-focused environment.
  • Collaborate closely with development teams, providing support in building and optimizing distributed systems.
  • Maintain expertise in Git workflows, including proficiency in CI/CD automation tools such as GitHub Actions.
  • Implement and manage monitoring and logging solutions, with hands-on experience in tools like DataDog and OpenTelemetry.
  • Strive to prevent issues like log diving, incident response, root cause analysis, and late-night pages by proactively managing system stability and reliability.


  • Requirements
  • Proficiency with AWS, Terraform, Packer, Ansible, and container technologies.
  • Expertise in AWS services
  • Experience with other cloud providers is a plus.
  • Strong knowledge of Ubuntu 24.04, Bash, Python, systemd, podman, docker, and auditd.
  • Familiarity with GitHub, GitHub Actions, GitHub Container Registry, and Copilot.
  • Experience with monitoring and logging tools like DataDog, OpenTelemetry, and Graylog.
  • Proficiency in working with databases and platforms such as Snowflake, Okta, Postgres, MongoDB, and ElasticSearch.
  • Familiarity with security tools like Snyk, Tenable.io, and 1Password.
  • Experience with SOC 2 or other compliance standards is highly desirable.

  • Who you are working for - About Collective[i]:


    Collective[i] is on a mission to help people and companies prosper. Backed over 20 patents and developed by a team of world renowned entrepreneurs, engineers, scientists, and business leaders, Collective[i] is an Economic Foundation Model (“EFM”) that studies how the world does business. Collective[i]’s advisors include a world renowned economist, the former Vice Chair of the Federal Reserve, founders of Comcast, Instagram, MySQL, and former executives from Tesla, NewsCorp, USANetworks, and others. 

    Harnessing insights from more than a decade of data collection, our EFM has been trained on trillions of dollars of data to unearth successful buying and selling patterns. With Collective[i], any person or company can plug in their own data and receive customized insights that help them maximize economic opportunity and adapt to changing market conditions.

    Founded and managed by the early teams behind LinkShare (purchased for $425m) and Overstock (NASDAQ:OSTK), Collective[i] is a private 100% remote company.

    Our core values help shape our culture: We are curious. We are direct. We deliver. We succeed together. We strive for the extraordinary. If you enjoy a challenge, thrive in an innovative environment and welcome the opportunity to work with amazing humans operating on the bleeding edge of technology, Collective[i] is the place for you.


    Recent press:


    Information about the founders:
    Tad Martin
    Stephen Messer
    Heidi Messer

    Required profile

    Experience

    Level of experience: Senior (5-10 years)
    Spoken language(s):
    Check out the description to know which languages are mandatory.

    Site Reliability Engineer Related jobs