Match score not available

Senior Site Reliability Engineer (Remote, US) at Collective[i]

Remote:

Full Remote

Contract:

Full time

Salary:

38 - 192K yearly

Experience:

Senior (5-10 years)

Work from:

United States

Offer summary

Qualifications:

Proficient in AWS and Terraform, Expertise in Linux distributions and scripting, Experience with containerization and Kubernetes, Familiarity with CI/CD tools like GitHub Actions, Knowledge of monitoring tools like DataDog.

Key responsabilities:

Manage AWS infrastructure using Terraform
Develop and implement containerization strategies
Collaborate with development teams for system optimization
Implement monitoring and logging solutions
Proactively manage system stability and reliability

Collective[i] SME https://www.collectivei.com/

51 - 200 Employees

See more Collective[i] offers

Job description

At Collective[i], we value diversity of experience, knowledge, backgrounds and people who share a commitment to building a company and community on a mission to help people be more prosperous. We recruit extraordinary individuals and provide them the platform to contribute their exceptional talents and the freedom to work from wherever they choose. Our company is a wonderful place to learn and grow alongside an incredible and tenacious team.

Collective[i] was founded by three entrepreneurs with over $1B of prior exits. Their belief in the power of Artificial Intelligence to transform life as we know it and improve economic outcomes at massive scale drove the decision to invest over $100m in the company which has created a state-of-the-art platform for prosperity that helps companies generate sales and people expand their professional connections. In the last decade, Collective[i] has grown into a powerful community of scientists, engineers, creative talent and more, working together to help people succeed in business.

We are seeking a skilled and motivated professional to join our team as a Senior Site Reliability Engineer. If you have hands-on experience with AWS in roles such as Site Reliability Engineer (SRE), DevOps Engineer, Cloud Administrator, Platform Engineer, Systems Analyst, or Systems Engineer—or any related role where you’ve managed cloud infrastructure—this opportunity could be a great fit for you.

Responsibilities

Manage AWS infrastructure across multiple accounts using Terraform with extensive experience in deployment and automation.

Utilize Linux and open-source tooling as the foundation of your work, being proficient across various Linux distributions, scripting languages, clustering technologies, database engines, and configuration management tools, with a preference for Ansible.

Develop and implement containerization strategies, ensuring well-crafted container builds. Must be capable of creating original containers and not just relying on third-party containers from public repositories.

Assess and apply Kubernetes knowledge selectively, understanding when and why it is appropriate to use—note, we are not a Kubernetes-focused environment.

Collaborate closely with development teams, providing support in building and optimizing distributed systems.

Maintain expertise in Git workflows, including proficiency in CI/CD automation tools such as GitHub Actions.

Implement and manage monitoring and logging solutions, with hands-on experience in tools like DataDog and OpenTelemetry.

Strive to prevent issues like log diving, incident response, root cause analysis, and late-night pages by proactively managing system stability and reliability.

Requirements

Proficiency with AWS, Terraform, Packer, Ansible, and container technologies.

Expertise in AWS services

Experience with other cloud providers is a plus.

Strong knowledge of Ubuntu 24.04, Bash, Python, systemd, podman, docker, and auditd.

Familiarity with GitHub, GitHub Actions, GitHub Container Registry, and Copilot.

Experience with monitoring and logging tools like DataDog, OpenTelemetry, and Graylog.

Proficiency in working with databases and platforms such as Snowflake, Okta, Postgres, MongoDB, and ElasticSearch.

Familiarity with security tools like Snyk, Tenable.io, and 1Password.

Experience with SOC 2 or other compliance standards is highly desirable.

Who you are working for - About Collective[i]:

Collective[i] is on a mission to help people and companies prosper. Backed over 20 patents and developed by a team of world renowned entrepreneurs, engineers, scientists, and business leaders, Collective[i] is an Economic Foundation Model (“EFM”) that studies how the world does business. Collective[i]’s advisors include a world renowned economist, the former Vice Chair of the Federal Reserve, founders of Comcast, Instagram, MySQL, and former executives from Tesla, NewsCorp, USANetworks, and others.

Harnessing insights from more than a decade of data collection, our EFM has been trained on trillions of dollars of data to unearth successful buying and selling patterns. With Collective[i], any person or company can plug in their own data and receive customized insights that help them maximize economic opportunity and adapt to changing market conditions.

Founded and managed by the early teams behind LinkShare (purchased for $425m) and Overstock (NASDAQ:OSTK), Collective[i] is a private 100% remote company.

Our core values help shape our culture: We are curious. We are direct. We deliver. We succeed together. We strive for the extraordinary. If you enjoy a challenge, thrive in an innovative environment and welcome the opportunity to work with amazing humans operating on the bleeding edge of technology, Collective[i] is the place for you.

Recent press:

Forbes: Stephen Messer: Amazon Missed The AI Boom

CNBC: Harvard professor on A.I. job risks: We need to upskill ad update business models

ZDNet: Why open source is essential to allaying AI fears

Information about the founders:

Tad Martin

Stephen Messer

Heidi Messer