Offer summary

Qualifications:

Bachelor's or Master's degree in a relevant field such as Computer Science or Engineering., 2+ years of professional software engineering experience with proficiency in systems languages like Go, Rust, or C++, and high-level languages like Python or TypeScript., Experience in designing, implementing, and operating production-grade distributed systems at scale., Strong communication skills for leading design discussions and code reviews in a fast-paced environment..

Key responsibilities:

Own end-to-end feature work including design review, infrastructure-as-code, and post-launch iteration.

Optimize GPU inference pipelines and extend the TypeScript/React console.

Collaborate with customers to debug latency issues and implement fixes in the orchestration layer.

Influence engineering culture by mentoring new hires and contributing to the technical roadmap.

Job description

About Us
Outspeed is solving one of the biggest challenges with current AI systems - latency. We are building the infra for real-time AI for applications in gaming, AR/VR, robotics, and more.

Outspeed is led by an experienced team of researchers and engineers with collective experience from MIT, Google, and Microsoft. Our team is based in San Francisco and values empathy, deep technical knowledge, and autonomy.

Janak earned his bachelor's and master's degrees from MIT and built grid-scale AI algorithms and infra at Autogrid (acq. Schneider). Sahil is a published researcher who led AI infra efforts at Google and Microsoft.

Role Description
As an early Member of Technical Staff you’ll dive into every layer of Outspeed’s real-time AI platform—core inference engines, orchestration services, developer APIs, and customer-facing tools. You’ll alternate between rapid prototyping and hardening production systems, shipping code that is immediately exercised by teams around the world.

Typical weeks might include:

Owning end-to-end feature work—from RFC and design review to infrastructure-as-code, monitoring, and post-launch iteration.
Optimizing GPU inference pipelines, extending our TypeScript/React console, and designing a new audio-streaming protocols.
Pairing with customers to debug latency spikes, then upstreaming the fix into our orchestration layer.
Influencing engineering culture: instituting best practices, mentoring newer hires, and shaping our technical roadmap alongside the founders.

This role is perfect for an engineer who enjoys breadth, thrives on context-switching, and wants their fingerprints on everything we build.

Benefits

Competitive salary + Equity
Health, dental, and vision insurance.
Bonuses based on performance
We are a company founded by immigrants, and we are committed to providing support to immigrant workers throughout their journey.

Requirements

2+ years of professional software engineering, with deep proficiency in at least one systems language (Go/Rust/C++) and one high-level language (Python/TypeScript).
Proven track record shipping production-grade distributed systems—you’ve designed, implemented, and operated services that run at scale and stay up.
Hands-on experience with end-to-end ML workflows: data pipelines, model training/tuning, packaging, and low-latency serving (PyTorch + CUDA or similar).
Strong product sense and the ability to translate ambiguous problems into well-scoped engineering work.
Excellent written & verbal communication; comfortable leading design docs and code reviews in a fast-moving, asynchronous environment.

Nice-to-haves

Expertise running containerized workloads with Docker, Kubernetes, or Nomad, plus IaC with Terraform or Pulumi.
Knowledge of public-cloud primitives (SDN, block storage, secrets/identity, spot fleet management) and how to squeeze every dollar out of them.
Familiarity with modern inference tooling (e.g., vLLM, Ray, Triton, llama.cpp) and a solid mental model of GPU/TPU performance bottlenecks.
A history of thriving in early-stage startups: rapid iteration, “build → measure → learn” loops, and wearing multiple hats at once.

Required profile