Principal Data Engineer

Remote: 
Full Remote
Contract: 
Work from: 

Offer summary

Qualifications:

10+ years in data engineering or backend software development, 5+ years building distributed systems in production environments, Strong experience in event-driven architectures and messaging systems, Proven track record of delivering large-scale, reliable data systems..

Key responsibilities:

  • Lead architecture and development for the Assembly Orchestrator’s data systems
  • Mentor a team of engineers and provide technical guidance and code reviews
  • Build distributed data processing systems using Java and Python
  • Implement scalable storage and high-performance access strategies.

Assembly Talent logo
Assembly Talent Startup https://assembly-industries.com/
11 - 50 Employees
See all jobs

Job description

Job Title: Principal Data Engineer

 

About Assembly Industries:

Talent is distributed but Opportunities are not - Assembly Industries is breaking that pattern by building an AI enabled talent platform that connects top-tier, highly skilled global professionals with innovative companies across the US. As a fast-growing startup, we are laser-focused on impactful growth, agile strategies, and exceptional results.

 

Role Overview:

We are seeking a Principal Data Engineer to lead the implementation of our Assembly Orchestrator platform, a sophisticated system for managing and executing Standard Operating Processes (SOPs). This role requires deep technical expertise in distributed systems, data engineering, and software architecture, with a focus on building scalable, reliable, and secure data processing pipelines.

 

Key Responsibilities:

Technical Leadership
  • Lead architecture and development for the Assembly Orchestrator’s data systems
  • Define technical standards and best practices for data engineering and system design
  • Mentor a team of engineers and provide technical guidance and code reviews
  • Drive architecture discussions aligned with enterprise data governance and security standards
Core Development
  • Build distributed data processing systems using Java and Python
  • Design event-driven architectures with Apache Kafka
  • Develop workflow orchestration logic with Temporal
  • Create resilient and scalable data pipelines for SOP execution
  • Enable real-time analytics and data feedback loops
System Architecture
  • Implement scalable storage and high-performance access strategies
  • Build observability into systems using Prometheus, Grafana, and ELK
  • Integrate robust security measures, role-based access, and disaster recovery plans
  • Own infrastructure design for deployment on AWS (EKS, S3, RDS) and Kubernetes
Technical Requirements:
  • Languages & Frameworks: Java (17+ with Spring Boot), Python (3.9+ with FastAPI/Django), TypeScript/JavaScript
  • Data & Workflow Tools: Apache Kafka, Apache Spark, Temporal, PostgreSQL, Elasticsearch
  • Infrastructure: AWS, Kubernetes, Docker, Terraform
  • Monitoring & Observability: Prometheus, Grafana, Jaeger, ELK Stack
  • Security & Compliance: SOC2, ISO27001, Secret Management, Key Rotation
 
Minimum Qualifications
  • 10+ years in data engineering or backend software development
  • 5+ years building distributed systems in production environments
  • 3+ years leading or mentoring technical teams
  • Proven track record of delivering large-scale, reliable data systems
  • Strong experience in event-driven architectures and messaging systems
     
Preferred Qualifications
  • Experience with workflow engines like Temporal or Airflow
  • Understanding of business process automation and BPM tools
  • Background in real-time data streaming and analytics
  • Contributions to open-source projects
  • Familiarity with machine learning data pipelines
 
Why Join Assembly
  • Work with a global team building infrastructure for next-generation work
  • Remote-first company with a flexible work culture
  • Competitive compensation and opportunity for long-term impact
  • Exposure to cutting-edge data architecture and engineering challenges
 

This is a remote role open for candidates based in Pakistan.

 

#LI-MF1


 

Required profile

Experience

Spoken language(s):
Tsonga
Check out the description to know which languages are mandatory.

Other Skills

  • Mentorship

Data Engineer Related jobs