Offer summary

Qualifications:

Experience in developing ETL pipelines with Apache Spark and Python., Solid knowledge of AWS services such as S3, RDS (PostgreSQL), IAM, Glue, EMR, and Lambda., Advanced SQL skills and experience with version control and CI/CD workflows., Fluent in English, with desirable knowledge in real-time data processing and financial data models..

Key responsibilities:

Design, build, and maintain ETL pipelines using Apache Spark following best practices.

Perform complex transformations, advanced calculations, and joins on large data volumes.

Ensure system quality, consistency, and performance through validations and automated testing.

Document processes and collaborate with cloud engineers for smooth integration with AWS infrastructure.

Job description

Responsabilidades:

Diseñar, construir y mantener pipelines ETL con Apache Spark, aplicando buenas prácticas establecidas por el Data Engineering Lead.
Realizar transformaciones complejas, cálculos avanzados y joins sobre grandes volúmenes de datos.
Garantizar calidad, consistencia y rendimiento del sistema a través de validaciones y pruebas automatizadas.
Documentar procesos y mantener la arquitectura técnica actualizada.
Colaborar con ingenieros cloud para asegurar integración fluida con infraestructura en AWS.

Requisitos técnicos:

Experiencia en desarrollo de pipelines ETL con Apache Spark y Python.
Conocimientos sólidos en servicios AWS: S3, RDS (PostgreSQL), IAM, Glue, EMR, Lambda.
Nivel avanzado de SQL.
Experiencia con control de versiones y flujos de CI/CD.