Proficiency in at least one programming language, preferably Python., Strong knowledge of distributed computing frameworks, especially PySpark., Familiarity with data modeling, architecture concepts, and cloud infrastructure., Experience with CI/CD processes and data testing..
Key responsibilities:
Design and implement data pipelines in collaboration with data users and engineering teams.
Ensure reliability, data quality, and optimal performance of data assets.
Deliver high-quality code focusing on simplicity and maintainability.
Support partners with proof of concept initiatives and technical questions.
Report This Job
Help us maintain the quality of our job listings. If you find any issues with this job post, please let us know.
Select the reason you're reporting this job:
With offices in the United States, Latin America, India, United Kingdom and Europe, Herzum is an international consulting group with a successful track-record of helping enterprises innovate and align business and IT, while concurrently reducing IT and development costs. A leading authority in IT strategy, enterprise architecture, agile software development/integration, and training and mentoring, Herzum is particularly known for its cosourcing centers and its COSM approach. With a network of strategic alliances with service and product companies, Herzum handles projects and organizations from start-up phase to Fortune 100 level.
Herzum is seeking a Data Engineer to join our team and support one of our Clients.
Key responsibilities:
Design and implement data pipelines in collaboration with our data users and other engineering teams.
Ensure reliability, data quality and optimal performance of our data assets.
Transpose complex business and analytics requirements into high-quality data assets.
Deliver high-quality code, focusing on simplicity, performance and maintainability.
Where applicable, re-design and implement existing data pipelines to leverage the newest technologies and best practice.
Work with solution engineers, data scientists and product owners to deliver end to end products.
Support our partners with proof of concept initiatives and data related technical questions.
Required skills:
Excellent software/data engineering skills and proficiency in at least one programming language, Python preferred.
Good knowledge of distributed computing frameworks, PySpark is a must-have, with others being an advantage.
Familiarity with system design, data structures, algorithms, storage systems and cloud infrastructure.
Understanding of data modeling and data architecture concepts.
Experience with CI/CD processes as well as data testing and monitoring.
Knowledge of Delta Lake protocol and Lakehouse architectures.
Experience with Databricks and Azure services for data, such as Azure Data Factory, Synapse Analytics or Fabric.
Additional skills:
Ability to work effectively in teams with both technical and non-technical individuals.
Ability to communicate complex technical concepts and results in a clear and detailed manner to non-technical audiences.
Excellent verbal and written communication skills.
Proficient in English.
Experience with YAML is a plus.
Work mode: full remote
Join Us! Become part of a team driven by innovation, belief in talent, and a commitment to excellence. Your next career step starts here.
This announcement is addressed to both sexes, in accordance with Laws 903/77 and 125/91, and to people of all ages and nationalities, in accordance with Legislative Decrees 215/03 and 216/03.
Required profile
Experience
Spoken language(s):
English
Check out the description to know which languages are mandatory.