Big Data + Azure DataBricks
Primary Skills:
Proficiency with Hadoop ecosystem technologies - HDFS and Spark.
Good knowledge on Big Data querying tools-Hive and Impala
Working Knowledge of data ingestion using spark for supporting various file formats like JSON, CSV and parquet with its compression techniques.
Perform analysis of vast data stores and uncover insights
Experience in Shell Script
Data Engineer with hand on experience in execution of data project on Azure (Migration or build)
Azure Databricks, ADF and Power BI, Git, Azure Devops must have skill
Secondary Skills:
Strong programming skills in Python
Roles and Responsibilities:
Hadoop production support and implementation.
Translate complex functional and technical requirements into detailed design.
Loading from disparate data sets and processing the data as per business need
Organizing data into tables, performing transformations, performance tuning and simplifying complex queries with Py-spark.
Monitoring data performance and modifying infrastructure as needed
Maintain security and data privacy.
Propose best practices/standards.
Work on transformation roadmap - Hadoop stack to Azure Databricks