Data Engineer
Id: AT0019
2023-02-07
• Gathering requirements to understand requests and needs to provide the best solution possible.
• Executing design, development & deployment of ETL applications, preparing High Level Design & Low-Level Design Documents Developing interfaces, packages, and load plans.
• Leading team to handle complex business rules while balancing the fulfillment of stringent performance requirements.
• Performed end-to-end delivery of Pyspark, Spark SQL, Azure Data Warehouse (ADW), CI/CD and Production Support.
• Wrote scripts in Hive SQL for creating complex tables with high performance metrics like partitioning, clustering and skewing.
• Worked with google data catalog and other google cloud APIs for monitoring, query and billing related analysis for Big Query usage.
• Designing and developing code, scripts and data pipelines that leverage structured and unstructured data integrated from multiple sources.
• Implementing Data warehouse solution consisting of ETLs, on-premises to Cloud Migration and building and deploying batch and streaming data pipelines on cloud environments.
• Investigate Data Quality issues and generate presentable narratives based on biases possible due to incompleteness of data.
• Monitoring project progress & outstanding issues and ensuring the quality of the deliverables by onducting daily defect review meetings & extending post-implementation support to team members by defining standard practices.
Qualification:
This position requires a minimum of a bachelor’s degree in Computer Science, Computer Information Systems, Information Technology, or a combination of education and experience equating to the U.S. equivalent of a Bachelor’s degree in one of the aforementioned courses.