D5677 Requirement Description

Responsibilities:
Design, develop, monitor, and maintain data pipelines in an AWS ecosystem with Databricks, Delta Lake, Python, SQL and Starburst as the technology stack. Collaborate with cross-functional teams to understand data needs and translate them into effective data pipeline solutions.
• Establish data quality checks and ensure data integrity and accuracy throughout the data lifecycle.
• Automate testing of the data pipelines and configure as part of CICD
• Optimize data processing and query performance for large-scale datasets within AWS and Databricks environments.
• Document data engineering processes, architecture, and configurations.
• Troubleshooting and debugging data-related issues on the AWS Databricks platform.
• Integrating Databricks with other AWS products such as SNS, SQS, and MSK.
Comments/Special Instructions