Job D6063 Description Click to apply: Please attach resume to mail
SOFT's client located in New York, NY is looking for a ETL Databricks Engineer for a long term contract assignment.

Qualifications:  
What we are looking for:  
• Hands-on experience in building ETL using Databricks SaaS infrastructure.  
• Experience in developing data pipeline solutions to ingest and exploit new and existing data sources.  
• Expertise in leveraging SQL, programming language like Python and ETL tools like Databricks  
• Perform code reviews to ensure requirements, optimal execution patterns and adherence to established standards.  
Computer Science or Equivalent  
• Expertise in AWS Compute (EC2, EMR), AWS Storage (S3, EBS), AWS Databases (RDS, DynamoDB), AWS Data Integration (Glue).  
• Advanced understanding of Container Orchestration services including Docker and Kubernetes, and a variety of AWS tools and services.  
• Good understanding of AWS Identify and Access management, AWS Networking and AWS Monitoring tools.  
• Proficiency in CI/CD and deployment automation using GITLAB pipeline.  
• Proficiency in Cloud infrastructure provisioning tools e.g., Terraform.  
• Proficiency in one or more programming languages e.g., Python, Scala.  
• Experience in Starburst, Trino and building SQL queries in federated architecture.  
• Good knowledge of Lake house architecture.  
• Design, develop, and optimize scalable ETL/ELT pipelines using Databricks and Apache Spark (PySpark and Scala).  
• Build data ingestion workflows from various sources (structured, semi-structured, and unstructured).  
• Develop reusable components and frameworks for efficient data processing.  
• Implement best practices for data quality, validation, and governance.  
• Collaborate with data architects, analysts, and business stakeholders to understand data requirements.  
• Tune Spark jobs for performance and scalability in a cloud-based environment.  
• Maintain robust data lake or Lakehouse architecture.  
• Ensure high availability, security, and integrity of data pipelines and platforms.  
• Support troubleshooting, debugging, and performance optimization in production workloads.  

Responsibilities:  
Your role as a Senior Data Engineer  
• Work on migrating applications from an on-premises location to the cloud service providers.  
• Develop products and services on the latest technologies through contributions in development, enhancements, testing and implementation.  
• Develop, modify, extend code for building cloud infrastructure, and automate using CI/CD pipeline.  
• Partners with business and peers in the pursuit of solutions that achieve business goals through an agile software development methodology.  
• Perform problem analysis, data analysis, reporting, and communication.  
• Work with peers across the system to define and implement best practices and standards.  
• Assess applications and help determine the appropriate application infrastructure patterns.  
• Use the best practices and knowledge of internal or external drivers to improve products or services.