As a Data Engineer, you will play a key role in designing, developing, and maintaining end-to-end data solutions in the Enterprise Data Platform. You will be responsible for building scalable, efficient, reliable, and reusable data pipelines and analytic applications, and your expertise will be leveraged in both backend and frontend development. You will collaborate closely with cross-functional teams to understand business requirements, architect data solutions, and implement best practices for data engineering and analytics.
Duties and Responsibilities
Data Design, Ingestion, Integration, and Processing
- Design, develop, and implement data solutions including data ingestion pipelines to acquire data from various source systems, storage processing, and visualization as needed, leveraging Azure Data Platforms and Databricks.
- Design and define data models, schemas, and structures to support analytical and reporting needs, ensuring scalability, performance, and data integrity. Configure and manage data storage solutions to store structured, semi-structured, and unstructured data. Optimize data storage and retrieval mechanisms for performance, cost efficiency, and compliance with data governance standards and regulations.
- Designing and implementing data integration/ingestion/transformation/migration/extraction which includes effectively mapping and converting data from source systems to target systems, ensuring precision and uniformity of data migration and integration.
- Design, upgrade, and implement new data workflows, automation, tools, and API integrations.
Implementation Expectations
- Perform unit and integration testing. Provide support in UAT, user training, pre-implementation and post-implementation support activities.
- Provide support/executive analysis needs including ad-hoc reporting analysis, dashboard creating, design of data models, etc.
- Collaborate with data scientists, analysts, and other stakeholders to support their data needs and to ensure that data is properly integrated and aligned with business requirements.
- Create technical documents such as solution design, program specifications, and other required documentation in the SDLC process.
DevOps and Automation
- Implement DevOps practices and automation workflows to streamline the development, deployment, and monitoring of data pipelines and applications. Collaborate with DevOps team to define CI/CD pipelines, automate testing, and ensure reliability and scalability of data solutions.
Performance Optimization and Tuning
- Monitor and optimize the performance of data pipelines and analytics applications, identifying bottlenecks, optimizing query performance, and tuning Spark jobs for efficiency.
- Implement caching, partitioning, indexing, and other strategies to improve data processing speed and reduce latency in workloads.
Controls and Compliance
- Implement security controls and encryption mechanisms to protect sensitive data stored and processed in the platform.
- Ensure compliance with industry standards, bank-wide standards, and other regulatory requirements.
Others
- Provide support on maintenance and operations of the environment as needed (e.g., updates, patches)
- Manage and ensure quality delivery of 3rd-party vendors where applicable.
- Conduct knowledge sharing sessions and training workshops to educate team members and other stakeholders.
- Stay up to date with industry developments and trends and recommend and implement new technologies and approaches to improve data engineering processes.
Education
- Bachelor's degree in Computer Science, Engineering, or related field
Work Experience
- Minimum 3-5 years of experience as Data Engineer, Software Engineer with a focus on data engineering and analytics
Technical Skills
- Experience in creating data pipelines and developing complex and optimized queries
- Having at least 2 years experience in Databricks and Microsoft Azure Data Platform
- Experience in SQL, Python, Shell, Scala as well as experience with data manipulation processing using Apache Spark and Databricks
- Strong understanding of data architecture principles, data modeling techniques, and data integration patterns, with experience in designing scalable and efficient data pipelines
Skills
- Experience with DevOps practices, CI/CD pipelines such as Azure DevOps is a strong advantage
- Familiarity with data visualization and reporting tools such as PowerBI is a strong advantage