Data Engineer

GetLinks partner

Dubai, United Arab Emirates

Negotiable

Job description

Job Description


Infrastructure Management:

● Design, develop, and maintain robust and scalable data pipelines to handle large datasets using both on-premise and cloud platforms (e.g., AWS, GCP, Azure).

● Implement and manage data storage solutions, including databases and data lakes, ensuring data integrity and performance.

Data Integration:

● Integrate data from various internal and external sources such as databases, APIs, flat files, and streaming data.

● Ensure data consistency, quality, and reliability through rigorous validation and

transformation processes.

ETL Development:

● Develop and implement ETL (Extract, Transform, Load) processes to automate data

ingestion, transformation, and loading into data warehouses and lakes.

● Optimize ETL workflows to ensure efficient processing and minimize data latency.

Data Quality & Governance:

● Implement data quality checks and validation processes to ensure data accuracy and completeness.

● Develop data governance frameworks and policies to manage data lifecycle, metadata, and lineage.

Collaboration and Support:

● Work closely with data scientists, AI engineers, and developers to understand their data needs and provide technical support.

● Facilitate effective communication and collaboration between the AI and data teams and other technical teams.

Continuous Improvement:

● Identify areas for improvement in data infrastructure and pipeline processes.

● Stay updated with the latest industry trends and technologies related to data engineering and big data.


Job Requirements


Education:

● Bachelor’s degree in Computer Science, Engineering, Data Science, or a related field. A Master’s degree is a plus.

Experience:

● Minimum of 3-5 years of experience in data engineering or a similar role.

● Proven experience with on-premise and cloud platforms (AWS, GCP, Azure).

● Strong background in data integration, ETL processes, and data pipeline development.

● Led the design and development of high-performance AI and data platforms, including IDEs, permission management, data pipelines, code management and model deployment systems.

Skills:

● Proficiency in scripting and programming languages (e.g., Python, SQL, Bash).

● Strong knowledge of data storage solutions and databases (e.g., SQL, NoSQL, data lakes).

● Experience with big data technologies (e.g., Apache Spark, Hadoop).

● Experience with CI/CD tools (e.g., Jenkins, GitLab CI, CircleCI).

● Understanding of data engineering and MLOps methodologies.

● Awareness of security best practices in data environments.

● Excellent problem-solving skills and attention to detail.

Preferred Qualifications:

● Managed on-premise Spark cluster for hands-on big data processing - focuses on both deployment and usage.

Contact us

1 - Minh Anh Le (Tina)Email: [email protected]Tel: +84 97 630 61 49Skype: lengminhanh91