Job Description :

Responsibilities :

- Lead the design, development, and implementation of high-performance, scalable, and resilient data pipelines for both batch and real-time (stream) data processing.

- Utilize your strong expertise in Apache Spark with Python to build efficient data transformation and processing jobs.

- Ensure data pipeline solutions are fault-tolerant and reliable, guaranteeing data quality and integrity throughout the data lifecycle.

- Work extensively with cloud-native data services, specifically demonstrating a minimum of 3+ years of hands-on experience with AWS services such as S3, DMS, Redshift, Glue, Lambda, Kinesis, MSK, or equivalent services from Azure/GCP.

- Develop and optimize complex SQL queries and work with various NoSQL technologies to manage and retrieve data efficiently.

- Collaborate with cross-functional teams, including data scientists, analysts, and other engineers, to understand data requirements and deliver impactful data solutions.

- Participate in data modeling, schema design, and data governance initiatives.

- Monitor, troubleshoot, and optimize existing data pipelines and infrastructure for performance and cost efficiency.

- Contribute to best practices for data engineering, ensuring maintainability, scalability, and security of our data platforms.

- Leverage your experience across multiple domains to adapt and apply best practices to diverse data challenges.

What We're Looking For :

- Minimum of 6 to 10 years of progressive experience building data solutions within Big Data environments.

- A strong ability to build robust, resilient, scalable, fault-tolerant, and reliable data pipelines.

- Mandatory hands-on experience with Apache Spark using Python for both batch and stream data processing.

- Solid knowledge and practical experience in both batch and stream data processing methodologies.

- Demonstrated exposure to working on data projects across multiple domains.

- Strong hands-on capabilities with both SQL and NoSQL technologies.

- Must have a minimum of 3+ years of hands-on experience with AWS services like S3, DMS, Redshift, Glue, Lambda, Kinesis, MSK, or similar data-focused services from Azure/GCP.

- Excellent problem-solving, analytical, and debugging skills.

- Ability to work independently and collaboratively in a remote team environment.

- Strong communication and interpersonal skills