About the Role
We are seeking a highly skilled Spark & Delta Lake Developer to join a critical data engineering initiative for a renowned global banking firm. This is a long-term opportunity with the potential for full-time onboarding, where you will contribute to the design and implementation of high-performance, scalable data solutions in a cloud-based data lakehouse environment.
Key Responsibilities
- Design and develop high-performance data pipelines using Apache Spark for both batch and real-time data processing
- Leverage Delta Lake architecture to build robust and scalable data solutions
- Optimize Spark jobs and pipelines for performance, scalability, and reliability
- Implement best practices for data ingestion, transformation, and storage in a cloud-optimized data lake environment
- Collaborate with data architects, DevOps teams, and platform engineers to deliver enterprise-grade data solutions
- Ensure data quality, governance, and security across all pipelines and workflows
- Participate in code reviews, documentation, and continuous integration/deployment processes
Required Skills and Qualifications
- 5–8 years of experience in big data and distributed computing
- Expertise in Apache Spark (RDD, DataFrame, Spark SQL, Spark Streaming)
- Hands-on experience with Delta Lake and data lakehouse architecture
- Strong programming skills in Scala or PySpark
- Proficiency in data modeling, partitioning, and performance tuning
- Familiarity with cloud platforms such as AWS, Azure, or GCP
- Understanding of CI/CD pipelines and DevOps practices in data engineering
- Knowledge of data governance, security, and compliance frameworks
Nice to Have
- Experience with Databricks, Apache Airflow, or Kubernetes
- Banking or financial domain experience
- Certifications in Big Data or Cloud Technologies (e.g., Databricks Certified Developer, AWS Big Data Specialty)