Location: Remote (South Africa)
Employment Type: Full-Time
Industry: Data Engineering | Cloud Platforms | Financial Services Technology
WatersEdge Solutions is partnering with a client to recruit a highly skilled Senior Data Engineer (Spark & Python Specialist). This is a strong opportunity for a technically advanced engineer who enjoys building scalable, high-performance data solutions in a modern cloud environment. The role is ideal for someone who thrives on optimisation, code-first engineering, and modernising legacy data logic into clean, portable, Python-centric solutions.
About the Role
As a Senior Data Engineer, you’ll serve as a key technical contributor within the engineering team, focused on building, maintaining, and optimising large-scale data processing engines. You’ll work extensively with Spark, PySpark, Delta Lake, and cloud-based lakehouse environments, helping shape a provider-agnostic platform with strong engineering standards, portability, and performance at its core.
Key Responsibilities
Optimise Spark-based processing through best practices in memory management, shuffle tuning, and partitioning
Build and maintain data pipelines using Python, PySpark, Delta Lake, and Parquet
Refactor legacy SQL-based ETL logic into modular, testable, maintainable Python libraries
Build and optimise medallion architecture layers across Bronze, Silver, and Gold
Support code-first orchestration approaches using tools such as Airflow, Dagster, or Python-based wrappers
Participate in code reviews and mentor junior engineers in PySpark best practices
Contribute to automated testing frameworks using Pytest
Work closely with data scientists, analysts, and business stakeholders to deliver fit-for-purpose solutions
Lead initiatives that strengthen data engineering capability, tooling, and technical best practice
Ensure strong security, compliance, and governance across engineering processes
What You’ll Bring
Bachelor’s degree in Computer Science, Information Systems, Engineering, or a related field
6+ years of Spark / PySpark experience
Strong production-grade Python capability
Solid T-SQL skills with experience interpreting and migrating existing SQL logic
Experience with Azure Synapse Analytics, Dedicated SQL Pools, and Data Factory
Hands-on expertise with Delta Lake and Parquet in high-volume environments
Experience with Docker and open-source, cloud-agnostic engineering standards
Strong collaboration skills and a proven track record modernising large-scale data workloads
Experience mentoring engineers and contributing to technical excellence across a team
Strong understanding of security, compliance, and data governance principles
Nice to Have
Exposure to Microsoft Fabric
Experience with code-first orchestration tooling such as Airflow or Dagster
Experience building reusable internal Python libraries and automated testing patterns
Background in high-scale cloud data platform modernisation
What’s On Offer
Fully remote role based in South Africa
Flexible working arrangements
Wellness support and home office reimbursement
Continuous learning opportunities
Competitive salary, ESOP, and recognition for performance
Supportive and inclusive team culture focused on accountability and work-life balance
Company Culture
This is a team that values transparency, accountability, inclusion, and technical excellence. You’ll join an environment that supports strong engineering standards, continuous learning, and collaborative problem-solving while giving people the flexibility to do their best work in a sustainable way.
If you have not been contacted within 10 working days, please consider your application unsuccessful.
You have successfully created your alert.
You will receive an email when a new job matching your criteria is posted.
Please check your email. It looks like you haven't verified your account yet. Here's what you're missing out on:
Didn't receive the link? Resend Verification Link