ENVIRONMENT:
An Investment company is searching for a talented and driven Data Scientist to join their innovative and growing team based in Durbanville, Cape Town. This is an exciting opportunity to work with a market leader, leveraging data to drive strategic decision-making, enhance business performance, and contribute to the future of wealth creation in South Africa.
DUTIES:
Machine Learning & Predictive Modelling
• Design, train, evaluate, and deploy supervised and unsupervised machine learning models in Python.
• Build predictive and prescriptive models across classification, regression, clustering, and ranking tasks.
• Own the full ML lifecycle: data preparation, feature engineering, model selection, validation, deployment, and monitoring.
• Package and deploy models in Docker for reproducible, versioned, maintainable production use.
NLP & Generative AI
• Develop and fine-tune NLP models for text classification, named entity recognition, sentiment analysis, and summarisation.
• Leverage LLMs and generative AI (Claude, OpenAI API, Hugging Face) to build intelligent applications.
• Design prompt engineering strategies and retrieval-augmented generation (RAG) pipelines using PostgreSQL/pgvector for vector storage.
• Evaluate and mitigate risks in generative AI outputs including hallucination, bias, and fairness.
Computer Vision
• Build and adapt computer vision models for image classification, object detection, and segmentation.
• Work with pre-trained architectures (e.g. CNNs, ViTs) and fine-tune on domain-specific datasets.
• Collaborate with engineering teams to integrate vision models into production systems.
Analytics, Dashboards & Statistical Insights
• Conduct rigorous exploratory data analysis (EDA) and statistical modelling to surface actionable insights.
• Design and analyse A/B tests and experiments to measure the impact of product and business changes.
• Translate complex analytical findings into clear, compelling narratives for non-technical stakeholders.
• Build dashboards and internal tools using React, TypeScript, Tailwind, and Framer Motion to track model and business KPIs.
AI Automation & Workflow Integration
• Use AI developer tools such as Claude Code and LLMs to accelerate experimentation and delivery.
• Automate data and model workflows with n8n, Zapier, and Base44.
• Integrate model outputs and insights into the group’s systems including Zoho CRM (via Deluge and APIs).
Collaboration & Research
• Partner with data engineers to ensure high-quality data is available for modelling.
• Work with product managers and stakeholders to define problems, success criteria, and evaluation metrics.
• Stay current with research developments in ML, NLP, and AI; evaluate and apply relevant techniques.
• Document methodologies, experiments, and model decisions to support reproducibility and knowledge sharing.
REQUIREMENTS:
• 3–5 years of hands-on experience in a data science or applied machine learning role.
• Strong proficiency in Python for data science (NumPy, pandas, scikit-learn, PyTorch or TensorFlow).
• Solid understanding of ML fundamentals: model selection, cross-validation, regularisation, and evaluation metrics.
• Practical experience with NLP and language models (transformers, BERT, GPT-family, etc.).
• Proficiency in SQL and working with PostgreSQL for data extraction and manipulation.
• Experience deploying models with Docker.
• Strong statistical foundations: hypothesis testing, probability, regression, and experimental design.
• Ability to communicate technical work clearly to non-technical audiences.
• Experience with generative AI tooling (LangChain, LlamaIndex, OpenAI API, Claude, Hugging Face).
• Experience building RAG patterns with vector stores (e.g. PostgreSQL/pgvector).
• Exposure to computer vision frameworks (OpenCV, torchvision, YOLO, Detectron2).
• Familiarity with AI automation tools (n8n, Zapier, Base44) and AI dev tools (Claude Code).
• Front-end skills (React, TypeScript, Tailwind, Framer Motion) for building model-facing tools.
• Experience integrating with Zoho CRM (Deluge, JavaScript)
• Postgraduate degree (Honours, Masters, or PhD) in a quantitative field such as Computer Science, Statistics, Mathematics, or Engineering.