• Experienced in data analysis and technically strong in SQL and Scala (Spark SQL)
• Understand / unpack the end-to-end flow data (process) and impact due to changes in data across the value chain of the data (From Cotton-to-Cash) i.e. High-Level Data Flow Diagrams, Source-to-Target mappings etc.
• Conceptual understanding of Data Visualization / Data Virtualization
• JAD Facilitations
• Document and unpack Business Data Requirements through continuous Business and Tech Engagement
Technologies / Skills
• Experience working on Hadoop & AWS
• Understanding large scale systems.
• Familiar with Agile methodologies
• Curious (Self-motivated to learn new things and question things one doesn’t fully understand – not afraid to fail)
• Teachable (willing to learn and not afraid to ask)
• Excel (Proficient)
• PL SQL, MS SQL, Oracle, DB2, MySQL, HQL (Proficient)
• Scala or similar language useable in Spark - PySpark, R etc. (Knowledge and Exposure would be beneficial)
• Data Visualization Tools - Tableau, QlikView, QlikSense, Excel etc. (Intermediate)
• Data Virtualization Tools – Denodo (Knowledge and Exposure would be beneficial)
• Big Data Knowledge - (Intermediate)
General/qualifications
• University Degree or higher certificate in computer science or information Technology or equivalent NQF level 5 qualification or higher.
• Five years+ experience within big data information / data analysis.