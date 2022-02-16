Senior Data Scientist (Remote) at Datafin Recruitment

ENVIRONMENT:

A fast-paced global Data Specialist seeks the expertise of a highly analytical Senior Data Scientist for a hands-on role finding innovative solutions to complex problems. You will work on Data Science R&D tasks & Machine Learning problems while familiarising themselves with the Trendscope tech stack, particularly sales to social and the knowledge graph, plus automated relevancy cleaning. The ideal candidate must possess a Masters Degree in Computer Science/Engineering/Math/Physics, etc., have a deep understanding of core Machine Learning concepts & a proven track record applying Machine Learning techniques, have worked with text data and working on a wide range of NLP problems such as Topic Modelling, Relevancy and Disambiguation, BERT/GloVe, SQL, NoSQL, MongoDB, pySpark, Python, Git, Unittest and Pytest.

DUTIES:

Work with time series and social data to provide creative data science solutions for the core product line.

Work on difficult, often cutting-edge problems such as Disambiguation, Relevancy and Forecasting.

Create production-ready solutions, working with Engineering to deploy solutions and productionise prototypes.

Mentor and support junior team members.

REQUIREMENTS:

Masters Degree in Computer Science, Mathematics, Physics, Engineering or other Physical Science.

Deep understanding of core Machine Learning Concepts.

A proven track record of applying Machine Learning techniques in a production context.

Experience working with text data.

Experience working with Engineering teams on optimatisation and productization of Data Science work.

Proven experience working on hard DS R&D problems in relevant areas i.e., NLP/Text Analytics.

Creative and comfortable working as part of a cross functional team and applying cutting edge data science solutions from the literature to BSD problems.

Technical Communication.

Proven track record of deploying and developing in a Big Data environment, ideally pySpark.

Proficient programming in Python with solid Software Development standards, including Version Control (Git), Testing (Unittest/Pytest) and virtual environments.

SQL and NoSQL databases including MongoDB.

Experience with a wide range of NLP problems such as Topic Modelling, Relevancy and Disambiguation.

Have worked with word embedding models such as BERT/GloVe.

Experience working in/managing a production code base.

Advantageous

PhD in Computer Science, Mathematics, Physics, Engineering or other Physical Science.

Scala.

Deep Learning, specifically for Multi-horizon Time Series Forecasting.

Knowledge of network science. Comfortable creating and working with graphs and preforming community detection, finding connected components, eigenvector centrality, etc.

Databricks experience.

Familiarity with the AWS stack.

Experienced in Agile development.

Product-driven Data Science development.

ATTRIBUTES:

Proven ability to provide thought leadership.

Highly creative.

Organised, and comfortable in an agile work environment.

Team player, willing to support others and engage with engineering teams.

Excellent people skills and great communicator.

Strong written communication and ability to clearly document work is essential.

