THE ROLE:
- You will work closely with our data scientists to productionise Machine Learning pipelines
- You will develop and maintain effective data processing architectures/pipelines
- You will also collaborate with the data warehouse and data platform teams
- You will build and support high traffic APIs used for serving ML model predictions
- You will have the opportunity to work on a wide range of interesting topics from productionisation of machine learning models to helping train recommender systems on terabytes of data
- As part of the data science team, you will also be given a lot of responsibilities to help shape the data architecture for the team.
OUR IDEAL TEAMMATE HAS/IS:
Desired:
- We develop data products powered by machine learning. You need to be strong in architecture design and implementation
- You have experience in building and managing CI/CD pipelines as well as using IaC tools, ideally Terraform
- You have strong knowledge and experience in basic services within cloud providers, ideally GCP.
- You have strong knowledge of programming/querying languages including: Python, SQL
- You have experience designing, implementing, and supporting data ingestion pipelines for enterprise-scale datasets using Big Data tools such as Apache Beam/Spark/Airflow
- You demonstrate knowledge of and ability to create machine/deep learning infrastructure that enables data science models for broad consumption
- You have experience of deploying machine learning models in production as APIs using Flask or similar, and via containers
- We have massive scale. You need to have experience in distributed, scalable systems such as Google BigQuery or AWS Redshift or similar
- We are a growing, diverse team and we work together across multiple locations. We love to collaborate/help each other and we want someone to share that ideology
- You have excellent communication and collaboration skills
- You are self-motivated, able to work unsupervised, and proactive to develop cloud-based ML engineering capabilities for the team when there is less running/operational demand for ML model deployment
Nice to have:
- You have particular experience using key Python ML libraries such as: scikit-learn and tensorflow, and are aware of key ML algorithms and their scaling complexities, e.g. Decision Trees, Neural Networks, k-means clusterers. Comfortable to advise on ML model scalability considerations, such as due to Big-O complexity, storage, latency
THE TEAM
The Data Technology team in News UK uses data and machine learning to power the newsroom, digital products, marketing, advertising and operations parts of our business. We are a multi-disciplinary team that includes data engineers, data scientists and data product development specialists. Data scientists are focused on developing predictive models that are productionised at scale, and tools and solutions with data and machine learning at their core that help the business improve or automate their existing workflows, or create value for the business in new and previously impossible ways