Data Engineer

ABOUT US

We are a multi-award-winning team of over 100 engineers, designers, and analysts based in Leicester, with development hubs in Ukraine and Spain. We specialise in bespoke software development, extended teams/staff augmentation, and support as a service.

OUR CLIENT

Our partner is Europe’s leading platform for buying and selling books, CDs, movies, games, and fashion.
The company fosters success by bringing together a team of professionals with diverse backgrounds who collaborate to bring bold ideas to life and find innovative solutions.

We are looking for a Data Engineer to join the Evolved Ideas team.

As Senior Data Engineer, you will own one of our most business-critical data assets: the system that links customer identities across our businesses and powers better decisions in marketing, CRM, reporting, and analytics. You will join our Business Intelligence & Data Engineering team and work closely with Data Engineers and Business Analysts to build reliable, scalable, and trustworthy customer identity data.

Your mission

Own the end-to-end pipeline that creates the unified customer_uuid across Books & Media and Fashion
Maintain and evolve our customer identity master data with a strong focus on accuracy, reliability, and production quality
Improve our probabilistic identity resolution model and make matching decisions measurable, transparent, and explainable
Build scalable and cost-efficient data pipelines across BigQuery, GCS, and Cloud Run Jobs
Introduce diagnostics, monitoring, and structured validation for every relevant model change
Identify and resolve edge cases in customer matching logic before they become production issues
Work closely with business and technical stakeholders to turn complex matching challenges into robust data solutions

Our Tech Stack

BigQuery
SQL
Python
Airflow
Splink
Google Cloud Storage
Cloud Run Jobs
Pub/Sub

Your profile

Must-Have:

5+ years of experience in production data engineering
Strong experience with BigQuery and advanced SQL in large-scale analytical environments
Strong Python skills for production-grade data engineering
Solid Airflow experience and a strong understanding of reliable orchestration patterns
Hands-on experience with incremental pipelines and idempotent data processing
Experience with probabilistic record linkage or entity resolution in production
Strong understanding of data quality, matching logic, and precision/recall trade-offs
A careful, structured, and ownership-driven way of working
Strong communication skills and the ability to explain technical decisions clearly

Nice-To-Have:

Experience with Splink and probabilistic record linkage tools
Experience with Cloud Run Jobs, GCS, and event-driven patterns in GCP
Experience with Pub/Sub as a source in data pipelines
Familiarity with data format trade-offs such as Parquet, Avro
Experience with dbt
Exposure to downstream BI use cases
Experience in e-commerce or marketplace environments
German language skills

YOU CAN LOOK FORWARD TO

Contributing to a high scale, complex product and seeing the real-time impact of your work
Healthcare insurance
Educational budget
Challenging tasks and professional development, knowledge & best practice sharing

Apply Online