Mykhailo Vakoliuk • Data Science & RevOps

Hi — I’m Mykhailo Vakoliuk.

Data Engineer for Revenue Operations

💼 2+ years in Operational Analytics
🎓 BSc of Computer Science at NaUKMA 2022-2025
📑 Grid Dynamics, HyperionDev – DS and GA certifications
💬 English, Russian and Ukrainian Speaking
📍 Based in London, UK

🤝 Get in touch

Selected Case Studies

Representative work · Full portfolio on GitHub

Portfolio

GA4 → Segmentation Pipeline (RevOps Growth Engine)

A production-style growth engine that turns raw GA4 events into actionable customer segmentation. Runs a full lifecycle: ingestion → SQL feature engineering → local Postgres storage → K-Means clustering → cluster definitions & recommended actions → API endpoints consumed by a streaming web UI. Deployed on a single server using Docker Compose, running the API and a PostgreSQL container image together in production.

FastAPI PostgreSQL Docker Compose K-Means SQL Streaming NDJSON UI

Next Basket Prediction Model

Trained and evaluated a gradient boosting model to forecast customer reordering behaviour on event-level sales data. Engineered features and analysed impact; achieved 83% ROC-AUC and 66% relative improvement over the baseline.

Pandas Scikit-learn XGBoost

Multilingual LLM Evaluation

Analysed and automated Large Lexical Corpora dataset preparing tools for Evaluation of LLM translations. Given a pair of Adj+Noun in any source and target languages, the algorithm will assess the quality of translation based on natural fluency, perplexity according to other language models, and lexical corpus matching. Word sense disambiguation with BERT-like models applied.

PyTorch Pandas APIs Git

Sales Processing Engine

Engineered a data pipeline that transformed raw event-level sales records into ingredient-level demand forecasts, using optical character recognition (OCR) and regex parsing to standardise unadapted sales sources. Implemented a custom linear forecasting engine to model 10+ time series, enabling product merit analysis and improving production planning and operational efficiency.

Pandas OCR Time Series Linear Modelling

My approach to Problem Solving

I build analytics and ML systems that convert messy user behaviour into clear, actionable outputs. I care about monetary benefit, deployment realism, and whether insights actually change what a business does.

Principles

I start with decisions, not engineering.

I define the decision the output should enable and who will use it. I set a clear “good enough” threshold early to keep the work focused on impact. If the answer won’t change anything, I simplify the question instead of overbuilding.

I treat solutions as products.

I treat analytical systems as products with lifetime value, not one-off reports. Presentation is part of the sale: if it’s not understood quickly, it won’t be adopted. I design for reuse, iteration, and a workflow that can run repeatedly.

I aim for clear outputs.

I aim for clear outputs, not just correct numbers — interpretation matters more than a dense pivot. I turn raw data into signals: what’s growing, what’s declining, and what deserves attention. If a result can’t be explained in plain language, it won’t drive decisions.

I reflect at every step.

I check whether the data is truthful, whether transformations are necessary, and whether we’re answering the real question. I sanity-check assumptions and stop when the output is useful, not when it is maximally complex. This typically leads to simpler pipelines and more confident conclusions.

Forecasting for production planning

During my work in Operations at LEB, we faced rising food and production costs driven by demand uncertainty. I built a forecasting and reporting workflow to improve planning, reduce waste, and support menu decisions.

Principles in use

Decisions

Data pipeline, not a report

The core decision wasn’t just forecast accuracy — it was reducing operational effort waste and justifying menu complexity. That framing let me simplify modelling and optimise for usability and repeatability, not fancy metrics.

Product

From baseline to forecast

The workflow had to cover A–Z in user experience — otherwise it would be seen as too complex to implement. I scoped an MVP that is easy to run and produces planning outputs that directly support cost reduction.

Clarity

Simple features, a lot of meaning

The system was intended for non-technical use in daily operations. Outputs were designed to be plug-and-play: interpretable signals and planning quantities, without relying on opaque scoring.

Reflection

Answering the right question

While forecasting, I questioned whether I used all available information and discovered an additional asset: a product merit report. This shifted the work from “predict next week” to “improve what we choose to sell” — enabling strategic simplification decisions.

Let’s connect

Open to data science roles, collaboration, and thoughtful conversations

mike.vakoliuk@gmail.com

London, UK