[2021 - 2024]
An Automated Simulation Engine and Predictive Modeling System
We devised an omni brand-market-channel system with measurement capabilities for a Fortune 500 company, enabling continuous learning by iterating on causal parameters (Average Store Sales, Elastic Lift, etc.). This procedure backcasts prior periods by projecting demand with various adjusted predicted causal parameters and measures accuracy (Bias, WMAPE) and inventory efficacy (GMROI) to assess the competence of alternative parameter values. I introduced a measurement capability and analyzed the effect of each parameter on overall error by segmenting the projected demand WMAPE. Noticing high error from one such sales parameter, we refined its' prediction algorithm to make it sensitive to attribute-focused demand signals. We used word embedding technique GloVe, the vectorized attributes with quantitative sales fields serve as features for XGBoost to predict sales of a new season. The predictions are used as overrides in the simulation of samples, to evaluate their effectiveness on projections. Additionally, we estimated the effect of treatments (promotion, elasticity, etc.), on demand. We also enhanced the projection method by accounting for the projected distribution error (owned but not sellable) based on prior sales and inventory statistics, and used Multiple Logistic Regression for feature classifications.
Pre-Season Analytics for Inventory Management
For a Fortune 500 company, constructed relevant demand planning metrics for pre-season (new) items leveraging historical statistics of similar items. Corrected data for lost sales, pricing, and deviation of sales from stores, and then added seasonality and elasticity to estimate average sales of new items. After applying eligibility conditions (ratio, coverage, etc.) and removing outliers (3σ rule), computed Store Deviation as the normalized index of relative sales with optimal weights of last year and trend. For the final metrics, assigned equal probability ranking (1/3 or 1/5) to items within a sub-category (under a product hierarchy) based on normalized historical sales and computed their distribution and initial weights. Then, combined historical cleansed sales of the higher product category, optimized sub-category rank weights (through SLSQP by minimizing MSE), and normalized seasonal indices.
[2020 - 2021]
AI/ML approach to develop Unconstrained Forecast
To develop an enterprise-wide demand forecast, defined Sectors to capture balanced regional demand by mapping ZIPs to states and then splitting states into sectors. Added derived features of seasonality, moving averages, lags, categories (smooth/lumpy/erratic/intermittent), rate of sales (ROS); external drivers of Covid mobility, web analytics, etc. With Google’s AutoML, iterated through these features to find the best model across sectors and seasons, assessed by accuracy metrics at n-week lags for segments based on revenue percentiles, product life, etc. Final forecasts were generated after exception handling, sectors to ZIP disaggregation, and adding constraints, if any.
Case Study: An Integrated Forecasting and Clustering model with Choice Ranks
In partnership with SAS, to overcome the challenge of complex product and demand variations, and lack of history for a US retailer, we proposed a method of sequentially ranking similar products by performance and forecasting rank demand, significantly improving the forecast quality. We identified the critical features with Random Forest and forecasted using hierarchical time series with optimal reconciliation and Random Forest. We also explored the utility of the Rank-based choice model that captures consumer purchase behavior. With Hierarchical Clustering (HC) we determined the attributes driving differential store sales and applied them to store clustering by K-Means. Instead of randomization, we used centroids from HC as initial seeds, reducing volatility and expediting convergence. Cluster profiling ensured stores were homogeneous within clusters and heterogeneous across them.
Large-scale Automated Demand Forecasting using Time Series
For a global firm across 60+ demand centers (DCs) with varied locations and demography, gathered insights into product life, variability, product similarity, and lost sales; and analyzed attributes by pervasiveness, density, and differentiating power; using coefficient of variation, estimation, linear regression, etc. With SAS New Product Forecasting (NPF), created an initial forecast (INPF) by modeling shape and volume clusters to score new items. INPF is then used as a causal variable Fake History Indicator Variable (FHIV). The system re-evaluates as actual sales are available, combining them with FHIV to generate the final forecast. We applied logarithmic and Box-Cox transformation techniques for heteroskedastic DCs, and forecasting methods of ARIMAX, UCM, ESM, and Intermittent Demand Model (Croston’s method), measured by 4-week lag WMAPE.
Predictive Analytics for Demand Planning Platform
For a FMCG company, performed portfolio segmentation by volume and variability (ABC-XYZ classification). Accordingly employed ML models: XGBoost, RF, and Stacked algorithm (Linear Regression, RF, and KNN in the 1st layer; with their predictions as features in the prediction layer), and Time Series Prophet with additional regressors (promotion, weather, etc.). In Prophet, Bayesian Inference captured the uncertainty around the estimates resulting in robust forecasts of erratic demand, and the Logistic Growth of trend handled non-linear demand growth. We also assessed the effect of promotion, assortment architecture, etc. on demand. For ML model interpretability, we utilized LIME and SHAP to break down forecasts into components of history, promotion, weather, etc., and identify the major driving features.
Internship: [May - July 2019]