Gbolahan, Author at Machine Learning, Slowly!

Getting Stuck on the Kaggle Disaster Tweets Project (and How I’m Shipping V1 Anyway)

January 30, 2026 by Gbolahan

I’ve been working on the Kaggle Disaster Tweets classification project, and for a while, progress felt good. I built a baseline model using TF-IDF and Logistic Regression and managed to get an F1 score of 0.82 without using a pipeline. Then I decided to “do things properly” and refactor everything into a scikit-learn pipeline — and … Read more

From Baseline to Submission: My Gradient Boosting Pipeline on Spaceship Titanic

January 13, 2026January 13, 2026 by Gbolahan

Introduction After completing my first Kaggle competition on Housing Prices, I decided to tackle the Spaceship Titanic dataset. The goal is to predict whether passengers were transported to another dimension during the voyage. This competition has been a great opportunity to improve my workflow, learn about pipelines, and practice model evaluation. In this post, I’ll walk through … Read more

Improving My Baseline Model: From Simple Linear Regression to a Proper Pipeline

January 13, 2026January 7, 2026 by Gbolahan

In my previous post, I built a simple baseline model for the House Prices Kaggle competition using only numerical features, scaling, and a linear model. Since then, I’ve iterated on that baseline by introducing a proper preprocessing pipeline, adding categorical features through one-hot encoding, and applying feature engineering, cross-validation, and hyperparameter tuning. The goal wasn’t … Read more

My First Kaggle Submission: A Baseline and What I’m Improving Next

January 1, 2026 by Gbolahan

I recently submitted my first baseline model to the Kaggle Housing Prices competition. This post is not about a perfect solution—it’s about documenting my learning process as I work through the problem step by step. I plan to make 3–4 submissions in total, each one improving on the previous version. This post covers the absolute … Read more