In my previous post, I built a simple baseline model for the House Prices Kaggle competition using only numerical features, scaling, and a linear model.
Since then, I’ve iterated on that baseline by introducing a proper preprocessing pipeline, adding categorical features through one-hot encoding, and applying feature engineering, cross-validation, and hyperparameter tuning. The goal wasn’t to chase a leaderboard score, but to build a more realistic and disciplined machine learning workflow.
In this post, I walk through the key improvements I made to the baseline model and what I learned from the process, before moving on to more advanced models in future experiments.
Quick Recap of the Baseline
Model: Ridge Regression
Features: Numerical features only
Preprocessing: Manual scaling
Evaluation: Train/validation split
Result: Kaggle score around 0.34
This was intentionally simple, but clearly incomplete.
Problems With the Baseline
Categorical features were completely ignored
Preprocessing was done manually (easy to leak data)
No unified pipeline
Target variable (price) was highly skewed
Evaluation setup was not robust
Improvements I Made
🔹 Using a Pipeline
Keeps preprocessing and model together
Prevents data leakage
Makes experiments reproducible
This immediately made my workflow cleaner and safer.
🔹 Handling Categorical Features
Stopped dropping them
Encoded them properly
Why house data depends heavily on categories
Improved the model’s ability to learn housing-related patterns
🔹 Log Transforming the Target
House prices are right-skewed
Log transform stabilized the target
Evaluation became more meaningful
🔹 Better Evaluation Discipline
Used consistent RMSE calculation
Compared models properly
Avoided misleading improvements
Results
These changes reduced my validation RMSE significantly compared to the baseline and gave me more confidence in the model’s ability to generalize.
What I Learned
Pipelines are not optional, even for baselines
Data preprocessing matters as much as the model
Target transformations can have a big impact
Clean evaluation > chasing scores
What I’ll Try Next
Feature engineering
Cross-validation and tuning
Tree-based models (Random Forest, Gradient Boosting)
Comparing linear vs non-linear models