This project served as the base for my research on using regression to predict compressive strength.
It saves experimentation time for civil engineers by estimating its target based on relevant input features such as cement content, water content, etc.
The project is built entirely in Python. It uses the Pandas and NumPy modules for data manipulation, Matplotlib and Seaborn for visualization, and scikit-learn for preprocessing and model training.
We had to keep all the features. Our biggest challenge was to find the right train-test split to train the model.
The model achieves a best R2 of 92, which is great for a small dataset with no feature reduction.
Find a model for your data, not the other way around.
Improving model scores with more sophisticated ensembles like LightGBM, or ensembles that allow for more model diversity.