Document Type

Article

Publication Date

4-2022

Publication Title

Cement and Concrete Composites

Abstract

Machine learning (ML)-based prediction of non-linear composition-strength relationship in concretes requires a large, complete, and consistent dataset. However, the availability of such datasets is limited as the datasets often suffer from incompleteness because of missing data corresponding to different input features, which makes the development of robust ML-based predictive models challenging. Besides, as the degree of complexity in these ML models increases, the interpretation of the results becomes challenging. These interpretations of results are critical towards the development of efficient materials design strategies for enhanced materials performance. To address these challenges, this paper implements different data imputation approaches for enhanced dataset completeness. The imputed dataset is leveraged to predict the compressive and tensile strength of concrete using various hyperparameter-optimized ML approaches. Among all the approaches, Extreme Gradient Boosted Decision Trees (XGBoost) showed the highest prediction efficacy when the dataset is imputed using k-nearest neighbors (kNN) with a 10-neighbor configuration. To interpret the predicted results, SHapley Additive exPlanations (SHAP) is employed. Overall, by implementing efficient combinations of data imputation approach, machine learning, and data interpretation, this paper develops an efficient approach to evaluate the composition- strength relationship in concrete. This work, in turn, can be used as a starting point toward the design and development of various performance-enhanced and sustainable concretes.

Keywords

Machine learning, Concrete strength, Missing data, Data imputation, SHAP

Volume

128

DOI

https://doi.org/10.1016/j.cemconcomp.2022.104414

ISSN

0958-9465

Comments

Originally published in Cement and Concrete Composites, Volume 128, April 2022, 104414.

Share

COinS