Generalized uncertainty in surrogate models for concrete strength prediction

https://doi.org/10.1016/j.engappai.2023.106155Get rights and content

Abstract

Applied soft computing has been widely used to predict material properties, optimal mixture, and failure modes. This is challenging, especially for the highly nonlinear behavior of brittle materials such as concrete. This paper proposes three surrogate modeling techniques (i.e., polynomial chaos expansion, Kriging, and canonical low-rank approximation) in concrete compressive strength regression analysis. A benchmark database of high-performance concrete is used with over 1,000 samples, and various sources of uncertainties in surrogate modeling are quantified, including meta-modeling assumptions, solvers, and sampling size. Two generalized extreme value distributional models are developed for error metrics using an extensive database of collected data in the literature. Bias and dispersion in the developed surrogate models are empirically compared with those distributions to quantify the overall accuracy and confidence level. Overall, the Kriging-based surrogate models outperform 80%–90% of the existing predictive models, and they illustrate more stable results. The selection of a proper optimization algorithm is the most important factor in surrogate modeling. For any practical purposes, the Kriging regression outperforms the polynomial chaos expansion and the low-rank approximation. The Kriging model is reliable for the pilot database and is less sensitive to modeling uncertainties. Finally, a series of composite error metrics is discussed as a decision-making tool that facilitates the selection of the best surrogate model using multiple criteria.

Introduction

Concrete is probably the most common man-made material used in engineering construction (Shah, 1993). While the three main ingredients of concrete are water, cement, and aggregate, many natural and artificial additives are typically used to impose specific characteristics on concrete. For example, water-reducing admixtures behave as plasticizers, which reduce the water content of a concrete mix by as much as 5%. Other widely used admixtures are accelerating, air-entraining, retarding, shrinkage reducing, corrosion-inhibiting, pozzolanic, damp-proofing, gas forming, alkali-aggregate expansion inhibiting, anti-washout, grouting, bonding, etc. (Kosmatka et al., 2002). Determination of the impact of various ingredients on the properties of concrete is always one of the challenges in material science and engineering. Aside from mix design uncertainty, several sources of randomness highly affect the mechanical characteristics. They include but are not limited to the dynamic behavior of material (Bragov et al., 2013), aging behavior (Abd Elaty, 2014, Gao et al., 2018), size effects (Kim and Yi, 2002), environmental conditions for curing (e.g., temperature and humidity) (Kim et al., 2002, Yi et al., 2005), and different testing procedures.
Predicting the mechanical properties (e.g., compressive Ni and Wang, 2000 and tensile Zain et al., 2002 strength or modulus of elasticity Demir, 2005) of concrete mixes might help to improve our understanding of their behavior and lead to robust design codes and standards. Developing such predictive models is not new, and the application of simple statistical models in terms of linear and nonlinear regression analysis goes back to a hundred years ago (Wilsdon, 1934, Wright, 1954). These models provided an analytical formula to determine the unknown parameters affecting the relationship between concrete strength and its ingredients or environmental variables.
With the development of soft computing (SC) algorithms (in which machine learning is part of it Falcone et al., 2020), many researchers have adapted one or more techniques for concrete strength prediction (Baykasoğlu et al., 2004). Two main objectives are typically followed in this type of research: (1) develop a practical relation that can assist practitioners in the analysis and design of the concrete structure, and (2) compare the performance of various SC algorithms and propose some improvements whenever possible. This paper does not dive into the pool of thousands of papers on applying SC in concrete mixture and concrete structures. Multiple state-of-the-art review articles have discussed this topic more in detail, among them (Chaabene et al., 2020, Frank et al., 2020, Chong et al., 2021, Nunez et al., 2021). DeRousseau et al. (2018) provided a review on the computational design optimization of concrete mixtures. Several researchers have compared and contrasted SC algorithms in the prediction of concrete strength, e.g., Young et al. (2019) compared artificial neural network (ANN), decision trees (DT), and support vector machines (SVM); DeRousseau et al. (2019) compared regression trees (RT), random forest (RF), and boosted trees (BT); Cook et al. (2019) applied ANN, SVM, RF, and several hybrid models; and Abuodeh et al. (2020) used deep learning techniques.
In addition, few researchers pointed out the sensitivity and vulnerability of the SC models to hyper-parameters estimation (Zhao et al., 2020). Others extended multiple SC algorithms to more than one database to generalize the observations (Chou et al., 2014, Kamath et al., 2022) and to determine the balance between the accuracy of the predictive models and their complexity (Ouyang et al., 2020). Finally, some researchers combined the SC techniques with various optimization algorithms (Yazdani and Jolai, 2016) (e.g., particle swarm optimization and genetic algorithms) to increase the prediction accuracy and reduce the computational cost (Mohammadi et al., 2021, Hasanipanah et al., 2020, Hasanipanah et al., 2022, Zhu et al., 2021).
With too many applications of SC in the prediction of concrete compressive strength, there is not yet any application of polynomial chaos expansion (PCE),2 Kriging, and low-rank approximation (LRA) in this field. The PCE can capture the stochastic relation for complex and nonlinear systems using homogeneous orthogonal polynomials basis functions (Wiener, 1938), and thus is a good candidate to be used in material science. Berveiller et al. (2012) used the PCE to update the long-term creep strains in concrete structures.
Kriging (also known as Gaussian process modeling/regression)  (Schöbi, 2019) is an efficient surrogate model for problems with high nonlinearity. Hoang et al. (2016) used Kriging for modeling the compressive strength of high-performance concrete with only seven input parameters and compared the results with ANN and SVM. Verma et al. (2017) compared several kernel-based methods, including Kriging, to predict the compressive strength in a small database of only 50 samples and four input parameters. Afshoon et al. (2021) proposed a combined method of Kriging with U-learning function and K-means clustering to predict the concrete fracture energy using a limited number of inputs. Asteris et al. (2021) compared Kriging with ANN, multivariate adaptive regression splines, and minimax probability machine regression in the prediction of concrete compressive strength. The input variables include only six composed variables, and over a thousand experimental data have been used. Ke and Duan (2021) developed a Gaussian process surrogate model for high-performance concrete, including the sensitivity analysis with Sobol indices.
Aside from application-oriented papers, several major studies compared the performance of surrogate models, including PCE (Sudret, 2008) and Kriging. Hadigol and Doostan (2018) provided a state-of-the-art review on least-squares PCE with various sampling strategies. Luthen et al. (2021) provided a comprehensive survey on sparse PCE methods, including benchmark problems. Torre et al. (2019) discussed the PCE as a machine learning regression and its application for big data analytics. Grasedyck et al. (2013) and Kishore Kumar and Schneider (2017) discussed various low-rank approximation techniques. None of these papers have discussed the uncertainty in the surrogate models using PCE, Kriging, or LRA.
According to the above-discussed literature review, the application of PCE, Kriging, and LRA is immature in material science and specifically concrete strength prediction. Therefore, this paper aims to address this concern using a pilot database of high-performance concrete with over one thousand experiments. Our novelties and contributions can be summarized as follows: (1) the first application of PCE and LRA in material science with multiple solution algorithms, (2) the most comprehensive application of the Kriging method in concrete strength prediction, which accounts for up to five surrogate modeling uncertainties (e.g., trend type, correlation family and type, and optimization method), (3) evaluating the sensitivity of the surrogate modeling to various assumptions (e.g., train/test size) using multiple error metrics, (4) separating the impact of various sources of uncertainty in surrogate modeling of the pilot database, (5) developing the first-ever probabilistic model (based on generalized extreme variable distribution) for error metrics of the pilot database and validate the Kriging results, and (6) providing a comprehensive discussion on probabilistic multi-criteria decision-making methods for selection of optimal regression model for the pilot database.
Section 2 provides a short and high-level review of the theoretical underpinning of the applied surrogate models. Section 3 describes the database and the basic relationship among the input parameters. Section 4.1 provides a comprehensive discussion on various aspects of three surrogate models, while Section 4.2 discusses the probabilistic aspects of predictive surrogate models. Section 4.3 incorporates the randomness in surrogate modeling, and finally, several multi-criteria decision-making metrics are discussed in Section 5.

Section snippets

Underpinning theory of surrogate models

Surrogate models (i.e., the meta-models) are models of the model and are used to approximate the actual response of the models (either analytical or numerical). The commonly used meta-models are PCE (Xiu and Karniadakis, 2002, Blatman and Sudret, 2011), Kriging (Krige, 1951, Sacks et al., 1989), canonical low-rank approximations (Chevreuil et al., 2015, Konakli and Sudret, 2016), and high-dimensional model representation (Chowdhury and Rao, 2009, Liu et al., 2016). This section provides a

Pilot database

For the pilot study, the experimental database originally collected by Yeh, 1998b, Yeh, 1998a and extended by many other research labs (Chang et al., 1996, Giaccio et al., 1992, Langley et al., 1989, Lessard et al., 1993) is used. This database is publicly available on the University of California Irvine machine learning repository website. This database includes 1030 samples from high-performance concrete (HPC), a popular construction material because of its high workability, strength, and

Anatomy of surrogate models

Fig. 3 presents the developed PCE surrogate models using three techniques: LAR, OMP, and BCS. All the simulations are performed using the UQLAB (Marelli and Sudret, 2014) and Matlab (MATLAB, 2021). In each case, the predicted output, YPCE, is plotted versus the initial experimental data, YEXP. For all the surrogate models, 20% of data is used for testing, while 80% is used to train the meta-model. However, the plots are based on the entire database (to visualize better and save page space). For

Discussion: Multi-criteria decision making

Section 4.2 discussed the probabilistic evaluation of the surrogate models using different error metrics. For the current example, i.e., the HPC pilot database, nearly all the error metrics yield a similar conclusion. However, this is not the case for all databases, and therefore a generalized evaluation metric should be discussed. We discuss multiple methods or composite metrics that can be used for such an evaluation. Other researchers have already used these techniques; however, there is no

Conclusions

In this paper, several surrogate models were trained to investigate their capability to predict the mechanical properties of materials (in our case, the compressive strength of high-performance concrete). The database used in this paper includes over a thousand experiments and eight highly nonlinear input variables. For this purpose, three methods, including the polynomial chaos expansion, Kriging (i.e., Gaussian process), and the canonical low-rank approximation were used to find the optimal

CRediT authorship contribution statement

Mohammad Amin Hariri-Ardebili: Supervised the project.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References (120)

  • BuiD.-K. et al.

    A modified firefly algorithm-artificial neural network expert system for predicting compressive and tensile strength of high-performance concrete

    Constr. Build. Mater.

    (2018)
  • CastelliM. et al.

    Prediction of high performance concrete strength using genetic programming with geometric semantic genetic operators

    Expert Syst. Appl.

    (2013)
  • ChengM.-Y. et al.

    High-performance concrete compressive strength prediction using time-weighted evolutionary fuzzy support vector machines inference model

    Autom. Constr.

    (2012)
  • ChengM.-Y. et al.

    High-performance concrete compressive strength prediction using Genetic Weighted Pyramid Operation Tree (GWPOT)

    Eng. Appl. Artif. Intell.

    (2014)
  • ChouJ.-S. et al.

    Enhanced artificial intelligence for ensemble approach to predicting high performance concrete compressive strength

    Constr. Build. Mater.

    (2013)
  • ChouJ.-S. et al.

    Machine learning in concrete strength simulations: Multi-nation data analytics

    Constr. Build. Mater.

    (2014)
  • ChowdhuryR. et al.

    Assessment of high dimensional model representation techniques for reliability analysis

    Probab. Eng. Mech.

    (2009)
  • DemirF.

    A new way of prediction elastic modulus of normal and high strength concrete—fuzzy logic

    Cem. Concr. Res.

    (2005)
  • DeRousseauM. et al.

    Computational design optimization of concrete mixtures: A review

    Cem. Concr. Res.

    (2018)
  • DeRousseauM. et al.

    A comparison of machine learning methods for predicting the compressive strength of field-placed concrete

    Constr. Build. Mater.

    (2019)
  • ErdalH.I.

    Two-level and hybrid ensembles of decision trees for high performance concrete compressive strength prediction

    Eng. Appl. Artif. Intell.

    (2013)
  • ErdalH.I. et al.

    High performance concrete compressive strength forecasting using ensemble models based on discrete wavelet transform

    Eng. Appl. Artif. Intell.

    (2013)
  • FalconeR. et al.

    Soft computing techniques in structural and earthquake engineering: a literature review

    Eng. Struct.

    (2020)
  • FengD.-C. et al.

    Machine learning-based compressive strength prediction for concrete: An adaptive boosting approach

    Constr. Build. Mater.

    (2020)
  • GkioulekasI. et al.

    Piecewise regression analysis through information criteria using mathematical programming

    Expert Syst. Appl.

    (2019)
  • GolafshaniE.M. et al.

    Predicting the compressive strength of normal and High-Performance Concretes using ANN and ANFIS hybridized with Grey Wolf Optimizer

    Constr. Build. Mater.

    (2020)
  • GoldbergD.E. et al.

    A comparative analysis of selection schemes used in genetic algorithms

  • GurevichP. et al.

    Pairing an arbitrary regressor with an artificial neural network estimating aleatoric uncertainty

    Neurocomputing

    (2019)
  • GurevichP. et al.

    Gradient conjugate priors and multi-layer neural networks

    Artificial Intelligence

    (2020)
  • HadigolM. et al.

    Least squares polynomial chaos expansion: A review of sampling strategies

    Comput. Methods Appl. Mech. Engrg.

    (2018)
  • HanQ. et al.

    A generalized method to predict the compressive strength of high-performance concrete by improved random forest algorithm

    Constr. Build. Mater.

    (2019)
  • Hariri-ArdebiliM.A. et al.

    Machine learning-aided PSDM for dams with stochastic ground motions

    Adv. Eng. Inform.

    (2022)
  • Hariri-ArdebiliM. et al.

    Polynomial chaos expansion for uncertainty quantification of dam engineering problems

    Eng. Struct.

    (2020)
  • KeX. et al.

    A Bayesian machine learning approach for inverse prediction of high-performance concrete ingredients with targeted performance

    Constr. Build. Mater.

    (2021)
  • KimJ.-K. et al.

    Effect of temperature and aging on the mechanical properties of concrete: Part II. Prediction model

    Cem. Concr. Res.

    (2002)
  • KonakliK. et al.

    Polynomial meta-models with canonical low-rank approximations: Numerical insights and comparison to sparse polynomial chaos expansions

    J. Comput. Phys.

    (2016)
  • LimC.-H. et al.

    Genetic algorithm in mix proportioning of high-performance concrete

    Cem. Concr. Res.

    (2004)
  • LiuY. et al.

    Accurate construction of high dimensional model representation with applications to uncertainty quantification

    Reliab. Eng. Syst. Saf.

    (2016)
  • MohammadiB. et al.

    Implementation of hybrid particle swarm optimization-differential evolution algorithms coupled with multi-layer perceptron for suspended sediment load estimation

    Catena

    (2021)
  • MousaviS.M. et al.

    A new predictive model for compressive strength of HPC using gene expression programming

    Adv. Eng. Softw.

    (2012)
  • NiH.-G. et al.

    Prediction of compressive strength of concrete by neural networks

    Cem. Concr. Res.

    (2000)
  • NunezI. et al.

    Estimating compressive strength of modern concrete mixtures using computational intelligence: A systematic review

    Constr. Build. Mater.

    (2021)
  • StoneR.

    Improved statistical procedure for the evaluation of solar radiation estimation models

    Sol. Energy

    (1993)
  • SudretB.

    Global sensitivity analysis using polynomial chaos expansions

    Reliab. Eng. Syst. Saf.

    (2008)
  • TorreE. et al.

    Data-driven polynomial chaos expansion for machine learning regression

    J. Comput. Phys.

    (2019)
  • VakhariaV. et al.

    Prediction of compressive strength and portland cement composition using cross-validation and feature ranking techniques

    Constr. Build. Mater.

    (2019)
  • AalimahmoodyN. et al.

    BAT algorithm-based ANN to predict the compressive strength of concrete—A comparative study

    Infrastructures

    (2021)
  • BabosS. et al.

    Sliced inverse median difference regression

    Stat. Methods Appl.

    (2020)
  • BerveillerM. et al.

    Updating the long-term creep strains in concrete containment vessels by using Markov chain Monte Carlo simulation and polynomial chaos expansions

    Struct. Infrastruct. Eng.

    (2012)
  • BiauG. et al.

    Neural random forests

    Sankhya A

    (2019)
  • Cited by (20)

    • Determination of concrete compressive strength from surface images with the integration of CNN and SVR methods

      2024, Measurement Journal of the International Measurement Confederation
      Citation Excerpt :

      In addition, the costs of artificial intelligence-based methods are low and these methods can be constantly and practically updated with new data [33–35]. Many studies are using artificial intelligence techniques to determine the compressive strength of concrete [36,37]. At the beginning of these studies, the materials used in concrete are given as input and the concrete compressive strength is taken as output.

    • Benchmarking AutoML solutions for concrete strength prediction: Reliability, uncertainty, and dilemma

      2024, Construction and Building Materials
      Citation Excerpt :

      This process takes into consideration the inherent complexity of concrete mixtures and their associated properties. ML applications in concrete science have been investigated across a spectrum, encompassing cement pastes [2,3], mortars [4,5], and diverse concrete types such as self-consolidating concrete [6,7], alkali-activated concrete [8,9], reclaimed asphalt pavement aggregate concrete [10], high-performance concrete [11,12], recycled aggregate concrete [13–16], reinforced concrete [17,18], high-strength concrete [19], lightweight aggregate concrete [20,21], eco-friendly and green concrete [22,23], and pervious concrete [24], among others. The ability to extrapolate from laboratory experiments, core samples, or measurements to comprehensively understand the concrete’s mechanical behavior is a critical advancement for many infrastructures [25].

    • Predicting concrete strength through packing density using machine learning models

      2023, Engineering Applications of Artificial Intelligence
      Citation Excerpt :

      The process of building a model that maps the relationship between the input feature vector and the continuous output value is known as ML regression (Asteris et al., 2021). Several authors adopted ML algorithms for predicting concrete strength using experimental data, and it was demonstrated that the prediction of concrete strength is dominating, and the test data is validated to suit the needed mix proportions with desired strength (DeRousseau et al., 2019; Feng et al., 2020; Moein et al., 2022; Al-Gburi et al., 2022; Hariri et al., 2023). In alignment with this ongoing exploration, the study at hand is dedicated to predicting the compressive strength of lime-modified cement mortar.

    View all citing articles on Scopus
    1
    Contributed equally to all stages of developing in this manuscript.
    View full text