ENERGY ABSORPTION PREDICTION IN SILANE-TREATED SURFACE USING MACHINE LEARNING by GROUP 4

Presented by: Amsyar & Anwar From Group 4

RESEARCH BACKGROUND

Composite materials are increasingly utilized in many engineering applications because they offer several enhanced properties and various advantages.

The Boeing 787 is 50% composites by weight and by 80% volume. The plane is also 20% aluminum, 15% titanium, 10% steel, and 5% other materials. Boeing can benefit from this structure due to the plenty of savings to be had when it comes to weight. Even though composites make up the majority of the structure, the total weight is cut by an average of 20%. (Sumit Singh, Dec 2022)

Figure 1: Aircraft B787 Composition

By using glass fiber representing synthetic fiber and flax fiber from natural fiber. Each composite has its own identity such as strength, durability, and environmentally friendly. Furthermore, a hybrid composite, combine with resin or adhesion to create a bonding is crucial.

Silane treatment is a common surface modification technique used to improve the properties of hybrid composites. The energy absorption of hybrid composites is an important factor to consider in many applications, including the aerospace and automotive industries.

WHAT IS SILANE

Figure 2: Silane (3-Aminopropyl) trimethoxysilane VS Figure 3: 3 Idiot Movie Film Scene

From the right side, is images from the 3 idiots movie. And the left side is silane (3-Aminopropyl) trimethoxysilane. The similarity between both pictures can bring are, Ranchodas act like an adhesion that glued their friendship more, while silane works as a bridging agent between the inorganic and organic substrate and increase the bonding of the strength. After using silane, typically flexural strength in composite rise around 40%. (Pape, 2011)

PROBLEM STATEMENT

What are the limitations of current methods for predicting energy absorption in silane-treated hybrid composites, and how can these limitations be addressed through the development of a machine-learning model?
What data will be used to train the machine-learning model for predicting energy absorption in silane-treated hybrid composites, and how accurate will the model be in predicting energy absorption at new or unseen silane concentrations?
In what ways will the development of a machine-learning model for predicting energy absorption in silane-treated hybrid composites provide a more efficient and cost-effective method for optimizing the performance of these composites in various applications?
Which algorithm deems suitable for the machine learning model to get the best prediction result?

OBJECTIVE

From manually performing tests for each different concentration, machine learning will be the first gateway before conducting research.
Train the model on existing data attributes and use results to predict energy absorption for new data.
Build a machine learning model to predict energy absorption in a silane-treated surface, so can get prediction results without conducting real research first.
Compare the performance of the regression algorithm with other algorithms to determine the best model.

TYPE OF MACHINE LEARNING TO BE USED

The type of Machine Learning intended to use is Regression in Supervised Learning. It is as the dataset in numerical and regression analysis always involves a numeric dependent variable. (Van den Berg, S. M)

LIST OF ABBREVIATIONS

CV: Cross-Validation
DT: Decision Tree
EA: Energy Absorption
F: Force
LS: Lasso Regresssion
LR: Linear Regression
MAE: mean absolute error
MSE: mean squared error
OLS: Ordinary Least Squared
R2 :R-Squared
RF: Random Fores
RMSE: Root Mean Square Error
RR: Ridge Regression
X: Independent Variable/Feature(Attributes)
Y:Dependent Variable/Target

DATA SET DESCRIPTION

The dataset was obtained from Aerospace Department at UniKL. There are five samples with different concentration ranging from (0%, 2%, 6%, and 8%) and each dataset contain energy absorption of different concentration with force and stroke applied as time goes on. Each dataset contains 5 columns (features) and more than 500 rows. Refer to Table 1.

Table 1: Dataset Sample List

From Table 2, the target is identified. The target’s column name is “Energy_Absoprtion”.

Table 2: Sample of the dataset for 0% concentration of silane-treated surface.

The dataset is checked for its data shape, info and any missing value occurs inside the sample.

Figure 4: Check Data Shape, Info & Missing Value

DATA ANALYSIS

Check Correlation between Features

By checking correlation in the dataset, variables are identified which one are most strongly related to each other.

Table 3: The value of correlations of each dataset

Figure 4: Heatmap of the value of the correlation of each dataset

Based on Table 3 and Figure 4, Heatmap, we can see one of the features, Force has the highest correlation with the target, Energy_Absorption.

Visualization of Features vs Target

Visualizing the relationship between features and target variables is an important step in data analysis, as it helps to identify patterns and relationships in the data.

Figure 5: Force vs Energy_Absorption

Figure 6: Force vs Total_Energy_Absorption

Figure 7: Stroke vs Energy_Absorption

Figure 8: Stroke vs Total_Energy_Absorption

Figure 9: Time vs Energy_Absorption

Figure 10: Time vs Total_Energy_Absorption

Based on the graphs above, Figure 5 has a direct linear correlation between Force vs Energy_Absorption.

Features Selection (using Ridge regression)

We use features selection using RidgeCV to perform features selection using Cross-Validation of 5 and Alpha parameters are 0.01, 0.1. 1. 10, and 100. Why we use Ridge regression because it tries to determine variables that have exactly zero effects without wasting any information. Also Verma, Y. (2022, April 5) said, Ridge regression is popular because it uses regularization for making predictions and regularization is intended to resolve the problem of overfitting. Ridge directly nullify the effect of less competent features.

Table 4. Feature Selection Using RidgeCV.

From the Table 4, the coefficient of Force is positive 2.4460, while Stroke is negative 0.0080 and time is positive 0.0091. We decided to choose Force as our feature because it is more important to focus on the sign and relative magnitude of the coefficients to interpret their impact on the target variable.

MACHINE LEARNING

Table 5: Summary of statistics for each feature.

Linear Regression

Linear regression is a statistical method used to model the relationship between a dependent variable (target) (Y) and one or more independent variables (features) (X).

Figure 11: Force vs Energy_Absorption

In this model, one feature and target are used thus linear regression is a suitable method to use as it provides a simple and interpretable model for predicting the target based on the feature. Which is X is Force and Y is Energy_Absorption.

Train-Test Split for Evaluating Machine Learning Algorithms

In 1997, a new method was discussed in a paper called A scaling law for the validation-set training-set size ratio (Guyon)

"train_test_split" function is used to split the data into 80% training data and 20% testing data. The 80/20 rule, as per Detective, T.D. (2020, January 31), produce an increase of 2% accuracy and improvement in precision, recall, and f1-score when he tried to verify the data based on Guyon's paper.

That’s why we choose the 80/20 train test data.

Outlier Check

Checking for outliers is an important step in the analysis of a linear regression model. Outliers are data points that fall outside of the expected range of values and can significantly affect the results of the regression analysis.

Figure 12: Box plot of Force and Energy Absorption

Based on Figure 12, there are no outliers observed and this can be interpreted as the dataset is relatively consistent and uniform in its values.

LINEAR REGRESSION MODEL

Figure 13: OLS Result for Linear Regression Model.

Using the 'linear_model.LinearRegression()' function, it fits a linear regression model to the given training data, which involves finding the coefficients of a linear equation that best fits the data. These coefficients are then used to make predictions on the test data.

However, in the Figure 13, [2] note, a strong multicollinearity is expected.

Figure 14: Performance Metric for Linear Model

Based on Figure 14, an R-squared value of 0.999 indicates a perfect fit between the model and the data. Overall, the low MAE, MSE, and RMSE indicate that the model is making accurate predictions, while the R-Squared of 1.0 indicates that the model is a perfect fit for the data

Figure 15: Graph of Actual vs Predicted Values LR

Next, the 'predict' method is used to generate predictions for the testing data, which can be compared to the actual target values to evaluate the performance of the model.

Figure 16: Actual VS Fitted Values using LR

As you can see, in Figure 16, the fitted values really follow the actual value distribution. So it can be a sign of overfitting. However, on [2] note, strong multicollinearity is expected.

Whether this model is reliable we cannot confirm, we just continue comparing using another algorithm first.

MODEL COMPARISON USING DT AND RF

Decision Tree (DT) Regressor

A decision tree is a type of machine-learning algorithm used for both classification and regression tasks. The algorithm builds a tree-like model by recursively splitting the data into smaller subsets based on the feature values that lead to the best separation of the target variable.

Figure 17: OLS Result for Decision Tree Model

Based on figure 17, the decision tree model has an R-squared value of 0.9999968398481558, which indicates that the model explains nearly all of the variation in the response variable. However, it's worth noting that the R-squared value can be somewhat misleading for decision tree models since they don't make the same assumptions as linear regression models and are not as easily interpretable in terms of the strength and direction of the relationship between the predictor and response variables.

Random Forest (RF) Regressor

Random forest is a type of ensemble learning algorithm that combines multiple decision trees to make more accurate predictions. In a random forest, multiple decision trees are built on different subsets of the data and features. During prediction, each decision tree in the forest independently predicts the outcome, and the final prediction is the average (for regression problems) or majority vote (for classification problems) of the predictions from all the decision trees.

Figure 18: OLS Result for Decision Tree Model

In the random forest model, the R2 value is 0.9999684505196771, which indicates that the model explains a very high proportion of the variance in the response variable.

REGULARIZATION TECHNIQUES

Based on the Linear Regression model, found multicollinearity, and one of the ways to address multicollinearity is by using Regularization techniques.

We need to apply regularization techniques. As Andrea Perlato, (n.d) said to reduce multicollinearity, we can use regularization which means keeping all the features but reducing the magnitude of the coefficients of the model. This is a good solution when each predictor contributes to predicting the dependent variable.

There are two techniques that can be applied which are Ridge and Lasso Regularization techniques,

Ridge limitation is not good for feature reduction. While Lasso is not suitable for two or more highly collinear variables. In this case, we will use Ridge to handle multicollinearity.

However, we also apply Lasso regularization techniques to compare whether it can outperform the Ridge.

Comparing Coefficient between Ridge and Lasso

Figure 19: Coefficient of Lasso and Ridge

From the Figure 19, the ridge coefficient (0.0033327) which is higher than lasso coefficient (0.00333251).
From the result of the coefficient, we continue using Ridge Regression to tune the hyperparameter.

Ridge Regression (RR)

Figure 20: OLS Result for Ridge Regression Model

Based on Figure 20, after applying Ridge regularization techniques, the R-squared, not quite visible differences can be seen.

Figure 21: Actual VS Fitted Value using RR

However, if we plot based on figure 21, we can see the fitted values did follow the actual value distribution without overfitting. Some points deviate from the original curve. But it is still quite reliable.

COMPARING LR AND RR

In figure 22, we can see the before (LR) and after (RR) applying regularization techniques.

Figure 22: (Before) LR VS RR (After)

TUNED HYPERPARAMETER

Table 6: RR Parameter, CV, Result R2

We use Ridge to tune the model using Cross-Validation of 5 and Comparing Alpha parameters from 0.01, 0.1. 1. 10, until 100. For the result, based on Figure 23, we get the Best Alpha parameter is 0.01, and the tuned r-squared score is 0.9998777.

Figure 23: Untuned RR VS Tuned RR

So do we really need to tune?

Hyperparameter tuning is an essential part of controlling the behavior of a machine-learning model. If we don’t correctly tune our hyperparameters, our estimated model parameters produce suboptimal results, as they don’t minimize the loss function. This means our model makes more errors.

Lasso Regression (LS)

Figure 24: OLS Result for Lasso Regression Model

The result for Lasso Regression in Figure 24, R-squared is also nearby to 1 which is 0.99987082.

So are there any differences?

COMPARING SKEWNESS AND KURTOSIS

The skewness for a normal distribution is "0" (zero), and any symmetric data should have a skewness near zero. Negative values for the skewness indicate data that are skewed left and positive values for the skewness indicate data that are skewed right.

Kurtosis is a measure of whether the data are heavy-tailed or light-tailed relative to a normal distribution.

What is the suitable range?

Both skew and kurtosis can be analyzed through descriptive statistics. Acceptable values of skewness fall between − 3 and + 3, and kurtosis is appropriate from a range of − 10 to + 10 when utilizing SEM (Brown, 2006).

If the skewness and kurtosis is close to 0, then a normal distribution is often assumed. Based on the figure.. ridge regression and lasso regression have a balance between skewness and kurtosis.

Table 7: Skewness and Kurtosis from OLS Result.

COMPARING ALL PERFORMANCE METRICS WITH 5 SAMPLES

For all samples, Linear and Ridge have the same result for all performance metrics. However, linear regression cannot handle multicollinearity, the metric value may be the same, but the performance quite differs. Refer to Table 8.

Table 8: Performance Metric of All Samples with All Models

CONCLUSION

For our conclusion, we can decide that Ridge Regression is the best algorithm to create the machine learning model compared to Linear Regression, Decision Tree regressor, and Random Forest Regressor.

For a starter, when the dataset used linear regression when we plotted the actual vs fitted values, it tend to overfitting while the when we plot Ridge Regression no overfitting occur.

Second, for skewness and kurtosis for dataset on each sample, only Ridge and Lasso Regression tend to be balanced.

Third, for performance metric Ridge and Linear regression quite the same, but linear regression cannot handle multicollinearity.

Lastly, if we rank the most balanced and suitable result goes to Ridge Regression, which is why we concluded that this algorithm is the best for this project.

FUTURE RECOMMENDATIONS

Perform on new target, Total_Energy_Absoprtion which is more outlier to handle.
Perfom multiple linear regression on force and stroke vs target to see the result.
Using time lapse machine learning model as the data have time data.

REFERENCES

Why The Boeing 787 & Airbus A350 Are Built With Composite Materials. (2021, September 2). Why the Boeing 787 & Airbus A350 Are Built With Composite Materials. https://simpleflying.com/787-a350-composite/

Pape, P. G. (2011). Adhesion promoters. In Applied Plastics Engineering Handbook,. Elsevier. https://doi.org/10.1016/b978-1-4377-3514-7.10029-7

van den Berg, S. M. (n.d.). Chapter 6 Categorical predictor variables | Analysing Data using Linear Models. Chapter 6 Categorical Predictor Variables | Analysing Data Using Linear Models. https://bookdown.org/pingapang9/linear_models_bookdown/chap-categorical.html

Feature Selection for Ridge Regression with Provable Guarantees. (n.d.). Feature Selection for Ridge Regression With Provable Guarantees | MIT Press Journals & Magazine | IEEE Xplore. https://ieeexplore.ieee.org/abstract/document/7439920

Verma, Y. (2022, April 5). A hands-on guide to ridge regression for feature selection. Analytics India Magazine. https://analyticsindiamag.com/a-hands-on-guide-to-ridge-regression-for-feature-selection/

Detective, T. D. (2020, January 31). Finally: Why We Use an 80/20 Split for Training and Test Data Plus an Alternative Method (Oh Yes. . .). Medium. https://towardsdatascience.com/finally-why-we-use-an-80-20-split-for-training-and-test-data-plus-an-alternative-method-oh-yes-edc77e96295d

Varghese, D. (2019, May 10). Comparative study on Classic Machine learning Algorithms. Medium. https://towardsdatascience.com/comparative-study-on-classic-machine-learning-algorithms-24f9ff6ab222

Deal Multicollinearity with Ridge Regression - Andrea Perlato. (n.d.). Deal Multicollinearity With Ridge Regression - Andrea Perlato. https://www.andreaperlato.com/mlpost/deal-multicollinearity-with-ridge-regression/

What is hyperparameter tuning? | Anyscale. (n.d.). Anyscale. https://www.anyscale.com/blog/what-is-hyperparameter-tuning

Chugh, A. (2022, March 16). MAE, MSE, RMSE, Coefficient of Determination, Adjusted R Squared — Which Metric is Better? Medium. https://medium.com/analytics-vidhya/mae-mse-rmse-coefficient-of-determination-adjusted-r-squared-which-metric-is-better-cd0326a5697e

Kurtosis - an overview | ScienceDirect Topics. (n.d.). Kurtosis - an Overview | ScienceDirect Topics. https://doi.org/10.1016/B978-0-12-396973-6.00010-1

Gawali, S. (2021, May 2). Skewness and Kurtosis: Quick Guide (Updated 2023). Analytics Vidhya. https://www.analyticsvidhya.com/blog/2021/05/shape-of-data-skewness-and-kurtosis/

Appreciate