Skip to main content

The Materiality of Corporate Governance Report Disclosures: Investigating the Perceptions of External Auditors working in Egypt

Research Authors
Abdelmoneim Bahyeldin Mohamed Metwally
Research Date
Research Department
Research Publisher
Scientific Journal for Financial and Commercial Studies and Researches (SJFCSR)
Research Vol
Vol.3, No.1, Part 1
Research Year
2022

Machine Learning Approaches for Auto Insurance Big Data.

Research Abstract

The growing trend in the number and severity of auto insurance claims creates a need for new methods to efficiently handle these claims. Machine learning (ML) is one of the methods that solves this problem. As car insurers aim to improve their customer service, these companies have started adopting and applying ML to enhance the interpretation and comprehension of their data for efficiency, thus improving their customer service through a better understanding of their needs. This study considers how automotive insurance providers incorporate machinery learning in their company, and explores how ML models can apply to insurance big data. We utilize various ML methods, such as logistic regression, XGBoost, random forest, decision trees, naïve Bayes, and K-NN, to predict claim occurrence. Furthermore, we evaluate and compare these models’ performances. The results showed that RF is better than other methods with the accuracy, kappa, and AUC values of 0.8677, 0.7117, and 0.840, respectively.

Research Authors
Mohamed Hanafy
Research Date
Research Member
Research Website
https://www.mdpi.com/2227-9091/9/2/42

Improving Imbalanced Data Classification in Auto Insurance by the Data Level Approaches.

Research Abstract

Predicting the frequency of insurance claims has become a significant challenge due to the imbalanced datasets since the number of occurring claims is usually significantly lower than the number of non-occurring claims. As a result, classification models tend to have a limited ability to predict the occurrence of claims. So, in this paper, we'll use various data level approaches to try to solve the imbalanced data problem in the insurance industry. We developed 32 machine learning models for predicting insurance claims occurrence {(undersampling, over-sampling, the combination of over-and undersampling (hybrid), and SMOTE)× (three Decision tree models, three boosting models, and two bagging models) = 32}, and we compared the models' accuracies, sensitivities, and specificities to comprehend the prediction performance of the built models. The dataset contains 81628 claims, each of which is a car insurance claim. There were 5714 claims that occurred and 75914 claims that didn't occur. According to the findings, the AdaBoost classifier with oversampling and the hybrid method had the most accurate predictions, with a sensitivity of 92.94%, a specificity of 99.82%, and an accuracy of 99.4%. And with a sensitivity of 92.48%, a specificity of 99.63%, and an accuracy of 99.1%, respectively. This paper confirmed that when analyzing imbalanced data, the AdaBoost classifier, whether using oversampling or the hybrid process, could generate more accurate models than other boosting models, Decision tree models, and bagging models.

Research Authors
Mohamed Hanafy
Research Date
Research Member
Research Website
https://thesai.org/Publications/ViewPaper?Volume=12&Issue=6&Code=IJACSA&SerialNo=56

Comparing SMOTE Family Techniques in Predicting Insurance Premium Defaulting using Machine Learning Models

Research Abstract

Default in premium payments impacts significantly on the profitability of the insurance company. Therefore, predicting defaults in advance is very important for insurance companies. Predicting in the insurance sector is one of the most beneficial and important study areas in today's world, thanks to technological advancements. But because of the imbalanced datasets in this industry, predicting insurance premium defaulting becomes a difficult task. Moreover, there is no study that applies and compares different SMOTE family approaches to address the issue of imbalanced data. So, this study aims to compare different SMOTE family approaches. Such as Synthetic Minority Oversampling Technique (MOTE), Safe-level SMOTE (SLS), Relocating Safe-level SMOTE (RSLS), Density-based SMOTE (DBSMOTE), Borderline-SMOTE(BLSMOTE), Adaptive Synthetic Sampling (ADSYN), and Adaptive Neighbor Synthetic (ASN), SMOTE-Tomek, and SMOTE-ENN, to solve the problem of unbalanced data. This study applied a variety of machine learning (ML)classifiers to assess the performance of the SMOTE family in addressing the imbalanced problem. These classifiers including Logistic Regression (LR), CART, C4.5, C5.0, Support Vector Machine (SVM), Random Forest (RF), Bagged CART(BC), AdaBoost (ADA), Stochastic Gradient Boosting, (SGB), XGBOOST(XGB), NAÏVE BAYES, (NB), k-Nearest Neighbors (K-NN), and Neural Networks (NN). Additionally, model validation strategies include Random hold-out. The findings obtained using various assessment measures show that ML algorithms do not perform well with imbalanced data, indicating that the problem of imbalanced data must be addressed. On the other hand, using balanced datasets created by SMOTE family techniques improves the performance of classifiers. Moreover, the Friedman test, a statistical significance test, further confirms that the hybrid SMOTE family methods are better than others, especially the SMOTE -TOMEK, which performs better than other resampling approaches. Moreover, among ML algorithms, the SVM model has produced the best results with the SMOTE- TOMEK.

Research Authors
Mohamed Hanafy
Research Date
Research Member
Research Website
https://thesai.org/Publications/ViewPaper?Volume=12&Issue=9&Code=IJACSA&SerialNo=70

Classification of the Insureds Using Integrated Machine Learning Algorithms: A Comparative Study

Research Abstract

With the growing number of insurance purchasers, the sophisticated claim analysis system has become an imperative must for any insurance firm. Claims Analysis can be utilized to better understand the customer strata and incorporate the findings throughout the insurance policy enrollment, including the underwriting and approval or rejection stages. In recent years machine learning (ML) technologies are increasingly being used to claims Analysis. However, choosing the optimal techniques, whether the features selection techniques, feature discretization techniques, resampling mechanisms, and ML classifiers for insurance decision assistance, is difficult and can harm the quality of claim suggestions. This study aims to develop appropriate decision models by combining binary classification, feature selection, feature discretization, and data resampling techniques. We did Extensive tests on three different datasets …

Research Authors
Mohamed Hanafy, Ruixing Ming
Research Date
Research Journal
Applied Artificial Intelligence
Research Member
Research Publisher
Taylor & Francis
Research Year
2022
Subscribe to