Skip to main content

Hybrid Predictive Modeling for Insurance Premium Retention: Integrating Statistical and AI Techniques.

Research Abstract

This research highlights the critical role of forecasting in the insurance industry and emphasises the premium retention ratio (PRR) as a key internal performance indicator for evaluating insurance company operations. Traditional time series models like ARIMA and Exponential Smoothing face limitations in capturing complex data patterns. To address this, the study proposes a hybrid predictive model that combines statistical time series models (ARIMA, EXP) with advanced AI techniques (ANN, SVR) to enhance PRR prediction accuracy in Egypts Fire, Marine, and Aviation insurance sectors. Using 80% of data for training (19892015) and 20% for testing (20162021), the study demonstrates that hybrid models, particularly ARIMA-ANN and EXP-ANN, outperform conventional models. The findings suggest that incorporating ANN into these models significantly improves prediction accuracy. This research offers a novel approach to forecasting in the Egyptian insurance market and provides publicly accessible datasets for further comparative studies across different countries.

Research Authors
Ahmed Abdelreheem Khalil, & Zaiming Liu,
Research Date
Research Image
Research Journal
International Journal of Computational Science and Engineering
Research Publisher
Inderscience publisher
Research Rank
Web of Science (ESCI Q3), Scopus (Q3)
Research Website
http://dx.doi.org/10.1504/IJCSE.2024.10067258
Research Year
2024

Machine Learning Based Method for Insurance Fraud Detection on Class Imbalance Datasets With Missing Values

Research Abstract

Insurance fraud is a prevalent issue that insurance companies must face, particularly in the realm of automobile insurance. This type of fraud has significant cost implications for insurance firms and can have a long-term impact on pricing strategies and insurance rates. As a result, accurately predicting and detecting insurance fraud has become a crucial challenge for insurers. The fraud datasets are usually imbalanced, as the number of fraudulent instances is much less than the ligament instances and contains missing values. Prior research has employed machine learning methods to address this class imbalance dataset problem, but there is limited effort handling the class imbalance dataset present in insurance fraud datasets. Moreover, we could not find an overfitting analysis for the relevant predictive models. This paper addresses these two limitations by employing two car insurance company datasets, namely, an Egyptian real-life dataset and a standard dataset. We proposed addressing the missing data and the class imbalance problems with different methods. Then, the predictive models were trained on processed datasets to predict insurance fraud as a classification problem. The classifiers are evaluated on several evaluation metrics. Moreover, we proposed the first overfitting analysis for insurance fraud classifiers, to our knowledge. The obtained results outline that addressing the class imbalance in the insurance fraud detection dataset has a significant positive effect on the performance of the predictive model, while addressing the problem of missing values has a slight effect. Moreover, the proposed methods outperform all of the existing methods on the accuracy metric.

Research Authors
AHMED ABDELREHEEM KHALIL, ZAIMING LIU, AHMED FATHALLA, AHMED ALI, AND AHMAD SALAH
Research Date
Research Image
Research Journal
IEEE Access
Research Publisher
IEEE publisher
Research Rank
Web of Science (SCI Q2), Scopus (Q1)
Research Vol
12
Research Website
https://doi.org/10.1109/ACCESS.2024.3468993
Research Year
2024

Enhancing operational efficiency of insurance companies: a fuzzy time series approach to loss ratio forecasting in the Egyptian market

Research Abstract

This article analyses the crucial significance of loss ratio (LR) evaluation in assessing the operational efficiency of insurance companies. Fuzzy time series (FTS) models are effective in modelling non-linear with uncertainties and providing adequate performance with limited data availability, which overcomes the shortage of conventional models used in Egyptian insurance market literature. This paper aims to introduce a comprehensive analysis of LR of the property and casualty sectors. Moreover, review and compare various forecasting FTS techniques to discuss the effectiveness of using FTS methods in forecasting LR. The suggested method is based on results from tests done with four models that had Huarng 465 partitions and interval length parts of 5, 10, 50, and 100. The results show LR prediction improved significantly. Yu and Cheng’s models using a Huarng 465 partition for training data and the Yu model with 100 partitions for testing data had high accuracy and low error.

Research Authors
Ahmed Abdelreheem Khalil, Zaiming Liu & Ahmed Abdelwahab Ali
Research Date
Research Image
Research Journal
Journal of Business Analytics
Research Publisher
Taylor & Francis publisher
Research Rank
Web of Science ( SCI Q3), Scopus (Q2)
Research Vol
7;4
Research Website
https://doi.org/10.1080/2573234X.2024.2393609
Research Year
2024

Precision in Insurance Forecasting: Enhancing Potential with Ensemble and Combination Models based on the Adaptive Neuro-Fuzzy Inference System in the Egyptian Insurance Industry

Research Abstract

Enhancing the precision of retention ratio predictions holds profound significance for insurance industry decision-makers and those vested in advancing insurance services. Precision helps insurance companies navigate inflationary pressures and evaluate underwriting profitability, enabling reliable prognoses of future underwriting gains. As far as we know, although there have been multiple attempts to construct a predictive model for retention ratio, none of these attempts have used combining models or studied the Egyptian market. Therefore, this study contributes significantly to this developing field by providing combining models, which combined statistical time series models such as Exponential Smoothing (ES), and Autoregressive Integrated Moving Average (ARIMA), with Adaptive Neuro-Fuzzy Inference System (ANFIS). Two different types of combinations are employed with these models. Furthermore, the study introduces three ensemble models designed for the purpose of predicting the retention ratio within the Egyptian insurance market. Dataset was carefully gathered from the EFSA’s annual reports, focused on the property-liability insurance sector within the Egyptian insurance market and covers the time period from 1989 to 2021. Next, the proposed models are assessed employing well-established statistical assessment metrics, namely, Mean Absolute Error (MAE), Mean Absolute Percent Error (MAPE), R Square (R2), and Root Mean Square Error (RMSE). The results show that combining and ensemble
methods improve predicted accuracy. A multi-linear regressionbased ensemble model that combines ARIMA, ES, and ANFIS models outperforms both single and combined models in robustness. The article concludes that the insurance industry can greatly benefit from modern predictive methods to make sound decisions.

Research Authors
Ahmed Abdelreheem Khalil, Zaiming Liu, and Ahmed Ali
Research Date
Research Journal
Applied Artificial Intelligence - An International Journal
Research Publisher
Taylor & Francis publisher
Research Rank
Web of Science (SCI Q2) Scopus (Q2)
Research Vol
38;1
Research Website
https://doi.org/10.1080/08839514.2024.2348413
Research Year
2024

The benefits of social insurance system prediction using a hybrid fuzzy time series method

Research Abstract

Decision-making in many industries relies heavily on accurate forecasts, including the insurance sector. The Social Insurance System (SIS) in Egypt, operating under a fully funded paradigm, depends on reliable predictions to ensure effective financial planning. This research introduces a hybrid predictive model that combines fuzzy time series (FTS) Markov chains with the tree partition method (TPM) and difference transformation to forecast total pension benefits within Egypt's SIS. A key feature of the proposed model is its ability to optimize the partitioning process, resulting in the creation of nine intervals that reduce computational complexity while maintaining forecasting accuracy. These intervals were consistently applied across all fuzzy time series models for comparison. The model's performance is evaluated using established metrics such as MAPE, Thiels' U statistic, and RMSE. Additionally, prediction interval coverage probability (PICP) and mean prediction interval length (MPIL) are used to assess the quality of prediction intervals, with a 95% prediction interval serving as the baseline. The proposed model achieved a PICP of approximately 95%, indicating well-calibrated prediction intervals, although the MPIL of 424.5 reflects a wider uncertainty range. Despite this, the model balances coverage accuracy and interval precision effectively. The results demonstrate that the proposed model significantly outperforms traditional models like linear regression, ARIMA, and exponential smoothing and conventional FTS models like Song, Chen, Yu, and Cheng by achieving the lowestMAPE with the value of 11.8% for training and 10.65% for testing. This superior performance highlights the model's reliability and potential applicability to further forecasting tasks in the field of insurance and beyond.

Research Authors
Ahmed Abdelreheem Khalil, Mohamed Abdelaziz Mandour, and Ahmed Ali
Research Date
Research File
Research Image
Research Journal
PeerJ Computer Science
Research Publisher
PeerJ
Research Rank
Web of Science (SCI Q1), Scopus (Q1)
Research Vol
10
Research Website
https://doi.org/10.7717/peerj-cs.2500
Research Year
2024

Semiparametric change points detection using single index spatial random effects model in environmental epidemiology study

Research Abstract

Environmental health studies are of great interest in research to evaluate the mortality-temperature relationship by adjusting spatially correlated random effects as well as identifying significant change points in temperature. However, this relationship is often not expressed using parametric models, which makes identifying change points an even more challenging problem. This paper proposes a unified semiparametric approach to simultaneously identify the nonlinear mortality-temperature relationship and detect spatially-dependent change points. A unified method is proposed for the model estimation, spatially dependent change points detection, and testing whether they are significant simultaneously by a permutation-based test. We operate under the assumption that change points remain constant, yet acknowledge the uncertainty regarding their precise number. These change points are influenced by the smoothing of an unknown function, which in turn relies on a smoothing variable and spatial random effects. Consequently, the detection of change points may be influenced by spatial effects. In this paper, several simulation studies are conducted to evaluate the performance of our proposed approach. The advantages of this unified approach are demonstrated using epidemiological data on mortality and temperature.

Research Authors
Hamdy F.F. Mahmoud and Inyoung Kim
Research Date
Research File
Research Journal
PloS one
Research Member
Research Pages
1-21
Research Publisher
PLOS
Research Rank
Q1
Research Vol
19
Research Website
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0315413
Research Year
2024
Subscribe to