The actuarial profession is evolving as we speak, not only in terms of business volumes but also in terms of product innovations. In actuarial terms, the exponential rise in the diversity and complexity of actuarial services is an “undemanding assumption”. Hence, there is a requirement to implement the best possible coding solutions to your actuarial problems.
The key driver is the simultaneous need for convenience and complexity. In my limited experience, users of actuarial services are inclined towards sufficient sophistication and accuracy- that does not change the desire for getting outputs at the click-of-a-button. In such situations, efficient use of technology becomes essential to drive innovation. Upskilling is easier for a student if they know exactly what to learn, how to use it, and when to use it – targeted learning is key.
Usage of VBA, Python, and R has proved to be handy for complex modeling. We’ve tried listing some areas in the actuarial product lifecycle that might involve the use of technology, along with references and sample problems -and might prove to be a starting point for actuarial students looking to upskill in times of an uncertain lockdown. Though we have covered it here as to which you should pick up, in this article we deep dive into how exactly you can do it.
Areas for Coding in Actuarial
The key areas covered under where coding in Actuarial can be used are:
- Collecting and cleaning data
- Calibrating the model parameters
- Stochastic simulations
- Front-ending the model usage (User interface)
Collecting and cleaning data
Data collection is usually one of the preliminary steps while building an actuarial solution. Databases usually come in Excel or .csv formats, unless one is dealing with a Database framework (SQL, etc.). For simplicity, we begin with a .csv form of data. We should remember that the coding is not easy at first for actuarial with background other than CS but then once you have a command on language, you can perform number of tasks within seconds or miliseconds, depending on your machine processor.
Excel might face challenges while dealing with datasets exceeding a certain volume (In my experience, anything above 200k records). These data-points might be policyholder records, claim records, or simply stock prices. Shifting them to Python or R proves helpful. Exploring the “Pandas” library in Python helps you read csv data directly into python, and analyse it at ease. R offers the same functionality in a function “read.csv()”. A good starting exercise would be to import a csv dataset into Python using the “pandas.read_csv()” function.
Running sense-checks on such data becomes easier once you just try to spot if there are any non-sensical values Eg: negative stock prices (Oil prices excepted now). The “sklearn.imputer” library in python can help with filling missing data points based on pre-defined logics Eg: fill any blank data points as the average of all other data points. “Keggle missing values1” could provide better insight for those who know the basic python environment.
Calibrating the Model parameters
Analysing trends is the crux of actuarial analysis and predictive modelling. Data once cleaned, can be easily visualised in Python using the “Seaborn” library that provides functions to plot the data in various forms, and spot trends. “ggplot2” package provides the same functionality in R. Kaggle data visualization2 might be a good starting point to begin with plotting data in python while you code for actuarial solutions.
Once trends are spotted, parameters need to be calculated using statistical techniques that all of us have covered over the actuarial course of study. Multivariate linear regression is one of the widely accepted techniques to predict values of a variable dependent on other metrics.
“sklearn.linear_model” library in python provides the functionality to fit linear models at the click of a button.
Linear progression Python Implementation3 is a good starting point for anyone looking to explore linear model fitting. A good exercise could be to try to fit a linear model to any dataset available on the internet. R provides the same functionality in the “lm()” function.
Monte-carlo simulations is a widely used concept for stochastic actuarial modelling. Coding in Python or R for actuarial solutions provide the functionality to run over a million random simulations based on pre-defined distributions at the click of a button. “numy.random” package in python helps running any number of simulations for defined distributions and parmeters Eg: numpy.random.poisson(). R helps you do the same with the “rpois()” function.
The stackoverflow link4 given below might be useful for a start. A good exercise could be to draw random outcomes of a dice (numbers 1-6), and calculate the mean after each simulation. An interesting observation would be to see the mean converge to the expected value after a few simulations.
Front-ending the model usage
One cannot let go of the convenience that MS-excel brings to the table. Users find it handy to see outputs on Excel or a PPT/PDF. Exporting results to an excel becomes essential in such cases. “pandas.wite_csv()” function allows you to export any calculated numbers to a csv format which can then be transferred to Excel using VBA or manually. R helps you do the same using “write.csv()”. this is the easiest way to code to let your actuarial solutions outputs to be exported to Excel.
Try calculating the average of random throws of dice, and exporting that value to Excel. Similarly, plots could be exported from python in a picture format. Refer chartio link5 to export a plot in a png format, which can then be pasted to Excel using VBA or manually.
An interesting addition could be to try to run a Python script using a button on Excel itself. It helps the client run a python model and derive outputs by simply pressing a button on Excel. This Youtube6 video can be a good starting point to understand how this works.
About the Author: Vaibhav Agarwal, an FRM and a student actuary with 12 papers cleared (CT Series- all, CP2, CP3, and SP0) have worked with Marsh & McLennan Companies, and currently working with KPMG as an Associate Consultant-FRM. He has worked on ERM strategies, D&O liability, Cyber Risk, and Environmental Liability among others.
We interviewed him for Bridging the gap: Actuarial Analytics & Risk Consulting
Feel free to connect with Vaibhav on Linkedin!
References for coding in Actuarial
Keggle missing values1 – https://www.kaggle.com/alexisbcook/missing-values
Kaggle data visualization2 – https://www.kaggle.com/learn/data-visualization
Linear progression Python Implementation3 – https://www.geeksforgeeks.org/linear-regression-python-implementation/#:~:text=Linear%20regression%20is%20a%20statistical,given%20set%20of%20independent%20variables.
link4 – https://stackoverflow.com/questions/50311406/how-to-draw-a-random-sample-from-a-poisson-distribution
chartio link5 – https://chartio.com/resources/tutorials/how-to-save-a-plot-to-a-file-using-matplotlib/