Blog 43: Simpson’s Paradox by Xavier Lo, FIA, FRM, MBA

A bit more technical this week I’m afraid. We are looking at the Simpson’s Paradox [辛普森悖論]. Some of you may have encountered some examples of this in real life. The definition of Simpson’s Paradox is that you can observe some correlational trends [相關性] when you look at separate distinct [分明] sets of data, but this trend actually reverses when you combine the data together.

A numerical example will clear this up. There are two classrooms, each made up of 5 students. The first classroom has 4 girls and 1 boy, and the second classroom has 1 girl and 4 boys. In the first classroom, the girls are all 5 years old and the boy is 6 years old. In the second classroom, the girl is 1 year old, and the boys are all 2 years old. Calculate the average age [平均年齡] of the girls and boys per classroom and you can see that the average age of the boys in each classroom is older than the girls. Now calculate the average age of the boys and girls when you combine both classrooms together and what happens? The average age of the girls is now higher than the boys!

Why does this happen? Usually, when you are told that the average age of the boys is higher than the girls in both classrooms, you’d expect this to still be the case when you combine the classrooms together. However, as you can see from the example, when the group sizes are very different, the trends you see in the individual groups might be very distorted [扭曲]. A real example of this occurred in 2016 when Donald Trump [特朗普] was elected US president [美國總統]. He won the election because he got more votes from the states, but he actually didn’t get the most number of votes overall across the whole of the US.

This can sometimes happen when you are analysing data and looking at correlation and causation between factors. Just remember that even though you see trends within groups of data, do a quick check on the overall total to see if the trend still exists – don’t let Simpson’s Paradox trip you up!

About the Author

Xavier Lo, FIA, FRM, MBA

Qualified fellow actuary (in UK and Hong Kong), Financial Risk Manager, and MBA graduate (listed on the Dean's List) with a passion for insurance, data science, and analytics. Experienced in a broad range of insurance roles (pricing, capital modelling, reserving, ERM), along with a touch of knowledge in banking. Member of the General Insurance Committee (2021), Actuarial Innovation Committee (2019 - 2021) in ASHK.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.