Can Machine Learning fail?

Machine Learning is a big trend right now, it is a hype, and maybe over overrated? In fact, Machine learning is a very good technic that can be used in a very large set of problems. I believe that the success and great work that the social medias and e-commence does in “predict” patterns and give recommendations in a very accurate way, has constructed some high expectations on it.

In a more practical approach Machine learning is a great technic to do some tasks, it is not magical, it is only a very good technic if applied and used in a correct way, like everything else.

Machine Learning is a instrument to solve problems, so the most important thing is to understand exactly what is the problem. We can split in four big groups of tasks:

  • Classification, the market is going up or down, the clients will default or not.

  • Regression applied when there is a continuous output, like what will be the inflation next year, how much the salary will growth next year.

  • Clustering, it is applied to understanding the characteristics of a group, the investor is conservative or aggressive. What are the characteristics of a group that has a probability to save more than $ 10k yearly; and

  • Association, this is the job that the social media does, like YouTube, Instagram, Netflix etc. by doing suggestions about what you could like. So, machine learning it is a tool to solve some problems and to do so, we must understand, exactly the problem that we have in front of us.

I would like to share a project did in the Machine Learning class at NOVA-IMS where we applied some regression technics like Linear Regression and Decision Tree. The goal of the project consists in developing a predictive data analytics solution for a French insurance company, following a CRISP-DM methodology (Where we could say that the most important part is the Business understanding, or to understand exactly the problem). The developed model seeks to forecast the number of claims each policyholder will have in the following year. By having this information, the insurance company could adjust its pricing model for the next year’s premiums according to the predicted number of claims.

After applying all the techniques and process of a CRISP-DM methodology, coming through Data Understanding, Data Preparation, Modelling and Evaluation for Liner Regression and Decision Tree. We were able to fit the results:

Linear Regression

Residuals for Linear Regression Model

Prediction Error for Linear Regression

Despite the fact that the results obtained by the model in the training set are similar to the ones obtained using the test set, all obtained measures show us that the model predictive power can be considered weak. One possible reason for such results is the fact that the initial dataset is extremely inbalanced (many more observations reporting zero claims than higher than zero)

Decision Tree

Residuals for Decision Tree Regressor Model

Prediction Error for Decision Tree Regressor

Similarly to what was observed using the Linear Regression Model, for the Decision Tree model the results obtained in the training set are similar to the ones obtained using the test set. Evethough, onde again, all obtained measures show us that the model predictive power can be considered weak.

Initialy, one could infer that the Decision Tree Model should return more robust results. Although, this was not the case. One possible reason for such is the fact that the initial dataset is extremely inbalanced (many more observations reporting zero claims than higher than zero). As well as that, a higher number of observations, meaning, a larger dataset, could lead to better results.

In this exercise we are able to see some problems involved in applying Machine Learning. It does not work as magical and are able to fit and solve all problems. Some time a good data Understanding and Data preparation can say more than the model itself. The fact is that Machine Learning is a great technique, but it needs hard work and the knowledge to be used correctly, every task needs the specific tool. If we think in the real life, we must know that this tool it will be used for many people and generally it better to make it simple. Do it simple!

Louise Cardoso

Welcome to Blog Capital Flow, your essential portal for up-to-date financial insights and analysis. Our site is dedicated to providing valuable information on investments, the financial market, global economy, and capital management. With expert articles and practical tips, we help investors and market enthusiasts make informed decisions and achieve their financial goals.

At Blog Capital Flow, you’ll find in-depth analysis on market trends, investment tips, and guidance on building a diversified portfolio. We keep you updated on the latest economic news, financial innovations, and strategies to maximize your returns. If you’re looking for quality content on personal finance, financial planning, or investing in stocks, bonds, and cryptocurrencies, Blog Capital Flow is your trusted resource.

Our goal is to empower our readers with the knowledge necessary to navigate the complex world of finance. Whether you’re a beginner or an experienced investor, we offer relevant content that can help enhance your financial skills and grow your wealth. Explore our articles and discover how Blog Capital Flow can be a vital tool on your journey to financial success.

Keywords: investments, financial market, global economy, capital management, personal finance, financial planning, cryptocurrencies, market analysis, portfolio diversification, economic news, returns, wealth, financial success.

https://www.blogcapitalflow.org
Previous
Previous

Data Science in Bond Market

Next
Next

How is car insurance priced?