DataScience

Investigating the relationship between life satisfaction and income

Aims: After I graduate I am set to enter the world of investment banking, an industry known for paying well but at significant personal cost. This led me to question the extent to which income is correlated with happiness, starting at a macro level, before focusing on a specific case study.

I plotted happiness score against log GDP per capita on a choropleth map using data from the World Happiness Report, where happiness score is defined as the average answer to the World Gallup Poll where participants rank their happiness from 1 to 10.

I used a bivariate scale to capture both GDP and happiness simultaneously, using a colour palate informed by Joshua Stevens' guide. I found terciles for both variables and the created 9 classifications for each possible combination of terciles, the interpretation of which can be seen by clicking on the scale.

In converting the TopoJSON file to GeoJSON, an error caused countries including Russia and Antarctica to display inaccurately as shown here. I looped through the GeoJSON output and changed the longitude of any coordinates beyond 180 degrees west to the eastern side of the map in this code.

This chart shows that most developed countries have both high happiness scores and GDP per capita, while developing countries including many in Africa display the opposite characteristics. Overall, tercile of wealth is a good predictor for the tercile of happiness, with most countries falling in categories 1,5 and 9 (same tercile for both variables) as shown in the chart below. However, there are a few interesting anomalies. For example, Turkey is unhappy despite high GDP and while Uzbekistan is happy despite low GDP.

The following regression solidifies the validity of this relationship, as indicated by the high R².

I investigated poverty rate as a potential driver behind this relationship. Poverty was defined as living on less than $1.90 per day. As expected, poverty is negatively correlated with happiness, a relationship exaggerated when ignoring countries with poverty rates below 1% as poverty has little impact on most people in these countries. I used the World_Bank_Data python library to download data from the World Bank API, allowing for new data to be downloaded annually upon release.

To tackle the root of my question, I analysed the relationship between job satisfaction and pay in FTSE 100 companies. I used Glassdoor which has thousands of ratings and salaries for large companies, providing accurate data.

However, the site blocks scrapers using BeautifulSoup alone. I used selenium and chrome driver to solve this issue. In the absence of an API, I found FTSE 100 constituents by scraping names from the LSE website and saving them into an array. I automated login to the site, navigated through the site and looped through the array of companies, searching for each. For each company, I navigated to the rating and salary sections, scraping data on each which I saved to a DataFrame. Different roles were reported at different frequencies, so for each salary I recorded the number of reports to calculate a weighted average of salaries. Full code is available here and a video of the scraper running is available here. Running the resulting code will automatically update the data set and corresponding chart but takes over 30 minutes to run.

To update the chart daily, I would run these terminal commands – I decided against this as the scraper uses significant computing power.

I regressed salary against job satisfaction, showing there is very little relationship, as implied by the low R². While pay at the private equity house 3I was the significantly higher than others, satisfaction rating was very low, saying something about the toxic culture in high finance.

Finally, I looked at rating and compensation in bulge bracket banks. Having scraped ratings from Glassdoor using the method described previously, found salary data from Arkesden, who are well renowned in the industry for providing accurate compensation data. I scraped the data using Tabula from a PDF released annually in this code. This data is released in the same format annually so could be scraped each year upon release.

At the junior level, banks tend to pay almost identical salaries, while at senior levels there is more variation. At VP level, higher paying banks have higher satisfaction rates, implying that salary plays a greater role in job satisfaction at the senior level.

Conclusions

To conclude, income and happiness are strongly correlated at the macro level. However, there is a more nuanced relationship between income and job satisfaction. In general, there is no clear relationship, highlighting the complex nature of job satisfaction, which incorporates many other factors including company culture. Although, there appears to be some level of correlation in banking, perhaps because the industry attracts money-motivated individuals.