Machine learning has great potential for improving products, processes and research. But computers usually do not explain their predictions which is a barrier to the adoption of machine learning. This book is about making machine learning models and their decisions interpretable.
After exploring the concepts of interpretability, you will learn about simple, interpretable models such as decision trees, decision rules and linear regression. Later chapters focus on general model-agnostic methods for interpreting black box models like feature importance and accumulated local effects and explaining individual predictions with Shapley values and LIME.
All interpretation methods are explained in depth and discussed critically. How do they work under the hood? What are their strengths and weaknesses? How can their outputs be interpreted? This book will enable you to select and correctly apply the interpretation method that is most suitable for your machine learning project.
The data are sourced from the World Mortality Dataset. Excess mortality is computed relative to the baseline obtained using linear extrapolation of the 2015–19 trend. In the figure below, gray lines are 2015–19, black line is baseline for 2020, red line is 2020, purple line is 2021. Countries are sorted by the % increase over the baseline.
Red number: excess mortality starting from the first officially reported Covid-19 death. Gray number: excess mortality as a % of the annual baseline deaths. Black number: excess mortality per 100,000 population. Blue number: ratio to the daily reported Covid-19 deaths over the same period (sourced from JHU).
Comparing the impact of the COVID-19 pandemic between countries or across time is difficult because the reported numbers of cases and deaths can be strongly affected by testing capacity and reporting policy. Excess mortality, defined as the increase in all-cause mortality relative to the recent average, is widely considered as a more objective indicator of the COVID-19 death toll. However, there has been no central, frequently-updated repository of the all-cause mortality data across countries. To fill this gap, we have collected weekly, monthly, or quarterly all-cause mortality data from 77 countries, openly available as the regularly-updated World Mortality Dataset. We used this dataset to compute the excess mortality in each country during the COVID-19 pandemic. We found that in the worst-affected countries the annual mortality increased by over 50%, while in several other countries it decreased by over 5%, presumably due to lockdown measures decreasing the non-COVID mortality. Moreover, we found that while some countries have been reporting the COVID-19 deaths very accurately, many countries have been underreporting their COVID-19 deaths by an order of magnitude or more. Averaging across the entire dataset suggests that the world’s COVID-19 death toll may be at least 1.6 times higher than the reported number of confirmed deaths.