This week the Economist had a very interesting article on the Bundesliga, where the newspaper touches upon the underepresentation of teams from the ex-East Germany in the soccer league. While the article was certainly eye-opening (I had never thought of even examining this question!) one thing that it was missing (in my opinion) was a baseline for comparing the number of teams from the ex-East region in the league. And this is what I ought to do in this post.
I collected data from Wikipedia for the teams participating in the Bundesliga from the fall of the Berlin wall until the current season. In the following figure I present the fraction of teams from the ex-East Germany and the corresponding 95% confidence intervals of these percentages.
I have further considered 3 baselines for comparisson; (a) the fraction of the area covered from ex-East Germany region, (b) the fraction of the population in the ex-East Germany region, and (c) the fraction of the cities in the ex-East Germany region. All these baselines are intuitive in the sense that you expect that the number of teams a region has might be proportional to the number of cities it includes (compared to the total region that the league serves).
As we can see considering the area baseline for all the 25 years after the reunification of the country, the East region has been under-represented significantly (at the 5% confidence level). In other words, if we perform the following statistical test:
we can reject the null hypothesis every season. When considering the other two baselines, there are 6 seasons, where the baseline expected falls within the 95% confidence intervals of the actual representation of the East region, and hence, we cannot reject the null hypothesis that the actual fraction is equal to the baseline. Even these cases though can be false negatives due to low statistical power cause by the small sample size (i.e., 20 teams in the 1991-92 season and 18 in the rest of the seasons). Also as the Economist article mentions as well, there has not been a single team from the East region sicne 2009!
I did the same excercise for the basketball Bundesliga using data from the 2008-2009 season and onwards (during the 1990s and the beginning of 2000s the teams in the league were even less than the current 18 and hence, any statistics obtained are tricky).
The situation in the basketball Bundesliga appears to be better (again though the area baseline is never reached). However, I will leave the conclusions with regards to the geo-sports segregation in Germany today to you; I just wanted to give out more data and a baseline that was missing from the Economist’s article! Any comments are always welcome.