Monday, 17 June 2013

Press, stats and football charts.

FIRST HALF

Madrid press is clear: it's the time for the relay in the hegemony of FC Barcelona.

After recent years, Real Madrid is getting closer to FC Barcelona level and, with the data studied, in the last year it has surpassed it.

To strengthen their argument, Madrid journalists use the following data: the incomes from the last two years of the two clubs, and the individual and collective titles achieved by the two teams.

Regarding incomes, Real Madrid has achieved in 2013 an increasing of 80% from the previous year, while FC Barcelona have only grown a 40%.

And in regard to individual and collective titles, Real Madrid has achieved a 40% less of titles than last season, while FC Barcelona won 10% less than the previous year.

If we transfer this data to a table, and assign number 100 to the value of each variable in 2012, we have the following figures:

Now let's move these data to a chart, so that we can analyze them in a more noticeable way:

If you look at the average of the two variables, we see a more positive and increasing trend in the data from Real Madrid than in the graph of FC Barcelona.

We can realize that the improvement of the variables average is more pronounced in the case of Real Madrid. So, the average rate rises from 100 to 115 in case of FC Barcelona, and from 100 to 120 in case of Real Madrid.

The conclusion, in view of these charts, is what news media from Madrid tell: FC Barcelona is in a cycle order, and Real Madrid is in sharp rise.

What do you think will be the opinion of the sports media from Barcelona?





SECOND HALF

Some of you will think: Barcelona's newspapers will say that these data are not certain, and will stand that FC Barcelona is still better than Real Madrid.

May be, but the truth is that the press in Barcelona should not use this argument, since it's easy to verify that the data used are correct.

Yes, but all data are relative percentages. Surely that FC Barcelona's absolute data are much higher than those of Real Madrid.

That's right. May be that absolute numbers from which FC Barcelona starts are higher than Real Madrid, and that despite the improvement in Real Madrid data, FC Barcelona is still above in absolute figures.

In fact, one of the shortcomings in many of the statistics that we find everyday is that there's no information about the original data on which they are based. Two countries can have the same IPC, but we must know the standard of living in each of them, and the base level from which we start, for comparison purposes. Two teams may have won 50% more titles than last year, but we can't equal one team that won 10 titles last year and wins 15 this season, with one that had won 2 titles last season and wins 3 titles this year

However, this doesn't mean that Madrid newspapers offer false information. They have not said anything about absolute figures, all they have done is focus on trends of both teams in relative values. And, apparently, they are right in their argument.

Sure. What happens is that among hundreds of ratios that could be handled, they've chosen those two (economy and titles) which better support their argument. Sure the other ratios were beneficial to the FC Barcelona...

Well, we know that each one uses the data that conform better his argumentation (including the author of this story). However, it seems that the data used are fairly representative of the success of a team.

To be strict, there should have been a previous study about which variables are the most representative of a club success. And we should start by defining what we mean by success of a club.

Nowadays, perhaps we could agree that now the most representative indicators of the success of a football club are the sports results and the economic performance . Although we should not forget their social responsibility (support grassroots sport, collaboration with NGOs, etc..) and their impact on social networks, all of them increasingly relevant aspects.

Yes, but although these two factors (titles and economics) are the most representative, need not have the same relevance. Surely if we weigh the two indices in a different way to that done (in our example both are equal, 50% of the total), we'll come to different results ...

Indeed, according to the importance we give to each index, we can draw different conclusions from those reached by the newspapers from Madrid.

However, even with the indices chosen, and the weighting established for them, we can still question the conclusions of Madrid papers.

Also, here we are mixing two variables: money and sportive results, which doesn’t have nothing to do each other....

It’s true. This is a typical 'trick' that many analysts use. Sometimes we see how completely heterogeneous figures are added: in our case, economic and sportive data. And it happens also in the sports performance index, in which a collective title, such as the achievement of the league title, worths the same as an individual title, for instance, a trophy awarded to a defense for being the player that has scored more headed goals in a Cup competition.

However, even with many caveats, we'll take for good the analysis of Madrid newspapers.

So we must admit that the tendency of Real Madrid has been better than that of FC Barcelona in the last season, right?

Well, it remains to check a non negligible detail that makes Barcelona newspapers believe that FC Barcelona is still better than Real Madrid.

And which is this detail?

Well, it's as simple as choosing appropriately the base year o reference year.

In the articles on Madrid newspapers, they rely their conclusions upon the results of each team in 2012.

This is, they assign the index base 100 to the results of each team in 2012, and they calculate the variations in 2013 on it.

Let's see how, with the same data, its interpretation radically changes, just only transferring the reference year.

We fix now as base 100 the value of each index in 2013, and we recalculate the table, remembering that Real Madrid had achieved a 80% increase in revenue and 40% fewer titles, and that the FC Barcelona increased 40% its revenue, and its titles decreased 10%.

Let’s see these data in a graphic:

We can see how with the same data, and changing only the year in which we set the 100% rate, the conclusions are radically different: the FC Barcelona maintains the same positive trend, but the trend of the average of the indexes clearly becomes negative for Real Madrid with this new perspective.

So Barcelona newspapers can use the same statistic as Madrid’s media to make on them their cover titles.

With our example, we can say that the simple choice of the value or the time we take to index, base or coordinate origin issue can lead to very different conclusions about the same data. Or we could also say that we can handle most statistical data in such way that they fit the target we want to prove.

This story is a tribute to Durrell Huff and his book ‘How to lie with Statistics’. Some months ago, a follower of Matifutbol, whom I want to thank, recommended me to read this extraordinary book.

Such books throw light on some aspects of Statistics, which should be known by everybody, in order that people could handle correctly all these data, indexes and graphics that surround us nowadays.