When it comes to learn how to make journalism, one of the best way is surely the experience in the field. At least this has worked for me so far, initially with my Italian “newspaper” prose and the interpretation of the local regulations, and now with the data analysis.
It was a friday when I knew the Birmingham Mail might have been interested in an article based on a dataset I found in Defra (Department of Environment, Food and Rural Affairs) website, and that they wanted something before a budget discussion, held on the following Tuesday. A three day deadline, a dataset and quotes to ask. Not so bad, but considered my inexperience in dealing with data I had a lot of reasons to be anxious.
The data provide the annual figures about waste management in all the British regions. As it can be seen there are 4 tables, where the first one is probably the most important, as it gves all (or, how we’ll see later, the most) of the management particulars. They are not so many, but they may be tricky for a not expert eye like mine and, what I have realized at the end of the day, is that it took me too long to analyse these data, to visualize them and to write a simple story.
In each table there’s a big red phrase on the bottom who recommend to avoid to sum up the data as they may result in a double counting, which in the first instance I obviously ignored. Basically we can divide the information the dataset provides in three, that are the total waste collected by local authorities, and the household and commercial parts of the total. While the first represents what the councils collect from the citizen’s dwellings, the second one is about the waste produced by industrial and commercial activities. For every category there are many entries, among which the recycling and not recycing part, that are what we are interested in.
Knowing something vague about some European targets to be met in a short time in Italy, what I wanted to see was the Birmingham City position in a recycling rate list. Having the total amount of waste collected and the recycled one both in tonnes, it is possible to calculate the percentage of the latter (total waste recycled /total waste), and set the data in a “from the smallest” list. To be honest this is useless as it is already calculated in the table 2, indeed the one I was considering, but it’s good to practice anyway. Doing this we can see that Birmingham is at the 6th position, the smallest number after Westmister and others that are not part of the main island. It is a story! And it was until I spoke with a member of the City Council.
The cold sweat: always read the dataset description!
At this point it is worth to say why the data are so tricky. In table 1 we have about 380 local authorities and in table 2 just 120. And it the fist one Birmingham was on the 28th position! Why?
While I was thinking that my story was irreparable flawed, I read the “notes for tables” part of the dataset, in which the reason is clearly explained. Some kind of authorities are described like being collection authorities, while the others are unitary and disposal. Some of the collection ones are included in others authorities like a Russian doll (Matrioska), so it is explained the reason of the big red warning. The solution? The same notes explain that it is enough to exclude the collection authorities, and in fact our Birmingham City Council came back to 6th position.
Always seek a response
At this point, and had already missed the Tuesday deadline because of several reasons, after asked the Council for a quote I have been warned by one of its spokesperson that my consideration of the total waste rather then just the household one is wrong.
In fact almost every legislation in this topic consider just the household waste collection as it is directly collected by the Council, while who deals with the commercial one are in big part the companies themselves through private contractors. What it is described in the spreadsheet is just a small part of what the Council collects from this companies, amounting in just less then 20% of the total, while the rest is covered in the shade, as the City Council do not hold such information.
The story is broken
At this point the disappointment was inhibited by the possibility of reshaping the story from “the second worst of the Mainland outside London” to the smaller sized “the worst in the West Midlands“ which has been published anyway.
However, the importance of this experience has been inevaluable considered my learning the way to read and interpretate datasets, and to deal with local authorities.