From Scraping to Mapping the local currencies: the Outwit Hub/Fusion Table combination (part 1)

What follows is the description of my personal experience of mapping the unmappable, through a classic “trial and error” learning.

Let’s start from the problem. Let’s say we want to produce something about alternative currencies. It is pretty attractive in these hard times, while we are watching the social catastrophe it is happening in countries like Greece, Spain and Italy if there are alternatives to the money that we are using.

Let’s say we ideally want to do a map for an overview of the European landscape and some kind of trend chart to show the average evolution of these over the time. As explained in a previous post, in which I played a bit with Tableu, we need a spreadsheet to which work on. 

From my perspective the solutions are two: the evergreen “copy and paste”, or a more professional scraper. While with the first solution I am not sure to get some nice results to be put on an Excel table, using the second one may more likely lead to a well organized spreadsheet.

Scraping the link 

According to the book Scraping for journalists by Paul Bradshaw, one of the best tools to use in this case, and that can (sometimes) work without some programming skills is Outwit hub. It needs to be downloaded from this site.

First of all I have based my work on the database of the Complementary Currency Resourse center. This site is pretty tricky as once you click on one of the links used to navigate the site the url doesn’t change. A scraper most of times need a specific url and, as we want just an European overview of local currencies, it is necessary to have a look at the html version of the site (right click and inspect element), going with the cursor over the code correspondeted to the highlighted part, and select the part of the “europe” link, which is the 4th <li> tag under the <ul class = “sideMenu2”> one. Here it is possible to find the real one, which is

Now that we have the proper link we can work on it with Outwit Hub, just inserting it into the url space on the top of the software page. Then it is just enough to click on the table option on the left and that’s it, isn’t it?


Currency picture

Trial and error

Because in this way we have just part of the list that we need. In fact at the bottom of the page there are the links for the rest (51-100, 101-108). So, what’s the solution? Just have a look at the html as before and we can find out in the <table width “99%” align “center”> tag that there are a couple of different url. Looking at them we can notice that in the middle they have 1&s and 2&s, so the first must be the one with 0&s.

outwit picture 2

 So we have to copy and paste these links in Outwit. I personally did this three times, exported the results in excel and then copy and paste to have an organic list. But there must be a better and more professional solution.

At this point it would be worth to eliminate the “stopped” category as, as the category’s name suggests, they have stopped to work as currencies anymore and other useless columns, with the aim to have just the “community“, “local exchange system” and the “url” ones.

Here we notice a nice trouble: in the local exchange system column, both the name of the currency and the type of it are in the same cells, separated just by a semicolon.

The solution? Easy, using at least my old school version of Excel . Here‘s the exact procedure explained by the Microsoft staff.

Then what I did was to insert the country column aside the community one, because i was expecting to use tableau Public like I did the time before, but I indeed wanted to change approach and I opted for a new tool, Google Fusion tables….



Embedding with wordpress: a failure

While I was writing this post about my experience with Tableau Public I went  across a problem: how can I embed something in a WordPress blog post? The most obvious answer come in my mind was to just copy and paste the embedding code in the html part of the post. Nothing appeared, excluding a “produced by Tableau Public” at the bottom of the post.

Googled the problem I have found plenty of people with approximately the same problem. In this forum, for instance, someone suggests to eliminate a part of the embedding code, or here the owner of the site decided to leave Wordpress, changing for Blogger. This is the reason why in that post I decided to insert an image of the Tableau dashboard. The problem should be related to an “i frame” plugin, downloadable just for the pro version of Wordpress. Nothing found for the on-line free one.

According to the an “i frame” tag is “an inline frame used to embed another document within the current HTML document“.

Then, I had a go with Blogger, went in the html section, pasted the code and guess… it didn’t work neither. Researching in Google I have found pretty quickly the solution in a Jerome Cukier blog post which led me to the satisfactory victory. Pity that it is on the platform that I don’t use.

Should I change web host as well?

Visualize with Tableau Public

I have previously written that I consider working with data part of a good investigative journalism. The inevitable further step is the ability to represent them in a way easily understandable by the reader.

Fortunately, and unlikely a decade ago, the web is plenty of useful tools that can accomplish this target for different kind of users expertise.

But let’s try to start from the problem. Let’s say we want to visualize a specific kind of data, like the ecological footprint of every country in the world. To make sense of the data that we are going to look at, we need to picture them in a way the state’s different levels can be compared with each other.

This is what my final work looks like, after had experimenting with the software Tableau Public.

Tableau experiment

Asking why I have embedded an image rather then the actual infographic?The reason is explained in this post. What I have tried to do here is a sort of infographic which aims to make sense of the data collected in an indipendent way. In other words I have embedded some text to describe what is visualized, so that the user have the chance to understand immediately of the figures.

Would it make sense to embed this entire work in a blog post? In my opinion absolutely not, because, having done this work for a course about infographic, it can be sold alone. But the versality of tableau allows to resize the infographic from the I-pad “landscape format” to a “medium blog post” one, and to eliminate the written parts.

What we firstly need is the raw data in a spreadsheet. I have found them in the United Nations Development Programme website, where there are a lot of interested indicators worth to work on. And in case you want to assemble different kinds of them in one spreadsheet to see any correlation, like for instance   between the GDP per capita and the gender inequality. 

Downloaded and cleaned the spreadsheet, it is all about following the instructions of the staff in this video.

The good 

creative versatility: with Tableau you can create a dashboard, that is what the viewer actually watch, assembling different visualisations tools, like charts, map and scatterplot in the same context, without just embed them separately in a website.

multiple dataset: one workbook support as many dataset as you need, even if it is  more likely to get confused among multiple pages and field names.

The bad

a bit fussy:  during the project I wasn’t often able to visualize what I needed correctly because the values weren’t set in the right format. With a bit of trial and error I made it, but sometimes wthout really know the reason.

The ugly

my data management knowledge: what I really wanted to do was a filter based on continents, but I was’t able to insert a continent columns near the country’s names. With my actual knowledge this had would mean to write near every country’s name the specific continent manually. Unthinkable for more than 170 countries.

Any idea useful to solve this problem?

The big step: from pen and notepad to data journalism and visualization

The main purpose that has led me up to Birmingham from the sunnier Ancona is to innovate and widen my journalism skills.Indeed this is more important then any mark that I will get for my projects in this MA.

What has attracted me during my experience in studying and working in journalism is the completeness of storytelling that a good investigative journalism style needs. With this in mind and the hunger to explore, I have started months ago, before the enrollment in the on-line journalism course, to research about the way of doing proper data journalism, starting from basic (really basic) data scraping up to some visualizations experiment. Then I have slightly improved these skills during the Massive On-line Open Course (MOOC) held by the Knight Center of Journalism in the Americas.

Now during the Multimedia Production module I’ll approach more the data journalism world, with the intention to explore also other thematics and media, like the mobile revolution and the miscellaneous products that can be made in the Web.

These two are part of my favourite examples:



And below there is an infographic made during the MOOC about the States ecological footprint.
Please be nice, it was just an experiment.

Ecological Footprint 1 and .