17 Jul 2016

Treemap and Sunburst Excel2016 charts for Tour De France


#57 Treemap and Sunburst Excel2016 charts for Tour De France


In 2015 Tour de France contracted the company Dimension Data to make the event as high tech as possible. As DD said themselves: 


It has completed its big data analytics and digital delivery platform from which the company will deliver real-time information on individual riders for the first time in the history of professional cycling. This major innovation was made possible by Amaury Sport Organisation (ASO) in partnership with the 22 teams participating in the 2015 Tour de France. Highly accurate data will be collected through the use of live trackers under the saddle of each rider. Dimension Data will then process and analyse the data, and make it available to cycling fans, commentators, broadcasters and the media.

When the Tour de France begins on Saturday, the viewing public around the world will be able to follow all 198 riders in 22 teams real-time, and be able to track the speed at which each cyclist is riding, exactly where he’s positioned in the race in relation to other cyclists, and the distance between each rider - all via a beta live tracking website.

And I think they did an excellent job, and also in this year's edition.
On one of their Twitter-posts:
https://twitter.com/letourdata/status/624565612048744448 
 I once saw a graphic (data: total % of time a cyclist was leading in the breakaway group) which I didn't know, and which was something which could not be made with Excel (2013), but a bit later, which the launch of Excel-2016, I learnt that this graph was a Treemap, so now supported by Excel, and I tried to reproduce, see e.g. fig.1


fig.1 Treemap chart wit data from teams and cyclists


Besides the Treemap chart, Excel 2016 had another new graph, the Sunburst chart, see fig.2 for an example




fig.2 Sunburst chart wit data from countries and cyclists


Both the Treemap and the Sunburst-chart are for displaying hierarchical data, so e.g. in Tour de France, cyclists are part of a team, so you can compare the teams performances (total times) and breaking it down to the indivual cyclists performances, as done in fig.1. Another possible grouping is cyclists per country as in fig.2, although this is more artificial, so not something which is considered by the Tour de France, while grouping by teams is of course (there is a prize for the best team). 

And if fig.3 you can see the equivalent of fig.2, but now in Treemap format. Maybe this format is handier if you have less space, like e.g. Dimension Data had, where this chart was part of a dashboard with other charts.


 fig.3 Treemap chart wit data from countries and cyclists

And in fig.4 you can see the original Dimension Data Treemap, which in fact did not have any hierarchical data, just data of cyclists. In this chart it is of course the easiest to compare data of 2 riders, e.g. the 2 best riders in the breakaway-group were Roland and Rodriguez (both in left of chart, with same square-size).

fig.4: Treemap charta with data from cycists.


If you want to read more about Big Data brought by Dimension Data to Tour de France, see e.g.:

And for a video explaining more about Treemap charts in Excel, see: 

For me, the new high-tech dimension (also by using onboard cameras) makes the Tour de France much more interesting to follow, and of course the fact that the Dutch are doing very well in TDF2016 (Tom Dumoulin 2 stage victories and Bouke
Mollema #2 in general ranking).

Downloads

Excel

PDF
https://goo.gl/9rDyKAhttps://goo.gl/9rDyKA





5 Jun 2016

Month-calendar for June 2016 in Excel

#56 Month-calendar for June 2016 in Excel

In this post the new month-calendar for June, see fig.1. The Excel also has the year-calendar for 2016 and a 2016-year-planner. Event of this month is UEFA Euro 2016 France, hold on 10/6 - 10/7. Unfortunately the Netherlands did not qualify, so this time I was not so motivated to make a new version of Match for this Eurocup, or a new Excel for an office-pool, like I did for last FIFA Worldcup. But of course you can do it by yourself, by adapting these 2 files (Access and Excel):

http://worktimesheet2014.blogspot.com.es/2014/05/match-access-app-to-track-world-cup.html

http://worktimesheet2014.blogspot.com.es/2014/05/pool-for-betting-2014-world-cup-soccer.html

For a very nice Excel with a EURO 2016 tracker, which automatically calculates the points, who go to next round etc (by using formulas), see:

http://www.excely.com/football/euro-2016-schedule.shtml




fig.1: Month calendar 201606


Downloads:

https://goo.gl/BA0cJU







2 May 2016

Month-calendar for May 2016 in Excel

#55 Month-calendar for May 2016 in Excel

In this post the new month-calendar for May, see fig.1. The Excel also has the year-calendar for 2016 and a 2016-year-planner. Event of this month is Roland Garros, the French Open ATP tennis tournament, 22/5-5/6.




 fig.1: month-calendar May 2016




Downloads:

https://goo.gl/q3lwNW





2 Apr 2016

Month-calendar for April 2016 in Excel

#54 Month-calendar for April 2016 in Excel

In this post the new month-calendar for April, see fig.1. The Excel also has the year-calendar for 2016 and a 2016-year-planner. Event of this month is 'El Clasico', Barcelona vs Real Madrid, Sat. April 2, in the stadium of Barca, Nou Camp, in which there will be a 'remembrance-mosaic' with the Barca-shirt with number 14´, the number of Johan Cruyff, the Dutch legendary player and coach of this club, who died last March 24. Cruyff was besides for his football also famous for his oracle-like aphorisms, like "every disavantage has its advantage" and his logic like : I do not believe. In Spain  all 22 players make the sign of the cross before they enter the field, and if it would work, it would always be a draw." , see:


--

--
Fig.1: calendar April 2016

12 Mar 2016

Month-calendar for March2016 in Excel

#53 Month-calendar for March2016 in Excel

In this post the new month-calendar for March2016, see fig.1. The Excel also has the year-calendar for 2016 and a 2016-year-planner. Event of this month is la Volta Ciclista a Catalunya 2016 (in Spain). For more info about this event , see:





fig.1: calendar March 2016


Downloads:

29 Feb 2016

20Minutos Blog Awards 2015 Statistics with Excel and R

#52: 20Minutos Blog Awards 2015 Statistics with Excel and R

NB: Si prefieres leer este blog en español, recuerda que con Google-Blogger (el herramiento usado para este blog) es posible traducir este blog de ingles a español. El 'Google Traductor' esta en el menu a la derecha.

The Spanish newspaper 20Minutos, organizes the 10th edition of its Blog Awards ("Premios 20Blogs"), with 7652 participating blogs from (mostly) Spanish speaking bloggers. Until 2nd of March everybody with a 20Minutos account can vote 20 times, one time for every blog-category, and the blog with the most votes wins the public prize, although the jury decides who will be the winner of the 1st prize. And like last year, I'm participating with this blog. 20Minutos allows you to promote your own blog to attract votes in the Blog Awards, so: if you like my blog, you can vote for it here:

http://lablogoteca.20minutos.es/excel-ejemplos-para-tu-trabajo-deportes-y-mas-50022/0/

To do so, you have to register here:

http://www.20minutos.es/usuarios/registro/

But enough self-promotion, let´s start with what this blog is about: MS Excel. For the 2014-blog-competition I made an Excel with some statistics, see:
http://worktimesheet2014.blogspot.com/2015/02/excel-statistics-for-20minutos-blog.html

and for this year's edition I did the same, but this time I made other statistics, and not only with Excel but also with R. My idea was to get into 1 Excel all the blog-records (with data like URL, Author, Title, Description and Category in which blog is participating), and to get all this data, I used the (free) tool https://www.import.io/ for web-scraping the 20Min. website for the blog-contest: http://lablogoteca.20minutos.es/premios-20blogs/

For more info about this great data-extract tool, see e.g.:

http://www.interhacktives.com/2014/03/06/scrape-data-without-coding-step-step-tutorial-import-io/

https://www.youtube.com/watch?v=V8lHaaAWGXc&feature=youtu.be

The result of the import had some 300 duplicate records, but with Excel it's easy to eliminate duplicates, see e.g.:

https://support.office.com/en-us/article/Filter-for-unique-values-or-remove-duplicate-values-ccf664b0-81d6-449b-bbe1-8daaec1e83c2

To this dataset I added some extra calculated fields like country of blog (only posible if domain-name of website has a country-code like "es" (Spain) and blog-platform (Blogger, Wordpress etc), see fig.2a.
NB: My (V)lookup-table with the mapping country-code - country I got from here:
http://www.web-wise-wizard.com/internet-dns-web/internet-iddd-country-codes.html

And with this dataset, I made the statistics which you can see in the figures 2b, 3 etc below, using pivot-tables and pivot-charts. But before getting into more detail about these stats I present in fig.1 a Word Cloud for which I used R (the script is included in the Download-section below). I used R to make the world cloud (instead of using a site like http://www.wordle.net/) to learn a bit about R, and because of it's extensive Text Mining capabilities. Some sites which helped me with this are:

https://eight2late.wordpress.com/2015/05/27/a-gentle-introduction-to-text-mining-using-r/

http://www.webmining.cl/2012/07/text-mining-de-twitter-usando-r/

To make this word cloud, I used the column Description (of a blog), a 'memo-field'.
Not suprisingly, the most used word is... 'Blog', and then comes 'World', 'Life'', 'Ínformation'.

.


fig.1 Word Cloud of Blog-descriptions of all blogs participating in 20Min. Blog-contest.





fig.2a: Data imported in Excel from 20Min. (formula-columns which are not interesting for user are hidden).



fig.2b: total Blogs per category'.

NB: Note that the totals in this chart correspond almost 100% to the ones on the 20Min.site (http://lablogoteca.20minutos.es/premios-20blogs/ ), as it should, but with a difference of 1 (in my stats, category Viajes has 404 records, not 405 as on the 20Min. site, I wander if that is because of an error in my data-extraction from this site or maybe an error by 20Min.?)




fig.3: Total Blogs per Platform

NB: Google's Blogger (domain Blogspot) appear more popular then Wordpress.




fig.4: Total Blogs per Country

NB: for those of you who have never heard of Tokelau: this (is)land offers a 1 year free (tk) domain (for everyone, so not only for the people of TK), after that you have to pay, or they close your site, which happened to my site, http://www.eigersoftware.tk ..





fig.5: Total Blogs per Author


NB:
- this pivot-table report filters authors with >= 7 blogs, for how you can do this filtering, see:
http://stackoverflow.com/questions/12209841/filtering-pivot-table-columns-only-count-if

- 'Grupo LM' has a total of 28 blogs in the competition, and I found out that this is a company which pays bloggers if they send them a blog-post. To see all 28 blogs, double-click on measure-value '28' in the pivot-table, and then Excels opens a new worksheet with the 28 records.
I wonder why the blog-authors who work for LM did not sign up for the blog-contest with their own name.

And to conclude this post, I´d like to say good luck ('mucho suerte') to all my fellow-bloggers in the "Premios 20Blogs" Blog Awards.
BTW: To decide on which blog to vote, I also searched for Éxcel' in my Excel and found several blogs, e.g. this excelent blog, from Argentina (on which I voted):

http://excelmascerca.blogspot.com.es/


Downloads:


#Mirror 1: Google Drive (zip file with Excel and R-script files):

https://goo.gl/ZAf54w


#Mirror 2: Microsoft Onedrive (Excel file):

http://1drv.ms/1REIjjC


#Mirror 3: Scribd.com (PDF file):

https://www.scribd.com/doc/301523553/20minutos-blogcompetition-2015-statistics-with-excel-and-r



2 Feb 2016

Month-calendar for February 2016 in Excel

#51 Month-calendar for February 2016 in Excel

In this post the new month-calendar for February 2016, see fig.1. The Excel also has the year-calendar for 2016 and a 2016-year-planner. Event of this month is Lillehammer 2016 Youth Olympic Games (YOG), the Wintergames for children from 15-18 years. For more info about this event , see:

http://www.lillehammer2016.com/



fig,1 Calendar for month February 2016


Downloads:

https://goo.gl/Ap3yxl