19 Dec 2016

Carrera de la Empresas 2016 Madrid Statistics made with R

#59: 10K Carrera de las Empresas 2016 Madrid: Statistics made with R + included in Power BI

Note 20/12/2016: I updated this post with a Power BI solution (with R-Stats and Q&A), see part 2 of this post.


Part 1: Statistics with R 

Today the 'business run' ("Carrera de las Empresas") was held here in Madrid. I participated with Raet, a Dutch ISV of a SAAS-solution for Human Resources, which since almost 2 years now has a software development centre in Madrid.

For an earlier post of this run (with more details), see:
http://worktimesheet2014.blogspot.com.es/2015/12/statistics-result-10-km-run-carrera-de.html

This time, there were over 14.000 runners, from 800 companies, for more details, see:

http://www.expansion.com/directivos/deporte-negocio/2016/12/19/58578b2ae2704ef1758b4588.html
The results of the 10K run (in which I participated) I got from here:

http://www.carreradelasempresas.com/sec_clasificaciones.asp
'
and for the statistics, see this page (a social/runners network): http://www.runedia.com/cursa/201628906/carrera-de-las-empresas-10k-actualidad-economica/2016/

The results (10K-Men, 5563 runners)  are in PDF format, which I converted to Excel, with help of this nice (free) online tool:
http://www.pdftoexcel.com/

After some 'cleansing' (see added yellow-columns in Excel file) I had the data ready for analysis, for which I used R, the language for statistical computing.
NB: For an earlier post in which I used R, see:

http://worktimesheet2014.blogspot.com.es/2016/02/20minutos-blog-awards-2015-statistics.html


The R-script I made looked basically like this:


sv10km2 <- read.csv("10Km2.csv")
edit(csv10km2)
Tfinish=as.numeric(as.difftime(as.character(csv10km2$Tfinish),format="%H:%M:%S"))
summary(Tfinish)
plot(density(Tfinish,na.rm=TRUE),lwd=3,xlab="Finish Time (mins)", main="Madrid Business Run 2016 10k Results")
View(csv10km2)
boxplot(Tfinish, data=csv10km2)
install.packages("sqldf")
newdata <-  sqldf('select * from csv10km2 where Team like %Raet%')


Input for the script has to be in the MyDocument-folder, in my case: D:\Persoonlijk\maartenree\Documents\10Km2.csv

See figures below for the result of my analysis.

And for those who don't have R, but want to do the same as I did for my company (in fig.3), so filter the results, I also added the Excel where the finish-results are in table-format (so easy to filter).



Fig.1: Plot with finish-times and Five-number summary


Fig.2: Boxplot  with finish-times



Fig.3: Filter dataset with finish-times for (teams of) company Raet


Part 2: Power BI + R Stats/Charts

Power BI has an option to include R Stats/Charts, and so I wanted to add the above R-chart 'Plot finish-times' (fig.1) to Power BI, and the filtering-option in Power BI would make it easy to 'slice-and-dice' this chart, in this case: filtering for a team, and showing the runners in that team. To be able to do this, I had to convert the Excel-column with the finish-times in 'duration-format' ([h]:mm:ss) to seconds, see column Tfinish2 in attached file 10Km3.xlsx and fig.4.

Fig.4: Excel with new column Tfinish2 (time in seconds)


And for the final result in Power BI (Desktop), see fig.5.
Note that slicer for team 'dimension' has a pre-filter (in this case with value 'Raet') to limit the number of items in the slicer (very helpful when there are over 1000 teams). And to filter for >1 team (like I did, with 3 Raet-teams 'checked'): press Control + click in checks for filter values. And note that the chart is reacting on the filter (so Power BI filter and R-chart are connected).


Fig.5 Power BI Desktop: report finish-times (in second) filtered for teams of Raet



I also uploaded the Power BI Desktop report to Power BI online, see:

https://app.powerbi.com/view?r=eyJrIjoiZDQ1YzFkZTAtODExNi00OWRmLWFlOTAtMzI0M2JjZWE4MzQ0IiwidCI6ImI3OWIyMzE3LTM0ZGQtNDNlNS05MWEyLWNkNjZkM2FlMWYwYiIsImMiOjh9

To be able to use R-Stats/Charts in Power BI online, you need to have a Power BI Pro account. About how to get a free trial-account, see:

https://powerbi.microsoft.com/en-us/documentation/powerbi-admin-purchasing-power-bi-pro/#power-bi-pro-60-day-trial

With Power BI Online it is also possible to do Q&A in natural language (English). See fig.6-7. For this example to work, I had to rename the Excel-columns Runner -> RunnerName and Team -> TeamName, see:

https://powerbi.microsoft.com/en-us/documentation/powerbi-service-q-and-a-tips/

Fig.6 Power BI and Q&A 1/2

Fig.7 Power BI and Q&A 2/2



References

http://cnr.lwlss.net/RealData/

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4317364/


Downloads

R:
v1: https://goo.gl/gdjPRk

PowerBI:

v2:
https://goo.gl/S43i2O

v2b: (better chart than v2)
https://goo.gl/gn4o8k

9 Dec 2016

PowerBI Dashboard for Messi´s worldrecord of 91 goals in one year


#58 PowerBI Dashboard for Messi´s worldrecord of 91 goals in one year

NB: 
- This post is not about an Excel I made (but about a Microsoft Power BI dashboard I made, which is (almost) a copy of an Excel workbook I blogged about earlier), so I tagged it with the 'More' category.

Once I made an Excel with a PowerView dashboard for Messi's worldrecord of #goals in one year (91), see:

http://worktimesheet2014.blogspot.com.es/2015/08/dashboard-for-messis-worldrecord-of-91.html

In this post, I made a new version of this dashboard, but now in Microsoft PowerBI, and also with some improvements, e.g. the treemap-view for all matches in which Messi scored, aggregated by the team and competition where he made his goals. NB: For more info about the Treemap-chart, see my previous post:

http://worktimesheet2014.blogspot.com/2016/07/treemap-and-sunburst-excel2016-charts.html

See fig.1 for the end-result, in PowerBI Desktop.



Fig.1 Messi-Dashboard in PowerBI Desktop

The nice thing of PowerBI-dashboards is that they are suited for any device, so apart from PC also smartphone or tablet, see fig.2, in PowerBI-online (cloud-service), so this is a site for sharing your dashboards, for mine, see:

https://app.powerbi.com/view?r=eyJrIjoiOWYxMTVkNTUtZjJjMy00NTcyLWJlMzctNmNlYzE5ZmQyYjM1IiwidCI6ImI3OWIyMzE3LTM0ZGQtNDNlNS05MWEyLWNkNjZkM2FlMWYwYiIsImMiOjh9

This URL Power BI gave me after choosing menu-option: "File > Publish to web", which should make the dashboard 'public', see:
https://michielquakernaat.com/2016/02/03/want-to-share-your-power-bi-report-to-the-world-yes-we-can/

Or you can use this URL: https://app.powerbi.com/groups/me/dashboards/729276ac-365c-4473-9d42-1969ddd16dac

There is also a Power BI Community page which has a 'Data Stories Gallery', and I posted my 'Messi-dashboard' here:
http://community.powerbi.com/t5/Data-Stories-Gallery/Messi-s-worldrecord-of-total-goals-in-1-year-91-goals-in-2012/m-p/100642#M485

And here I found another nice Messi-dashboard:
http://community.powerbi.com/t5/Data-Stories-Gallery/Incredible-Lionel-Messi/m-p/82225/highlight/false#M386

So this is similar to this page of the Tableau-community:
https://public.tableau.com/en-us/s/gallery



Fig.2: Messi-Dashboard in PowerBI (online) on PC and smartphone


2 Nice build-in features of Power BI are:
- Quick Insight in the dataset (source of dashboard), which generates several 'data-profile' charts, see fig.2b
- Natural language Q&A, user types his question (in English), using some 'keywords' like 'by' to specify the 'slice-and-dice' dimensions, and Power BI answers (with charts) , see fig.2c.

For more details about these features (and Power BI in general), see:
http://www.computerworld.com/article/3088958/data-analytics/free-data-visualization-with-microsoft-power-bi-your-step-by-step-guide-with-video.html



Fig.2b: Quick Insight



Fig.2c: Natural language Q&A

As said in my previous blog-post, my Messi-dashboard was inspired by one made with BI/Dashboard-tool Tableau  (for which PowerBI was Microsofts answer/alternative), see fig.3 for the this version.


Fig.3: original Messi-dashboard in (free trial version of ) Tableau

To see another really nice example of a PowerBI-report with the results of the Spanish soccer League made by a friend, see fig.4 and his site: http://Ligasantanderbi.azurewebsites.net



Fig.4: PowerBI (with Azure) example with results Spanish soccer League

So with Power BI you can make really nice dashboards, and with the frequent updates/releases of Power BI, much more frequent than that of MS Office (Excel), one could wonder if for 'self-service BI' Power BI is the way to go (and not Excel with its 'Power-plugins' like Power View, Power Map etc.) Some say it is, see e.g.:

http://www.cio.com/article/2979725/enterprise-software/why-power-bi-is-the-future-of-excel.html

while others say you can/should use the 'pack' 'Power BI + Exel', see:
https://www.youtube.com/watch?v=jmTedSuKers

And what do you think? Comments are welcome (but please not anonymously, because I get a lot of 'fake' comments (publicity) from anonymous users, so I don't read anonymous comments.

Note 12/12/2016:
Although I haven't got any comments on this post (yet), apparently the 'editor' of The #PowerBI Daily (on www.paper.li), liked this post, because he published it here today, see fig.5.
NB: Paper.li is a site to create your own (virtual) newspaper of tweets of persons you follow, for more info, see:
http://www.takeflyte.com/flyte/2010/12/what-is-paper-li-a-newspaper-of-tweets.html

Fig.5: Blog-post referenced on  The #PowerBI Daily (on www.paper.li) 



References

-1: Good explanation of what is "Business Intelligence (BI): Multidimensional Analysis"
https://youtu.be/IhFkNmVmwn4

-2: A 'BI-solution' actually consists of 2 components: a Data Warehouse (DW), the (invisible) back-room ('the database') and BI, the (visual) front-room (the dashboard), where DW is normally about 80% of the solution, and BI 20 %, so it is the top of the ice-berg, as described here:
https://www.datamart.de/competence/business-intelligence-data-warehouse/Seiten/default.aspx


Downloads
Pbix-file, so PowerBI-report:

https://goo.gl/Ex4XbB






17 Jul 2016

Treemap and Sunburst Excel2016 charts for Tour De France


#57 Treemap and Sunburst Excel2016 charts for Tour De France


In 2015 Tour de France contracted the company Dimension Data to make the event as high tech as possible. As DD said themselves: 


It has completed its big data analytics and digital delivery platform from which the company will deliver real-time information on individual riders for the first time in the history of professional cycling. This major innovation was made possible by Amaury Sport Organisation (ASO) in partnership with the 22 teams participating in the 2015 Tour de France. Highly accurate data will be collected through the use of live trackers under the saddle of each rider. Dimension Data will then process and analyse the data, and make it available to cycling fans, commentators, broadcasters and the media.

When the Tour de France begins on Saturday, the viewing public around the world will be able to follow all 198 riders in 22 teams real-time, and be able to track the speed at which each cyclist is riding, exactly where he’s positioned in the race in relation to other cyclists, and the distance between each rider - all via a beta live tracking website.

And I think they did an excellent job, and also in this year's edition.
On one of their Twitter-posts:
https://twitter.com/letourdata/status/624565612048744448 
 I once saw a graphic (data: total % of time a cyclist was leading in the breakaway group) which I didn't know, and which was something which could not be made with Excel (2013), but a bit later, which the launch of Excel-2016, I learnt that this graph was a Treemap, so now supported by Excel, and I tried to reproduce, see e.g. fig.1


fig.1 Treemap chart wit data from teams and cyclists


Besides the Treemap chart, Excel 2016 had another new graph, the Sunburst chart, see fig.2 for an example




fig.2 Sunburst chart wit data from countries and cyclists


Both the Treemap and the Sunburst-chart are for displaying hierarchical data, so e.g. in Tour de France, cyclists are part of a team, so you can compare the teams performances (total times) and breaking it down to the indivual cyclists performances, as done in fig.1. Another possible grouping is cyclists per country as in fig.2, although this is more artificial, so not something which is considered by the Tour de France, while grouping by teams is of course (there is a prize for the best team). 

And if fig.3 you can see the equivalent of fig.2, but now in Treemap format. Maybe this format is handier if you have less space, like e.g. Dimension Data had, where this chart was part of a dashboard with other charts.


 fig.3 Treemap chart wit data from countries and cyclists

And in fig.4 you can see the original Dimension Data Treemap, which in fact did not have any hierarchical data, just data of cyclists. In this chart it is of course the easiest to compare data of 2 riders, e.g. the 2 best riders in the breakaway-group were Roland and Rodriguez (both in left of chart, with same square-size).

fig.4: Treemap charta with data from cycists.


If you want to read more about Big Data brought by Dimension Data to Tour de France, see e.g.:

And for a video explaining more about Treemap charts in Excel, see: 

For me, the new high-tech dimension (also by using onboard cameras) makes the Tour de France much more interesting to follow, and of course the fact that the Dutch are doing very well in TDF2016 (Tom Dumoulin 2 stage victories and Bouke
Mollema #2 in general ranking).

Downloads

Excel

PDF
https://goo.gl/9rDyKAhttps://goo.gl/9rDyKA





5 Jun 2016

Month-calendar for June 2016 in Excel

#56 Month-calendar for June 2016 in Excel

In this post the new month-calendar for June, see fig.1. The Excel also has the year-calendar for 2016 and a 2016-year-planner. Event of this month is UEFA Euro 2016 France, hold on 10/6 - 10/7. Unfortunately the Netherlands did not qualify, so this time I was not so motivated to make a new version of Match for this Eurocup, or a new Excel for an office-pool, like I did for last FIFA Worldcup. But of course you can do it by yourself, by adapting these 2 files (Access and Excel):

http://worktimesheet2014.blogspot.com.es/2014/05/match-access-app-to-track-world-cup.html

http://worktimesheet2014.blogspot.com.es/2014/05/pool-for-betting-2014-world-cup-soccer.html

For a very nice Excel with a EURO 2016 tracker, which automatically calculates the points, who go to next round etc (by using formulas), see:

http://www.excely.com/football/euro-2016-schedule.shtml




fig.1: Month calendar 201606


Downloads:

https://goo.gl/BA0cJU







2 May 2016

Month-calendar for May 2016 in Excel

#55 Month-calendar for May 2016 in Excel

In this post the new month-calendar for May, see fig.1. The Excel also has the year-calendar for 2016 and a 2016-year-planner. Event of this month is Roland Garros, the French Open ATP tennis tournament, 22/5-5/6.




 fig.1: month-calendar May 2016




Downloads:

https://goo.gl/q3lwNW





2 Apr 2016

Month-calendar for April 2016 in Excel

#54 Month-calendar for April 2016 in Excel

In this post the new month-calendar for April, see fig.1. The Excel also has the year-calendar for 2016 and a 2016-year-planner. Event of this month is 'El Clasico', Barcelona vs Real Madrid, Sat. April 2, in the stadium of Barca, Nou Camp, in which there will be a 'remembrance-mosaic' with the Barca-shirt with number 14´, the number of Johan Cruyff, the Dutch legendary player and coach of this club, who died last March 24. Cruyff was besides for his football also famous for his oracle-like aphorisms, like "every disavantage has its advantage" and his logic like : I do not believe. In Spain  all 22 players make the sign of the cross before they enter the field, and if it would work, it would always be a draw." , see:


--

--
Fig.1: calendar April 2016

12 Mar 2016

Month-calendar for March2016 in Excel

#53 Month-calendar for March2016 in Excel

In this post the new month-calendar for March2016, see fig.1. The Excel also has the year-calendar for 2016 and a 2016-year-planner. Event of this month is la Volta Ciclista a Catalunya 2016 (in Spain). For more info about this event , see:





fig.1: calendar March 2016


Downloads: