19 Dec 2016

Carrera de la Empresas 2016 Madrid Statistics made with R

#59: 10K Carrera de las Empresas 2016 Madrid: Statistics made with R + included in Power BI

Note 20/12/2016: I updated this post with a Power BI solution (with R-Stats and Q&A), see part 2 of this post.


Part 1: Statistics with R 

Today the 'business run' ("Carrera de las Empresas") was held here in Madrid. I participated with Raet, a Dutch ISV of a SAAS-solution for Human Resources, which since almost 2 years now has a software development centre in Madrid.

For an earlier post of this run (with more details), see:
http://worktimesheet2014.blogspot.com.es/2015/12/statistics-result-10-km-run-carrera-de.html

This time, there were over 14.000 runners, from 800 companies, for more details, see:

http://www.expansion.com/directivos/deporte-negocio/2016/12/19/58578b2ae2704ef1758b4588.html
The results of the 10K run (in which I participated) I got from here:

http://www.carreradelasempresas.com/sec_clasificaciones.asp
'
and for the statistics, see this page (a social/runners network): http://www.runedia.com/cursa/201628906/carrera-de-las-empresas-10k-actualidad-economica/2016/

The results (10K-Men, 5563 runners)  are in PDF format, which I converted to Excel, with help of this nice (free) online tool:
http://www.pdftoexcel.com/

After some 'cleansing' (see added yellow-columns in Excel file) I had the data ready for analysis, for which I used R, the language for statistical computing.
NB: For an earlier post in which I used R, see:

http://worktimesheet2014.blogspot.com.es/2016/02/20minutos-blog-awards-2015-statistics.html


The R-script I made looked basically like this:


sv10km2 <- read.csv("10Km2.csv")
edit(csv10km2)
Tfinish=as.numeric(as.difftime(as.character(csv10km2$Tfinish),format="%H:%M:%S"))
summary(Tfinish)
plot(density(Tfinish,na.rm=TRUE),lwd=3,xlab="Finish Time (mins)", main="Madrid Business Run 2016 10k Results")
View(csv10km2)
boxplot(Tfinish, data=csv10km2)
install.packages("sqldf")
newdata <-  sqldf('select * from csv10km2 where Team like %Raet%')


Input for the script has to be in the MyDocument-folder, in my case: D:\Persoonlijk\maartenree\Documents\10Km2.csv

See figures below for the result of my analysis.

And for those who don't have R, but want to do the same as I did for my company (in fig.3), so filter the results, I also added the Excel where the finish-results are in table-format (so easy to filter).



Fig.1: Plot with finish-times and Five-number summary


Fig.2: Boxplot  with finish-times



Fig.3: Filter dataset with finish-times for (teams of) company Raet


Part 2: Power BI + R Stats/Charts

Power BI has an option to include R Stats/Charts, and so I wanted to add the above R-chart 'Plot finish-times' (fig.1) to Power BI, and the filtering-option in Power BI would make it easy to 'slice-and-dice' this chart, in this case: filtering for a team, and showing the runners in that team. To be able to do this, I had to convert the Excel-column with the finish-times in 'duration-format' ([h]:mm:ss) to seconds, see column Tfinish2 in attached file 10Km3.xlsx and fig.4.

Fig.4: Excel with new column Tfinish2 (time in seconds)


And for the final result in Power BI (Desktop), see fig.5.
Note that slicer for team 'dimension' has a pre-filter (in this case with value 'Raet') to limit the number of items in the slicer (very helpful when there are over 1000 teams). And to filter for >1 team (like I did, with 3 Raet-teams 'checked'): press Control + click in checks for filter values. And note that the chart is reacting on the filter (so Power BI filter and R-chart are connected).


Fig.5 Power BI Desktop: report finish-times (in second) filtered for teams of Raet



I also uploaded the Power BI Desktop report to Power BI online, see:

https://app.powerbi.com/view?r=eyJrIjoiZDQ1YzFkZTAtODExNi00OWRmLWFlOTAtMzI0M2JjZWE4MzQ0IiwidCI6ImI3OWIyMzE3LTM0ZGQtNDNlNS05MWEyLWNkNjZkM2FlMWYwYiIsImMiOjh9

To be able to use R-Stats/Charts in Power BI online, you need to have a Power BI Pro account. About how to get a free trial-account, see:

https://powerbi.microsoft.com/en-us/documentation/powerbi-admin-purchasing-power-bi-pro/#power-bi-pro-60-day-trial

With Power BI Online it is also possible to do Q&A in natural language (English). See fig.6-7. For this example to work, I had to rename the Excel-columns Runner -> RunnerName and Team -> TeamName, see:

https://powerbi.microsoft.com/en-us/documentation/powerbi-service-q-and-a-tips/

Fig.6 Power BI and Q&A 1/2

Fig.7 Power BI and Q&A 2/2



References

http://cnr.lwlss.net/RealData/

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4317364/


Downloads

R:
v1: https://goo.gl/gdjPRk

PowerBI:

v2:
https://goo.gl/S43i2O

v2b: (better chart than v2)
https://goo.gl/gn4o8k

9 Dec 2016

PowerBI Dashboard for Messi´s worldrecord of 91 goals in one year


#58 PowerBI Dashboard for Messi´s worldrecord of 91 goals in one year

NB: 
- This post is not about an Excel I made (but about a Microsoft Power BI dashboard I made, which is (almost) a copy of an Excel workbook I blogged about earlier), so I tagged it with the 'More' category.

Once I made an Excel with a PowerView dashboard for Messi's worldrecord of #goals in one year (91), see:

http://worktimesheet2014.blogspot.com.es/2015/08/dashboard-for-messis-worldrecord-of-91.html

In this post, I made a new version of this dashboard, but now in Microsoft PowerBI, and also with some improvements, e.g. the treemap-view for all matches in which Messi scored, aggregated by the team and competition where he made his goals. NB: For more info about the Treemap-chart, see my previous post:

http://worktimesheet2014.blogspot.com/2016/07/treemap-and-sunburst-excel2016-charts.html

See fig.1 for the end-result, in PowerBI Desktop.



Fig.1 Messi-Dashboard in PowerBI Desktop

The nice thing of PowerBI-dashboards is that they are suited for any device, so apart from PC also smartphone or tablet, see fig.2, in PowerBI-online (cloud-service), so this is a site for sharing your dashboards, for mine, see:

https://app.powerbi.com/view?r=eyJrIjoiOWYxMTVkNTUtZjJjMy00NTcyLWJlMzctNmNlYzE5ZmQyYjM1IiwidCI6ImI3OWIyMzE3LTM0ZGQtNDNlNS05MWEyLWNkNjZkM2FlMWYwYiIsImMiOjh9

This URL Power BI gave me after choosing menu-option: "File > Publish to web", which should make the dashboard 'public', see:
https://michielquakernaat.com/2016/02/03/want-to-share-your-power-bi-report-to-the-world-yes-we-can/

Or you can use this URL: https://app.powerbi.com/groups/me/dashboards/729276ac-365c-4473-9d42-1969ddd16dac

There is also a Power BI Community page which has a 'Data Stories Gallery', and I posted my 'Messi-dashboard' here:
http://community.powerbi.com/t5/Data-Stories-Gallery/Messi-s-worldrecord-of-total-goals-in-1-year-91-goals-in-2012/m-p/100642#M485

And here I found another nice Messi-dashboard:
http://community.powerbi.com/t5/Data-Stories-Gallery/Incredible-Lionel-Messi/m-p/82225/highlight/false#M386

So this is similar to this page of the Tableau-community:
https://public.tableau.com/en-us/s/gallery



Fig.2: Messi-Dashboard in PowerBI (online) on PC and smartphone


2 Nice build-in features of Power BI are:
- Quick Insight in the dataset (source of dashboard), which generates several 'data-profile' charts, see fig.2b
- Natural language Q&A, user types his question (in English), using some 'keywords' like 'by' to specify the 'slice-and-dice' dimensions, and Power BI answers (with charts) , see fig.2c.

For more details about these features (and Power BI in general), see:
http://www.computerworld.com/article/3088958/data-analytics/free-data-visualization-with-microsoft-power-bi-your-step-by-step-guide-with-video.html



Fig.2b: Quick Insight



Fig.2c: Natural language Q&A

As said in my previous blog-post, my Messi-dashboard was inspired by one made with BI/Dashboard-tool Tableau  (for which PowerBI was Microsofts answer/alternative), see fig.3 for the this version.


Fig.3: original Messi-dashboard in (free trial version of ) Tableau

To see another really nice example of a PowerBI-report with the results of the Spanish soccer League made by a friend, see fig.4 and his site: http://Ligasantanderbi.azurewebsites.net



Fig.4: PowerBI (with Azure) example with results Spanish soccer League

So with Power BI you can make really nice dashboards, and with the frequent updates/releases of Power BI, much more frequent than that of MS Office (Excel), one could wonder if for 'self-service BI' Power BI is the way to go (and not Excel with its 'Power-plugins' like Power View, Power Map etc.) Some say it is, see e.g.:

http://www.cio.com/article/2979725/enterprise-software/why-power-bi-is-the-future-of-excel.html

while others say you can/should use the 'pack' 'Power BI + Exel', see:
https://www.youtube.com/watch?v=jmTedSuKers

And what do you think? Comments are welcome (but please not anonymously, because I get a lot of 'fake' comments (publicity) from anonymous users, so I don't read anonymous comments.

Note 12/12/2016:
Although I haven't got any comments on this post (yet), apparently the 'editor' of The #PowerBI Daily (on www.paper.li), liked this post, because he published it here today, see fig.5.
NB: Paper.li is a site to create your own (virtual) newspaper of tweets of persons you follow, for more info, see:
http://www.takeflyte.com/flyte/2010/12/what-is-paper-li-a-newspaper-of-tweets.html

Fig.5: Blog-post referenced on  The #PowerBI Daily (on www.paper.li) 



References

-1: Good explanation of what is "Business Intelligence (BI): Multidimensional Analysis"
https://youtu.be/IhFkNmVmwn4

-2: A 'BI-solution' actually consists of 2 components: a Data Warehouse (DW), the (invisible) back-room ('the database') and BI, the (visual) front-room (the dashboard), where DW is normally about 80% of the solution, and BI 20 %, so it is the top of the ice-berg, as described here:
https://www.datamart.de/competence/business-intelligence-data-warehouse/Seiten/default.aspx


Downloads
Pbix-file, so PowerBI-report:

https://goo.gl/Ex4XbB