Export Data from Power BI into a file using R

Export Data from Power BI into a file using R

We usually import Data from file into Power BI, but exporting data from Power BI can be very handy when you want to create a custom visual using R. In fact it can be very cumbersome to code your visual directly into the Power BI script editor. Here are few reasons why you should opt for exporting your Power Bi dataset first and re-import it in R to create your visual. Intellisense is not available in Power BI R script…

Read More Read More

Import multiple CSV files in R and load them all together in a single data frame

Import multiple CSV files in R and load them all together in a single data frame

List of all the filenames One approach I found really straight forward is to create a list of all your filenames. You can also create a pattern to fetch your directory and returns all the matching files. In my example I need to read all the files starting with “FR”. The function lapply (equivalent of a loop) reads every single file presents in my list fileNames and store them into my variable zonnesFiles. The variable zonnesFiles is a list of…

Read More Read More

PowerBI – Dynamic Chart Title

PowerBI – Dynamic Chart Title

Unlike Qlikview, the chart titles in PowerBI can only be static. as you can only pass a static text in the title parameter. However, there’s a way around it! The workaround I found is pretty simple you just need to fake a title by creating a measure that contains your title expression and drop this measure into a Card visual. Then by applying the same transparency and colours of your chart you just need to turn off the chart tile…

Read More Read More

For Loop vs Vectorization in R

For Loop vs Vectorization in R

A brief comparison between for loop and vectorization in R A short post to illustrate how vectorization in R is much faster than using the common for loop. In this example I created two vectors a and b witch will take some random numbers. I’ll compute the sum of a and b using the for loop and the vectorization approach and then compare the execution time taken by both of the different methods. I’ll repeat this test 10 times with…

Read More Read More

Central Limit Theorem -example using R

Central Limit Theorem -example using R

The Central Limit Theorem is probably the most important theorem in statistics. In this post I’ll try to demystify the CLT with clear examples using R. The central limit theorem (CLT) states that given a sufficiently large sample size from a population with a finite level of variance, the mean of all samples from the same population will be approximately equal to the mean of the original population. Furthermore, the CLT states that as you increase the number of samples…

Read More Read More

Coursera Data Science Specialization Review

Coursera Data Science Specialization Review

“Ask the right questions, manipulate data sets, and create visualizations to communicate results.” “This Specialization covers the concepts and tools you’ll need throughout the entire data science pipeline, from asking the right kinds of questions to making inferences and publishing results. In the final Capstone Project, you’ll apply the skills learned by building a data product using real-world data. At completion, students will have a portfolio demonstrating their mastery of the material.” The JHU Data Science Specialization is one of…

Read More Read More

Human Resources Data Analytics

Human Resources Data Analytics

Using predictive analytics to predict the leavers. The dataset contains the different variables below: Employee satisfaction level Last evaluation Number of projects Average monthly hours Time spent at the company Whether they have had a work accident Whether they have had a promotion in the last 5 years Department Salary Whether the employee has left *This dataset is simulated Download dataset By using the summary function we can obtain the descriptive statistic information of our dataset: Data preparation: Followed by the str function…

Read More Read More

Populating Time Dimension

Populating Time Dimension

A ready-made script that I have modified to create and populate a Kimball Time dimension. This script will create a time dimension and populate it with different levels of granularity: second, minute, hour.

Implement Linear Regression in R (single variable)

Implement Linear Regression in R (single variable)

Linear regression is probably one of the most well known and used algorithms in  machine learning. In this post, I will discuss about how to implement linear regression step by step in R. Let’s first create our dataset in R that contains only one variable “x1” and the variable that we want to predict “y”. #Linear regression single […]

Stanford Machine Learning: Intro

Stanford Machine Learning: Intro

I have decided to take part in the machine elarning courses provided by Stanford University. Now there are loads of MOOCs but this course was  one of the first programming MOOCs Coursera put online by Coursera and it is still ranked as first by Class Central. I have now almost completed the 11 weeks course and I can tell that Stanford Professor Andrew Ng is a brillant teacher, he is able to explain quite complicated algorithm in a very simple way. This course provides…

Read More Read More