Browsed by
Category: Data Science

Clustering in Power BI using R

Clustering in Power BI using R

Since 2016 there’s a built-in feature in Power BI that allows us to automatically find cluster within our data.This is a great feature, however, its main drawback is that whenever we add new data into Power BI the clusters need to be manually recalculated for the new data. In this post, I will show how we can implement clustering in Power BI using R and automatically recalculate the clusters whenever we hit the refresh button. What is Clustering Clustering is the…

Read More Read More

How to change the size of Plot Figure Matplotlib

How to change the size of Plot Figure Matplotlib

  When plotting figures with matplotlib you might want to change the size of the figure displayed. So here is a quick trick to adjust the size import matplotlib.pyplot as plt #Inside your plot code just type the following line of code #Set the plot width to 12 inches and height to 6 inches plt.rcParams[“figure.figsize”] = [12,6]  For more details see the  figure  documentation .

T-Test: Dr. Semmelweis and the discovery of handwashing

T-Test: Dr. Semmelweis and the discovery of handwashing

This article only illustrates the use of t-test in a real life problem but does not provide any technical information on what is T-Test or how T-Test works. I will go through the T-test in details in another post and will link it into this post. Intro I was looking for a cool dataset to illustrate the use of T.test and I found this DataCamp project “Dr. Semmelweis and the discovery of handwashing”. This a straightforward project but I really…

Read More Read More

Coursera Data Science Specialization Review

Coursera Data Science Specialization Review

“Ask the right questions, manipulate data sets, and create visualizations to communicate results.” “This Specialization covers the concepts and tools you’ll need throughout the entire data science pipeline, from asking the right kinds of questions to making inferences and publishing results. In the final Capstone Project, you’ll apply the skills learned by building a data product using real-world data. At completion, students will have a portfolio demonstrating their mastery of the material.” The JHU Data Science Specialization is one of…

Read More Read More

Human Resources Data Analytics

Human Resources Data Analytics

Using predictive analytics to predict the leavers. The dataset contains the different variables below: Employee satisfaction level Last evaluation Number of projects Average monthly hours Time spent at the company Whether they have had a work accident Whether they have had a promotion in the last 5 years Department Salary Whether the employee has left *This dataset is simulated Download dataset By using the summary function we can obtain the descriptive statistic information of our dataset: Data preparation: Followed by the str function…

Read More Read More

Implement Linear Regression in R (single variable)

Implement Linear Regression in R (single variable)

Linear regression is probably one of the most well known and used algorithms in  machine learning. In this post, I will discuss about how to implement linear regression step by step in R. Let’s first create our dataset in R that contains only one variable “x1” and the variable that we want to predict “y”. #Linear regression single […]

Stanford Machine Learning: Intro

Stanford Machine Learning: Intro

I have decided to take part in the machine elarning courses provided by Stanford University. Now there are loads of MOOCs but this course was  one of the first programming MOOCs Coursera put online by Coursera and it is still ranked as first by Class Central. I have now almost completed the 11 weeks course and I can tell that Stanford Professor Andrew Ng is a brillant teacher, he is able to explain quite complicated algorithm in a very simple way. This course provides…

Read More Read More