Browsed byCategory: Data Science

Multiple Linear Regression in Power BI

Multiple Linear Regression in Power BI

In this post, I will describe how to implement Multiple Linear Regression in Power BI using DAX only. In the February 2023 release, Power BI introduced a new function called “Linest“, so we will see how to use it to make predictions and interpret its result. Linear Regression Linear regression is a type of statistical analysis used to find the relationship between two variables. It is used to determine how one variable (dependent variable) is related to another variable (independent…

Show text results from an R visual in Power BI

Show text results from an R visual in Power BI

In this short post, I will describe how to show text results from an R script visual in Power BI. Microsoft Idea Today there is already an idea submitted to Microsoft to enable this possibility however this idea does not have a lot of votes so it is not likely to be added anytime soon so then the workaround comes to the rescue! Why do we even need to show text result from an R visual? According to the multiple…

Clustering in Power BI using R

Clustering in Power BI using R

Since 2016 there’s a built-in feature in Power BI that allows us to automatically find clusters within our data.This is a great feature, however, its main drawback is that whenever we add new data into Power BI the clusters need to be manually recalculated for the new data. In this post, I will show how we can implement clustering in Power BI using R and automatically recalculate the clusters whenever we hit the refresh button. What is Clustering Clustering is the…

Bootstrap analysis with Power BI

Bootstrap analysis with Power BI

In this post, I share how to perform a bootstrap analysis with Power BI and briefly introduce what is bootstrapping and when to use it.

How to change the size of Plot Figure Matplotlib

How to change the size of Plot Figure Matplotlib

When plotting figures with matplotlib you might want to change the size of the figure displayed. So here is a quick trick to adjust the size import matplotlib.pyplot as plt #Inside your plot code just type the following line of code #Set the plot width to 12 inches and height to 6 inches plt.rcParams[“figure.figsize”] = [12,6]  For more details see the  figure  documentation .

T-Test: Dr. Semmelweis and the discovery of handwashing

T-Test: Dr. Semmelweis and the discovery of handwashing

This article only illustrates the use of t-test in a real life problem but does not provide any technical information on what is T-Test or how T-Test works. I will go through the T-test in details in another post and will link it into this post. Intro I was looking for a cool dataset to illustrate the use of T.test and I found this DataCamp project “Dr. Semmelweis and the discovery of handwashing”. This a straightforward project but I really…

Coursera Data Science Specialization Review

Coursera Data Science Specialization Review

“Ask the right questions, manipulate data sets, and create visualizations to communicate results.” “This Specialization covers the concepts and tools you’ll need throughout the entire data science pipeline, from asking the right kinds of questions to making inferences and publishing results. In the final Capstone Project, you’ll apply the skills learned by building a data product using real-world data. At completion, students will have a portfolio demonstrating their mastery of the material.” The JHU Data Science Specialization is one of…

Human Resources Data Analytics

Human Resources Data Analytics

Using predictive analytics to predict the leavers. The dataset contains the different variables below: Employee satisfaction level Last evaluation Number of projects Average monthly hours Time spent at the company Whether they have had a work accident Whether they have had a promotion in the last 5 years Department Salary Whether the employee has left *This dataset is simulated Download dataset By using the summary function we can obtain the descriptive statistic information of our dataset: Data preparation: Followed by the str function…

Implement Linear Regression in R (single variable)

Implement Linear Regression in R (single variable)

Linear regression is probably one of the most well known and used algorithms in  machine learning. In this post, I will discuss about how to implement linear regression step by step in R. Let’s first create our dataset in R that contains only one variable “x1” and the variable that we want to predict “y”. #Linear regression single […]

Stanford Machine Learning: Intro

Stanford Machine Learning: Intro

I have decided to take part in the machine elarning courses provided by Stanford University. Now there are loads of MOOCs but this course was  one of the first programming MOOCs Coursera put online by Coursera and it is still ranked as first by Class Central. I have now almost completed the 11 weeks course and I can tell that Stanford Professor Andrew Ng is a brillant teacher, he is able to explain quite complicated algorithm in a very simple way. This course provides…