Join Digital Marketing Foundation MasterClass worth Rs 1999 FREE

8 Popular Types of Data Visualizations in Python

Python visualization

Introduction

Data Visualization turns data into images that nearly anyone can understand making them invaluable for explaining the significance of digits to people who are more visually oriented

~Jonsen Carmack

 

Visualization

Not every time the numbers will sound meaningful to people working with data. This is where Data Visualization comes in. It is a technique of encoding those numbers into images which can be much more helpful to gain meaningful insights. It is one of the essential steps in every Data Science process.

But, do not get upset if Data Visualization is a new term for you. We’ll talk about Data Visualization in Python throughout this blog.

If you are a beginner in Python, I recommend you to please refer to this blog before proceeding further, in case you haven’t :).

Why Python for Data Visualization?

8 popular ways to perform data visualization in python
Image: http://www. Programmingbuddy. Club/2017/02/udemy-python-for-data-analysis-and. Html

Though there are lots of tools available for Data Visualization, Python has few best libraries that make Python Visualization easy for any dataset. These libraries make Python Visualization affordable for large and small datasets.  There are several courses available on the internet that just focuses on Data Visualization with Python and especially with Matplotlib. Matplotlib is very useful to create and present Python Visualization.

Popular Libraries For Data Visualization in Python:

Some of the most popular Libraries for Python Data Visualizations are:

  1. Matplotlib
  2. Seaborn
  3. Pandas
  4. Plotly
  • and many more

Further, We’ll create different types of Python Visualizations using these libraries.

Types of Python Visualization:

Let us explore different types of techniques for python visualization. we’ll use a jupyter notebook with python for writing all the codes.

Visualization

 

First, we’ll import Python Visualization Libraries using following code.

Imports 2
Import all necessary libraries

Remember, %matplotlib inline is only for jupyter notebooks, if you are using another editor, you’ll use: plt.show() at the end of all your plotting commands to have the figure pop up in another window.

Now, we’ll import an inbuilt iris dataset from Seaborn library which will be used to create various Python Visualization.

Dataset 1
Iris dataset from seaborn library

Now, we’ll use this dataset to create various Python Visualization.

1.) Scatterplot:

This is used to find a relationship in a bivariate data. It is most commonly used to find correlations between two continuous variables. Here, we’ll see scatter plot for Petal Length and Petal Width using matplotlib.

 

Scatter 1
Scatterplot using matplotlib

We can notice that the relationship between the two variables is linear and positive.

We used plt.title to add a title to our post, plt.xlabel to add a label for the x-axis and similarly plt.ylabel to add a label for the y-axis. There are plenty of such options which can be useful for adding/modifying plots. you can refer the matplotlib documentation for a complete guide.

2.) Histogram:

The histogram shows the distribution of a continuous variable.  It can discover the frequency distribution for a single variable in a univariate analysis.

Here we’ll plot a histogram for sepal width to check it’s frequency distribution.

Histogram
Histogram using matplotlib

We observe that the distribution is normally distributed. bins is used to divide the entire range of values into a series of intervals.

3.) Bar Chart:

Bar Chart or Bar Plot is used to represent categorical data with vertical or horizontal bars. It is a general plot that allows you to aggregate the categorical data based on some function, by default the mean. 

Here we’ll plot a Bar Chart for the three Species with Sepal Length using Seaborn.

Bar chart
Bar chart using seaborn

We can notice that the y-axis is the mean of Sepal Length for the three classes of Species namely Setosa, Versicolor, and Virginia.  Also, the three bars have different colors which represent each of the species uniquely.

4.) Pie Chart:

Pie Chart is a type of plot which is used to represent the proportion of each category in categorical data. The whole pie is divided into slices which are equal to the number of categories.

Pie chart
Pie chart using matplotlib

The three slices in the above chart represent three categories of species. we have used explode to separate the three slices. Similar to a histogram, The three slices have different colors which represent each of the categories uniquely.

5.) Countplot:

Countplot is similar to a bar plot except that we only pass the X-axis and Y-axis represents explicitly counting the number of occurrences. Each bar represents count for each category of species.

Here, we’ll plot Countplot for three categories of species using Seaborn.

Countplot
Countplot using sns

We can observe that the three bars represent the count for the three categories of species.

6.) Boxplot:

Boxplot is used to show the distribution of a variable. The box plot is a standardized way of displaying the distribution of data based on the five-number summary: minimum, first quartile, median, third quartile, and maximum.

Here, we’ll plot a Boxplot for checking the distribution of Sepal Length.

Boxplot_univariate
boxplot for univariate data using seaborn.

Also, A box plot shows the distribution of quantitative data in a way that facilitates comparisons between variables or across levels of a categorical variable.

Here, we’ll plot Boxplot to compare the distribution of Sepal Length for each level of Species.

Boxplot_biivariate
boxplot for bivariate data using seaborn.

We can also plot a Boxplot for the entire dataset with Horizontal orientation.

Boxplot_dataset
boxplot for entire dataset using seaborn.

So, we can observe that all the plots represent the distribution of dataset with four quartiles. Also, it represents the maximum and minimum value. While the dots outside the plot represent outliers.

7.) Heatmap:

Heatmap is a type of Matrix plot that allows you to plot data as color-encoded matrices. It is mostly used to find multi-collinearity in a dataset.

To plot a heatmap, your data should already be in a matrix form, the heatmap basically just colors it in for you.

Here, we’ll plot a heatmap to find the correlation between variables of the iris dataset. First, we’ll create a correlation matrix for iris dataset.

Correlation matrix
Correlation matrix for iris dataset

Now, we’ll plot the heatmap for the above correlation matrix.

Heatmap
heatmap showing correlation using seaborn.

Here, we can observe that the correlation is shown with color-coded matrices. The value of correlation ranging from 0 to 1. cmap is used to change the color codings and cannot is used to display the value of correlation in the plot.

8.) Distplot:

The Distplot shows the distribution of a univariate data

Here, we’ll use Distplot to check distribution for Sepal Width.

Distplot
distplot for sepal width using seaborn.

So, we can observe that the distribution is normal. Also, to remove the distribution layer we can use kde = False

9.) Jointplot:

Jointplot is used to represent the distribution of one variable to match up with the distribution of another variable. To be more specific, Jointplot allows you to basically match up two Distplots for bivariate data.

Here, we’ll plot a Jointplot for petal length and sepal length.

Jointplot
jointplot for bivariate data using seaborn.

Grids, Style, and Color

Grids are general types of plots that allow you to map plot types to rows and columns of a grid, this helps you create similar plots separated by features.

First, we’ll create a subplot grid for plotting pairwise relationships in a dataset using pairgrid. Then we’ll map the pairwise relationship to those grids.

GridsPairgrid

Grids_map

Here, sns.PairGrid() will create a pairwise grid of variables in a dataset and the map function will map the relationship among variables to those grids.

Also, we can use map.upper, map.lower, map.diag to map different types of relationships for upper, lower and diagonal pairs.

Now, we’ll see how to control figure aesthetics in seaborn briefly.

we’ll see how we can change the grid style or color using seaborn.

There are five preset seaborn themes: darkgrid, whitegrid, dark, white, and ticks. darkgrid is the default for Seaborn. For all the plots above, we have used white grid-style Set defaults using sns.set().

 

Default grid style
Default grid style (darkgrid)

 

Also, We can change the grid style in seaborn using sns.set_style().

 

White grid style
White grid style

you should try with different grid options available in Seaborn and notice changes in the grid style.

Similar to grids, we have control over spines i.e. borders of a plot using Seaborn. We can remove spines from a graph as per our requirement.
Despine
despine for removing borders

So, sns.despine() will remove borders from the top and right side of the figure. Further, we can also remove border from the left as well as bottom using the argument, left= True & bottom= True.

Borderless

We can use matplotlib’s plt.figure(figsize=(width,height) to change the size of most seaborn plots. Also,  can control the size and aspect ratio of the plots by passing in parameters: size, and aspect. 

Now, Let’s have a look at an example.

Changing size of a plot
Changing the size of a plot

So, we can see that the Width and Height of the plot have changed according to the parameters passed.  For some of the plots, we can also pass these parameters inside the sns.

For example:

Sns size aspect

The set_context() allows you to override default parameters in order to scale the plot:


Conclusions:

Hence, we have covered most of the basics of Python Visualization using seaborn and matplotlib. I hope this article will give you a head start for diving into Python Visualization. Also, You can refer to the official documentation for Matplotlib and Seaborn for further reference and deep understandings.

Avatar of murtuza dahodwala
Murtuza Dahodwala
An aspiring Data Scientist with a comprehensive understanding of Data Analytics & Data Science specializing in diverse computing areas such as R-programming, Python, Statistics,& Machine Learning. Currently working as a Data Scientist for Petpooja (Prayosha Food services Pvt. Ltd.) where the focus is laid on building an automated food recommendation system for restaurants. Besides, worked on various projects like Automated fraud detection system for hall tickets in an examination hall.

Leave a Comment

Your email address will not be published. Required fields are marked *

In-Demand Courses

4-7 months Instructor Led Live Online Training
Starts March 30, 31, 1, 2, 2024
  • Covers all Digital Marketing Techniques

4 months Online
New Batch Dates are not Open
  • Digital Media Mastery (with Paid Media Expertise)
Digital Marketing Webinars
Mar 30
Upcoming
Raj Sharma, Digital Vidya Team 11:00 AM - 12:00 PM (IST)
Apr 28
Completed
Marketing Leaders from Paytm Insider, Cognizant and Digital Vidya 03:00 PM - 04:00 PM (IST)
Mar 24
Completed
Marketing Leaders from Merkle Sokrati, 3M, Uber India and VIP Industries Limited 03:00 PM - 04:00 PM (IST)

Discuss With A Career Advisor

Not Sure, What to learn and how it will help you?

Call Us Live Chat Free MasterClass
Scroll to Top