Sunday 20 November 2022

Data Visualization-I in PYTHON

Making plots and static or interactive visualizations is one of the most important tasks in data analysis. It may be a part of the exploratory process; for example, helping identify outliers, needed data transformations, or coming up with ideas for models.

matplotlib.pyplot is a plotting library used for 2D graphics in python programming language. It can be used in python scripts, shell, web application servers and other graphical user interface toolkits.

Matploitlib is a Python Library used for plotting, this python library provides and objected-oriented APIs for integrating plots into applications.

Before start plotting let us understand some basics

  1.  With Pyplot, you can use the xlabel() and ylabel() functions to set a label for the x- and y-axis.
  2.  With Pyplot, you can use the grid() function to add grid lines to the plot.
  3.  You can use the keyword argument linestyle, or shorter ls, to change the style of the plotted line:
  4.  The plot() function is used to draw points (markers) in a diagram. By default, the plot() function draws a line from point to point.
  5.  You can use the keyword argument marker to emphasize each point with a specified marker: 

 Importing matplotlib :

from matplotlib import pyplot as plt
import matplotlib.pyplot as plt 

Basic plots in Matplotlib :

Matplotlib comes with a wide variety of plots. Plots helps to understand trends, patterns, and to make correlations. They’re typically instruments for reasoning about quantitative information. Some of the sample plots are covered here.

 a) Line Chart 

 Line charts are used to represent the relation between two data X and Y on a different axis

# importing the required libraries
import matplotlib.pyplot as plt
import numpy as np

# define data values
x = np.array([1, 2, 3, 4]) # X-axis points
y = x*2 # Y-axis points

plt.plot(x, y) # Plot the chart # display

The following is the output

b) Bar Chart

  1. A bar plot or bar chart is a graph that represents the category of data with rectangular bars with lengths and heights that is proportional to the values which they represent.
  2. The bar plots can be plotted horizontally or vertically.
  3. A bar chart describes the comparisons between the discrete categories. One of the axis of the plot represents the specific categories being compared, while the other axis represents the measured values corresponding to those categories.

The following programs show the comparison between year and product

import matplotlib.pyplot as plt
# Creating data
year = ['2010', '2002', '2004', '2006', '2008']
production = [25, 15, 35, 30, 10]
# Plotting barchart, production)
# Saving the figure.

 The following is the output 

c) scatter plots

Scatter plots show many points plotted in the Cartesian plane. Each point represents the values of two variables. One variable is chosen in the horizontal axis and another in the vertical axis.

import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.rand(50, 4), columns=['a', 'b', 'c', 'd'])
df.plot.scatter(x='a', y='b')

The following is the output


d) Pie Chart

  1. A Pie Chart is a circular statistical plot that can display only one series of data.
  2. The area of the chart is the total percentage of the given data.
  3. The area of slices of the pie represents the percentage of the parts of the data.
  4. The slices of pie are called wedges. The area of the wedge is determined by the length of the arc of the wedge. The area of a wedge represents the relative percentage of that part with respect to whole data.
  5. Pie charts are commonly used in business presentations like sales, operations, survey results, resources, etc as they provide a quick summary. 

# Import libraries
from matplotlib import pyplot as plt
import numpy as np

# Creating dataset
cars = ['AUDI', 'BMW', 'FORD','TESLA', 'JAGUAR', 'MERCEDES']

data = [23, 17, 35, 29, 12, 41]

# Creating plot
fig = plt.figure(figsize =(10, 7))
plt.pie(data, labels = cars)

# show plot

The following is the output

e) Box Plot

  1. Box plots are a measure of how well distributed the data in a data set is.
  2. It divides the data set into three quartiles. This graph represents the minimum, maximum, median, first quartile and third quartile in the data set.
  3. It is also useful in comparing the distribution of data across data sets by drawing boxplots for each of them.
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.rand(10, 5), columns=['A', 'B', 'C', 'D', 'E'])'True')

The following is the output





Post a Comment

Note: only a member of this blog may post a comment.

Machine Learning



Java Tutorial




C Programming


Python Tutorial


Data Structures


computer Organization