Monday, 12 December 2022

Exploratory data analysis on the banking data set with Pandas Part-II

The banking data set contains all the details. By reading or observing data set carefully write the code for the following.

First we load the data set into a variable, so that is becomes easy to perform operations on it.

1. How many number of missing values are there in the data set? (Answer:3)

3. What is the shape of the data after dropping the feature “Unnamed: 0”, missing values and duplicated values? (Answer:(5578,17) )

2. Total how many duplicate values are presented in the data set? (Answer:2)

4. What is the average age of the clients those who have not subscribed to deposit? (Answer:41)

First, I will extract the required columns from the data set as

Then , I will filter the required data

Finally, I will describe it to find total count, the mean is the average

5. What is the maximum number of contacts performed during the campaign for the clients who have subscribed to deposit? (Answer:32)

First, I will extract the required columns from the data set as

Then , I will filter the required data

Finally, I will describe it to find total count, the maximum is the total count

6. What is the count of unique education levels in the data and find out how many clients have completed secondary education?

the count of unique education levels in the data can be computed as (Answer:4)

how many clients have completed secondary education? (Answer:2721)

7. What is the percentage split of the categories in the column “deposit”?

( Answer: Yes - 47% & No - 53%)

8. Generate a scatter plot of “age” vs “balance”.

9. How many clients with personal loan has housing loan as well? (Answer: 397)

10. How many unemployed clients have not subscribed to deposit? (Answer: 78)

Monday, 12 December 2022