Easiest way to do EDA in Python for Machine Learning problems

Hi dear fellow programmers,

This short blog is about performing

EDA(Exploratory Data Analysis)
on any .csv-based ML dataset.

As we are already familiar with the Pandas library and use it to handle .csv-based datasets, there is another handy library that works with pandas to create a
Profiling Report
of the .csv dataset, which very few people know about.

The library is called 
pandas_profiling
, and it is very easy to use.

Installation: pip install pandas_profiling

Usage:

import pandas as pd
from pandas_profiling import ProfileReport
df = pd.read_csv('heart.csv')
profile = ProfileReport(df, title="Pandas Profiling Report")
profile.to_file("heart_profiling_report.html")

Thanks for reading!

Comments