Classification datasets

File	Description	Source link (with details)	Preprocessing applied	Label column
`generated.csv`	Automatically-generated dataset containing data samples separated into very well-delineated categories. This can be considered a “best-case scenario” test case.			`label`
`defaults.csv`	Defaults on credit card payments	UCI	Minor (column name reformatting)	`defaulted`
`winequality.csv`	Quality ratings of Portuguese white wines	UCI	Added binarized label column `recommend` indicating `quality >= 7`	`recommend`
`vehicles.csv`	Recognizing vehicle type from its silhouette	OpenML	None	`Class`
`eeg.csv`	EEG eye state measurements	OpenML	Dropped a few outlier rows	`Class`
`kick_starter.csv`	Kick stater project state	Kaggle	Dropped unnamed columns; Minor column name reformatting; Calculated duration of the project and dropped start and end dates; Dropped some rows with wrong input type; Dropped main category column and kept category column; randomply sampled 30% of the data; Filled NA with 0 for numeric values	`state`
`mushrooms.csv`	Classification mushrooms edibility based on physical features	UCI	Renamed the column `class` to `edibility` for descriptiveness	`edibility`
`Surgical-deepnet.csv`	Surgical cases related to complication	Kaggle	None	`complication`
`gender_classification.csv`	use hobbies to guess gender	Kaggle	None	`Gender`

These can all be loaded using Pandas:

import pandas as pd
dataset = pd.read_csv("file.csv")