Some Datasets for Homeworks

Below, you can find a few datasets you can use for your homeworks. There are just a few suggestions though. If you have a particular dataset in mind, please get in touch and we can sort it out.

Achareh

Achareh is an online platform to hire handymen in over 30 cities in Iran covering more than 300 services related to houses, cars, pets, health and beauty and more.

The dataset contains over 60,000 orders, scores and transactions placed on Achareh for multiple services. Here’s a sample:

Ubaar

Ubaar is an online platform to connect truck drivers to clients having a shipment to send. It is the first platform to offer instant pricing based on machine learning in Iran and is currently operating domestically in all major cities in Iran as well as internationally for select destinations.

The dataset contains over 60,000 orders, scores and transactions placed on Ubaar for multiple load-types, sources, destinations and truck types.

Divar

I think Divar hardly needs an introduction, as it is the largest classified ads platform in Iran (and probably Afghanistan).

The dataset “contains 947635 posts that were published in Divar classified ads platform. These posts were published and archived before 2017.”

Digikala

Digikala hardly needs an introduction either. It is not only the largest e-commerce platform in Iran, but in the Middle East.

The dataset contains over 2 million transactions and 100,000 goods offerred at Digikala.

Note that in order to access this dataset, you’ll need the signup through Digikala’s open data platform using your academic email.

Heart Disease Dataset

The dataset contains information about a number of patients and the goal is to predict the chance of a patient having a heart attack or other heart-related disorders.

Here’s a sample:

Fashion MNIST

This is a dataset of 70,000 Zalando’s article images. Each image is a 28x28 grayscale image, associated with a label from 10 classes.

Here’s a sample:

World Development Indicators

This is a dataset, from the World Bank, contains “over a thousand annual indicators of economic development from hundreds of countries around the world”. It is an amazing and fun dataset to work with!

Here’s a sample:

Other Datasets?

If you are looking for inspirtation and ideas, you can also checkout the “Awesome Public Datasets” page on Github. It has the links to thousands of public datasets, many of which are truly awesome ;)

In case you have a dataset that you would like to work on during homeworks, we can discuss it offline and if the dataset is suitable, you can work on the dataset of your choice. Just send us an email and we will sort it out!