C.S. Chan | Univ. Malaya

Released Datasets

ICText Dataset

ICDAR 2021 Competition

The ICText is an Integrated Circuit Text Spotting and Aesthetic Assessment dataset with a collection of 20,000 images collected in real-world environment.

Total Text Dataset Star

The Total-Text dataset is a collection of 1555 images with more than 3 different text orientations: Horizontal, Multi-Oriented, and Curved, one of a kind.

WikiArt Dataset Star

In order to replicate or to have a fair comparison to our paper, we created a "new" Wikiart dataset. All the images were obtained from WikiArt.org. We are neither responsible for the content nor the meaning of these images.

Exclusively Dark Dataset Star

The Exclusively Dark (ExDARK) dataset is a collection of 7,363 natural low-light images with 12 object classes (similar to PASCAL VOC) annotated on both image class level and local object bounding boxes.

MalayaKew Plant Dataset

MalayaKew (MK) Leaf dataset was collected at the Royal Botanic Gardens, Kew, England. It consists of scan-like images of leaves from 44 species classes. This dataset is very challenging as leaves from different species classes have very similar appearance.

CUTE80 Dataset

We introduce the first curved text dataset to be made public, namely CUTE80 that consists of 80 curved text line images with complex background, perspective distortion effect and poor resolution effect (in circle, S, Z shaped text lines).

Pratheepan Human Skin Dataset

The images in this dataset are downloaded randomly from Google for human skin detection research. These images are captured with a range of different cameras using different colour enhancement and under different illuminations.

Name