ImageNet (ILSVRC)

Overview

ImageNet is a large-scale visual dataset organized according to the WordNet hierarchy, created to support research in object recognition and visual understanding. It was introduced alongside the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) and became the foundational dataset that catalyzed the modern deep learning revolution in computer vision. ImageNet is primarily used for large-scale image classification and as a pretraining corpus for transferable visual representations.

Modality: Images
Annotations | Labels: Single-label object class per image using WordNet synsets
Size:
- ~1.28 million training images
- 50,000 validation images
- 100,000 test images (labels withheld)
Format:
- Images: JPEG
- Labels: Text files / metadata mappings
Structure:
- Predefined train, validation, and test splits
- Directory structure organized by class

Typical Uses

Image classification benchmarking
Large-scale visual pretraining
Transfer learning and feature extraction

Notable Features

Large-scale object-centric dataset with broad visual diversity
Longstanding benchmark with extensive historical baselines
Strong empirical correlation with downstream vision task performance

Limitations

Single-label annotations despite multi-object images
Known geographic and cultural bias in image sources
No native support for detection or segmentation tasks

Access

License / Source Information

Dataset Owner: ImageNet Project, Stanford University
ImageNet is free to use for educational or noncommercial purposes only.