计算机视觉常见数据集

来源:互联网 发布:淘宝开店考试题 编辑:程序博客网 时间:2024/06/04 19:09

ImageNet

overview

(updated on April 30, 2010)

  • Total number of non-empty synsets: 21841
  • Total number of images: 14,197,122
  • Number of images with bounding box annotations: 1,034,908
  • Number of synsets with SIFT features: 1000
  • Number of images with SIFT features: 1.2 million

  • 总类别: 21,841

  • 总图像: 14,197,122
  • 有bounding box标注的图像数量: 1,034,908
  • 有SIFT特征的类别数: 1000
  • 有SIFT特征的图像数: 1.2 million

tasks

  • Scene classification/场景分类
  • Object localization/目标定位
  • Object detection/目标检测
  • Object detection from video/视频目标检测

MSCOCO

overview

  • Object segmentation
  • Recognition in Context
  • Multiple objects per image
  • More than 300,000 images
  • More than 2 Million instances
  • 80 object categories
  • 5 captions per image
  • Keypoints on 100,000 people

  • 目标分割

  • 内容识别
  • 每幅图像多个实例
  • 图像:300,000+
  • 实例:2,000,000+
  • 目标种类:80
  • 平均每张图像标注数:5
  • 有关键点人数:100,000

tasks

  • Detection/检测
  • Keypoints/关键点
  • Captioning/标注

Open Images

  • images URL: ~9,000,000
  • 2,000,000 bounding boxes spanning 600 object classes (1.24M in train, 830K in validation+test)
  • 4,300,000 human-verified positive image-level labels on the training set
  • coming soon: Trained models (both image-level and object detectors).

Youtube-8M

(2017 update)
- Video URLs: 7,000,000
- Video: 450,000 hours
- Audio/Visual Features: 3,200,000,000
- Classes: 4716
- Avg.Labels/Video: 3.4

悲哀的是大天朝用不了youtube


SUN

overview

  • Images: 131,067
  • Scene categories: 908
  • Segmented objects: 313,884
  • Object categories: 4,479
  • 图像数: 131,067
  • 场景种类: 908
  • 分割种类: 313,884
  • 物体种类: 4,479

tasks

  • Scene Recognition Benchmark(场景识别)
  • scene categories: 397
  • Object Detection Benchmark(目标检测)
  • images: 16,873

NUS-WIDE

  • images: 269,648
  • unique tags: 5,018
  • low-level features types: 6

PASCAL VOC 2010

  • Training and validation images: 10,103
  • Testing images: 9,637
  • categories: 33/59
原创粉丝点击