CASIA WebFace DatabaseCASIA WebFace Database

来源:互联网 发布:融资担保 知乎 编辑:程序博客网 时间:2024/06/06 13:02

网络链接:http://www.cbsr.ia.ac.cn/english/CASIA-WebFace-Database.html

                                                                      CASIA WebFace Database

Pushing by big data and deep convolutional neural network (CNN), the performance of face recognition is becoming comparable to human. Using private large scale training datasets, several groups achieve very high performance on LFW, i.e., 97% to 99%. While there are many open source implementations of CNN, none of large scale face dataset is publicly available. The current situation in the field of face recognition is that data is more important than algorithm. To solve this problem, we propose a semi-automatical way to collect face images from Internet and build a large scale dataset containing 10,575 subjects and 494,414 images, called CASIA-WebFace. To the best of our knowledge, the size of this dataset rank second in the literature, only smaller than the private dataset of Facebook (SCF). We encourage those data-consuming methods training on this dataset and reporting performance on LFW.

The statistics of the proposed CASIA-WebFace dataset is shown in Table 1. Except for Facebook's SFC dataset, the scale of CASIA-WebFace has the largest scale. For users' privacy issue, maybe SFC will never be open to research community. The features of Microsoft's WDRef dataset was publicly available from 2012 but it is inflexible for advanced researches. Among the datasets listed in the table, CASIA-WebFace+LFW is the most suitable combination for large scale face recognition in the wild. If you feel the accuracy of LFW has been saturated by the current state-of-the-art method.BLUFR is a more challenging protocol to report your results.

Table 1. The information of CASIA-WebFace and comparison to other large scale face datasets.

Dataset#Subjects#ImagesAvailabilityLFW [1]5,74913,233PublicWDRef [2]2,99599,773Public (feature only)CelebFaces [3]10,177202,599PrivateSFC [4]4,0304,400,000PrivateCACD [5]2,000163,446Public (partial annotated)CASIA-WebFace10,575494,414Public

Publication and Results: 
To illustrate the quality of CASIA-WebFace, we train a deep CNN on it and compare its accuracy to state-of-the-art methods, such as, DeepFace and DeepID2. You can refer the following technical report for details. 
♦ Dong Yi, Zhen Lei, Shengcai Liao and Stan Z. Li, “Learning Face Representation from Scratch”. arXiv preprint arXiv:1411.7923. 2014. (pdf)

The above reference should be cited in all documents and papers that report experimental results based on the CASIA WebFace database.

Download Instructions: 
To apply for the database, please follow the steps below:

  1. Download and print the document Agreement for using CASIA WebFace database
  2. Sign the agreement (The agreement must be signed by the director or the delegate of the deparmart of university. Personal applicant is not acceptable.)
  3. Send the agreement to cbsr-request@authenmetric.com
  4. Check your email to find a login account and a password of our website after one day, if your application has been approved.
  5. Download the CASIA WebFace database from our website with the authorized account within 48 hours.

Copyright Note and Contacts:
The database is released for research and educational purposes. We hold no liability for any undesirable consequences of using the database. All rights of the CASIA WebFace database are reserved.

References: 
[1] LFW, http://vis-www.cs.umass.edu/lfw/ 
[2] D. Chen, X. Cao, L. Wang, F. Wen, and J. Sun. “Bayesian face revisited: A joint formulation”. In ECCV 2012, pages 566–579. Springer, 2012. 
[3] Y. Sun, X. Wang, and X. Tang. “Deep learning face representation by joint identification-verification”. arXiv preprint arXiv:1406.4773, 2014. 
[4] Y. Taigman, M. Yang, M. Ranzato, and L. Wolf. “Deepface: Closing the gap to human-level performance in face verification”. In Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on, pages 1701–1708. IEEE, 2014. 

[5] CARC, http://bcsiriuschen.github.io/CARC/


0 0
原创粉丝点击