伯克利客座教授：AlphaGo Zero and Deep Learning | GAIR大讲堂

来源：互联网发布：克莱汤普森生涯数据编辑：程序博客网时间：2024/05/22 04:52

分享内容

▼

本场GAIR大讲堂嘉宾将解析AlphaGo Zero如何将白板学习、Resnet、MCTS等技术，将Polic Network和Value Network组合框架下使用Self-play解决零经验下自学习过程。介绍目前最新的深度学习方式如何将机器感知向机器认知方向的演进，目前王强博士团队应用深度学习的最新研究方向分享。由于演讲人多年担任SCI期刊编委，也将对学术论文撰写经验进行分享。

建议预读文献

《Mastering the game of Go without human knowledge》

论文地址：http://t.cn/RWkV1B6

A long-standing goal of artificial intelligence is an algorithm that learns, tabula rasa, superhuman proficiency in challenging domains. Recently, AlphaGo became the first program to defeat a world champion in the game of Go. The tree search in AlphaGo evaluated positions and selected moves using deep neural networks. These neural networks were trained by supervised learning from human expert moves, and by reinforcement learning from self-play. Here we introduce an algorithm based solely on reinforcement learning, without human data, guidance or domain knowledge beyond game rules. AlphaGo becomes its own teacher: a neural network is trained to predict AlphaGo’s own move selections and also the winner of AlphaGo’s games. This neural network improves the strength of the tree search, resulting in higher quality move selection and stronger self-play in the next iteration. Starting tabula rasa, our new program AlphaGo Zero achieved superhuman performance, winning 100–0 against the previously published, champion-defeating AlphaGo.

《How to Escape Saddle Points Efficiently》

论文地址：https://arxiv.org/abs/1703.00887

This paper shows that a perturbed form of gradient descent converges to a second-order stationary point in a number iterations which depends only poly-logarithmically on dimension (i.e., it is almost "dimension-free"). The convergence rate of this procedure matches the well-known convergence rate of gradient descent to first-order stationary points, up to log factors. When all saddle points are non-degenerate, all second-order stationary points are local minima, and our result thus shows that perturbed gradient descent can escape saddle points almost for free. Our results can be directly applied to many machine learning applications, including deep learning. As a particular concrete example of such an application, we show that our results can be used directly to establish sharp global convergence rates for matrix factorization. Our results rely on a novel characterization of the geometry around saddle points, which may be of independent interest to the non-convex optimization community.

《Benchmarking State-of-the-Art Deep Learning Software Tools》

论文地址：https://arxiv.org/abs/1608.07249

Deep learning has been shown as a successful machine learning method for a variety of tasks, and its popularity results in numerous open-source deep learning software tools. Training a deep network is usually a very time-consuming process. To address the computational challenge in deep learning, many tools exploit hardware features such as multi-core CPUs and many-core GPUs to shorten the training time. However, different tools exhibit different features and running performance when training different types of deep networks on different hardware platforms, which makes it difficult for end users to select an appropriate pair of software and hardware. In this paper, we aim to make a comparative study of the state-of-the-art GPU-accelerated deep learning software tools, including Caffe, CNTK, MXNet, TensorFlow, and Torch. We first benchmark the running performance of these tools with three popular types of neural networks on two CPU platforms and three GPU platforms. We then benchmark some distributed versions on multiple GPUs. Our contribution is two-fold. First, for end users of deep learning tools, our benchmarking results can serve as a guide to selecting appropriate hardware platforms and software tools. Second, for software developers of deep learning tools, our in-depth analysis points out possible future directions to further optimize the running performance.

分享提纲

▼

1.AI和深度学习主要论题

2.深度学习在AI的应用

3.AlphaGo与AlphaGo Zero的介绍与对比

4.从机器感知到机器认知

5.团队最新研究方向

分享主题

▼

AlphaGo Zero and Deep Learning -from Machine Perception to Machine Cognition

分享人简介

▼

王强博士，本科毕业于西安交通大学计算机科学与技术专业，后获得卡内基梅隆大学软件工程专业硕士学位、机器人博士学位。美国货币监理署（OCC）审计专家库成员、IBM商业价值研究院院士及纽约Thomas J. Watson研究院主任研究员。IEEE高级会员，并担任了2008、2009、2013及未来2018年CVPR的论文评委，同时是PAMI和TIP两个全球顶级期刊的编委。王强博士在国际顶级期刊发表了90多篇论文，并多次在ICCV，CVPR等大会做论文分享。其主要研究领域图像理解、机器学习、智能交易、金融反欺诈及风险预测等。

分享时间

▼

北京时间11月6日（周一）晚20:00

参与方式

▼

扫描海报二维码添加社长微信，备注「王强」