READING NOTE: Semantic Object Parsing with Graph LSTM
来源:互联网 发布:边际递减效应爱情知乎 编辑:程序博客网 时间:2024/05/16 12:33
TITLE: Semantic Object Parsing with Graph LSTM
AUTHER: Xiaodan Liang, Xiaohui Shen, Jiashi Feng, Liang Lin, Shuicheng Yan
ASSOCIATION: National University of Singapore, Sun Yat-sen University, Adobe Research
FROM: arXiv:1603.07063
CONTRIBUTIONS
- A novel Graph LSTM structure is proposed handle general graph-structured data, which effectively exploits global context by superpixels extracted by over-segmentation.
- A confidence-driven scheme is proposed to select the starting node and the order of updating sequences.
- In each Graph LSTM unit, different forget gates for the neighboring nodes are learned to dynamically incorporate the local contextual interactions in accordance with their semantic relations.
METHOD
The main steps of the method is shown in the following figure.
- The input image first passes through a stack of convolutional layers to generate the convolutional feature maps.
- The convolutional feature maps are further used to generate an initial semantic confidence map for each pixel.
- The input image is over-segmented to multiple superpixels. For each superpixel, a feature vector is extracted from the upsampled convolutional feature maps.
- The first Graph LSTM takes the feature vector of every superpixel as input to compute a better state.
- The second Graph LSTM takes the feature vector of every superpixel and the output of first Graph LSTM as input.
- The update sequence of the superpixel is according to the initial confidence of the superpiexels.
- several 1×1 convolution filters are employed to produce the final parsing results.
some details
A graph structure is built based on the superpixels. The nodes are the superpixels and the nodes are linked when they are adjacent. The history information used by the G-LSTM for one superpixel come from the adjacent superpixels.
ADVANTAGES
- Constructed on superpixels generated by oversegmentation, the Graph LSTM is more naturally aligned with the visual patterns in the image.
- Adaptively learning the forget gates with respect to different neighboring nodes when updating the hidden states of a certain node is beneficial to model various neighbor connections.
0 0
- READING NOTE: Semantic Object Parsing with Graph LSTM
- READING NOTE: Understanding Convolution for Semantic Segmentation
- Reading Note: Pyramid Scene Parsing Network
- READING NOTE: Object Detection from Video Tubelets with Convolutional Neural Networks
- READING NOTE: Learning Deconvolution Network for Semantic Segmentation
- READING NOTE: Do semantic parts emerge in Convolutional Neural Networks?
- READING NOTE: Object Detection by Labeling Superpixels
- Reading Note
- READING NOTE: Feature Pyramid Networks for Object Detection
- READING NOTE: Learning to Detect Human-Object Interactions
- Reading Note: DSOD: Learning Deeply Supervised Object Detectors from Scratch
- READING NOTE: ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation
- Online Object Tracking, Learning and Parsing with And-Or Graphs
- JDOM Example : Reading and Parsing XML with SAX parser in Java
- Reading Note 1127
- READING NOTE: Bayesian SegNet
- reading note 1
- reading note 2
- 客户端调用服务端接口减少请求数据容量的优化例子
- 安卓小案例收集二
- poj1984 Navigation Nightmare(带权并查集)
- [bzoj2818]gcd
- java的并发测试
- READING NOTE: Semantic Object Parsing with Graph LSTM
- 基因数据处理43之mango之503错误
- 2016SDAU课程练习三1016
- 排序——冒泡排序
- 无线通信大作业 开题报告
- openwrt中使用ubus实现进程通信
- window 局域网下文件共享的开启与关闭方法
- 上拉刷新--下拉加载XListView
- 计算机图形学(二)输出图元_16_字符函数