Weakly supervised learning methods have brought improvements to the semantic segmentation problem. By simplifying the labeling work, more attention is given to the network architecture. In the paper entitled “Weaklier Supervised Semantic Segmentation With Only One Image Level Annotation per Category”, the authros propose a three-stage semantic segmentation framework that deals with image and pixel level understanding at a coarse level and goes deeper towards objects feature learning at a fine grained level.
The novelty consists of using only one sample with image level annotation per category, whose labeling form is more closer to prior conditions required by human to learn new objects. For image classification, response activation clustering (RAC) is proposed to achieve image level labeling, while multi heat map slices fusion (MSF) and saliency-edge-color-texture (SECT) based modification are utilized to generate pixel level annotations, which combine high-level semantic features and imaging prior based low-level attributes. For object common feature learning, dual-branch iterative structure is introduced. Based on conservative and radical strategies, information integration are realized iteratively, the completeness and accuracy of object region are gradually improved.
In the first stage, image level semantic information is extracted in form of response vector, and the relationship of each pair of feature dimensions is analyzed to achieve accurate image level object category annotations. Then, heat maps based on high-level semantics and low-level imaging attributes are utilized in combination to generate pixel level pseudo supervised annotations. In the first two phases, multi attention mechanism is introduced to achieve a better understanding of objects which are not salient or with small scale, as well as to mine detailed expression in images. Using a number of obtained annotations, dual branch network model is designed to learn common features of objects from different instances, more complete and accurate object regions can be obtained iteratively. Based on the methods, semantic segmentation task is implemented through a learning process which takes advantage of prior knowledge as much as possible.
Li, Xi, Huimin Ma, and Xiong Luo. “Weaklier Supervised Semantic Segmentation With Only One Image Level Annotation per Category.” IEEE Transactions on Image Processing 29 (2019): 128-141.