Image matting is a long‐standing problem in computer graphics and vision, mostly identified as the accurate estimation of the foreground in input images. We argue that the foreground objects can be represented by different‐level information, including the central bodies, large‐grained boundaries, refined details, etc. Based on this observation, in this paper, we propose a multi‐scale information assembly framework (MSIA‐matte) to pull out high‐quality alpha mattes from single RGB images. Technically speaking, given an input image, we extract advanced semantics as our subject content and retain initial CNN features to encode different‐level foreground expression, then combine them by our well‐designed information assembly strategy. Extensive experiments can prove the effectiveness of the proposed MSIA‐matte, and we can achieve state‐of‐the‐art performance compared to most existing matting networks.