Automated image stylization to create artistically pleasing images from ordinary photographs is an interesting and useful task in computer vision. Therefore, several automated styling methods have been developed using powerful Deep Neural Network (DNN) features. They typically use a carefully constructed joint loss function to separately consider the similarities between a proposed output and the input content & style images. However, most previous methods apply a single homogeneous stylization optimization to the entire image or require manual creation of segmentation masks for good performance. These systems may perform poorly when the input style and content image differ greatly, or if the content requires a difference in the rendering of certain regions (e.g. facial details). In this paper, we propose a heterogeneous stylization system making full use of a Segmentation Deep Convolution Neural Network, which produces hierarchical semantic information and performs feature extraction simultaneously. This allows a semantically guided heterogeneous stylization. We also consider which factors influence the quality of stylization, and visualize the networks' capacity to render desired details effectively.