The retrieval of 3-D surface models of the Earth is a major issue of remote sensing. Some nice results have already been obtained at medium resolution with optical and radar imaging sensors. For instance, missions such as the Shuttle Radar Topography Mission (SRTM) or the SPOT HRS have provided accurate digital terrain models. The computation of a digital surface model (DSM) over urban areas is the new challenging issue. Since the recent improvements in radar image resolution, synthetic aperture radar (SAR) interferometry, which had already proved its efficiency at low resolution, has provided an accurate tool for urban 3-D monitoring. However, the complexity of urban areas and high-resolution SAR images prevents the straightforward computation of an accurate DSM. In this paper, an original high-level processing chain is proposed to solve this problem, and some results on real data are discussed. The processing chain includes three main steps, namely: (1) information extraction; (2) fusion; and (3) correction. Our main contribution addresses the merging step, where we aim at retrieving both a classification and a DSM while imposing minimal constraint on the building shapes. The joint derivation of height and class enables the introduction of more contextual information. As a consequence, more flexibility toward scene architecture is possible. First, the initial images (interferogram, amplitude, and coherence images) are converted into higher-level information mapping with different approaches (filtering, object recognition, or global classification). Second, these new images are merged into a Markovian framework to jointly retrieve an improved classification and a height map. Third, DSM and classification are improved by computing layover and shadow from the estimated DSM. Comparison between shadow/layover and classification allows some corrections. This paper mainly addresses the second step, while the two others are briefly explained and referred to already published papers. The results obtained on real images are compared to ground truth and indicate a very good accuracy in spite of limited image resolution. The major limit of DSM computation remains the initial spatial and altimetric resolutions that need to be made more precise