This paper proposes a hierarchical organization of the linguistic knowledge, that views grammar as an abstraction of item-dependent information (in particular, an abstraction of subcategorization frames into a hierarchy of classes). The formalism has been successfully applied to a classification of 105 Italian verbal frames, developed by analysing a corpus of about 500,000 words.
The proposed framework (expressed in a dependency approach) is of linguistic and computational interest. From a linguistic point of view, it is a clear, significant and non-redundant representation. From a computational point of view, structuring the grammar into a hierarchy allows to define a predictive component for parsing, exploiting the information at many levels of the hierarchy: this allows to reduce the ambiguity, a very big problem in large scale NLP systems.