How to Combine Text-Mining Methods to Validate Induced Verb-Object Relations


Nicolas Bechet, Jacques Chauche, Violaine Prince, Mathieu Roche




This paper describes methods using Natural Language Processing approaches to extract and validate induced syntactic relations (here restricted to the Verb-Object relation). These methods use a syntactic parser and a semantic closeness measure to extract such relations. Then, their validation is based on two different techniques: A Web Validation system on one part, then a Semantic-Vectorbased approach, and finally different combinations of both techniques in order to rank induced Verb-Object relations. The Semantic Vector approach is a Roget-based method which computes a syntactic relation as a vector. Web Validation uses a search engine to determine the relevance of a syntactic relation according to its popularity. An experimental protocol is set up to judge automatically the relevance of the sorted induced relations. We finally apply our approach on a French corpus of news by using ROC Curves to evaluate the results.