Mining and visualizing scientific publication data from Vojvodina

Miloš Radovanović, Jure Ferlež, Dunja Mladenić, Marko Grobelnik, Mirjana Ivanović

This paper describes the process of information extraction and analysis of a document collection representing publication references of the majority of researchers in the Serbian province of Vojvodina. After an introduction to techniques for text mining and visualization, a review of the IST World system is given, which provides online services for search, navigation and visualization of bibliographic data collected from multiple sources. The process adopted for information extraction consists of reference recognition and coauthorship detection. Evaluation on a representative subset of the data demonstrates good performance of the extraction as measured by precision and recall. An example analysis by a domain expert using IST-World system shows how insights can be gained into the collaboration and competence of authors using the visualization functionalities of IST World.