Dynamic Network Modelling with Similarity based Aggregation Algorithm


Günce Keziban Orman




Proper modelling of complex systems allows hidden knowledge discovery that cannot be explored using traditional methods. One of the techniques for such modelling is dynamic networks. In this work, we aim to develop a methodology for extracting proper dynamic networks. We concentrate on two fundamentally interconnected problems: first, determining the appropriate window size for dynamic network snapshots; and second, obtaining a proper dynamic network model. For the former problem, we propose Jaccard similarity and its statistical significance based compression ratio, and for the latter, we propose an aggregation approach that extracts dynamic networks with snapshots of varying duration. The aggregation algorithm compresses the system information when there is repetition and takes snapshots when there is a significant structural change. The experiments are realised on four simple or complex data sets by comparing our proposal with baseline approaches. We used well-known Enron emails as simple set and Haggle Infocomm, MIT Reality Mining, and Sabanci Wi-Fi logs as complex data sets. These complex sets like Wi-Fi or Bluetooth connections which are known to be noisy, making analysis difficult show the proximity of system objects. The experimental results show that the proposed methodology can be used to find not only significant time points in simple Enron emails, but also circadian rhythms with their time intervals that reveal the life-cycle of connected areas from complex Wi-Fi logs or bluetooth connections. According to testing on four real-world data sets, both compression ratios and the aggregation process enable the extraction of dynamic networks with reduced noise, are easy to comprehend, and appropriately reflect the characteristics of the system.