A Weighted Mutual Information Biclustering Algorithm for Gene Expression Data

Yidong Li, Wenhua Liu, Yankun Jia, Hairong Dong

Microarrays are one of the latest breakthroughs in experimental molecular biology, which have already provided huge amount of high dimensional genetic data. Traditional clustering methods are difficult to deal with this high dimensional data, whose a subset of genes are co-regulated under a subset of conditions. Biclustering algorithms are introduced to discover local characteristics of gene expression data. In this paper, we present a novel biclustering algorithm, which called Weighted Mutual Information Biclustering algorithm(WMIB) to discover this local characteristics of gene expression data. In our algorithm, we use the weighted mutual information as new similarity measure which can be simultaneously detect complex linear and nonlinear relationships between genes, and our algorithm proposes a new objective function to update weights of each bicluster, which can simultaneously select the conditions set of each bicluster using some rules.We have evaluated our algorithm on yeast gene expression data, the experimental results show that our algorithm can generate larger biclusters with lower mean square residues simultaneously.