Biclustering
Overview
Biclustering -- simultaneous clustering of rows and columns -- is an important new technique in two-way data analysis. Though the idea has been around for 30 years, there has been a huge development in algorithms since 2000. Many of the algorithms discovered deal with different kinds of bicluster problems, especially relating to the expected outcome or structure.
One of the problems raised is not only finding the best algorithm, but knowing which algorithm should be used under which conditions. So our work is pursuing two goals: to develop a general framework for model-based biclustering and also to benchmark popular algorithms.
In addition to theoretical investigations, we are developing an open-source reference implementation within R -- an environment for statistical computing and graphics. State-of-the-art algorithms will be made available through a uniform and convenient user interface. Also common normalization, discretisation, visualization methods and some newly developed validation methods will be implemented. The ultimate goal is to create an all-encompassing toolbox for bicluster calculation, visualization and validation.
LMU Project Members
International Cooperations
R Packages
- biclust: Biclustering in R
A general framework for Biclustering in R. The main function biclust provides several algorithms to find biclusters in two-dimensional data: Cheng and Church, Spectral, Plaid Model, Xmotifs and Bimax. In addition, the package provides methods for data preprocessing (normalization and discretisation), visualization, and validation of bicluster solutions.
Publications
- Sara Dolnicar and Sebastian Kaiser and Katie Lazarevski and Friedrich Leisch (2010), BICLUSTERING
Overcoming data dimensionality problems in market segmentation, Acepted for publication (January 2010) in Journal of Travel Research.
- Sebastian Kaiser and Friedrich Leisch (2008), A Toolbox for Bicluster Analysis in R, Compstat 2008---Proceedings in ComputationalStatistics, Paula Brito, Physica Verlag, Heidelberg, Germany.
Technical Report unter: http://epub.ub.uni-muenchen.de/3293/
Talks
- Sebastian Kaiser and Friedrich Leisch. biclust- An R Package for Biclustering. Presented at ``Bicluster Workshop'', Hasselt, Belgium, November 05-06 2010.
- Sebastian Kaiser and Friedrich Leisch. Quest: A Generalized Motif Bicluster Algorithm. Presented at ``UseR 2009'', Rennes, France, July 07-10 2009.
- Sebastian Kaiser and Friedrich Leisch. Biclustering in data driven market segmentation. Presented at ``IFCS/GFKL 2009'', Technische Universit(ä)t, Dresden, Germany, March 13-18 2009.
- Sebastian Kaiser and Friedrich Leisch. A toolbox for bicluster analysis in r. Presented at ``COMPSTAT 2008'', Porto, Portugal, August 24-29 2008.
- Sebastian Kaiser and Friedrich Leisch. biclust - a toolbox for bicluster analysis in r. Presented at ``useR! 2008'', Technische Universität Dortmund, Germany, August 12-14 2008.
- Sebastian Kaiser and Friedrich Leisch. Benchmarking bicluster algorithms. Presented at ``GFKL 2008'', Helmut-Schmidt-Universität, Hamburg, Germany, July 16-18 2008.
- Sebastian Kaiser and Friedrich Leisch. biclust: A toolbox for bicluster analysis in r. Presented at ``Statistical Computing 2008'', Schloß Reisensburg, Günzburg, Germany, June 1-4 2008.