multiClust - multiClust: An R-package for Identifying Biologically Relevant
Clusters in Cancer Transcriptome Profiles
Clustering is carried out to identify patterns in
transcriptomics profiles to determine clinically relevant
subgroups of patients. Feature (gene) selection is a critical
and an integral part of the process. Currently, there are many
feature selection and clustering methods to identify the
relevant genes and perform clustering of samples. However,
choosing an appropriate methodology is difficult. In addition,
extensive feature selection methods have not been supported by
the available packages. Hence, we developed an integrative
R-package called multiClust that allows researchers to
experiment with the choice of combination of methods for gene
selection and clustering with ease. Using multiClust, we
identified the best performing clustering methodology in the
context of clinical outcome. Our observations demonstrate that
simple methods such as variance-based ranking perform well on
the majority of data sets, provided that the appropriate number
of genes is selected. However, different gene ranking and
selection methods remain relevant as no methodology works for
all studies.