A current problem in biological and medical research is how to use
existing biological knowledge and heterogeneous experimental data in
making inferences on new data. We study new computational methods and
theory for the fusion of multiple biological information sources with
partially-relevant background data from existing and new databanks. We
argue that using the available public or private background
information from hundreds of different situations or conditions, it is
possible to both complement the existing scarce data and to focus the
analysis on relevant variables.
The project complements the task-dependent bioinformatics methods, which are
naturally required in all biological and medical research problems as well,
with methods that address a key underlying statistical limitation in current studies using high-throughput
measurement techniques: large p, small n. It is very hard to make trustworthy
computational models or statistically significant diagnoses based on only few samples (small n) when the number of
studied genes or metabolites (p) is large.
This project is partially based on a previous project in the
programme of the
Academy of Finland.