||February 8, 2007
||Dr. Jean-Philippe Vert, Ecole des Mines de Paris
||Kernel methods in computational biology
|| Many problems in computational biology and chemistry can be formalized
as classical statistical problems, e.g., pattern recognition, regression
or dimension reduction, with the caveat that the data are often not
vectors. Indeed objects such as gene sequences, small molecules, protein
3D structures or phylogenetic trees, to name just a few, have particular
structures which contain relevant information for the statistical
problem but can hardly be encoded into finite-dimensional vector
Kernel methods are a class of algorithms well suited for such problems.
Indeed they extend the applicability of many statistical methods
initially designed for vectors to virtually any type of data, without
the need for explicit vectorization of the data. The price to pay for
this extension to non-vectors is the need to define a positive definite
kernel between the objects, formally equivalent to an implicit
vectorization of the data. The "art" of kernel design for various
objects have witnessed important advances in recent years, resulting in
many state-of-the-art algorithms in computational biology and chemistry,
as well as many other fields.
The goal of this short course is to present the mathematical foundations
of kernel methods, as well as the main approaches that have emerged so
far in kernel design. The relevance of these methods will be illustrated
by several examples in computational biology.