Extracting knowledge from large databases of examples through kernel methods has proved to be an effective approach, due notably
to the versatility and good generalisation abilities of kernel algorithms such as the support vector machine. A basic ingredient of such methods lies in the kernel, which quantifies arbitrarily how similar two objects are. In most cases related to complex objects, such as the ones encountered in biology (sequences, graphs, DNA chips), the question of defining a valid kernel, i.e. tuning this similarity to be both meaningful and fast to compute, remains however a major issue. We review in this talk how a modeling approach through probabilistic models can be used to reach both objectives, by presenting older works and recent results published recently in the Jour. of Machine Learning Research by the author, Kenji Fukumizu (ISM, Tokyo) and Jean-Philippe Vert (ENSMP, Paris).