Florence d'Alché Buc, Université d'Evry-Val d'Essonne, Evry, France.

Protein-protein network inference with regularized output and input kernel methods

Prediction of a physical interaction between two proteins has been addressed in the context of supervised learning, unsupervised learning and more recently, semi-supervised learning using various sources of information (genomic, phylogenetic, protein localization and function). The problem can be seen as a kernel matrix completion task if one defines a kernel that encodes similarity between proteins as nodes in a graph or alternatively, as a binary supervised classification task where inputs are pairs of proteins.
In this talk, we first make a review of existing works (matrix completion, SVM for pairs, metric learning, training set expansion), identifying the relevant features of each approach. Then we define the framework of output kernel regression (OKR) that uses the kernel trick in the output feature space. After recalling the results obtained so far with tree-based output kernel regression methods, we develop a new family of methods based on Kernel Ridge Regression that benefit from the use of kernels both in the input feature space and the output feature space. The main interest of such methods is that imposing various regularization constraints still leads to closed form solutions. We show especially how such an approach allows to handle unlabeled data in a transductive setting of the network inference problem and multiple networks in a multi-task like inference problem.
New results on simulated data and yeast data illustrate the talk.

Nir Friedman, The Hebrew University of Jerusalem, Jerusalem, Israel.

Exploring transcription regulation through cell-to-cell variability

The regulation of cellular protein levels is a complex process involving many regulatory mechanisms. These regulatory mechanisms introduce a cascade of stochastic events leading to variability of protein levels between cells. Previous studies have shown that perturbing genes involved in transcription regulation alters variability of protein levels, but to date, there has been no systematic characterization of these effects. Here we utilize single-cell expression levels of two fluorescent reporters under a wide range of genetic perturbations in Saccharomyces cerevisiae to identify proteins that affect expression variability.
We introduce computational methodology to determine the variability introduced by each perturbation, and distinguish between global variability, affecting both reporters in a coordinated manner, and local variability, affecting individual reporters independently. Classifying genes by their variability phenotype identifies functionally coherent groups, which broadly correlate with the different stages of transcriptional regulation. Specifically, we find that perturbation of processes related to DNA maintenance, chromatin regulation and RNA synthesis affect local variability, while processes related to protein synthesis and transport, cell morphology and cell size affect global variability. In addition, we find that perturbations of many processes related to chromatin regulation affect both global and local variability. Finally, we demonstrate that the variability phenotypes of different protein complexes provide insights into their cellular functions. Our methodology provides tools for examining arising data on variability, and establishes the utility of this phenotype as a tool in dissecting the regulatory mechanisms involved in gene expression.

Ursula Kummer, BIOQUANT, University of Heidelberg, Germany.

Computational environments for modeling biochemical networks

Computational modeling is an integral and crucial part of systems biology. It relies on accessible and user-friendly software to set up models, model management and model analysis. Here, two systems are presented that have been implemented for these needs. The first one, COPASI, has been around since 2004 and is a standalone software suite that encompasses many of the commonly used algorithms and approaches in computational modeling. Amongst others, it allows parameter estimation of model on the basis of experimental data sets with diverse methods. The second software is SYCAMORE which is a web based application designed to allow database driven modeling. Thus, it interacts directly with databases for enzymatic kinetics and with tools to estimate parameters based on protein structural data. Both systems are constantly refined and features added.

Hans Lehrach, Max Planck Institute for Molecular Genetics, Berlin, Germany.

Deep sequencing and systems biology: steps on the way to an individualised treatment of cancer patients

Biological processes are driven by complex networks of interactions between molecular and cellular components. Predicting the outcome of potential disturbances is of prime importance to be able to prevent disease, as well as to identify possible therapies for diseases, which are already present. To predict the behaviour of such complex networks, we will have to develop general models of the processes involved, based on information on pathways derived from genetic and molecular approaches, to ‘individualise’ these by applying ‘genomics’ scale analysis techniques (e.g. genome and/or transcriptome analysis by next-gen sequencing techniques-genomics), and to explore the behaviour of these models computationally (systems biology). We are using a combination of high throughput sequencing of genome and transcriptome of both tumor and patient to establish predictive models (virtual patients), which ultimately will reflect the response of real patients to specific therapies in oncology and other areas of medicine.

Vebjorn Ljosa, The Broad Institute of MIT and Harvard, USA.

Automatic quantification of subtle cellular phenotypes in microscopy-based high-throughput experiments

Microscopy-based high-throughput experiments can provide a view into biological responses and states at the resolution of singe cells. CellProfiler, our open-source image-analysis software, has become widely used by biologists to design custom analysis pipelines for complex high-throughput assays. I will discuss our work in progress to automatically quantify the prevalence of subtle cellular phenotypes in high-throughput samples of cultured cells I will also touch briedly on the use of machine learning to improve the accuracy and robustness of CellProfiler's image segmentation.
Our classification tool, CellProfiler Analyst, enables a biologist to train a boosting classifier iteratively to detect rare, complex phenotypes, and its usefulness has been demonstrated in several high-throughput screens. Here, I will describe a method to learn phenotypes without requiring hand-labeled cells for training. Instead, a classifier is trained from negative and positive controls in the experiment, where the positives are known to be enriched in the phenotype of interest, even if only slightly (e.g., 55% vs. 45% penetrance). By nonlinearly projecting cells into a random feature space, we can use efficient linear methods but still benefit from nonlinear notions of similarity, and can overcome experimental noise by training on millions of cells. Using the resulting classifier to assign soft labels to each cell in the experiment, we can identify enriched samples ("hits") nonparametrically. Furthermore, we are developing techniques to automatically identify relevant cellular phenotypes in large-scale chemical profiling experiments.

Oral presentations

The following papers are accepted for oral presentation (ordered by first authors' lastname):

Sumeet Agarwal, Gabriel Villar and Nick Jones.
High throughput network analysis
Luigi Cerulo and Michele Ceccarelli.
On learning gene regulatory networks with only positive examples
Keith Harris and Mark Girolami.
An integrated generative and discriminative Bayesian model for binary classification
Anne-Claire Haury and Jean-Philippe Vert.
On the stability and interpretability of prognosis signatures in breast cancer
Vân Anh Huynh-Thu, Alexandre Irrthum, Louis Wehenkel and Pierre Geurts.
Inferring regulatory networks from expression data using tree-based methods
Marvin Meeng, Arno Knobbe, Arne Koopman, Jan Bert van Klinken and Sjoerd A. A. van den Berg.
Equation discovery for whole-body metabolism modelling
Sofia Mosci, Silvia Villa, Alessandro Verri and Lorenzo Rosasco.
A fast algorithm for structured gene selection
Axel-Cyrille Ngonga Ngomo.
Parameter-free clustering of protein-protein interaction graphs
Andrea Ocone and Guido Sanguinetti.
Inference in hierarchical transcriptional network motifs
Manfred Opper and Andreas Ruttor.
A note on Inference for reaction kinetics with monomolecular reactions
Hossein Rahmani, Hendrik Blockeel and Andreas Bender.
Collaboration-based function prediction in protein-protein interaction networks
Daniel Silk, Paul Kirk, Christopher Barnes and Michael Stumpf.
Automated detection of chaotic and oscillatory regimes
Hongyu Su, Markus Heinonen and Juho Rousu.
Multilabel prediction of drug activity
Andrea Szabóová, Ondrej Kuželka, Filip Železný and Jakub Tolar.
Prediction of DNA-binding proteins from structural features
Katerina Tashkova, Peter Korošec, Jurij Šilc, Ljupčo Todorovski and Sašo Džeroski.
Parameter estimation in an endocytosis model
Celine Vens, Etienne Danchin and Marie-Noelle Rosso.
Identifying proteins involved in parasitism by discovering degenerated motifs

Poster presentations

The following posters will be presented (ordered by first authors' lastname):

Stuart Aitken. Applications of nested sampling optimisation to systems biology models
Peter Antal, András Gézsi, András Millinghoffer, Gergely Hajós, Csaba Szalai and András Falus. On the applicability of Bayesian univariate methods as filters in complex GWAS analysis
Peter Antal, Peter Sárközy, Zoltan Balazs, Gergely Hajós, Csaba Szalai and András Falus. Haplotype- and pathway-based aggregations for the Bayesian analysis of rare variants
H.M. Shahzad Asif and Guido Sanguinetti. Large scale learning of combinatorial transcriptional dynamics from gene expression
Annalisa Barla, Sofia Mosci, Lorenzo Rosasco, Alessandro Verri, Paolo Fardin, Andrea Cornero, Massimo Acquaviva and Luigi Varesio. Combining l1-l2 regularization with biological prior for multi-level hypoxia signature in Neuroblastoma
Kim Batselier and Bart De Moor. Maximum likelihood and polynomial system solving
Nicolas Brunel and Florence d'Alché-Buc. Flow-based Bayesian estimation of differential equations for modeling biological networks
Borja Calvo and Rubén Armañanzas. Module network-based classifiers for redundant domains
Lore Cloots, Hui Zhao, Tim Van den Bulcke, Yan Wu, Riet De Smet, Valerie Storms, Pieter Meysman, Kristof Engelen and Kathleen Marchal. Query-based biclustering of gene expression data using probabilistic relational models
Luna De Ferrari, Stuart Aitken, Jano van Hemert and Igor Goryanin. Multi-label prediction of enzyme classes using InterPro signatures
Frank Dondelinger, Sophie Lebre and Dirk Husmeier. Reconstructing developmental gene networks using heterogeneous dynamic Bayesian networks with information sharing
Marco Grimaldi. On the assessment of variability factors in computational network inference
Katja Hansen, David Baehrens and Klaus-Robert Müller. Explaining kernel based predictions in drug design
Steven Kiddle, Richard Hickman, Katherine Denby and Sach Mukherjee. From gene expression to predicted gene regulation in the defence response of Arabidopsis to infection by Botrytis cinerea
Dragi Kocev, Bernard Ženko, Petra Paul, Coenraad Kuijl, Jacques Neefjes, and Sašo Džeroski. Predictive clustering relates gene annotations to phenotype properties extracted from images
Tomasz Konopka. Automated analysis of biological oscillator models
Ondrej Kuželka and Filip Železný. Shrinking covariance matrices using biological background knowledge
Viet Anh Nguyen and Pietro Lio`. Filling in the gaps of biological network
Andreas Ruttor, Florian Stimberg and Manfred Opper. Comparing diffusion and weak noise approximations for inference in reaction models
Chloé Sarnowski, Pablo Carbonell, Mohamed Elati and Jean-Loup Faulon. Prediction of catalytic efficiency to discover new enzymatic activities
Marie Schrynemackers, Pierre Geurts, Louis Wehenkel and M. Madan Babu. Prediction of genetic interactions in yeast using machine learning
Kana Shimizu and Koji Tsuda. All pairs similarity search for short reads
Ivica Slavkov, Darko Aleksovski, Nigel Savage, Kimberley V. Walburg, Tom H.M. Ottenhoff and Sašo Džeroski. Discovering groups of genes with coordinated response to M. leprae infection
Helene Thygesen, Peter-Bram 't Hoen and A.H. Koos Zwinderman. A hierarchical Poisson model for next generation cDNA sequencing libraries
Jimmy Vandel, Simon De Givry, Brigitte Mangin and Matthieu Vignes. Gene regulatory network reconstruction with a combination of genetics and genomics data

Friday, 15 October

08:30

17:00

Registration desk open

09:00

09:10

Welcome

09:10

10:10

Invited talk

Nir Friedman. Exploring transcription regulation through cell-to-cell variability

10:10

11:00

Session 1

10:10

10:35

Celine Vens, Etienne Danchin and Marie-Noelle Rosso. Identifying proteins involved in parasitism by discovering degenerated motifs

10:35

11:00

Sumeet Agarwal, Gabriel Villar and Nick Jones. High throughput network analysis

11:00

11:30

Coffee break

11:30

12:30

Invited talk

Ursula Kummer. Computational environments for modeling biochemical networks

12:30

13:20

Session 2

12:30

12:55

Marvin Meeng, Arno Knobbe, Arne Koopman, Jan Bert van Klinken and Sjoerd van den Berg. Equation discovery for whole-body metabolism modelling

12:55

13:20

Katerina Tashkova, Peter Korošec, Jurij Šilc, Ljupčo Todorovski and Sašo Džeroski. Parameter Estimation in an Endocytosis Model

13:20

14:45

Session 3: Poster session A

13:20

14:45

Poster presentations with catered lunch.

14:45

15:45

Invited talk

Hans Lehrach. Deep sequencing and systems biology: steps on the way to an individualised treatment of cancer patients

15:45

16:15

Coffee break

16:15

17:30

Session 4

16:15

16:40

Vân Anh Huynh-Thu, Alexandre Irrthum, Louis Wehenkel and Pierre Geurts. Inferring regulatory networks from expression data using tree-based methods

16:40

17:05

Andrea Ocone and Guido Sanguinetti. Inference in hierarchical transcriptional network motifs

17:05

17:30

Luigi Cerulo and Michele Ceccarelli. On learning gene regulatory networks with only positive examples

17:30

18:00

Community meeting

19:30

Conference dinner

Saturday, 16 October

09:00

10:00

Invited talk

Florence d'Alché Buc. Protein-protein network inference with regularized output and input kernel methods

10:00

10:50

Session 5

10:00

10:25

Hossein Rahmani, Hendrik Blockeel and Andreas Bender. Collaboration-based function prediction in protein-protein interaction networks

10:25

10:50

Andrea Szabóová, Ondrej Kuželka, Filip Železný and Jakub Tolar. Prediction of DNA-binding proteins from structural features

10:50

11:20

Coffee break

11:20

12:20

Invited talk

Vebjorn Ljosa. Automatic quantification of subtle cellular phenotypes in microscopy-based high-throughput experiments

12:20

13:10

Session 6

12:20

12:45

Anne-Claire Haury and Jean-Philippe Vert. On the stability and interpretability of prognosis signatures in breast cancer

12:45

13:10

Sofia Mosci, Silvia Villa, Alessandro Verri and Lorenzo Rosasco. A fast algorithm for structured gene selection

13:10

14:15

Session 7: Poster session B

13:10

14:15

Poster presentations with catered lunch.

14:15

15:05

Session 8

14:15

14:40

Daniel Silk, Paul Kirk, Christopher Barnes and Michael Stumpf. Automated detection of chaotic and oscillatory regimes

14:40

15:05

Manfred Opper and Andreas Ruttor. A note on Inference for reaction kinetics with monomolecular reactions

15:05

15:30

Coffee break

15:30

16:20

Session 9

15:30

15:55

Hongyu Su, Markus Heinonen and Juho Rousu. Multilabel prediction of drug activity

15:55

16:20

Keith Harris and Mark Girolami. An integrated generative and discriminative Bayesian model for binary classification

16:20

Closing remarks

Program

Invited Speakers

Oral presentations

Poster presentations