Read2Tree infers phylogenetic trees from raw sequencing reads quick and easy

• Author: Christophe Dessimoz & Fritz Sedlazeck •

We just published a method to build phylogenetic trees directly from raw reads, bypassing time-consuming steps such as genome assembly. This post gives the short story and the backstory. In particular, find out below what Read2Tree has in common with “Smoke on the Water” from the band Deep Purple.

In biology, phylogenetic trees are everywhere. They help us understand the relationships between species, genes, or cells—how they evolved, and how they’re related.

The sequencing revolution provides the raw material to infer phylogenetic trees, but building state-of-the-art phylogenetic trees requires tedious steps from read curation, de novo assembly, gene annotation, ortholog identification to tree inference, which can take many months to run—millions of CPU hours invested in this process are not uncommon—and specialised knowledge to oversee this process.

That’s where Read2Tree comes in. Our new approach to tree inference bypasses the usual steps of genome assembly, annotation, and orthology inference. Instead, it uses existing knowledge of the protein sequence universe to directly reconstruct comprehensive sequence alignments from raw sequencing reads.

The approach is vastly faster than traditional methods and in many cases more accurate—the exception being when sequencing coverage is high and reference species very distant. Read2Tree is also flexible, working with genome and transcriptome, short and long reads, and sequencing coverage as low as 0.1x.

We were encouraged by the buzz the Read2Tree manuscript elicited on bioRxiv last year, and are delighted it has now been published in Nature Biotechnology.

What is Read2Tree good for?

A nice illustration of Read2Tree’s potential was the reconstruction of a phylogeny of coronaviruses, which processed on the same tree diverse Coronaviridae sequences as well as 10,000 raw SARS-CoV-2 datasets from the Short Read Archive. The reconstructed tree was consistent with the lineage classification obtained from the UniProt reference proteomes, accurately recovering the main coronavirus genera and all subgenera (Figure 1). At the same time, the same phylogeny accurately clustered the sequences according to CDC variants of concerns classification. These results demonstrate the versatility and scalability of Read2Tree, making it suitable for both zoonotic surveillance and human epidemiology.

10k-sample COVID tree

Figure 1—Zoomed-in display of a tree inferred using Read2Tree on 10,283 samples whole genome SARS-CoV-2 samples. Classification in colour was obtained from [https://harvestvariants.info](https://harvestvariants.info), where grey leaves are unclassified according to the CDC label. The colour clustering shows that the Read2Tree-based tree recovers consistent classification. Click on the tree to see it full screen.

The ability to reconstruct phylogenetic trees from raw reads has additional advantages. Some genomes are deposited with poor or even entirely absent protein annotation sets. Processing genomes directly from raw reads can avoid this limitation and also decrease biases that arise from relying too heavily on specific reference genomes. Although some efforts have been made to “dehumanize” non-human great ape genomes, other clades still face similar biases that can be significantly reduced by processing raw reads.

Who might find it useful?

We think Read2Tree will be especially useful for small labs with limited bioinformatics expertise and computational resources, allowing them to perform state-of-the-art phylogenomics on particular species or environments of interest.

But it’s not just small labs that can benefit from Read2Tree. Large consortia can also use it to regularly update their trees as new genomes are sequenced. This is especially important as more and more projects around comparative genomics are underway, such as the Earth BioGenome, the Darwin Tree of Life, or the European Reference Genome Atlas projects.

In addition, Read2Tree’s ability to infer trees from much lower coverage than traditional methods means it can also be useful for quality control early in the process. This makes it a valuable tool for environmental and metagenomic applications, especially when combined with genome binning techniques.

Overall, Read2Tree is a powerful method for inferring phylogenetic trees directly from raw sequencing reads. We hope it will help make phylogenetic tree inference faster, more accurate, and more accessible to a wider range of researchers.

What’s next?

Now that the introductory Read2Tree paper is published, we are excited to explore new potential applications that we haven’t been able to tackle so far. For instance, we have already received inquiries from researchers interested in using Read2Tree for ancient DNA applications or for monitoring systems that require fast turnaround time and low coverage.

Moving forward, we have two main goals. First, we aim to expand Read2Tree’s capabilities to handle multi-species samples, which will enable an even broader range of applications in the metagenomics field. While long-read applications may offer the most benefit, we are confident that Read2Tree’s ability to perform well with short-reads will also prove valuable in detangling multiple species.

Secondly, we plan to explore the use of Read2Tree in single-cell sequencing. This rapidly growing field involves sequencing individual cells, including cancer cells, and analysing their genetic information. Given Read2Tree’s ability to operate with low coverage levels (down to 0.2x), we believe it could facilitate fast and accurate characterization of tumour or cell evolution.

We hope that Read2Tree will help streamline and democratise comparative genomics analyses. We are excited to see how researchers will apply this tool to further advance our understanding of genetics and evolution.

What’s the backstory?

Both of our labs (Fritz Sedlazeck’s and Christophe Dessimoz’s) have been collaborating for many years, and we’ve always enjoyed exchanging ideas even though our research interests are quite diverse. One of our interests over the years is how to combine our expertise in sequence analysis and ortholog comparison to develop new methodologies and gain new insights into biology.

It was during one of Fritz’s visits to Christophe’s lab in Lausanne, Switzerland, that we started brainstorming ideas for a project that led to Read2Tree. Our goal was to overcome the limitations and bottlenecks of comparative genomics. We had some amazing cheese risotto, and the beautiful scenery fueled our discussions further (Figure 2).

View on the lake Geneva from Montreux

Figure 2 — Fritz alleges that the epiphany of Read2Tree took place with this view from his hotel room in Montreux, Switzerland, during a collaborative visit to Christophe’s group. It’s not entirely implausible, considering this very view [inspired the song “Smoke on the Water” by Deep Purple](https://en.wikipedia.org/wiki/Smoke_on_the_Water#History).

 

David Dylus, the first author, was convinced that it was possible to bring our ideas to life, although he did not anticipate how much time and effort it would take (Figure 3). Even after he moved on to a new role in the pharmaceutical industry, he continued to work on Read2Tree after regular work hours. And when the COVID-19 pandemic hit, we had to face additional challenges, such as maintaining regular meetings and pushing the manuscript forward while not compromising on quality. We also faced technical issues, such as hard disk crashes and cluster updates that led to data loss, but David hang on.

Completing the paper was not an easy task, and one of the biggest challenges was organising and identifying all of the SRA data sets, including those related to yeast and COVID-19. Despite these challenges, we were able to bring the project to completion. It was a special joy to present the work at ISMB 2022, where Fritz and Christophe had the wonderful opportunity to meet in person, and we continued to discuss our work while enjoying good food and drinks by beautiful Mendota lake in Madison, Wisconsin.

In summary, nice food and lakeside views were instrumental in the making of Read2Tree.

David mining at SIB 20th anniversary party

Figure 3 — First author David Dylus performing on stage (centre, crouching) on the occasion of SIB Swiss Institute of Bioinformatics’s 20th anniversary—a period of rapid progress in the development of Read2Tree. Though no-one is entirely certain, rumour has it that David is miming “sipping a cup of tea while looking into the distance”, in line with our theme of sustenance, inspiring landscapes, and scientific progress.

 

Note this blog post was first published on the Nature Communities blog here.

Reference

Dylus, D., Altenhoff, A., Majidian, S. et al. Inference of phylogenetic trees directly from raw sequencing reads using Read2Tree. Nat Biotechnol (2023). doi:10.1038/s41587-023-01753-4.

 

Share or comment:

To be informed of future posts, sign up to the low-volume blog mailing-list, subscribe to the blog's RSS feed, or follow us on Twitter. To read old posts, check out the index here.


This year, the MCEB international conference was held in Switzerland for the first time in its history.

During five days, from June 26th to 30th 2022, experts in mathematical and computational evolutionary biology from all over the world exchanged about their passion in Chateau d’Oex, in the heart of the welcoming mountains of the regional natural park of Gruyeres.

Here is a little summary on this annual not-to-be-missed event for evolutionary biology aficinionados.

The edition 2022 of the international MCEB conference brought together a hundreds of scientists from diverse disciplines and at different stages of their career, from PhD students to world-renown senior scientists. During five consecutive days, the Chateau d’Oex has hence been the improbable scene of inspiring exchanges about evolution between mathematicians, computational biologists, evolutionary biologists, ecologists, epidemiologists and cancer biologists.

view from conference roomcheese making

View from the conference venue (left) and cheese making social activity (right).

In total, 6 one-hour talks and 20 short talks were given to present epistemological perspectives, recent methodological advances and challenges yet to be addressed for reconstructing the evolutionary history of the genes, genomes, populations and species observed then and today on Earth.

As a mirror of the truly interdisciplinary nature of the event, a wide range of phylogenetic structures have been discussed during these five days. Networks of gene flows, phylogenetic trees, or genealogical trees predicted by coalescent theory were on the menu of this year.

Experts specialized in introgressive events such as horizontal gene transfers or endosymbioses provided insights on the methods and challenges to model reticulate evolution. With this respect, an inspirational talk was given on how ghost lineages that went extinct in the past but nonetheless exchanged some DNA with ancestors of extant species could mislead our interpretation of the directionality of gene flows within phylogenetic networks.

Lectures on the theoretical advances that were made throughout the last 50 years in the reconstruction of gene and species phylogenetic trees were then given by world leaders in the field of phylogeny. In particular, they provided mathematical evidence that contrary to what is practiced today to reconstruct species trees, neither the consensus tree of several gene trees nor the tree inferred from the concatenated alignment of these genes actually give a good approximation of the phylogeny of the different species encoding these genes, which highlights the urgent need to pursue methodological efforts to better model species evolution.

On shorter evolutionary timescales, numerous mathematical models to infer geneological trees of human populations or cancer cell lineages were also presented during the conference.

view from conference roomcheese making

Poster session (left) and group photo (right).

Finally, a strong focus was placed this year on methods for coupling phylogenetic inferences with phenotypical, ecological, archaeological, geographical, epidemiological and medical data in order to study how traits or diseases evolved across space and time. Striking examples of these integrative analyses were provided by methodologies to retrace with accuracy the evolution of the recent Sars-Cov2 and MERS-Cov pandemics over time and space.

In addition to these talks of exceptional scientific quality, two poster sessions animated by junior researchers and students took place during the conference and were truly appreciated by every participants for the scientific excellence of the posters and the conviviality of the moments.

Overall, through five days of scientific presentations, poster sessions, dinners, parties and social activities such as hiking in the Alpes or visiting a cheese factory, scientific exchanges and informal talks were fostered and allowed to create news bonds within this community of researchers.

The MCEB 2022 conference was a great success as it enabled a diversity of scientists from all over the world to meet, exchange on their work and build new collaborations!

Share or comment:

To be informed of future posts, sign up to the low-volume blog mailing-list, subscribe to the blog's RSS feed, or follow us on Twitter. To read old posts, check out the index here.


Life as an academic: my 2016 in numbers

• Author: Christophe Dessimoz •

Life as an academic is varied and busy. Students sometimes believe that all we do is teach. In fact, we do quite a few other things. Here’s my 2016 in numbers.

  • number of papers published: 10
  • number of paper rejections: 7
  • number of books edited: 1
  • number of grant proposals submitted: 8
  • number of research contracts negotiated with the industry: 2
  • number of blog posts: 5
  • number of tweets: 474 (66% were retweets)
  • number of YouTube videos: 1
  • number of papers reviewed: 24
  • number of papers edited: 3
  • number of grants reviewed: 3
  • number of PhD theses examined: 2
  • number of emails received (excluding spam and mailing-lists): 12,695
  • number of emails written: 4,377 (!)
  • number of minutes videoconferencing on GoToMeeting: 13,236 (!!)
  • number of Geneva-London-Geneva roundtrips: 12
  • number of meetings with >50 attendees co-organised: 6
  • number of seminars hosted: 4
  • number of conferences attended: 3
  • number of talks given: 11
  • number of semester-long courses organised: 2
  • number of hours lectured: 32
  • number of 2000-word student papers marked: 47
  • number of summer students supervised: 4
  • number of overnight retreats attended: 4
  • number of work Christmas dinners attended: 3
  • number of annual reports written: 3 (this does not count)
  • number of Tête de Moine eaten at lab celebrations: 4
  • number of times moved home: 0 (noteworthy since we moved 5 times in the preceding 5 years…)

I wish you, Dear Reader, all the best in 2017!

Share or comment:

To be informed of future posts, sign up to the low-volume blog mailing-list, subscribe to the blog's RSS feed, or follow us on Twitter. To read old posts, check out the index here.


Interview with Tunca Doğan, OMA Visiting Fellow 2016

• Author: Tunca Doğan •

Note: the “Life in the Lab” series features interviews of interns and visitors. This post is by our second 2016 OMA Visiting Fellow Tunca Doğan, who spent a month with us earlier this year. You can follow Tunca on Twitter at @tuncadogan. —Christophe

 

Please introduce yourself in a few sentences.

My name is Tunca Doğan. I received my PhD in 2013 with a thesis study in the fields of bioinformatics and computational biology where we developed methods for the clustering of the protein sequences using unsupervised machine learning techniques (Dogan and Karacali, 2013). I’ve since been working as a post-doctoral fellow in the EMBL-EBI, UK under the Protein Function Development team (UniProt Database) leaded by Dr Maria Martin. Here I’m developing new tools and methods for the automated functional annotation of protein records in the UniProtKB using a variety of features including domain architectures (Dogan et al., 2016). I’m also conducting research in the field of computational drug discovery. As of 2016, I’m also affiliated to the Department of Health Informatics, METU, Turkey both as a senior research fellow and a faculty candidate.

Why did you choose to apply to the OMA visiting fellowship programme?

The team behind OMA is world-leading in the field of phylogenomics, and they authored many highly cited publications in this area. Moreover, OMA is considered to be one of the most reliable and comprehensive resources offering phylogenomic information on various species. I’ve applied to this programme in order to develop my knowledge in phylogenomic research, particularly about the OMA production. My specific research aim was to investigate if and how the information in OMA can be utilized in order to increase the coverage and the quality of the automated functional annotation of proteins in the UniProt database.

Discussions on UNIL campus with Leonardo de Oliveria Martins, Surag Nair, Clément Train, David Dylus and Tunca Doğan (from l. to r.)

What project did you work on during your visit?

The project I worked on had two sides: 1) investigating novel ways of quality checking of the data produced in the OMA pipeline (especially HOGs) using the Domain Architecture Alignment and Classification (DAAC) method I previously developed in UniProt; 2) investigating the use of OMA groups and HOGs to propagate the functional annotation between the (homologous) member proteins of the same clusters/classes.

Was there any highlight or low point you’d like to share?

It was a great experience for me both professionally and socially. I’ve learnt a great deal in just one month and we still keep our collaboration with the continuation of the abovementioned project. Everyone I met in the group: Christophe, Adrian, David, Leonardo and Clement were all knowledgeable, helpful and friendly that I had great time during my stay. It was a great pleasure to meet and to work with them all…

UNIL/EPFL campus is just beautiful, at the shores of lake Geneva. The campus is also well-equipped for all possible needs. This was also my first time in Switzerland and I was enchanted by the beauty of this country… The only downside for a foreign visitor could be the expensiveness of life in Switzerland, which was also manageable with a little prior investigation and planning.

Do you have any practical tip for future OMA visiting fellows?

I definitely recommend any researcher (at PhD or post-doc level) that has an interest in phylogenomics to apply to this programme. You’ll learn a great deal and have a good time at the same time. Also (for the foreigners) do not forget about travelling around this beautiful country in your spare time…

 

Editor’s note: If you are interested in the OMA visiting fellowship programme, consult this page.

References:

Doğan T, & Karaçalı B (2013). Automatic identification of highly conserved family regions and relationships in genome wide datasets including remote protein sequences. PloS one, 8 (9) PMID: 24069417

Doğan T, MacDougall A, Saidi R, Poggioli D, Bateman A, O’Donovan C, & Martin MJ (2016). UniProt-DAAC: domain architecture alignment and classification, a new method for automatic functional annotation in UniProtKB. Bioinformatics (Oxford, England), 32 (15), 2264-71 PMID: 27153729

Share or comment:

To be informed of future posts, sign up to the low-volume blog mailing-list, subscribe to the blog's RSS feed, or follow us on Twitter. To read old posts, check out the index here.


Interview with Rosa Fernández, 2016 OMA Visiting Fellow

• Author: Rosa Fernández García •

Note: We are rebooting our “Life in the Lab” series, which features interviews of interns and visitors. This post is by our inaugural OMA Visiting Fellow Rosa Fernández García, who spent a month with us earlier this year. You can follow Rosa on Twitter at @Rosamygale. —Christophe

 

Please introduce yourself and your research interests.

I received my bachelor’s degree in Biology (major in Zoology) at Complutense University in Madrid, Spain. I got my master’s and PhD at the same university with a thesis about phylogeny and phylogeography of cosmopolitan earthworms. After that, I moved to the lab of Prof. Gonzalo Giribet at Harvard University where I was a postdoc during 3 years and a Research Associate for another year. In January 2017, I’ll move to Barcelona to work as a Research Fellow in the lab of Dr. Toni Gabaldón at the Center for Genomic Regulation.

My research addresses fundamental questions about evolution in invertebrates: in other words, I am fascinated by how, when and where biodiversity took its form, and why it is maintained. My main two animal groups of interest are terrestrial annelids (oligochaetes) and (pan)arthropods, particularly the earliest branching lineages and most scientifically neglected groups (chelicerates and myriapods).

How did biodiversity took its shape? Resolving the tree of life. Macroevolutionary patterns are generally what we see when we look at the large-scale history of life. It encompasses the grandest trends and transformations in evolution, such as the origin of bilateral animals or the radiation of arthropods. In order to understand how lineages are related to each other, I study macroevolutionary patterns in several groups of invertebrates through phylogenetics and phylogenomics. I currently lead a fruitful line of research dealing with phylogenomics of myriapods and chelicerates, having optimized protocols to sequence successfully single individuals of the rarest and smallest arthropods. We are getting closer to resolve the Arthropod Tree of Life!

Artist’s rendition of the Arthropod tree of life

When and where? I tried to understand the mode and tempo of animal diversification patterns through the integration of phylogeography, biogeography and paleogeography.

Why? Comparative transcriptomics and genomics is a very powerful tool to shed light on very interesting evolutionary questions, such as arthropod terrestrialization - one of my favorite new lines of research.

Why did you choose to apply to the OMA visiting fellowship programme?

Orthology inference is one of the key steps in phylogenomics. I had been using OMA for a few years and I wanted to learn how I could use it more efficiently in my ongoing projects.

What project did you work on during your visit?

My project focused on optimizing OMA runs for some big and challenging data sets that I was having problems with. Also, I was interested in learning how I could exploit hierarchical orthogroups for comparative genomics studies in arthropods.

Rosa and David Dylus at coffee break (photo by Arthur Dessimoz)

Was there any highlight or low point you’d like to share?

It was a great experience to be in the Dessimoz lab for a month. As a systematist with relatively limited bioinformatic background, it was absolutely great to exchange ideas with computer scientists interested in the same scientific problems but with a completely different perspective that mine. It was a very enriching experience.

Do you have any practical tip for future OMA visiting fellows?

One month was not enough for me, so try to stay longer if your project is ambitious. And ask Christophe to bring a Tête de Moine cheese in your last day, it’s delicious!

 

Editor’s note: If you are interested in the OMA visiting fellowship programme, consult this page.

References:

Fernández R, Laumer CE, Vahtera V, Libro S, Kaluziak S, Sharma PP, Pérez-Porro AR, Edgecombe GD, & Giribet G (2014). Evaluating topological conflict in centipede phylogeny using transcriptomic data sets. Molecular biology and evolution, 31 (6), 1500-13 PMID: 24674821

Fernández, R., Hormiga, G., & Giribet, G. (2014). Phylogenomic Analysis of Spiders Reveals Nonmonophyly of Orb Weavers Current Biology, 24 (15), 1772-1777 DOI: 10.1016/j.cub.2014.06.035

Fernández R, & Giribet G (2015). Unnoticed in the tropics: phylogenomic resolution of the poorly known arachnid order Ricinulei (Arachnida). Royal Society open science, 2 (6) PMID: 26543583

Novo M, Fernández R, Andrade SC, Marchán DF, Cunha L, & Díaz Cosín DJ (2016). Phylogenomic analyses of a Mediterranean earthworm family (Annelida: Hormogastridae). Molecular phylogenetics and evolution, 94 (Pt B), 473-8 PMID: 26522608

Fernández R, Edgecombe GD, & Giribet G (2016). Exploring Phylogenetic Relationships within Myriapoda and the Effects of Matrix Composition and Occupancy on Phylogenomic Reconstruction. Systematic biology, 65 (5), 871-89 PMID: 27162151

Sharma, P., Fernandez, R., Esposito, L., Gonzalez-Santillan, E., & Monod, L. (2015). Phylogenomic resolution of scorpions reveals multilevel discordance with morphological phylogenetic signal Proceedings of the Royal Society B: Biological Sciences, 282 (1804), 20142953-20142953 DOI: 10.1098/rspb.2014.2953

Rosa Fernandez, Prashant Sharma, Ana LM Tourinho, & Gonzalo Giribet (2016). The Opiliones Tree of Life: shedding light on harvestmen relationships through transcriptomics BioRxiv DOI: 10.1101/077594

Share or comment:

To be informed of future posts, sign up to the low-volume blog mailing-list, subscribe to the blog's RSS feed, or follow us on Twitter. To read old posts, check out the index here.


Lausanne, Switzerland

• Author: Christophe Dessimoz •

The new academic year brings a big change to our lab. I am moving to the University of Lausanne, Switzerland, on a professorship grant from the Swiss National Science Foundation. The generous funding will enable us to expand our activities on computational methods dealing with mixtures of phylogenetic histories. Lausanne is a hub for life sciences and bioinformatics so we will feel right at home there—indeed we have already been collaborating with several groups there. I join the Center for Integrative Genomics and the Department of Ecology and Evolution. I also look forward to reintegrating the Swiss Institute of Bioinformatics. At a personal level, this marks a return to a region in which I grew up, after 16 years in exile.

However, I keep a joint appointment at UCL, where part of the lab remains. I’ll be flying back regularly and keep some of my teaching activities. UCL is a very special place—one which would be too hard for me to leave entirely. For all the cynicism we hear about universities-as-businesses, the overriding priority at UCL clearly remains on outstanding scholarship. My departments (Genetics, Evolution, Environment and Computer Science) are both highly collegial and supportive. Compared to the previous institutions I have worked for, the organisational culture at UCL is very much bottom-up. The pervasive chaos is perceived as a shortcoming by some, but it’s actually a huge competitive advantage—one that leaves ample room for initiative and flexibility. One colleague once told me that I could build a nuclear reactor in my lab and no one would ask a question—provided I secure the funding for it of course…

So how are we going to manage working in two different sites? Well, the situation is not new. We have had a distributed lab for several years and have developed a system for remote collaboration. Currently, we have lab members primarily based in London, Zurich, Ghent, and Cambridge. Our weekly lab meeting and monthly journal club are done via videoconference (with GoToMeeting). I try to have at least fortnightly 1:1 meetings with all remote members. During the day, the lab stays in touch via instant messaging (using HipChat). We have shared code (git) and data (sshfs) repositories. We tend to write collaborative papers using Google Docs (with Paperpile as reference manager). Importantly, we have a lab retreat every four months where we meet in person, reflect on our work, and have fun. We supplement this with collaborative visits as needed. The system is not perfect—please share your experience if you’ve found other good ways of collaborating remotely—but overall it’s working quite well.

Share or comment:

To be informed of future posts, sign up to the low-volume blog mailing-list, subscribe to the blog's RSS feed, or follow us on Twitter. To read old posts, check out the index here.


Quest for Orthologs 4

• Authors: Ed Chalstrey, Jan Koch, Clement Train & Lucas Wittwer •

On 25-27 May 2015, the lab attended the 4th international ‘Quest for Orthologs’ conference held at Center for Genomic Regulation (CRG) in Barcelona, Spain. The following blog entry is a summary of the experiences had at the conference by Ed Chalstrey, Jan Koch, Clement Train, and Lucas Wittwer, who are interns and master’s students in the Dessimoz lab.

 

Quest for Orthologs (QfO) is a meeting of groups working on orthology detection and phylogenomic databases, with an aim to improve and standardise orthology predictions. This meeting was part of a series of conferences beginning in 2009, which have successfully brought together a community of researchers with shared goals. These goals included collaboration on benchmarking and the sharing of reference datasets.

As short project students in the group, QfO gave an excellent opportunity for those of us based at UCL to meet some of our colleagues from ETH (in Zurich) and Bayer CropScience (in Ghent) in person for the first time and to make contact with other scientists working in the field of ortholog prediction.

QFO picture 1QFO picture 2

As young scientists, some of the most important questions we face are: Will I be able to explain my project to established scientists and discuss it with them? Will I be able to understand the work of other scientists, even if their research topic falls outside my area of expertise? How can I have new ideas and be inspired to contribute to an area of research I’m new to? For us, most of whom had not attended a conference before, QfO was the perfect place to begin answering these questions.

The conference involved talks from each of the research groups and a poster session for students to display their contributions. Each of the postdocs and PhD students in the Dessimoz lab gave a short talk to introduce their posters, as well as one of us (Clement).

Clement: “The talk and the poster were the great practice for us to increase our communication skills by presenting to an audience composed of experts in related topics. This enabled us to adapt our talks depending of the kind of people we had in front of us and exchange ideas with other people during a constructive conversation. Also, attending talks on the many fields related to our work (orthology) was an amazing experience as interns, both in discovering new things and helping us in our own project with new ideas and other ways of thinking.”

QfO was a great opportunity for us to meet scientists that have worked in the field for many years and from all over the world. We were able to benefit from their experience and the advice they gave us after talking with them about our own research projects, gaining a different perspective to that of our usual supervisors and colleagues. One of the discussions had by Jan with two researchers from Switzerland may even lead to a potential future collaboration; they were interested in DLIGHT, a program that was developed by our group.

As well as discussing our current work, the conference also gave us the chance to think about future work opportunities and network with established scientists. One of the highlights for us was meeting Eugene Koonin and Sergei Mekhedov from the NCBI at the conference dinner. We had an amusing chat (about topics not necessarily related to orthology!) and an enjoyable evening. They even invited us to visit them at the NCBI!

All in all, we greatly benefited from our participation in the QfO conference.

Share or comment:

To be informed of future posts, sign up to the low-volume blog mailing-list, subscribe to the blog's RSS feed, or follow us on Twitter. To read old posts, check out the index here.


My internship experience, by Anna Sueki

• Author: Anna Sueki •

Note: this is another interview in our series “Life in the Lab”, which gives unedited accounts of students who have spent time with us. —Christophe

 

Please introduce yourself in a few sentences.

I am Anna Sueki, 3rd year Biochemistry student at UCL. I did my summer internship at Dr. Dessimoz’s lab during summer 2014. I’m originally from Japan, but grew up in Singapore and Germany.

Why did you choose to join the lab?

During my second year, I took a module called “Computational Biology” and I really enjoyed learning programming and other computational aspects in biology. Dr. Dessimoz was one of the lecturer for that module, and since I found his lectures interesting, I applied to his lab for this summer internship.

What project did you work on (and for how long)?

My project was “Synteny visualisation in the OMA browser”. This was for the new release of the OMA browser, and I created the new synteny viewer function within the browser to show neighbouring genes of the entry gene and its ortholog relationships. I worked for 3 months (from June to September 2014) during my summer vacation.

What came out of it?

With the help of other members in the lab, I could produce the synteny viewer function, and it will be implemented in the new OMA browser. Also, explanation of this new function and an example was included in the paper for new OMA release.

Other than this actual project outcome, I learned about how computers and their systems work, programming of Python, and how to use Python’s web framework Django.

OMA Synteny Viewer

New synteny viewer in the OMA Browser

Was there any highlight or low point you’d like to share? Personally, I enjoyed the lab retreat in Zurich a lot. It was nice to see the members in Zurich who I saw only through video meeting every week. Also I got many helps from Zurich members through the chatting system, especially Adrian and Clement.

What is your overall impression and would you do it again?

I really enjoyed my time in Dr. Dessimoz’s lab, and I learned a lot during these three months. I started without barely any knowledge in computer science but through this summer, I found out that I like learning coding and other computational skills. If I can achieve another chance, I would love to work with Dr. Dessimoz and other members again!

Share or comment:

To be informed of future posts, sign up to the low-volume blog mailing-list, subscribe to the blog's RSS feed, or follow us on Twitter. To read old posts, check out the index here.


My internship experience, by Leslie Macmillan

• Author: Leslie Macmillan •

Note: this is the third interview in our new series “Life in the Lab”, which gives unedited accounts of students who have spent time with us. —Christophe

 

I just finished the M.Sc. Computer Science program at UCL, and did my summer project in the Dessimoz lab. I chose this lab because the project sounded interesting and the environment seemed supportive. Appearances turned out to be correct!

My project was increasing the speed of the existing Trees of Life program, which draws phylogenetic trees based on genetic distance within orthologous gene groups. At the end of the project, the overall speedup was 1.40X, and along the way I learned a lot about optimisation and many other topics: from UNIX-based systems to linear algebra to genetics.

Tree of Life

Sample output of the “TreeCollection” program

Overall I would really recommend working in this lab to anyone considering it. Though I was just here for three months, I felt very included and supported throughout the project. It was a great way to see the inner workings of a bioinformatics lab, and through this experience I was able to go to two conferences and a lab retreat in Switzerland!

Retreat 1Retreat 2

Share or comment:

To be informed of future posts, sign up to the low-volume blog mailing-list, subscribe to the blog's RSS feed, or follow us on Twitter. To read old posts, check out the index here.


My internship experience, by Ferenc Galkó

• Author: Ferenc Galkó •

Note: this is the second interview in our new series “Life in the Lab”, which gives unedited accounts of students who have spent time with us. —Christophe

 

Please introduce yourself in a few sentences.

My name is Ferenc Galkó and I am a student from Hungary. I am 22 years old and will be graduating this December (2014) with a BSc in computer science from the Budapest University of Technology and Economics. I am really interested in doing an MSc and later possibly PhD abroad.

Why did you choose to join the lab?

As a BSc student I did not have much opportunities to contribute to the intriguing work of research labs, nor had I the time to move away from my home country for a longer period. I thought that an internship abroad would help to fulfil both of my aspirations.

What project did you work on (and for how long)?

I have worked for about three months on the migration of various operations on sequences, including a vectorized version of the Smith-Waterman algorithm written in C. The existing, efficient code base had to be migrated by using NumPy and Cython to make the main functionalities available from Python.

What came out of it?

At the end of the internship we had a new Python package called Python Optimal Pairwise Alignment (PyOPA), which provides an opportunity to perform operations with the efficiency of a vectorized C code with the convenience of a Python code. It is also a plan to publish the package in the PyPi shop in the very near future.

Was there any highlight or low point you’d like to share?

I had started my internship just a day before the lab retreat in Switzerland. I think it was a great thing to start the internship like this, because I have met almost everyone in person from the lab. During the retreat there was a special session addressed for pre-PhD students, where we discussed the advantages and the costs of doing a PhD. It was really great to hear unbiased advice from those who already went through this process or currently doing it.

What is your overall impression and would you do it again?

I would definitely do it again, it was a great and unique experience altogether. I have met interesting people from many countries and gained insight into a well-known research lab, which will surely form my upcoming years of study in a positive way.

Share or comment:

To be informed of future posts, sign up to the low-volume blog mailing-list, subscribe to the blog's RSS feed, or follow us on Twitter. To read old posts, check out the index here.


Creative Commons
                    License The Dessimoz Lab blog is licensed under a Creative Commons Attribution 4.0 International License.