I maintained and developed Genomation - a Bioc R package that provides collection of functions for simplfiying common tasks in genomic feature/interval analysis. It provides functions for reading BED and GFF files as GRanges objects, summarizing genomic features over predefined windows so users can make average enrichment of features over defined regions or produce heatmaps. It can also annotate given regions with other genomic features such as exons,introns and promoters.
People: Altuna Akalin, Vedran Franke and others
I contributed to methylKit - a Bioc R package for DNA methylation analysis and annotation from high-throughput bisulfite sequencing. The package is designed to deal with sequencing data from RRBS and its variants, but also target-capture methods such as Agilent SureSelect methyl-seq. In addition, methylKit can deal with base-pair resolution data for 5hmC obtained from Tab-seq or oxBS-seq. It can also handle whole-genome bisulfite sequencing data if proper input format is provided.
People: Altuna Akalin, Alex Blume and others
GitHub: https://github.com/al2na/methylKit
PiGx is a collection of genomics pipelines implemented in snakemake, Python and R. All pipelines are easily configured with a simple sample sheet and a descriptive settings file. The result is a set of comprehensive, interactive HTML reports with interesting findings about your samples.
People: Altuna Akalin, Ricoardo Wurmus and others
GitHub: http://bioinformatics.mdc-berlin.de/pigx/
The motifActivity R package predicts key transcription factors (TFs) driving gene expression or epigenetic marks changes across the input samples, and the activity profiles of TFs. As input is uses a set of gene expression (e.g. RNA-seq) or epigenetic marks (such as from BS-seq, ChIP-seq, ATAC-seq etc.) across samples, and a set of DNA motifs.
People: Katarzyna Wreczycka under the supervision of Altuna Akalin
GitHub: https://github.com/katwre/motifActivity
Enhancements of an interactive tool for the visual exploration of genomic data called IGV web application (original source code) implemented in Javascript and Python included:
Example view from the IGV web app displaying genomic data tracks.
The lattice protein hydrophobic-polar (HP) model, showing the global energy.
GitHub: https://github.com/katwre/bioinformatics-projects/tree/master/Molecular_Dynamics
Modern short-read assembly algorithms construct a de Bruijn graph by
representing all k-mer prefixes and suffixes as nodes and then drawing
edges that represent k-mers having a particular prefix and suffix [1].
Eulerian walk allows to reconstruct the DNA sequence from its fragments
(k-mers) [2].
[1] Phillip E C Compeau, Pavel A Pevzner and Glenn
Tesler (2011). How to apply de Bruijn graphs to genome assembly. Nature
Biotechnology 29, 987–991
[2] Pavel A. Pevzner, Haixu Tang and
Michael S. Waterman (2001). An Eulerian path approach to DNA fragment
assembly. Proc Natl Acad Sci U S A., 98(17): 9748–9753
GitHub: https://github.com/katwre/bioinformatics-projects/tree/master/genome_assembly
GitHub: https://github.com/katwre/Minesweeper
A django based server for Multiple Sequence Alignment (MSA)
visualization
GitHub: https://github.com/freesci/MSA-vis-project
Phone application with django 1.5.1, manifesto app, localStorage:
GitHub: https://github.com/katwre/phone_application
Find your best career matches based on your personality profile based on the Big Five Aspects Scale. This interactive web tool lets users explore their personality traits and see how their profile aligns with different career paths. This tool uses Python (running directly in your browser via Pyodide without need to precompile) and machine learning techniques like PCA and clustering using sklearn and pandas libraries to generate personalized visualizations and career matches.
Figure: PCA plot showing career matches based on your personality profile.