Skills
Collaboration & Communication :
Working in an independent manner and also in cross-functional teams (clinicians, wet-lab scientists, data scientists, software engineers, UX/UI designers)
Gathering, evaluating and synthesizing of information into articles
Project management and mentoring
Good at communication and presentation skills
Good at problem solving and adaptability
Curious and loves to learn new things
Programming & Tools :
Python - NumPy, pandas, scikit-learn, seaborn, PyTorch, Matplotlib, Plotly, Biopython
R - CRAN, Bioconductor
RMarkdown, Jupyter Notebooks
unit testing - R/testthat, Python/unittest/pytest
bash
Machine Learning :
Statistical tests: incl. t-tests, Wilcoxon
Regression: incl. linear regression, logistic regression, Cox/survival analysis, elastic net/ridge/LASSO
Classification: incl. random forests, XGBoost, SVM, Linear Discriminant Analysis (LDA)
Clustering: K-means, EM algorithm
Probabilistic models: Hidden Markov models (HMMs), linear Gaussian state-space models
Dimensionality reduction & factorization: PCA, t-SNE, MOFA, NMF
Sampling & optimization: replica exchange Monte Carlo
Deep learning: variational autoencoders (VAEs), CNNs, transformers, retrieval-augmented generation (RAG), LLMs
Federated learning
MLOps :
Workflow languages - Nextflow, Snakemake
Docker, Singularity
SLURM, Grid Engine, Kubernetes
AWS, DigitalOcean
Databases : MySQL, SQLite, PostgreSQL
Version Control & Software Management : Linux/Unix systems, git, svn, conda, GNU Guix
Web Frameworks : Django, CSS, JavaScript, HTML, jQuery, PHP