Máster en Bioinformática Aplicada a Medicina Personalizada y Salud (Curso 2024-2025)
- Day 1 (24.02.2025) [online]:
- 2/3 Theory
- 1/3 Project: Session I (project presentation)
- Day 2 (25.02.2025) [online]:
- 1/2 Theory
- 1/2 Project: Session II
- Day 3 (26.02.2025) [online]:
- 2/3 Theory
- 1/3 Project: Session III (with problem and dataset presentation HITO-1)
- Day 4 (27.02.2025):
- Practice: Hands-On (Machine Learning Basics in Python with scikit-learn Part I)
- Day 5 (28.02.2025):
- 2/3 Practice: Hands-On (Machine Learning Basics in Python with scikit-learn Part II)
- 1/3 Practical tips for Machine Learning
- Day 6 (03.03.2025):
- 1/2 Theory: presentation of a real case-study (the PolyDeep project)
- 1/2 Project: Session IV
- Day 7 (04.03.2025):
- Project: Session III
- Day 8 (05.03.2025):
- Project Session: IV (with results presentation HITO-2)
Download and install Miniconda from: https://www.anaconda.com/download/success
Run the following command to create the Conda environment for the hands-on practice sessions:
conda env create -f environment.yml
And then activate it by running:
conda activate machine-learning
During the hands-on sessions (Machine Learning Basics in Python with scikit-learn), we are going to use the Breast Cancer Data
available here.
This file came from the UCI Machine Learning Repository. More information about this dataset can be found here and here.
To download it again, run the following commands:
mkdir data
wget https://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsin/wdbc.data -O data/wdbc.data
sed -i '1iid,diagnosis,radius_mean,texture_mean,perimeter_mean,area_mean,smoothness_mean,compactness_mean,concavity_mean,concave points_mean,symmetry_mean,fractal_dimension_mean,radius_se,texture_se,perimeter_se,area_se,smoothness_se,compactness_se,concavity_se,concave_points_se,symmetry_se,fractal_dimension_se,radius_worst,texture_worst,perimeter_worst,area_worst,smoothness_worst,compactness_worst,concavity_worst,concave points_worst,symmetry_worst,fractal_dimension_worst' data/wdbc.data
The information about the project is available here.
- Practical Statistics for Data Scientists: 50 Essential Concepts
- Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 2nd Edition
- Ten quick tips for machine learning in computational biology [10.1186/s13040-017-0155-3]
- DOME: recommendations for supervised machine learning validation in biology [10.1038/s41592-021-01205-4]
- The ABC recommendations for validation of supervised machine learning results in biomedical sciences [10.3389/fdata.2022.979465]