π What is this about?
Data science & machine learning (ML) are transforming biology by helping us find hidden patterns in huge datasets β from DNA sequences to patient health records.
In bioinformatics, ML helps:
β
Predict disease risk,
β
Classify cancer types,
β
Find new drug targets,
β
And even forecast how microbes might evolve.
π§ Basics of supervised vs. unsupervised learning
| Type of learning | What it means | Examples in bioinformatics |
|---|---|---|
| Supervised learning | Model learns from labeled data (you know the answers) | Classify tumors as benign vs. malignant, predict patient survival. |
| Unsupervised learning | Model finds natural patterns in unlabeled data | Cluster gene expression profiles to find unknown subtypes of cancer. |
π Mini workflow: how ML is used in bioinformatics
javaCopyEdit1. Collect data
(gene expression, mutations, clinical data)
β
2. Preprocess data
(clean, normalize, remove noise)
β
3. Choose ML method
(supervised for prediction, unsupervised for clustering)
β
4. Train model
β
5. Test & validate
β
6. Biological interpretation
(find key genes, pathways, predict outcomes)
π Short case study: Classifying cancer using gene expression data
𧬠Scenario
- Researchers have gene expression profiles of patients with different cancer types.
π» What they do
- Use Python (libraries like
Pandas,Scikit-learn). - Apply supervised learning (e.g. support vector machines or random forests) to train a model to classify samples as breast vs. lung vs. colon cancer based on expression levels.
π― Why this matters
- Helps doctors diagnose faster or predict which patients will respond to a treatment, moving toward personalized medicine.
β Short summary table
| ML method | Used for | Example bioinformatics application |
|---|---|---|
| Supervised (classification, regression) | Predict labels / outcomes | Predict cancer subtype from RNA-seq data. |
| Unsupervised (clustering, PCA) | Find hidden patterns, group samples | Discover new disease subtypes from gene profiles. |
| Feature selection | Find most important genes/features | Identify top 10 genes distinguishing cancer vs. normal. |
