ML101: TP1

0575e100 · PLN (Algolia) · 855e62ce · 0575e100 · 0575e100 · 0575e100
Commit 0575e100 authored Jan 07, 2023 by PLN (Algolia)
20 changed files
--- a/slides/00-intro.md
+++ b/slides/00-intro.md
--- a/slides/01-choisir.md
+++ b/slides/01-choisir.md
@@ -13,26 +13,144 @@ paginate: true
 ---
 ## Perceptron
+<!-- Base fondamentale -->
+<!-- Little more than a learning thermostat -->
+---
+> _perceptron may eventually be able to learn, make decisions, and translate languages._
+**Frank Rosenblatt**, 1958
+---
+<!-- 
+Was researched until Minsky's killer book Perceptrons:
+> acknowledge some of the perceptrons' strengths while also showing major limitations
+-->
+...until Marvin Minsky's book 
+## _Perceptrons_ (1969)
+![bg right:38% fit](https://external-content.duckduckgo.com/iu/?u=https%3A%2F%2Fweltbild.scene7.com%2Fasset%2Fvgw%2Fperceptrons-195682049.jpg&f=1&nofb=1&ipt=802cd208ee9f59170d9fb973f0c820f23265e3d6c23b60529d332b7536fec005&ipo=images)
+---
+<!---
+-> Focus on "Symbolic systems"
+-> huge limits to perceptrons abilities
+-> killed the field: almost stopped research on the connectionnist paradigm for the next fifteen years, until it gets a REVIVAL
+-->
+- Pour aller plus loin: [_WP History of AI | Perceptrons and the attack on connectionism_](https://en.wikipedia.org/wiki/History_of_artificial_intelligence#Perceptrons_and_the_attack_on_connectionism)
+--- 
+![bg fit](https://upload.wikimedia.org/wikipedia/en/5/52/Mark_I_perceptron.jpeg)
+<!-- Mark I Perceptron machine, 1958 | Cornell University Library website -->
+<!-- 
+It was connected to a camera with 20×20 cadmium sulfide photocells to make a 400-pixel image.
+The main visible feature is a patch panel that set different combinations of input features. 
+To the right, arrays of potentiometers that implemented the adaptive weights. -->
+---
+# Formalisme: un Perceptron
+<!-- _backgroundColor: 
+_color: -->
+$$ f(\mathbf{x}) = \begin{cases}1 & \text{if }\ \mathbf{w} \cdot \mathbf{x} + b > 0,\\0 & \text{otherwise}\end{cases} $$
+---
+# Intuition: apprentissage d'un Perceptron
+<!-- _backgroundColor: 
+_color: -->
+![bg right 95%](https://upload.wikimedia.org/wikipedia/commons/thumb/8/8a/Perceptron_example.svg/1280px-Perceptron_example.svg.png)
+<!-- Diagram by Elizabeth Goodspeed -->
+$$ f(\mathbf{x}) = \begin{cases}1 & \text{if }\ \mathbf{w} \cdot \mathbf{x} + b > 0,\\0 & \text{otherwise}\end{cases} $$
+---
+![bg 90%](https://cdn-images-1.medium.com/max/1600/1*ZafDv3VUm60Eh10OeJu1vw.png)
+---
+![bg left fit](https://cdn-images-1.medium.com/max/1600/1*ZafDv3VUm60Eh10OeJu1vw.png)
+<!-- Image credit: Shruti Jadon, Introduciton to Different Activation Functions for Deep Learning
+https://medium.com/@shrutijadon/survey-on-activation-functions-for-deep-learning-9689331ba092 -->
+- Aller plus loin: [CodeX - Ansh David: _intro to Perceptrons and different type of activation functions_
+](https://medium.com/codex/single-layer-perceptron-and-activation-function-b6b74b4aae66)
+---
+## Multi Layer Perceptron
+![bg right:55% fit](https://rasbt.github.io/mlxtend/user_guide/classifier/NeuralNetMLP_files/neuralnet_mlp_1.png)
+<!-- Image credit & reference: Raschka, Sebastian (2018) MLxtend: Providing machine learning and data science utilities and extensions to Python's scientific computing stack.
+https://rasbt.github.io/mlxtend/
+ -->
+--- 
+## MLP == Neural Network!
 ---
-## Neural Network
+## Recurrent Neural Networks
+<!-- Recurrence -> loops -> temporal & dynamic behavior -->
+<!-- internal state as input == kind of 'memory'
+-> Works for understanding streams of stuff, e.g. cursive handwriting, or stock market data, or audio -> speech to text, music analysis, etc  
+ -->
+![bg fit right](https://www.skynettoday.com/assets/img/overviews/2020-09-27-a-brief-history-of-neural-nets-and-deep-learning/34568.gif)
+- See [The Unreasonable Effectiveness of RNNs](http://karpathy.github.io/2015/05/21/rnn-effectiveness/)
 ---
 ## LSTMS
+<!-- how can RNNS handle long term dependencies?” -->
+<!-- Let's learn from Christopher Olah: 
+co-founders of Anthropic, an AI lab focused on the safety of large models. Previously, I led interpretability research at OpenAI
+worked at Google Brain
+co-founded Distill, a scientific journal focused on outstanding communication. 
+-->
+- See [Understanding LSTM networks](https://colah.github.io/posts/2015-08-Understanding-LSTMs/) by **Christopher Olah**
 ---
-## Deep Learning:
+# Deep Learning
 --- 
 ### Layers, Layers, Layers!
--- 
-### Convolutions, Capsules and other tricks
 --- 
 ### Attention! It's all you need
+- https://arxiv.org/pdf/1706.03762.pdf
 --- 
 ### TRANSFORMERS
+![bg right fit](https://external-content.duckduckgo.com/iu/?u=http%3A%2F%2Fwww.wikialpha.org%2Fmediawiki%2Fimages%2Fthumb%2F1%2F17%2FSteppernightstick-united.jpg%2F480px-Steppernightstick-united.jpg&f=1&nofb=1&ipt=f023244f8ef45b4cba81ffdd819cada10f55dea2c207d715f0f10498b001582f&ipo=images)
 --- 
 ### Language Models
+- [BERT: Bidirectional
+Encoder Representations from Transformers](https://arxiv.org/pdf/1810.04805v2.pdf)
 ---
 ### Genetic Algorithms
 ---
@@ -42,11 +160,26 @@ paginate: true
 # Pratique : Choisir un Modèle
 ---
-### Selon la data
+# Supervisé ou non ?
+--- 
+![bg 93%](https://miro.medium.com/max/1400/0*p3zNp9D8YFmX32am.webp)
+<!-- Reference et source: https://medium.com/@dkatzman_3920/supervised-vs-unsupervised-learning-and-use-cases-for-each-8b9cc3ebd301 -->
+--- 
+### Selon l'objectif
+- [Overview of Supervised ML Algorithms](https://towardsdatascience.com/overview-of-supervised-machine-learning-algorithms-a5107d036296)
 ---
-### Selon les ressources
+![bg](https://miro.medium.com/max/1400/1*31IsgRs2QZ7H4kl8tNqjdQ.webp)
 ---
-### Selon l'usage
+### Selon le tooling à disposition
--- a/slides/03-tester.md
+++ b/slides/03-tester.md
@@ -15,7 +15,7 @@ paginate: true
 ---
 ## De manière continue
 ---
-### Quand la data chanfge
+### Quand la data change
 ---
 ### Quand le modèle évolue
 ---

--- a/slides/04-utiliser.md
+++ b/slides/04-utiliser.md
@@ -69,3 +69,4 @@ PyLint
 Black
 PyDantic
 MyPy
--- a/slides/img/00-alphago-movie.png
+++ b/slides/img/00-alphago-movie.png
--- a/slides/img/00-alphazero-learn.png
+++ b/slides/img/00-alphazero-learn.png
--- a/slides/img/00-alphazero.png
+++ b/slides/img/00-alphazero.png
--- a/slides/img/00-codex.png
+++ b/slides/img/00-codex.png
--- a/slides/img/00-copilot-verbatim.gif
+++ b/slides/img/00-copilot-verbatim.gif
--- a/slides/img/00-deepmind.png
+++ b/slides/img/00-deepmind.png
--- a/slides/img/00-huggingface.png
+++ b/slides/img/00-huggingface.png
--- a/slides/img/00-openai.png
+++ b/slides/img/00-openai.png
--- a/slides/img/00-turfu.png
+++ b/slides/img/00-turfu.png
--- a/tp/00-intro-tp.md
+++ b/tp/00-intro-tp.md
@@ -8,6 +8,12 @@ paginate: true
 footer: "ML101 | TP0: Introduction | Paul-Louis Nech | INTECH 2022-2023"
 ---
+<style>
+li  {
+  font-size: 0.8em;
+}
+</style>
 # <!-- fit --> TP1: Introduction
 <!-- ### Définitions & Histoire du ML  
 ### méthodes et outils ; 
@@ -42,7 +48,7 @@ Sur l'intranet ou à formation@nech.pl
 ###### Faites une phrase avec vos propres mots pour définir ce que veut dire:
 - "Apprendre"
- "Deep Learning"
+- "Machine Learning"
 - "Précision et Rappel"
 - "Overfit"
@@ -87,8 +93,8 @@ Lisez un peu sur cette personne, puis partagez ici quelque-chose qu'elle a dit o
 ### Lvl 3: Jouer avec des Image Models
- Ouvrez "ThisPersonDoesNotExist" et dites en quelques mots votre impression sur la qualité des images générées.
+- Ouvrez "[ThisPersonDoesNotExist](thispersondoesnotexist.com)" et dites en quelques mots votre impression sur la qualité des images générées.
- Ouvrez "ThisXDoesNotExist", choisissez un autre modèle, et commentez sa qualité.
+- Ouvrez "[ThisXDoesNotExist](thisxdoesnotexist.com)", choisissez un autre modèle, et commentez sa qualité.
 --- 

--- a/tp/01-choisir/mnist.py
+++ b/tp/01-choisir/mnist.py
+from datasets import load_dataset, DatasetDict, Dataset
+from torchvision.transforms import Compose, ColorJitter, ToTensor
+import matplotlib.pyplot as plt
+import numpy as np
+def main():
+    # Load
+    dataset = load_dataset("mnist")
+    # Inspect
+    # print(dataset)
+    # dataset["train"][0]['image'].show()
+    # dataset["test"][0]['image'].show()
+    # Prepare labels mapping to be able to go from id to label
+    labels = dataset["train"].features["label"].names
+    label2id, id2label = {}, {}
+    for i, label in enumerate(labels):
+        label2id[label] = str(i)
+        id2label[i] = label
+    # Now check e.g. label for image 2:
+    print(f"Image 2 is labelled {id2label[2]}")
+def use_image_multi_class_classification():
+    from transformers import AutoFeatureExtractor, AutoModelForImageClassification
+    extractor = AutoFeatureExtractor.from_pretrained("autoevaluate/image-multi-class-classification")
+    model = AutoModelForImageClassification.from_pretrained("autoevaluate/image-multi-class-classification")
+if __name__ == '__main__':
+    main()
\ No newline at end of file
--- a/tp/01-choisir/requirements.txt
+++ b/tp/01-choisir/requirements.txt
+matplotlib
+datasets
+torchvision
+transformers
+evaluate
\ No newline at end of file
--- a/tp/01-tp-choisir.md
+++ b/tp/01-tp-choisir.md
+---
+marp: true
+theme: uncover
+color: #eee
+colorSecondary: #333
+backgroundColor: #111
+paginate: true
+footer: "ML101 | TP1: Choisir un modèle | Paul-Louis Nech | INTECH 2022-2023"
+---
+<style>
+li  {
+  font-size: 0.8em;
+}
+</style>
+# <!-- fit --> TP1: Choisir un modèle
+<!-- 
+Théorie : méthodes
+fondamentales
+Pratique : choisir un modèle
+adapté à son problème
+-->
+---
+Objectifs : 
+- Acquérir une intuition des différences entre grandes familles de modèles de Machine Learnig
+- Savoir choisir choisir un modèle adapté à son problème
+---
+Format: Rendu écrit (fichier Markdown ou Doc avec une section par _Level_)
+Sur l'intranet ou à formation@nech.pl
+<br />
+**DEADLINE : 23 Janvier 23:59:59**
+<br />
+> _Le cachet de mon mailserver faisant foi_.
+---
+## Lvl 0: La base
+![bg right:35% w:300](https://www.meme-arsenal.com/memes/a6effdba5a540560c7b5ee616ee0f1f3.jpg)
+<!-- Image credit: World of Warcraft Tutorial boar -->
+###### Faites une phrase avec vos propres mots pour définir ce que veut dire:
+- "Apprentissage Supervisé"
+- "Apprentissage Non Supervisé"
+- "Layer"
+- ""
+---
+## Lvl 1: Intro to Neural Nets by Google
+- [Google Colab: Exercises - Intro to Neural Nets](https://colab.research.google.com/github/google/eng-edu/blob/main/ml/cc/exercises/intro_to_neural_nets.ipynb?hl=en#scrollTo=g8HC-TDgB1D1)
+:rocket: simple linear regression model VS Neural network :rocket:
+---
+## Lvl 1.1 : baseline 
+**Terminez les sections jusqu'à "_Build a linear regression model as a baseline_"**
+<br />
+- Partagez les métriques `loss` et `mean_squared_error` obtenues avec votre baseline de régression linéaire
+Exemple: 
+```
+ Evaluate the linear regression model against the test set:
+300/300 [==============================] - 1s 2ms/step - loss: 0.4018 - mean_squared_error: 0.4018
+[0.40178823471069336, 0.40178823471069336]
+```
+---
+## Lvl 1.2 : beat the baseline :metal: 
+**Terminez les sections jusqu'à "_Call the functions to build and train a deep neural net_"**
+<br />
+- Partagez les premières métriques `loss` et `mean_squared_error` obtenues avec votre modèle de réseau de neurones
+Exemple: 
+```
+Evaluate the new model against the test set:
+3/3 [==============================] - 0s 7ms/step - loss: 0.3705 - mean_squared_error: 0.3705
+[0.3705473244190216, 0.3705473244190216]
+```
+---
+## Lvl 2 : now beat yourself :smiling_imp: 
+**Terminez les sections jusqu'à "_Task 2: Optimize the deep neural network's topography_"**
+<br />
+- Partagez la définition finale de votre réseau de neurones
+- Partagez votre :muscle: **meilleur résultat** :muscle: de métriques `loss` et `mean_squared_error` obtenues avec votre modèle de réseau de neurones 
+- Répondez en quelques mots : qu'est-ce qui affectait la performance de votre réseau de neurones ? Quel impact sur sa vitesse d'apprentissage ?
+---
+## Lvl 3 : now beat the real world
+Comment faire mieux sur le _test_ set, et pas seulement bien apprendre le _training set_ ?
+**Terminez la dernière section "_Task 3: Regularize the deep neural network_"**
+<br />
+---
+## Lvl 3.1 : high score, real world
+- Partagez votre :brain: **meilleur résultat en situation réelle** :brain: de métriques `loss` et `mean_squared_error` obtenues avec votre modèle de réseau de neurones régularisé 
+---
+## Bonus: Mais alors mon modèle d'avant il vaut quoi ?
+- Répondez en quelques mots : en quoi la régularisation est utile ? Qu'en concluez vous sur votre évaluation initiale du modèle (les "meilleures métriques du Lvl 2?)
--- a/tp/02-tp-entrainer.md
+++ b/tp/02-tp-entrainer.md
+---
+marp: true
+theme: uncover
+color: #eee
+colorSecondary: #333
+backgroundColor: #111
+paginate: true
+footer: "ML101 | TP1: Choisir un modèle | Paul-Louis Nech | INTECH 2022-2023"
+---
+<style>
+li  {
+  font-size: 0.8em;
+}
+</style>
+# <!-- fit --> TP1: Choisir un modèle
+<!-- 
+Théorie : méthodes
+fondamentales
+Pratique : choisir un modèle
+adapté à son problème
+-->
+---
+Objectifs : 
+- Théorie : outils fondamentaux
+- Pratique : entrainer un modèle
+---
+Format: Rendu écrit (fichier Markdown ou Doc avec une section par _Level_)
+Sur l'intranet ou à formation@nech.pl
+<br />
+**DEADLINE : 24 Janvier 23:59:59**
+<br />
+> _Le cachet de mon mailserver faisant foi_.
+---
+## Lvl 0: La base
+![bg right:35% w:300](https://www.meme-arsenal.com/memes/a6effdba5a540560c7b5ee616ee0f1f3.jpg)
+<!-- Image credit: World of Warcraft Tutorial boar -->
+###### Faites une phrase avec vos propres mots pour définir ce que veut dire:
+FOO BAR BAZ
+---
+## Lvl 1: Multi-class classification with MNIST
+- [Google Colab: Exercises - Multi-Class Classification with MNIST](https://colab.research.google.com/github/google/eng-edu/blob/main/ml/cc/exercises/multi-class_classification_with_MNIST.ipynb?hl=en#scrollTo=XuKlphuImFSN)
+:rocket: simple linear regression model VS Neural network :rocket:
--- a/tp/04-tp-utiliser.md
+++ b/tp/04-tp-utiliser.md
--- a/tp/use/diffusion.py
+++ b/tp/use/diffusion.py
+from diffusers import DDPMPipeline
+def main():
+    image_pipe = DDPMPipeline.from_pretrained("google/ddpm-celebahq-256")
+    image_pipe.to("cuda")
+    pass
+if __name__ == '__main__':
+    main()
\ No newline at end of file