Commit 0575e100 by PLN (Algolia)

ML101: TP1

parent 855e62ce
......@@ -6,9 +6,15 @@ colorSecondary: #333
backgroundColor: #111
paginate: true
transition: wipe
footer: "ML101 | Paul-Louis Nech | INTECH 2022-2023"
footer: "ML101 par Paul-Louis Nech | Présenté à l'INTECH Info | © 2022-2023"
---
<style>
li {
font-size: 0.8em;
}
</style>
# <!-- fit --> Hello World
ROBOTS ARE UPRISING. WHAT SIDE ARE YOU ON?
......@@ -54,7 +60,7 @@ $$ {\displaystyle g(x):=f^{L}(W^{L}f^{L-1}(W^{L-1}\cdots f^{1}(W^{1}x)\cdots ))}
![bg right](https://uproxx.com/wp-content/uploads/2015/05/angry-bender.jpg?w=650)
<!-- Image Credit: Bender from Futurama, created by Matt Groening and David X. Cohen -->
###### - [the Fallacy of generalization from Fictional evidence](https://www.lesswrong.com/posts/rHBdcHGLJ7KvLJQPk/the-logical-fallacy-of-generalization-from-fictional)
- [the Fallacy of generalization from Fictional evidence](https://www.lesswrong.com/posts/rHBdcHGLJ7KvLJQPk/the-logical-fallacy-of-generalization-from-fictional)
<!-- The more details, the less likely! -->
---
......@@ -62,13 +68,13 @@ $$ {\displaystyle g(x):=f^{L}(W^{L}f^{L-1}(W^{L-1}\cdots f^{1}(W^{1}x)\cdots ))}
![bg right grayscale](https://uproxx.com/wp-content/uploads/2015/05/angry-bender.jpg?w=650)
###### - [AGI Ruin: A List of Lethalities](https://www.lesswrong.com/posts/uMQ3cqWDPHhjtiesc/agi-ruin-a-list-of-lethalities)
- [AGI Ruin: A List of Lethalities](https://www.lesswrong.com/posts/uMQ3cqWDPHhjtiesc/agi-ruin-a-list-of-lethalities)
---
## On va parler des derniers succès
#### - [Meta AI Research: CICERO](https://ai.facebook.com/blog/cicero-ai-negotiates-persuades-and-cooperates-with-people/)
- [Meta AI Research: CICERO](https://ai.facebook.com/blog/cicero-ai-negotiates-persuades-and-cooperates-with-people/)
<!-- Mais aussi:
- AlphaGo / AlphaZero
-
......@@ -79,7 +85,7 @@ $$ {\displaystyle g(x):=f^{L}(W^{L}f^{L-1}(W^{L-1}\cdots f^{1}(W^{1}x)\cdots ))}
## Et d'échecs intéressants
##### - [MIT Tech review: 100s of AI tools built for covid... None helped.](https://www.technologyreview.com/2021/07/30/1030329/machine-learning-ai-failed-covid-hospital-diagnosis-pandemic/)
- [MIT Tech review: 100s of AI tools built for covid... None helped.](https://www.technologyreview.com/2021/07/30/1030329/machine-learning-ai-failed-covid-hospital-diagnosis-pandemic/)
![bg left 100%](./img/01-covid-failed.png)
......@@ -88,36 +94,75 @@ $$ {\displaystyle g(x):=f^{L}(W^{L}f^{L-1}(W^{L-1}\cdots f^{1}(W^{1}x)\cdots ))}
## Mais aussi de révolutions en cours
##### - [Midjourney](fixme)
- [Midjourney](https://midjourney.com/)
![bg left](./img/01-midjourney.png)
<!-- Image Credit: MidJourney's Showcase -->
---
<!-- _footer: "" -->
![bg 100%](./img/01-diffusion.png)
![bg 93%](./img/01-diffusion.png)
<!-- Image Credit: Jay Allamar, The illustrated Stable Diffusion -->
<br />
<br />
<br />
<br />
<br />
##### - [The Illustrated Stable Diffusion by Jay Alamar](https://jalammar.github.io/illustrated-stable-diffusion/)
- [The Illustrated Stable Diffusion by Jay Alamar](https://jalammar.github.io/illustrated-stable-diffusion/)
---
### Comment les modèles... peuvent raconter des histoires
##### - [AI Dungeon 2](https://aidungeon.cc/)
- [AI Dungeon 2](https://aidungeon.cc/)
![bg left 100%](./img/01-dungeon.jpg)
---
### Comment les modèles... peuvent halluciner
##### - [DeepDream](https://en.wikipedia.org/wiki/DeepDream)
- [DeepDream](https://en.wikipedia.org/wiki/DeepDream)
- [Pareidolia](https://en.wikipedia.org/wiki/Pareidolia)
![bg left 100%](../tp/img/intech_dream.png)
---
### Comment ils nous battent à plate couture...
![bg right fit](./img/00-alphago-movie.png)
- [AlphaGo](https://www.nature.com/articles/nature16961)
<!--
Mastering the game of Go with deep neural networks and tree search
The game of Go has long been viewed as the most challenging of classic games for artificial intelligence owing to its enormous search space and the difficulty of evaluating board positions and moves. Here we introduce a new approach to computer Go that uses ‘value networks’ to evaluate board positions and ‘policy networks’ to select moves. These deep neural networks are trained by a novel combination of supervised learning from human expert games, and reinforcement learning from games of self-play. Without any lookahead search, the neural networks play Go at the level of state-of-the-art Monte Carlo tree search programs that simulate thousands of random games of self-play. We also introduce a new search algorithm that combines Monte Carlo simulation with value and policy networks. Using this search algorithm, our program AlphaGo achieved a 99.8% winning rate against other Go programs, and defeated the human European Go champion by 5 games to 0. This is the first time that a computer program has defeated a human professional player in the full-sized game of Go, a feat previously thought to be at least a decade away.
-->
---
### ...puis se battent eux-mêmes
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
- [AlphaZero](https://arxiv.org/abs/1712.01815)
<!-- In December 2017, AlphaZero beat the 3-day version of AlphaGo Zero by winning 60 games to 40, and with 8 hours of training it outperformed AlphaGo Lee on an Elo scale. AlphaZero also defeated a top chess program (Stockfish) and a top Shōgi program (Elmo).[7][8] -->
![bg 40%](./img/00-alphazero-learn.png)
<!--
Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm
The game of chess is the most widely-studied domain in the history of artificial intelligence. The strongest programs are based on a combination of sophisticated search techniques, domain-specific adaptations, and handcrafted evaluation functions that have been refined by human experts over several decades. In contrast, the AlphaGo Zero program recently achieved superhuman performance in the game of Go, by tabula rasa reinforcement learning from games of self-play. In this paper, we generalise this approach into a single AlphaZero algorithm that can achieve, tabula rasa, superhuman performance in many challenging domains. Starting from random play, and given no domain knowledge except the game rules, AlphaZero achieved within 24 hours a superhuman level of play in the games of chess and shogi (Japanese chess) as well as Go, and convincingly defeated a world-champion program in each case.
-->
----
### Comment ils nous comprennent
<!-- Nous catégorisent -->
......@@ -125,7 +170,7 @@ $$ {\displaystyle g(x):=f^{L}(W^{L}f^{L-1}(W^{L-1}\cdots f^{1}(W^{1}x)\cdots ))}
<!-- Ratent des trucs énormes (cohérence, spacial reasoning...)) -->
<!-- Bref: sont ALIEN -->
##### - [OpenAI Jukebox: a neural net that generates music, including rudimentary singing](https://openai.com/blog/jukebox/)
- [OpenAI Jukebox: a neural net that generates music, including rudimentary singing](https://openai.com/blog/jukebox/)
![bg left 100%](./img/01-jukebox.png)
......@@ -153,7 +198,7 @@ _color: black
<!-- Vous pouvez me contacter -->
### Paul-Louis Nech
- ###### ✉ formation@nech.pl
- ###### ✉ etudiants@nech.pl
- ###### 🔗 LinkedIn.com/in/PLNech
- ###### 💡 GitHub.com/PLNech
......@@ -169,9 +214,9 @@ _color: black
## Parcours
- #### EPITA 2016 | MTI
- #### Software Engineer @Algolia
- #### Senior ML Engineer @Algolia
- **EPITA** 2016 | Spé **MTI**
- Software Engineer @**Algolia**
- Senior _ML_ Engineer @**Algolia**
![bg right](./img/01-me2.jpg)
---
......@@ -212,71 +257,132 @@ _footer: ""
![bg fit](./img/00-questionnaire-feel.png)
<!-- Aucun 'ça peut pas marcher' ! On a passé cette phase -->
---
# Le positif
<!-- FIXME TODO SPLIT INTO SLIDES -->
Positifs?
---
#### Le positif
> DAll-E: "_c'est trop bien **à condition de bien formuler** ce que l'on veut_"
> Dall-E: "_c'est trop bien **à condition de bien formuler** ce que l'on veut_"
> Copilot: "arrive parfois à deviner entièrement un paragraphe de code parfois sans même faire d'efforts supplémentaire."
---
#### Le positif
> AI-Dungeon: "la complexité parfois mal interprétée"
> la rapidité de suggestions de traduction et la multitude de propositions
> GitHub Copilot: "_arrive parfois à deviner entièrement un paragraphe de code parfois **sans même faire d'efforts** supplémentaire_"
> Google lens
---
### Le positif
> la **rapidité** de suggestions de traduction et la **multitude** de propositions
---
<!-- FIXME TODO SPLIT INTO SLIDES -->
## Le négatif
Négatifs ?
---
#### Le négatif
> AI-Dungeon: "la complexité parfois mal interprétée"
---
#### Le négatif
> des librairies python à n'en plus finir
---
#### Le négatif
> Niveau morale c'est pas forcément ouf car ils utilisent du code open source
---
#### Le négatif
> Ça reprenait vraiment bcp bcp bcp trop les exemples fournis. Les phrases était majoritairement reformulée mais pas nouvelles !
---
#### Le négatif
> Copilot: parfois des suggestions totalement inappropriées
---
#### Le négatif
> La quantité de donnée à télécharger pour que ce ça deviennent fiable.
---
#### Le négatif
> l'écriture intutive de mon téléphone...
---
#### Le négatif
> le programme perdait vite le cours de la discussion
---
#### Le négatif
> les deepfakes où ils ont utilisé des personnes sans leur consentement dans des vidéos
---
#### Le négatif
> Mon ancienne entreprise voulait tellement "optimiser" le moteur de leurs jeux, qu'ils en ont cassé plus d'un (FDJ).
---
# Espoirs
---
### Espoirs
> la médecine
---
### Espoirs
> Permettre une communication fluide entre personnes de langages différentes, handicapés ou non
---
# Dangers
---
### Dangers
> les IA en machine learning qui ont été utilisé par de nombreuse personnes sur internet et qui ont été rendu inutile à cause des trolls
-> GPT4-chan, ou ChatGPT sur Stack Overflow
---
### Dangers
- [GPT4-chan](https://huggingface.co/ykilcher/gpt-4chan) and the [Ai Gating debate](https://medium.com/geekculture/gpt-4-chan-and-the-ai-gating-debate-41c3eb54ec32)
- ou [ChatGPT sur Stack Overflow](https://huggingface.co/ykilcher/gpt-4chan)
> _Temporary policy: ChatGPT is banned_
<!-- En quoi c'est un problème ? -->
![bg right fit](https://thegradient.pub/content/images/size/w1600/2022/06/main-3.png)
<!-- Image credit: https://thegradient.pub/gpt-4chan-lessons/ -->
---
### Dangers
> par optimiser on peut entendre ajouter des fonctionnalités qui induisent une difficultés d'utilisation du produit à long terme...
---
### Dangers
> La non régression ne marche pas toujours
---
### Dangers
> Les robots tueurs
![bg right fit](https://i.ytimg.com/vi/Ofmg-4D5PoA/hqdefault.jpg)
---
......@@ -284,6 +390,7 @@ Négatifs ?
<br />
### Les robots tueurs
<br />
- [Stop Killer Robots Campaign](https://www.stopkillerrobots.org/)
......@@ -298,6 +405,32 @@ With growing digital dehumanisation, the Stop Killer Robots coalition works to e
- `A.I. AND RACE`
- `#KEEPCTRL`
<!-- DEMUNANIZATION:
From smart homes and targeted advertising to the use of robot dogs by police enforcement, artificial intelligence technologies and automated decision-making are now playing a significant role in our lives.
Technology can be amazing. But just because we can build something, it doesn’t mean we should.
Many technologies with varying degrees of autonomy are already being widely rolled out without pausing to consider the consequences of normalising their use. Why do we need to talk about this?
Because machines don’t see us as people, just another piece of code to be processed and sorted.
-->
<!-- AI AND RACE:
From social media to the use of robot dogs by police enforcement, artificial intelligence technologies and automated decision making are now playing a significant role in our lives.The prejudices in our society live in the algorithms we design, in the data sets we compile and in the labels we prescribe one another.
Emerging technologies like facial and vocal recognition draw on already biased training datasets and often fail in recognising people of colour, persons with disabilities and women, favouring light-skinned and outwardly masculine faces over darker-skinned and outwardly feminine faces.
And while efforts will be made to diversify data sets, this is not just a problem of unrepresentative data.
-->
<!-- KEEPCTRL:
Autonomy in weapons systems is a profoundly human problem. Killer robots change the relationship between people and technology by handing over life and death decision-making to machines. They challenge human control over the use of force, and where they target people, they dehumanise us – reducing us to data points.
But, technologies are designed and created by people. We have a responsibility to establish boundaries between what is acceptable and what is unacceptable. We have the capacity to do this, to protect our humanity and ensure that the society we live in, that we continue to build, is one in which human life is valued – not quantified.
-->
<br />
<br />
<br />
......@@ -306,19 +439,35 @@ With growing digital dehumanisation, the Stop Killer Robots coalition works to e
---
### Dangers
> _Que le machine learning soit utiliser pour contrôler les humains_
<!-- non pas dans le sens d'un film de science-fiction mais dans le sens ou les personnes aux commandes pourrait fortement influencer les décisions de certaines personnes. Ex: les dernière élections présidentielles au Etats-Unis -->
---
### Dangers
> _TikTok / instagram_ :'(
> Que le machine learning soit utiliser pour contrôler les humains, non pas dans le sens d'un film de science-fiction mais dans le sens ou les personnes aux commandes pourrait fortement influencer les décisions de certaines personnes. Ex: les dernière élections présidentielles au Etats-Unis
> Que les machines contrôlent le monde ?
---
### Dangers
> _Les pubs qui **s'adaptent** en fonction des recherches Intenrnet des personnes._
> la manipulation des fois non prévu des algorithmes comme la radicalisation de certaines personnes
---
### Dangers
> _la manipulation des fois non prévu des algorithmes comme la radicalisation de certaines personnes_
> TikTok/ instagram :'(
> Les pubs qui s'adaptent en fonction des recherches Intenrnet des personnes.
---
### Dangers
> _Que les machines contrôlent le monde ?_
---
### Dangers
> CYBERPUNK 2077
![bg right](https://preview.redd.it/fftof4hmu2b71.jpg?width=640&crop=smart&auto=webp&s=238d105bb0ce2441d6e20ad1a203a65cfc153ca2)
---
## C'est quoi ce cours ?
......@@ -509,12 +658,23 @@ Comment je sais ce que je sais pas ?
---
<!-- -> Occam's razor! There's always a more complex rule that matches all observations perfectly.
-->
![bg 50%](https://probmods.org/assets/img/Curve_fitting.png)
<!-- Image source: Probabilistic models of Cognition - https://probmods.org/chapters/occams-razor.html
## ⟶ Rasoir d'Occam
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
-> Occam's razor! There's always a more complex rule that matches all observations perfectly.
-->
![bg 50%](https://probmods.org/assets/img/Curve_fitting.png)
<!-- Image source: Probabilistic models of Cognition - https://probmods.org/chapters/occams-razor.html -->
---
# Failures of ML
......@@ -564,13 +724,13 @@ REPLICATION CRISIS! Probleme commun dans toute la science, incentives, game theo
---
Generative models
# Generative models
https://thisxdoesnotexist.com/
---
# FaceApp
## FaceApp
![](https://cdn-images-1.medium.com/max/1024/1*XBEpvGfjv_xo7ebBYNVDNA.png)
<!-- Credit: https://laptrinhx.com/faceapp-or-how-i-learned-to-stop-worrying-and-love-the-machines-2993489094/ -->
......@@ -578,7 +738,7 @@ https://thisxdoesnotexist.com/
---
# GANs
## GANs
<br />
<br />
......@@ -595,7 +755,7 @@ https://thisxdoesnotexist.com/
---
GALACTICA?
## GALACTICA
https://galactica.org/explore/
[Try it live ;)](https://huggingface.co/spaces/morenolq/galactica-base)
......@@ -717,6 +877,109 @@ TAY
-> Interface matters!
---
# OpenAI Codex
# GitHub Copilot
...et autres modèles génératifs de code
---
![bg](https://venturebeat.com/wp-content/uploads/2021/06/Blog-Hero.png?w=1601&strip=all)
<!-- Image credit: https://venturebeat.com/business/openai-warns-ai-behind-githubs-copilot-may-be-susceptible-to-bias/ -->
<!--
> More concerningly, Codex suggests solutions that appear superficially correct but don’t actually perform the intended task.
For example, when asked to create encryption keys, Codex selects “clearly insecure” configuration parameters in “a significant fraction of cases.” The model also recommends compromised packages as dependencies and invokes functions insecurely, potentially posing a safety hazard.
-> STACKOVERFLOW COPYPASTA ON STEROIDS
-->
---
> _"I'm afraid this will create a mis-re-licensing hell"_
> ~[Bruno Hebling Viera](https://twitter.com/HeblingVieira/status/1410972761241968641)
<!-- > I'm afraid this will create a mis-re-licensing hell, where everyone will claim they've been fooled by Copilot, and MS will say licensees should revise the code for licenses themselves. -->
---
- Efficacité == Intelligence ?
![bg right fit](https://www.encora.com/hs-fs/hubfs/Picturegit2.png?width=1280&name=Picturegit2.png)
---
## Recitation?
- [GitHub Copilot: Parrot or Crow?](https://github.blog/2021-06-30-github-copilot-research-recitation/)
<!-- "when it does, it mostly quotes code that everybody quotes"
-> Appel à la foule :facepalm:
"50 million smokers can't be wrong"
Cela dit, reconnaissent que:
> However, there’s still one big difference between GitHub Copilot reciting code and me reciting a poem: I know when I’m quoting
-->
---
> However, there’s still one big difference between
> **GitHub Copilot reciting code**
> and **me reciting a poem**:
>
> I _know_ when I’m quoting
<br />
~ _Albert Ziegler, GitHub Copilot: Parrot or Crow?_
---
![bg fit](https://i0.wp.com/user-images.githubusercontent.com/4434330/165667874-2f04b14c-909e-4bf5-9639-ff346da960f1.gif?ssl=1)
---
> "I don't want to say anything but that's not the right license Mr Copilot."
- [Tweet](https://twitter.com/i/status/1410886329924194309)
---
<!-- FAST INVERT SQUARE ROOT -->
<!-- QUAKE <3 -->
![bg fit](./img/00-copilot-verbatim.gif)
<!-- Image source: previous tweet -->
---
- Question de température ? Pas assez hot pour oser ses propres idées ?
---
> I hacked the Copilot code temperature generation to 1.0 (to allow for **wilder** output),
> yet the original code **still persisted**,
> with the alternate approaches being **less helpful**.
- [Max Woolf](https://twitter.com/minimaxir/status/1411005120695865347/photo/1)
![bg fit right:40%](https://pbs.twimg.com/media/E5TlafWVcAAJNRZ?format=jpg&name=4096x4096)
<!-- Image source: Max woolf's tweet -->
---
<!-- _footer: '' -->
![bg fit](./img/00-codex.png)
<!-- Image source: Anonymous internet meme warrior -->
---
### Gaming the game: IAs flemmardes
......@@ -729,6 +992,8 @@ TAY
---
### Short-term goals vs long-term goals
<!-- Chemin le plus court vers le goal, probleme ? -->
<!-- What did these robots fail to learn? -->
<!-- What is Netflix optimizing for? -->
......@@ -770,7 +1035,7 @@ _color: white
### Conclusions sur le _Paperclip maximizer_
- Essayez vous-même avec un [Clicker game](](https://www.decisionproblem.com/paperclips/index2.html)) 😛
- Voir Nick Bostrom's [_Ethical Issues in Advanced Artificial Intelligence_](https://nickbostrom.com/ethics/ai)
- Voir Nick Bostrom's [_Ethical Issues in Advanced Artificial Intelligence_](https://nickbostrom.com/ethics/ai) 💡
![bg crop brightness:0.2](./img/00-paperclips.png)
......@@ -915,6 +1180,17 @@ Le Feature Engineering, tout un art
<br />
<br />
---
Exemple: a Neural Network
<br />
<br />
$$ {\displaystyle g(x):=f^{L}(W^{L}f^{L-1}(W^{L-1}\cdots f^{1}(W^{1}x)\cdots ))} $$
![invert w:15em](https://www.moonbooks.org/media/images/thumbnails_1000_1000/neural-network-01.PNG?lastmod=1600539785.861)
---
## Features: which one?
......@@ -989,6 +1265,8 @@ Supervisé ou non?
<!-- Supervisé : trouvez la règle de mon jeu d'Eleusis (variante triplets de nombres.) -->
<!-- Regle? E.g. differences croissantes ou egales d1 <= d2 <= d3 -->
<!-- Did it feel different? -->
---
<!--
......@@ -1308,6 +1586,122 @@ _footer: ""
![bg right 120%](./img/01-lesswrong.png)
---
### But also...
['Death with Dignity' strategy](https://www.lesswrong.com/posts/j9Q8bRmwCgXRYAgcJ/miri-announces-new-death-with-dignity-strategy)
> "tl;dr: It's obvious at this point that humanity **isn't going to solve the alignment problem**
> or even try very hard
> or even go out with much of a fight
---
### Also 'Death with Dignity'
### (╯°□°)╯︵ ┻━┻
['Death with Dignity' strategy](https://www.lesswrong.com/posts/j9Q8bRmwCgXRYAgcJ/miri-announces-new-death-with-dignity-strategy)
> Since **survival is unattainable**,
> we should shift the focus of our efforts to helping humanity
> **die** with with **slightly more dignity**."
---
# Les Acteurs de l'Écosystème
---
## "Research" labs
- DeepMind
- OpenAI
- HuggingFace
---
### DeepMind
![bg right fit](./img/00-deepmind.png)
- [Lab](https://www.deepmind.com)
- [AI Safety Research](https://deepmindsafetyresearch.medium.com)
Exemples:
- [On Meta's CICERO model](https://www.deepmind.com/blog/ai-for-the-board-game-diplomacy)
- [On Policy Regularization](https://deepmindsafetyresearch.medium.com/your-policy-regulariser-is-secretly-an-adversary-14684c743d45)
---
### "Open"AI
![bg right fit](./img/00-openai.png)
- [Research](https://openai.com/research/)
- [AI Alignment](https://openai.com/alignment/)
<br />
---
Debate : [_Is OpenAI still Open?_](https://becominghuman.ai/is-openai-still-open-81f6839f756b)
<!--
This is a key turning point for OpenAI and everything following from the 1 Billion Dollar Investment by Microsoft to releasing GPT-3 as a commercial product
and now the exclusivity deal with Microsoft stem from this change in corporate structure and leadership.
-->
- API-based [examples](https://gpt3demo.com/apps/algolia)
- [Safety Research](https://openai.com/blog/our-approach-to-alignment-research/)
- [Yudkowsky on OpenAI](https://www.lesswrong.com/posts/oEC92fNXPj6wxz8dd/how-to-think-about-and-deal-with-openai)
- [Common misconceptions thread](https://www.lesswrong.com/posts/3S4nyoNEEuvNsbXt8/common-misconceptions-about-openai)
<br />
<!-- Above:
Nothing else Elon Musk has done can possibly make up for how hard the "OpenAI" launch trashed humanity's chances of survival;
previously there was a nascent spirit of cooperation, which Elon completely blew up to try to make it all be about who, which monkey, got the poison banana, and by spreading and advocating the frame that everybody needed their own "demon" (Musk's old term) in their house,
and anybody who talked about reducing proliferation of demons must be a bad anti-openness person who wanted to keep all the demons for themselves.
-->
![bg right fit](./img/00-openai.png)
---
## HuggingFace
> "The Underdogs"
- [Models](https://huggingface.co/models)
- [Datasets](https://huggingface.co/datasets)
- [APIs](https://huggingface.co/pricing)
![bg right fit](./img/00-huggingface.png)
---
## Cloud Vendors
<!-- Interet: ils vendent leur compute, nous habituent à leur plateforme, etc -->
- [AWS](https://aws.amazon.com/machine-learning/)
- [GCP](https://cloud.google.com/solutions/ai)
- [Azure](https://azure.microsoft.com/en-us/solutions/ai/#benefits)
![bg right fit](https://www.whizlabs.com/blog/wp-content/uploads/2020/07/aws-machine-learning-tools.png)
---
## Hardware shops
<!-- Interet: ils justifient leurs nouvelles chips! -->
- [NVIDIA](https://www.nvidia.com/en-us/deep-learning-ai/solutions/machine-learning/)
- [AMD, you still here?](https://www.amd.com/en/technologies/deep-machine-learning)
- [Acquisition strategy](https://www.xilinx.com/applications/video-imaging/video-ai-analytics.html)
- [Google?](https://cloud.google.com/tpu/)
---
## On résume
......@@ -1350,3 +1744,49 @@ On se retrouve l'année prochaine pour voir tout ça ensemble :)
![bg right 90%](https://miro.medium.com/max/640/0*JWCLdKhz-0e_77tB.webp)
<!-- Image Credit: Steward Brandt - Whole Earth Catalog -->
---
### Outro
- AI Box experiments!
---
### Outro
- `tp-00`
https://nech.pl/ml101-00-tp
---
### Outro
- Let's play with [StableDiffusion](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/diffusers_intro.ipynb#scrollTo=e5MYkuhcRGAS)
- [using Diffusers](https://github.com/huggingface/diffusers)
- [Go further on your own!](https://github.com/Stability-AI/stablediffusion#image-inpainting-with-stable-diffusion)
---
### Outro
- Let's play with [Neural Networks](https://colab.research.google.com/github/google/eng-edu/blob/main/ml/cc/exercises/intro_to_neural_nets.ipynb?hl=en#scrollTo=9n9_cTveKmse)
---
### Outro
- Let's play with [Neural Networks: LVL2 🛠](https://www.sitepoint.com/keras-digit-recognition-tutorial/)
---
### Outro
- Let's play with [LSTM lyric generation](https://git.plnech.fr/pln/BabelZoo/tree/master/LeBoulbiNet)
- Or [Rebuild your own 🛠](https://colab.research.google.com/github/tensorflow/examples/blob/master/courses/udacity_intro_to_tensorflow_for_deep_learning/l10c03_nlp_constructing_text_generation_model.ipynb#scrollTo=2LmLTREBf5ng)
<!-- From https://www.udacity.com/course/intro-to-tensorflow-for-deep-learning--ud187 -->
![vertical bg right](https://git.plnech.fr/pln/BabelZoo/raw/master/LeBoulbiNet/boulbi.jpg)
![vertical bg right fit](https://git.plnech.fr/pln/BabelZoo/raw/master/LeBoulbiNet/first.png)
![vertical bg right fit](https://git.plnech.fr/pln/BabelZoo/raw/master/KoozDawa/first.png)
![vertical bg right fit](./img/00-turfu.png)
\ No newline at end of file
......@@ -13,26 +13,144 @@ paginate: true
---
## Perceptron
<!-- Base fondamentale -->
<!-- Little more than a learning thermostat -->
---
> _perceptron may eventually be able to learn, make decisions, and translate languages._
**Frank Rosenblatt**, 1958
---
<!--
Was researched until Minsky's killer book Perceptrons:
> acknowledge some of the perceptrons' strengths while also showing major limitations
-->
...until Marvin Minsky's book
## _Perceptrons_ (1969)
![bg right:38% fit](https://external-content.duckduckgo.com/iu/?u=https%3A%2F%2Fweltbild.scene7.com%2Fasset%2Fvgw%2Fperceptrons-195682049.jpg&f=1&nofb=1&ipt=802cd208ee9f59170d9fb973f0c820f23265e3d6c23b60529d332b7536fec005&ipo=images)
---
<!---
-> Focus on "Symbolic systems"
-> huge limits to perceptrons abilities
-> killed the field: almost stopped research on the connectionnist paradigm for the next fifteen years, until it gets a REVIVAL
-->
- Pour aller plus loin: [_WP History of AI | Perceptrons and the attack on connectionism_](https://en.wikipedia.org/wiki/History_of_artificial_intelligence#Perceptrons_and_the_attack_on_connectionism)
---
![bg fit](https://upload.wikimedia.org/wikipedia/en/5/52/Mark_I_perceptron.jpeg)
<!-- Mark I Perceptron machine, 1958 | Cornell University Library website -->
<!--
It was connected to a camera with 20×20 cadmium sulfide photocells to make a 400-pixel image.
The main visible feature is a patch panel that set different combinations of input features.
To the right, arrays of potentiometers that implemented the adaptive weights. -->
---
# Formalisme: un Perceptron
<!-- _backgroundColor:
_color: -->
$$ f(\mathbf{x}) = \begin{cases}1 & \text{if }\ \mathbf{w} \cdot \mathbf{x} + b > 0,\\0 & \text{otherwise}\end{cases} $$
---
# Intuition: apprentissage d'un Perceptron
<!-- _backgroundColor:
_color: -->
![bg right 95%](https://upload.wikimedia.org/wikipedia/commons/thumb/8/8a/Perceptron_example.svg/1280px-Perceptron_example.svg.png)
<!-- Diagram by Elizabeth Goodspeed -->
$$ f(\mathbf{x}) = \begin{cases}1 & \text{if }\ \mathbf{w} \cdot \mathbf{x} + b > 0,\\0 & \text{otherwise}\end{cases} $$
---
![bg 90%](https://cdn-images-1.medium.com/max/1600/1*ZafDv3VUm60Eh10OeJu1vw.png)
---
![bg left fit](https://cdn-images-1.medium.com/max/1600/1*ZafDv3VUm60Eh10OeJu1vw.png)
<!-- Image credit: Shruti Jadon, Introduciton to Different Activation Functions for Deep Learning
https://medium.com/@shrutijadon/survey-on-activation-functions-for-deep-learning-9689331ba092 -->
- Aller plus loin: [CodeX - Ansh David: _intro to Perceptrons and different type of activation functions_
](https://medium.com/codex/single-layer-perceptron-and-activation-function-b6b74b4aae66)
---
## Multi Layer Perceptron
![bg right:55% fit](https://rasbt.github.io/mlxtend/user_guide/classifier/NeuralNetMLP_files/neuralnet_mlp_1.png)
<!-- Image credit & reference: Raschka, Sebastian (2018) MLxtend: Providing machine learning and data science utilities and extensions to Python's scientific computing stack.
https://rasbt.github.io/mlxtend/
-->
---
## MLP == Neural Network!
---
## Neural Network
## Recurrent Neural Networks
<!-- Recurrence -> loops -> temporal & dynamic behavior -->
<!-- internal state as input == kind of 'memory'
-> Works for understanding streams of stuff, e.g. cursive handwriting, or stock market data, or audio -> speech to text, music analysis, etc
-->
![bg fit right](https://www.skynettoday.com/assets/img/overviews/2020-09-27-a-brief-history-of-neural-nets-and-deep-learning/34568.gif)
- See [The Unreasonable Effectiveness of RNNs](http://karpathy.github.io/2015/05/21/rnn-effectiveness/)
---
## LSTMS
<!-- how can RNNS handle long term dependencies?” -->
<!-- Let's learn from Christopher Olah:
co-founders of Anthropic, an AI lab focused on the safety of large models. Previously, I led interpretability research at OpenAI
worked at Google Brain
co-founded Distill, a scientific journal focused on outstanding communication.
-->
- See [Understanding LSTM networks](https://colah.github.io/posts/2015-08-Understanding-LSTMs/) by **Christopher Olah**
---
## Deep Learning:
# Deep Learning
---
### Layers, Layers, Layers!
---
### Convolutions, Capsules and other tricks
---
### Attention! It's all you need
- https://arxiv.org/pdf/1706.03762.pdf
---
### TRANSFORMERS
![bg right fit](https://external-content.duckduckgo.com/iu/?u=http%3A%2F%2Fwww.wikialpha.org%2Fmediawiki%2Fimages%2Fthumb%2F1%2F17%2FSteppernightstick-united.jpg%2F480px-Steppernightstick-united.jpg&f=1&nofb=1&ipt=f023244f8ef45b4cba81ffdd819cada10f55dea2c207d715f0f10498b001582f&ipo=images)
---
### Language Models
- [BERT: Bidirectional
Encoder Representations from Transformers](https://arxiv.org/pdf/1810.04805v2.pdf)
---
### Genetic Algorithms
---
......@@ -42,11 +160,26 @@ paginate: true
# Pratique : Choisir un Modèle
---
### Selon la data
# Supervisé ou non ?
---
![bg 93%](https://miro.medium.com/max/1400/0*p3zNp9D8YFmX32am.webp)
<!-- Reference et source: https://medium.com/@dkatzman_3920/supervised-vs-unsupervised-learning-and-use-cases-for-each-8b9cc3ebd301 -->
---
### Selon l'objectif
- [Overview of Supervised ML Algorithms](https://towardsdatascience.com/overview-of-supervised-machine-learning-algorithms-a5107d036296)
---
### Selon les ressources
![bg](https://miro.medium.com/max/1400/1*31IsgRs2QZ7H4kl8tNqjdQ.webp)
---
### Selon l'usage
### Selon le tooling à disposition
......@@ -15,7 +15,7 @@ paginate: true
---
## De manière continue
---
### Quand la data chanfge
### Quand la data change
---
### Quand le modèle évolue
---
......
......@@ -69,3 +69,4 @@ PyLint
Black
PyDantic
MyPy
......@@ -8,6 +8,12 @@ paginate: true
footer: "ML101 | TP0: Introduction | Paul-Louis Nech | INTECH 2022-2023"
---
<style>
li {
font-size: 0.8em;
}
</style>
# <!-- fit --> TP1: Introduction
<!-- ### Définitions & Histoire du ML
### méthodes et outils ;
......@@ -42,7 +48,7 @@ Sur l'intranet ou à formation@nech.pl
###### Faites une phrase avec vos propres mots pour définir ce que veut dire:
- "Apprendre"
- "Deep Learning"
- "Machine Learning"
- "Précision et Rappel"
- "Overfit"
......@@ -87,8 +93,8 @@ Lisez un peu sur cette personne, puis partagez ici quelque-chose qu'elle a dit o
### Lvl 3: Jouer avec des Image Models
- Ouvrez "ThisPersonDoesNotExist" et dites en quelques mots votre impression sur la qualité des images générées.
- Ouvrez "ThisXDoesNotExist", choisissez un autre modèle, et commentez sa qualité.
- Ouvrez "[ThisPersonDoesNotExist](thispersondoesnotexist.com)" et dites en quelques mots votre impression sur la qualité des images générées.
- Ouvrez "[ThisXDoesNotExist](thisxdoesnotexist.com)", choisissez un autre modèle, et commentez sa qualité.
---
......
from datasets import load_dataset, DatasetDict, Dataset
from torchvision.transforms import Compose, ColorJitter, ToTensor
import matplotlib.pyplot as plt
import numpy as np
def main():
# Load
dataset = load_dataset("mnist")
# Inspect
# print(dataset)
# dataset["train"][0]['image'].show()
# dataset["test"][0]['image'].show()
# Prepare labels mapping to be able to go from id to label
labels = dataset["train"].features["label"].names
label2id, id2label = {}, {}
for i, label in enumerate(labels):
label2id[label] = str(i)
id2label[i] = label
# Now check e.g. label for image 2:
print(f"Image 2 is labelled {id2label[2]}")
def use_image_multi_class_classification():
from transformers import AutoFeatureExtractor, AutoModelForImageClassification
extractor = AutoFeatureExtractor.from_pretrained("autoevaluate/image-multi-class-classification")
model = AutoModelForImageClassification.from_pretrained("autoevaluate/image-multi-class-classification")
if __name__ == '__main__':
main()
\ No newline at end of file
matplotlib
datasets
torchvision
transformers
evaluate
\ No newline at end of file
---
marp: true
theme: uncover
color: #eee
colorSecondary: #333
backgroundColor: #111
paginate: true
footer: "ML101 | TP1: Choisir un modèle | Paul-Louis Nech | INTECH 2022-2023"
---
<style>
li {
font-size: 0.8em;
}
</style>
# <!-- fit --> TP1: Choisir un modèle
<!--
Théorie : méthodes
fondamentales
Pratique : choisir un modèle
adapté à son problème
-->
---
Objectifs :
- Acquérir une intuition des différences entre grandes familles de modèles de Machine Learnig
- Savoir choisir choisir un modèle adapté à son problème
---
Format: Rendu écrit (fichier Markdown ou Doc avec une section par _Level_)
Sur l'intranet ou à formation@nech.pl
<br />
**DEADLINE : 23 Janvier 23:59:59**
<br />
> _Le cachet de mon mailserver faisant foi_.
---
## Lvl 0: La base
![bg right:35% w:300](https://www.meme-arsenal.com/memes/a6effdba5a540560c7b5ee616ee0f1f3.jpg)
<!-- Image credit: World of Warcraft Tutorial boar -->
###### Faites une phrase avec vos propres mots pour définir ce que veut dire:
- "Apprentissage Supervisé"
- "Apprentissage Non Supervisé"
- "Layer"
- ""
---
## Lvl 1: Intro to Neural Nets by Google
- [Google Colab: Exercises - Intro to Neural Nets](https://colab.research.google.com/github/google/eng-edu/blob/main/ml/cc/exercises/intro_to_neural_nets.ipynb?hl=en#scrollTo=g8HC-TDgB1D1)
:rocket: simple linear regression model VS Neural network :rocket:
---
## Lvl 1.1 : baseline
**Terminez les sections jusqu'à "_Build a linear regression model as a baseline_"**
<br />
- Partagez les métriques `loss` et `mean_squared_error` obtenues avec votre baseline de régression linéaire
Exemple:
```
Evaluate the linear regression model against the test set:
300/300 [==============================] - 1s 2ms/step - loss: 0.4018 - mean_squared_error: 0.4018
[0.40178823471069336, 0.40178823471069336]
```
---
## Lvl 1.2 : beat the baseline :metal:
**Terminez les sections jusqu'à "_Call the functions to build and train a deep neural net_"**
<br />
- Partagez les premières métriques `loss` et `mean_squared_error` obtenues avec votre modèle de réseau de neurones
Exemple:
```
Evaluate the new model against the test set:
3/3 [==============================] - 0s 7ms/step - loss: 0.3705 - mean_squared_error: 0.3705
[0.3705473244190216, 0.3705473244190216]
```
---
## Lvl 2 : now beat yourself :smiling_imp:
**Terminez les sections jusqu'à "_Task 2: Optimize the deep neural network's topography_"**
<br />
- Partagez la définition finale de votre réseau de neurones
- Partagez votre :muscle: **meilleur résultat** :muscle: de métriques `loss` et `mean_squared_error` obtenues avec votre modèle de réseau de neurones
- Répondez en quelques mots : qu'est-ce qui affectait la performance de votre réseau de neurones ? Quel impact sur sa vitesse d'apprentissage ?
---
## Lvl 3 : now beat the real world
Comment faire mieux sur le _test_ set, et pas seulement bien apprendre le _training set_ ?
**Terminez la dernière section "_Task 3: Regularize the deep neural network_"**
<br />
---
## Lvl 3.1 : high score, real world
- Partagez votre :brain: **meilleur résultat en situation réelle** :brain: de métriques `loss` et `mean_squared_error` obtenues avec votre modèle de réseau de neurones régularisé
---
## Bonus: Mais alors mon modèle d'avant il vaut quoi ?
- Répondez en quelques mots : en quoi la régularisation est utile ? Qu'en concluez vous sur votre évaluation initiale du modèle (les "meilleures métriques du Lvl 2?)
---
marp: true
theme: uncover
color: #eee
colorSecondary: #333
backgroundColor: #111
paginate: true
footer: "ML101 | TP1: Choisir un modèle | Paul-Louis Nech | INTECH 2022-2023"
---
<style>
li {
font-size: 0.8em;
}
</style>
# <!-- fit --> TP1: Choisir un modèle
<!--
Théorie : méthodes
fondamentales
Pratique : choisir un modèle
adapté à son problème
-->
---
Objectifs :
- Théorie : outils fondamentaux
- Pratique : entrainer un modèle
---
Format: Rendu écrit (fichier Markdown ou Doc avec une section par _Level_)
Sur l'intranet ou à formation@nech.pl
<br />
**DEADLINE : 24 Janvier 23:59:59**
<br />
> _Le cachet de mon mailserver faisant foi_.
---
## Lvl 0: La base
![bg right:35% w:300](https://www.meme-arsenal.com/memes/a6effdba5a540560c7b5ee616ee0f1f3.jpg)
<!-- Image credit: World of Warcraft Tutorial boar -->
###### Faites une phrase avec vos propres mots pour définir ce que veut dire:
FOO BAR BAZ
---
## Lvl 1: Multi-class classification with MNIST
- [Google Colab: Exercises - Multi-Class Classification with MNIST](https://colab.research.google.com/github/google/eng-edu/blob/main/ml/cc/exercises/multi-class_classification_with_MNIST.ipynb?hl=en#scrollTo=XuKlphuImFSN)
:rocket: simple linear regression model VS Neural network :rocket:
from diffusers import DDPMPipeline
def main():
image_pipe = DDPMPipeline.from_pretrained("google/ddpm-celebahq-256")
image_pipe.to("cuda")
pass
if __name__ == '__main__':
main()
\ No newline at end of file
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment