Want to understand the AI model actually behind Harry Potter by Balenciaga or the infamous image of the Pope in the puffer jacket? Well.. diffusion frameworks such as DALL-E 2, Midjourney, Imagen or Stable Diffusion seem to get a lot of credit, where as the true unsung hero of the story is the underlying U-Net architecture that they all actually use under the hood. Don't get me wrong Diffusion models are awesome but the U-Net is an absolute STAPLE when it comes to computer vision and this video aims to break it down in an easy way. Originally used for image segmentation the U-Net has developed into so much more. Happy watching!
U-Net paper: https://arxiv.org/abs/1505.04597
Many thanks to numerous online resources that helped me create this video.