Skip to main content

Master 2 ID3D - Analyse, Traitement d’Image et Vision 3D #

Master 2 IA - Apprentissage Machine Et Image #

Master 2 DISS - Fundamentals Of Image Processing And Interpretation #

Teachers: Alexandre Meyer, Julie Digne, Nicolas Bonneel - LIRIS, Université Claude Bernard Lyon 1

MLImage_all.jpg

Objective #

This page contains material for different courses about 'Machine Learning and Image/Geometry' in the Master of Computer Science at the University of Lyon 1 (ATIV3D for ID3D, and ‘Learning and Image’ for IA/DISS). The course takes place in autumn. The aim of the course is to provide an overview in machine learning (particularly deep learning) for image/geometry problems. The course begins by presenting the classic image-related problems, such as classification, descriptor extraction, pattern recognition, object tracking, segmentation, etc. Then, it presents generative methods. A wide range of different types of networks (CNN, auto-encoder, LSTM, GAN, transform, diffusion,etc.) is given, focusing on image data, but also on data such as point clouds, meshes, animation (skeleton), colour palettes, etc.

For the IA Master’s options, the slides are here.

Topics #

Deep learning and images #

  • Basis: training, latent space, regularization, etc.
  • ConvolutionNN
  • Segmentation (U-Net, etc.)
  • Tracking (YOLO)
  • Skeleton (OpenPose, XNect, etc.)
  • Notion of Transformer/attention for classification (Vit)

Generative deep learning #

  • Images generation
    • auto-encoder
    • GAN
    • Diffusion
  • Extended to 3D data

Deep learning and geometry #

  • Geometric data
    • Point cloud (pointNet, etc.)
    • Meshes (MeshConv, etc.)
    • Diffusion on surface
  • Implicit neuronal representation (IGR, SIREN)
  • Neural radiance field NERF (Champs de radiance neuronaux)

Optimal transport #

  • Introduction to optimal transport

Timetable: autumn 2025 #

For IA and DISS, courses and lab are Thursday afternoons from October to January. For ID3D, there are also some courses on Tuesday mornings in September and after, then in October the lecturers are on Thursday afternoons and the lab on Tuesday mornings.

MLImage_all.jpg

Evaluations #

  • ID3D: Lab exam (50%) + courses exam (50%)
    • Questions on all labs/courses including parts on 3D Vision
  • IA: Lab SkeletonDance (20%) + Lab exam (40%) + courses exam (40%)
  • DISS: Lab SkeletonDance (15%) + Lab exam (30%) + courses exam (30%) + articles presentations (25%)

Practical and courses questions may include questions on all labs, including the “SkeletonDance” Lab.

Dates #

  • 19/12/2025: rendering of the “SkeletonDance” lab in TOMUSS (IA, DISS)
  • 08/01/2026: Lab(TP) exam (IA, DISS, ID3D), 14h-15h, Amphi Jordan (Braconnier)
  • 15/01/2026: Exam (IA, DISS, ID3D), 14h-15h30, Amphi Jordan (Braconnier)
  • 22/01/2026: Papers presentations (DISS), 14h, room C003 (all DISS students must be present)
  • 29/01/2026: Papers presentations (DISS), 14h, room C002 (all DISS students must be present)

Skeleton Dance Lab #

AI and DISS: the “Synthesis of a person’s image guided by posture” lab must be returned. This TP can be done alone or in pairs.

Dealine: Friday 19th December. You will upload a ZIP file containing all your files in TOMUSS including the trained network + a short video of demo. No report, but the README.md must contain all the informations:

  • video of 2 min showing a demo running the code;
  • how to run the code using the trained network;
  • how to train the networks
  • and what you did exactly explain in English or French.

Papers presentation for DISS #

Instructions #

In TOMUSS, enter your group name (groups of 2 or 3 students). All members of a group must use exactly the same group name. Each group must choose one paper from the list provided by selecting (a?) in TOMUSS. Attendance is mandatory for all students on both Thursday afternoons. Each group will present their chosen paper in front of all students. 15 minutes for the presentation, 10 minutes for questions.

Your presentation must include

  • An introduction of the problem addressed by the paper.
  • An explanation of why the paper is important, including:
    • what was done before this paper (previous approaches or limitations).
  • A detailed explanation of the paper itself:
    • main ideas, methods, and contributions.
  • A discussion of what was done after the paper:
    • impact, follow-up work, or how it influenced later research.
  • Provide concrete examples of real applications that are based on or inspired by the paper. Important: one student in the group may take the role of experimentation and present what they were able to reproduce using the code available on GitHub.

Papers list #

(a1) An Image is Worth 16×16 Words: Transformers for Image Recognition at Scale Alexey Dosovitskiy et al., ICLR 2021 https://openreview.net/forum?id=YicbFdNTTy

(a2) Masked Autoencoders Are Scalable Vision Learners Kaiming He, Xinlei Chen, Saining Xie, Yanghao Li, Piotr Dollár, Ross Girshick CVPR 2022 https://openaccess.thecvf.com/content/CVPR2022/html/He_Masked_Autoencoders_Are_Scalable_Vision_Learners_CVPR_2022_paper.html

(a3) DDPM Denoising Diffusion Probabilistic Models (Ho et al., 2020) or Denoising Diffusion Implicit Models Song etal. ICLR 2021 https://openreview.net/forum?id=St1giarCHLP

(a4) Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding Chitwan Saharia, William Chan, et al. (2022) NeurIPS 2022 https://openreview.net/forum?id=08Yk-n5l2Al

(a5) Classifier-Free Diffusion Guidance Ho etal. NeurIPS 2021 Workshop on Deep Generative Models and Downstream Applications. https://arxiv.org/abs/2207.12598

(a6) EfficientDet: Scalable and Efficient Object Detection Mingxing Tan et al. CVPR 2020 (alternative à YOLO)

(a7) Segmenter: Transformer for Semantic Segmentation Strudel et al. ICCV 2021 https://openaccess.thecvf.com/content/ICCV2021/papers/Strudel_Segmenter_Transformer_for_Semantic_Segmentation_ICCV_2021_paper.pdf

(a8) Radiance Surfaces: Optimizing Surface Representations with a 5D Radiance Field Loss. Ziyi Zhang, Nicolas Roussel, Thomas Muller, Tizian Zeltner, Merlin Nimier-David, Fabrice Rousselle, Wenzel Jakob

(a9) Learning Joint Surface Atlases https://arxiv.org/pdf/2206.06273.pdf

(a10) Generative Escher Meshes https://arxiv.org/pdf/2309.14564

(a11) Gaussian Cube: A Structured and Explicit Radiance Representation for 3D Generative Modelling https://arxiv.org/pdf/2403.19655

(a12) Tensor field networks: Rotation- and translation-equivariant neural networks for 3D point clouds https://arxiv.org/abs/1802.08219

(a13) High‑Resolution Image Synthesis with Latent Diffusion Models https://arxiv.org/abs/2112.10752

(a14) End‑to‑End Object Detection with Transformers (DETR) https://arxiv.org/abs/2005.12872

(a15) NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis https://arxiv.org/abs/2003.08934