This is an advanced seminar course that will focus on the latest research on transformer models for visual recognition. The course will consist of research paper presentations and a semester-long course project. Topics will include vision transformers, MLP-based models, self-supervised learning, multi-modal learning, and various image and video-based applications. Background in deep learning is required.
Administrative Information
-
Instructor: Gedas Bertasius
-
Time: Mon & Wed 11 am - 12:15 pm
-
Location: FB 009
-
Office Hours: by appointment
-
Canvas Site: https://uncch.instructure.com/courses/49024
Grading
-
Class Participation: 10%
-
Paper Critiques: 20%
-
Paper Presentations: 30%
-
Course Project: 40%
Course Policies
-
Class Participation: Please come to class prepared for a paper discussion with your peers. Furthermore, please do not discuss the papers with your peers before the class. I'm interested in hearing your own opinion about the papers.
-
Late Submissions: The class is structured around a tight paper presentation schedule. Therefore, late assignments will not be accepted.
-
Academic Integrity: For your presentations and projects, you are allowed to use materials from external sources. However, you must clearly acknowledge those sources.