top of page

Gedas Bertasius

Gedas Bertasius

Assistant Professor

I am an Assistant Professor in the Computer Science department at the University of North Carolina, Chapel Hill. My research interests are in computer vision and machine learning. In particular, I'm interested in video understanding, human behavior modeling, and multi-modal deep learning. I'm also passionate about using computer vision for advanced sports analytics.

Research Overview

Video Recognition

Developing spatiotemporal models for automatic video analysis.

Virtual AI Assistants

Multimodal Learning

Building models that learn from video, audio, and text.

Computer Vision for Sports

Designing video-based AI models that can help people with various daily tasks.

Developing computer vision tools for advanced sports analytics.

News

Contact

Selected Projects

BIMBA: Selective-Scan Compression for Long-Range Video Question Answering

Md Mohaiminul Islam, Tushar Nagarajan, Huiyu Wang, Gedas Bertasius, Lorenzo Torresani

CVPR 2025 (1st Place Winner at CVPR 2025 Ego4D EgoSchema Challenge)

[arxiv] [project page] [code] [model] [demo] [bibtex]

BASKET: A Large-Scale Video Dataset for Fine-Grained Skill Estimation

Yulu Pan, Ce Zhang, Gedas Bertasius

CVPR 2025

[arxiv] [project page] [code] [data] [bibtex]

A Simple LLM Framework for Long-Range Video Question-Answering

Ce Zhang, Taixi Lu, Md Mohaiminul Islam, Ziyang Wang, Shoubin Yu, Mohit Bansal, Gedas Bertasius

EMNLP 2024

[arxiv] [code] [bibtex]

Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives

Kristen Grauman, Andrew Westbury, Lorenzo Torresani, Kris Kitani, Jitendra Malik, Gedas Bertasius, ... , Michael Wray

CVPR 2024

[arxiv] [project website] [blog] [video] [bibtex]

Video ReCap: Recursive Captioning of Hour-Long Videos

Md Mohaiminul Islam, Ngan Ho, Xitong Yang, Tushar Nagarajan, Lorenzo Torresani, Gedas Bertasius

CVPR 2024 (Egocentric Vision (EgoVis) Distinguished Paper Award)

[arxiv] [project website] [code] [dataset] [bibtex]

VindLU: A Recipe for Effective Video-and-Language Pretraining

Feng Cheng, Xizi Wang, Jie Lei, David Crandall, Mohit Bansal, Gedas Bertasius

CVPR 2023

[arxiv] [code] [bibtex]

Is Space-Time Attention All You Need for Video Understanding?

Gedas Bertasius, Heng Wang, Lorenzo Torresani

ICML 2021

[arxiv] [code] [talk] [slides] [blog] [VentureBeat] [SiliconAngle] [bibtex]

All Publications

bottom of page