- Video understanding: temporal modelling, optical flow (Lucas-Kanade, Farneback), two-stream networks
- Video architectures: 3D CNNs (C3D, I3D), TimeSformer, VideoMAE, SlowFast networks
- Action recognition and temporal action detection
- Video object tracking: SORT, DeepSORT, ByteTrack
- 3D vision: depth estimation (monocular, stereo), point clouds, NeRFs, 3D Gaussian splatting
- SLAM: visual odometry, feature-based SLAM, ORB-SLAM, LiDAR SLAM
- VR/AR: pose estimation, scene reconstruction, real-time rendering considerations