Video and 3D Vision

  • Video understanding: temporal modelling, optical flow (Lucas-Kanade, Farneback), two-stream networks
  • Video architectures: 3D CNNs (C3D, I3D), TimeSformer, VideoMAE, SlowFast networks
  • Action recognition and temporal action detection
  • Video object tracking: SORT, DeepSORT, ByteTrack
  • 3D vision: depth estimation (monocular, stereo), point clouds, NeRFs, 3D Gaussian splatting
  • SLAM: visual odometry, feature-based SLAM, ORB-SLAM, LiDAR SLAM
  • VR/AR: pose estimation, scene reconstruction, real-time rendering considerations