Hello, I'm Nikita Karaev

I am a Ph. D student at Meta AI and Visual Geometry Group, University of Oxford advised by Christian Rupprecht, Natalia Neverova and Andrea Vedaldi. Before my PhD, I did an engineering program at École Polytechnique in beautiful Paris. Also, I enjoy running and exploring mountains 🏔️


News

  • Mar 2025: We released VGGT, a feed-forward neural net that directly predicts all key 3D attributes of a scene. Try our HF demo!
  • Oct 2024: We`re releasing CoTracker3, a new point tracking model trained on real data.
  • Jul 2024: CoTracker is accepted at ECCV 2024!
  • Mar 2024: VGGSfM is accepted at CVPR 2024 as a highlight!
  • Jan 2024: CoTracker now supports tracking of 10x more points.
  • Aug 2023: We released CoTracker, a model for tracking any pixel in a video.
  • Mar 2023: My first PhD paper, DynamicStereo, has been accepted at CVPR 2023!
  • Jan 2022: I have started my PhD at Meta AI and Oxford!
  • Sep 2021: We climbed Mount Elbrus, the highest mountain in Europe! 🏔️ 5642m
  • Aug 2021: I have completed the internship at FAIR.
  • May 2021: I have started a research internship at Facebook AI Research with Natalia Neverova and Andrea Vedaldi.

Publications

VGGT: Visual Geometry Grounded Transformer

CVPR 2025

VGGT is a feed-forward neural net that directly predicts all key 3D attributes of a scene: cameras, point maps, depth maps, and 3D point tracks, from one, a few, or hundreds of its views, within seconds.

CoTracker3: Simpler and Better Point Tracking by Pseudo-Labelling Real Videos

In CoTracker3, we propose a simple and effective method for scaling synthetic-trained point trackers on real data.

Visual Geometry Grounded Deep Structure From Motion

CVPR 2024 (Highlight)

We propose a new fully differentiable Structure-from-Motion pipeline.

CoTracker: It is Better to Track Together

ECCV 2024

CoTracker bridges the gap between long-term point tracking and Optical Flow by jointly tracking multiple points (pixels) throughout an entire video.

DynamicStereo: Consistent Dynamic Depth from Stereo Videos

CVPR 2023

We introduce Dynamic Replica, a synthetic benchmark dataset for dynamic depth-from-stereo models, and propose DynamicStereo, a temporally consistent disparity estimation model that we train on this dataset.