On May 6, 2025, the BLISS Speaker Series presents Timothée Darcet, a PhD student at Meta AI and Inria, discussing "CAPI: Cluster and Predict Latent Patches for Improved Masked Image Modeling." This 45-minute talk introduces CAPI, a novel pure-MIM framework that rethinks target representations, loss functions, and architectures. CAPI employs a clustering-based loss stable in training and demonstrates promising scalability. Utilizing a ViT-L backbone, CAPI achieves 83.8% accuracy on ImageNet and 32.1% mIoU on ADE20K with simple linear probes, outperforming previous MIM methods and approaching the performance of the current state-of-the-art, DINOv2. Attendees are encouraged to arrive early, as doors close promptly at 7:15 PM. The event includes opportunities for networking and complimentary drinks. RSVP via Meetup is required for entry.
More: BLISS Speaker Series Episode #17