ICCV 2023计算机视觉研究指南

本文介绍了某中心在ICCV 2023会议上发表的计算机视觉研究成果,涵盖3D标注、动作识别、数据表示、视频生成等多个前沿领域,包括目标检测、图像分割、机器反学习等具体技术内容。

某中心ICCV 2023论文快速指南

从图像分割和目标检测等经典问题,到数据表示和"机器反学习"等理论主题,某中心研究人员在ICCV上发表的论文展示了其在计算机视觉研究工作的多样性。

按主题分类的论文

3D标注

3-DHAL3D: 细粒度3D部件标注的分层主动学习 Fenggen Yu, Yiming Qian, Francisca Gil Ureta, Brian Jackson, Eric Bennett, Richard Zhang

ImGeoNet: 用于多视角3D目标检测的图像诱导几何感知体素表示 Tao Tu, Shun-Po Chuang, Yu-Lun Liu, Cheng Sun, Ke Zhang, Donna Roy, Cheng-Hao Kuo, Min Sun

动作识别

SkeleTR: 面向野外基于骨架的动作识别 Haodong Duan, Mingze Xu, Bing Shuai, Davide Modolo, Zhuowen Tu, Joe Tighe, Alessandro Bergamo

数据表示

Linear spaces of meanings: 视觉-语言模型中的组合结构 Matthew Trager, Pramuditha Perera, Luca Zancato, Alessandro Achille, Parminder Bhatia, Stefano Soatto

Motion-guided masking for spatiotemporal representation learning David Fan, Jue Wang, Leo Liao, Yi Zhu, Vimal Bhat, Hector Santos, Rohith Mysore Vijaya Kumar, Xinyu (Arthur) Li

配音视频生成

SIDGAN: 通过平移不变学习生成高分辨率配音视频 Urwa Muaz, Wondong Jang, Rohun Tripathi, santhosh Mani, Wenbin Ouyang, Ravi Teja Gadde, Baris Gecer, Sergio Elizondo, Reza Madad, Naveen Nair

地理空间基础模型

Towards geospatial foundation models via continual pretraining Matias Mendieta, Boran Han, Xingjian Shi, Yi Zhu, Chen Chen

图神经网络

Learning adaptive neighborhoods for graph neural networks Avi Saha, Oscar Mendez, Chris Russell, Richard Bowden

图像检索

FashionNTM: 通过级联记忆进行多轮时尚图像检索 Anwesan Pal, Sahil Wadhwa, Ayush Jaiswal, Xu Zhang, Yue Wu, Rakesh Chada, Pradeep Natarajan, Henrik I. Christensen

图像分割

Coarse-to-fine amodal segmentation with shape prior Jianxiong Gao, Xuelin Qian, Yikai Wang, Tianjun Xiao, Tong He, Zheng Zhang, Yanwei Fu

LD-ZNet: 基于潜在扩散方法的文本图像分割 Koutilya PNVR, Bharat Singh, Pallabi Ghosh, Behjat Siddiquie, David Jacobs

Rethinking amodal video segmentation from learning supervised signals with object-centric representation Ke Fan, Jingshi Lei, Xuelin Qian, Miaopeng Yu, Tianjun Xiao, Tong He, Zheng Zhang, Yanwei Fu

信息提取

DocTr: 用于文档中结构化信息提取的文档变换器 Haofu Liao, Aruni RoyChowdhury, Weijian Li, Ankan Bansal, Yuting Zhang, Zhuowen Tu, Ravi Kumar Satzoda, R. Manmatha, Vijay Mahadevan

机器反学习

SAFE: 使用分片图的机器反学习 Yonatan Dukler, Ben Bowman, Alessandro Achille, Aditya Golatkar, Ashwin Swaminathan, Stefano Soatto

目标检测

Bidirectional alignment for domain adaptive detection with transformers Liqiang He, Wei Wang, Albert Chen, Min Sun, Cheng-Hao Kuo, Sinisa Todorovic

Unsupervised open-vocabulary object localization in videos Ke Fan, Zechen Bai, Tianjun Xiao, Dominik Zietlow, Max Horn, Zixu Zhao, Carl-Johann Simon-Gabriel, Mike Zheng Shou, Francesco Locatello, Bernt Schiele, Thomas Brox, Zheng Zhang, Yanwei Fu, Tong He

目标跟踪

Object-centric multiple object tracking Zixu Zhao, Jiaze Wang, Max Horn, Yizhuo Ding, Tong He, Zechen Bai, Dominik Zietlow, Carl-Johann Simon-Gabriel, Bing Shuai, Zhuowen Tu, Thomas Brox, Bernt Schiele, Yanwei Fu, Francesco Locatello, Zheng Zhang, Tianjun Xiao

场景文本识别

CLIPTER: Looking at the bigger picture in scene text recognition Aviad Aberdam, David Haim Bensaid, Alona Golts, Roy Ganz, Oren Nuriel, Royee Tichauer, Shai Mazor, Ron Litman

Towards models that can see and read Roy Ganz, Oren Nuriel, Aviad Aberdam, Yair Kittenplon, Shai Mazor, Ron Litman

迁移学习

PADCLIP: Pseudo-labeling with adaptive debiasing in CLIP for unsupervised domain adaptation Zhengfeng Lai, Sol Vesdapunt, Ning Zhou, Jun Wu, Cong Phuoc Huynh, Xuelu Li, Kah Kuen Fu, Chen-Nee Chuah

视频检索

Audio-enhanced text-to-video retrieval using text-conditioned feature alignment Sarah Ibrahimi, Xiaohang Sun, Pichao Wang, Amanmeet Garg, Ashutosh Sanan, Mohamed Omar

视频分割

MEGA: Multimodal alignment aggregation and distillation for cinematic video segmentation Najmeh Sadoughi, Xinyu (Arthur) Li, Avijit Vajpayee, David Fan, Bing Shuai, Hector Santos, Vimal Bhat, Rohith Mysore Vijaya Kumar

comments powered by Disqus
使用 Hugo 构建
主题 StackJimmy 设计