某中心ICCV 2023论文快速指南
从图像分割和目标检测等经典问题,到数据表示和"机器反学习"等理论主题,某中心研究人员在ICCV上发表的论文展示了其在计算机视觉研究工作的多样性。
按主题分类的论文
3D标注
3-DHAL3D: 细粒度3D部件标注的分层主动学习 Fenggen Yu, Yiming Qian, Francisca Gil Ureta, Brian Jackson, Eric Bennett, Richard Zhang
ImGeoNet: 用于多视角3D目标检测的图像诱导几何感知体素表示 Tao Tu, Shun-Po Chuang, Yu-Lun Liu, Cheng Sun, Ke Zhang, Donna Roy, Cheng-Hao Kuo, Min Sun
动作识别
SkeleTR: 面向野外基于骨架的动作识别 Haodong Duan, Mingze Xu, Bing Shuai, Davide Modolo, Zhuowen Tu, Joe Tighe, Alessandro Bergamo
数据表示
Linear spaces of meanings: 视觉-语言模型中的组合结构 Matthew Trager, Pramuditha Perera, Luca Zancato, Alessandro Achille, Parminder Bhatia, Stefano Soatto
Motion-guided masking for spatiotemporal representation learning David Fan, Jue Wang, Leo Liao, Yi Zhu, Vimal Bhat, Hector Santos, Rohith Mysore Vijaya Kumar, Xinyu (Arthur) Li
配音视频生成
SIDGAN: 通过平移不变学习生成高分辨率配音视频 Urwa Muaz, Wondong Jang, Rohun Tripathi, santhosh Mani, Wenbin Ouyang, Ravi Teja Gadde, Baris Gecer, Sergio Elizondo, Reza Madad, Naveen Nair
地理空间基础模型
Towards geospatial foundation models via continual pretraining Matias Mendieta, Boran Han, Xingjian Shi, Yi Zhu, Chen Chen
图神经网络
Learning adaptive neighborhoods for graph neural networks Avi Saha, Oscar Mendez, Chris Russell, Richard Bowden
图像检索
FashionNTM: 通过级联记忆进行多轮时尚图像检索 Anwesan Pal, Sahil Wadhwa, Ayush Jaiswal, Xu Zhang, Yue Wu, Rakesh Chada, Pradeep Natarajan, Henrik I. Christensen
图像分割
Coarse-to-fine amodal segmentation with shape prior Jianxiong Gao, Xuelin Qian, Yikai Wang, Tianjun Xiao, Tong He, Zheng Zhang, Yanwei Fu
LD-ZNet: 基于潜在扩散方法的文本图像分割 Koutilya PNVR, Bharat Singh, Pallabi Ghosh, Behjat Siddiquie, David Jacobs
Rethinking amodal video segmentation from learning supervised signals with object-centric representation Ke Fan, Jingshi Lei, Xuelin Qian, Miaopeng Yu, Tianjun Xiao, Tong He, Zheng Zhang, Yanwei Fu
信息提取
DocTr: 用于文档中结构化信息提取的文档变换器 Haofu Liao, Aruni RoyChowdhury, Weijian Li, Ankan Bansal, Yuting Zhang, Zhuowen Tu, Ravi Kumar Satzoda, R. Manmatha, Vijay Mahadevan
机器反学习
SAFE: 使用分片图的机器反学习 Yonatan Dukler, Ben Bowman, Alessandro Achille, Aditya Golatkar, Ashwin Swaminathan, Stefano Soatto
目标检测
Bidirectional alignment for domain adaptive detection with transformers Liqiang He, Wei Wang, Albert Chen, Min Sun, Cheng-Hao Kuo, Sinisa Todorovic
Unsupervised open-vocabulary object localization in videos Ke Fan, Zechen Bai, Tianjun Xiao, Dominik Zietlow, Max Horn, Zixu Zhao, Carl-Johann Simon-Gabriel, Mike Zheng Shou, Francesco Locatello, Bernt Schiele, Thomas Brox, Zheng Zhang, Yanwei Fu, Tong He
目标跟踪
Object-centric multiple object tracking Zixu Zhao, Jiaze Wang, Max Horn, Yizhuo Ding, Tong He, Zechen Bai, Dominik Zietlow, Carl-Johann Simon-Gabriel, Bing Shuai, Zhuowen Tu, Thomas Brox, Bernt Schiele, Yanwei Fu, Francesco Locatello, Zheng Zhang, Tianjun Xiao
场景文本识别
CLIPTER: Looking at the bigger picture in scene text recognition Aviad Aberdam, David Haim Bensaid, Alona Golts, Roy Ganz, Oren Nuriel, Royee Tichauer, Shai Mazor, Ron Litman
Towards models that can see and read Roy Ganz, Oren Nuriel, Aviad Aberdam, Yair Kittenplon, Shai Mazor, Ron Litman
迁移学习
PADCLIP: Pseudo-labeling with adaptive debiasing in CLIP for unsupervised domain adaptation Zhengfeng Lai, Sol Vesdapunt, Ning Zhou, Jun Wu, Cong Phuoc Huynh, Xuelu Li, Kah Kuen Fu, Chen-Nee Chuah
视频检索
Audio-enhanced text-to-video retrieval using text-conditioned feature alignment Sarah Ibrahimi, Xiaohang Sun, Pichao Wang, Amanmeet Garg, Ashutosh Sanan, Mohamed Omar
视频分割
MEGA: Multimodal alignment aggregation and distillation for cinematic video segmentation Najmeh Sadoughi, Xinyu (Arthur) Li, Avijit Vajpayee, David Fan, Bing Shuai, Hector Santos, Vimal Bhat, Rohith Mysore Vijaya Kumar