Publication

You can also browse my Google Scholar profile. * denotes equal contribution; + denotes corresponding author.


  1. Masked Diffusion Transformer is a Strong Image Synthesizer
    Shanghua Gao, Pan Zhou+, Ming-Ming Cheng, Shuicheng Yan
    ICCV, 2023, [PDF] [Code]
    SoTA image generative model on ImageNet 256x256; 13x faster learning speed than DiT (core of SORA)

  2. EditAnything: Empowering Unparalleled Flexibility in Image Editing and Generation
    Shanghua Gao, Zhijie Lin, Xingyu Xie, Pan Zhou+, Ming-Ming Cheng, Shuicheng Yan
    ACMMM, 2023, [PDF] [Code]
    the first a few pioneers for highly-flexible image editing, e.g., cross-image dragging like try-on, region-interactive editing, controllable layout generation, and virtual character replacement.

  3. Consistent3D: Towards Consistent High-Fidelity Text-to-3D Generation with Deterministic Sampling Prior
    Zike Wu, Pan Zhou+, Xuanyu YI, Xiaoding Yuan, Hanwang Zhang
    CVPR, 2024, [Axriv] [Code]
    the first ODE-sampling guided Score Distillation Sampling for 3D generation

  4. Prototypical Contrastive Learning of Unsupervised Representations
    Junnan Li, Pan Zhou, Caiming Xiong, Steven Hoi
    ICLR, 2021, [Axriv] [Bibtex] [Blog] [Code], 900+ citations,
    the first clustering contrastive learning method to learn high-level semantics, i.e., data cluster structure

  5. MetaFormer Baselines for Vision
    Weihao Yu, Chenyang Si, Pan Zhou, Mi Luo, Yichen Zhou, Jiashi Feng, Shuicheng Yan, Xinchao Wang
    TPAMI & CVPR, 2023, [Axriv] [Code], 600+ citations,
    replacing attention with simple pooling still achieves high performance, breaking "attention is all you need" and revealing network design principle

  6. Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models
    Xingyu Xie*, Pan Zhou*, Huan Li, Zhouchen Lin, Shuicheng Yan
    TPAMI, 2024 [PDF] [Code]
    2X-faster and SoTA optimizer on 15+ networks like ResNet, ConvNext, ViT, Swin, MAE, BERT, GPT2, LLAMA, Dreamfusion, DiT, PPO in RL, etc. Included by popular deep-learning codebases like NVIDIA NeMo for LLM, HuggingFace Timm and OpenMMLab for CV tasks, Jittor of Tsinghua University for 3D.

  7. Win: Weight-Decay-Integrated Nesterov Acceleration for Faster Network Training
    Pan Zhou, Xingyu Xie, Zhouchen Lin, Kim-Chuan Toh, Shuicheng Yan
    JMLR & ICLR, 2024 [PDF] [Code]
    accelerate AdamW/Adam/LAMB/SGD by 1.5x on vision and language modeling tasks.

  8. Towards Theoretically Understanding Why SGD Generalizes Better Than ADAM in Deep Learning
    Pan Zhou, Jiashi Feng, Chao Ma, Caiming Xiong, Steven Hoi, and Weinan E
    NeurIPS, 2020, [PDF] [SUPP] [Axriv] [Bibtex] [Code] [Slides] [Poster], 200+ citations
    the first theory to explain "why SGD generalizes better than ADAM in deep learning"


Full Publications

    2024

  1. Win: Weight-Decay-Integrated Nesterov Acceleration for Faster Network Training
    Pan Zhou, Xingyu Xie, Zhouchen Lin, Kim-Chuan Toh, Shuicheng Yan
    Journal of Machine Learning Research (JMLR), 2024
    [PDF] [Code]

  2. Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models
    Xingyu Xie*, Pan Zhou*, Huan Li, Zhouchen Lin, Shuicheng Yan
    IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024
    [PDF] [Code]

  3. Instant3D: Instant Text-to-3D Generation
    Ming Li, Pan Zhou, Jia-Wei Liu, Jussi Keppo, Min Lin, Shuicheng Yan, Xiangyu Xu
    International Journal of Computer Vision (IJCV), 2024
    [PDF] [Code]

  4. Enhancing Visual Grounding in Vision-Language Pre-Training with Position-Guided Text Prompts
    Alex Jinpeng Wang, Pan Zhou, Mike Zheng Shou, Shuicheng Yan
    IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024
    [PDF] [Code] ,

  5. Towards Understanding Convergence and Generalization of AdamW
    Pan Zhou, Xingyu Xie, Zhoucheng Lin, Shuicheng Yan
    IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024
    [PDF] [Supp]

  6. Unsupervised Modality Adaptation with Text-to-Image Diffusion Models for Semantic Segmentation
    Ruihao Xia, Yu Liang, Peng-Tao Jiang, Hao Zhang, Bo Li, Yang Tang, Pan Zhou
    Neural Information Processing Systems (NeurIPS), 2024
    [Axriv]

  7. LOVA3: Learning to Visual Question Answering, Asking and Assessment
    Hengyuan Zhao, Pan Zhou+, Difei Gao, Mike Zheng Shou
    Neural Information Processing Systems (NeurIPS), 2024
    [Axriv] [Code]

  8. 4-bit Shampoo for Memory-Efficient Network Training
    Sike Wang, Pan Zhou, Jia Li, Hua Huang
    Neural Information Processing Systems (NeurIPS), 2024
    [Axriv]

  9. MVGamba: Unify 3D Content Generation as State Space Sequence Modeling
    Xuanyu Yi, Zike Wu, Qiuhong Shen, Qingshan Xu, Pan Zhou, Joo Hwee Lim, Shuicheng YAN, Xinchao Wang, Hanwang Zhang
    Neural Information Processing Systems (NeurIPS), 2024
    [Axriv]

  10. Genixer: Empowering Multimodal Large Language Models as a Powerful Data Generator
    Henry Hengyuan Zhao, Pan Zhou+, Mike Zheng Shou
    European Conference on Computer Vision (ECCV), 2024
    [PDF] [Code]

  11. Efficient Cascaded Multiscale Adaptive Network for Image Restoration
    Yichen Zhou, Pan Zhou+, Teck Khim Ng
    European Conference on Computer Vision (ECCV), 2024

  12. Let's Think Outside the Box: Exploring Leap-of-Thought in Large Language Models with Multimodal Humor Generation
    Shanshan Zhong, Zhongzhan Huang, Shanghua Gao, Wushao Wen, Liang Lin, Marinka Zitnik, Pan Zhou+
    IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024
    [Axriv] [Code]

  13. InceptionNeXt: When Inception Meets ConvNeXt
    Weihao Yu, Pan Zhou, Shuicheng YAN, Xinchao Wang
    IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024
    [Axriv] [Code]

  14. Consistent3D: Towards Consistent High-Fidelity Text-to-3D Generation with Deterministic Sampling Prior
    Zike Wu, Pan Zhou+, Xuanyu YI, Xiaoding Yuan, Hanwang Zhang
    IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024
    [Axriv] [Code]

  15. Friendly Sharpness-Aware Minimization
    Tao Li, Pan Zhou+, Zhengbao He, Xinwen Cheng, Xiaolin Huang
    IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024
    [Axriv] [Code]

  16. Few-shot Learner Parameterization by Diffusion Time-steps
    Zhongqi Yue, Pan Zhou+, Richang Hong, Hanwang Zhang, Qianru Sun
    IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024
    [Axriv] [Code]

  17. Diffusion Time-step Curriculum for One Image to 3D Generation
    Xuanyu Yi, Zike Wu, Qingshan Xu, Pan Zhou+, Joo Hwee Lim, Hanwang Zhang
    IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024
    [Axriv] [Code]

  18. 2023

  19. MetaFormer Baselines for Vision
    Weihao Yu, Chenyang Si, Pan Zhou, Mi Luo, Yichen Zhou, Jiashi Feng, Shuicheng Yan, Xinchao Wang
    IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
    [Axriv] [Code] ,

  20. ScaleLong: Towards More Stable Training of Diffusion Model via Scaling Network Long Skip Connection
    Zhongzhan Huang, Pan Zhou+, Shuicheng Yan, Liang Lin
    Neural Information Processing Systems (NeurIPS), 2023
    [Axriv] [Code]

  21. Masked Diffusion Transformer is a Strong Image Synthesizer
    Shanghua Gao, Pan Zhou+, Ming-Ming Cheng, Shuicheng Yan
    International Conference on Computer Vision (ICCV), 2023
    [PDF] [Code]

  22. EditAnything: Empowering Unparalleled Flexibility in Image Editing and Generation
    Shanghua Gao, Zhijie Lin, Xingyu Xie, Pan Zhou+, Ming-Ming Cheng, Shuicheng Yan
    ACM International Conference on Multimedia (ACMMM), 2023
    [PDF] [Code]

  23. STPrivacy: Spatio-Temporal Privacy-Preserving Action Recognition
    Ming Li, Xiangyu Xu, Hehe Fan, Pan Zhou, Jun Liu, Jia-Wei Liu, Jiahe Li, Jussi Keppo, Mike Zheng Shou, Shuicheng Yan
    International Conference on Computer Vision (ICCV), 2023
    [PDF]

  24. Contrastive Video Question Answering via Video Graph Transformer
    Junbin Xiao, Pan Zhou, Angela Yao, Yicong Li, Richang Hong, Shuicheng Yan, Tat-Seng Chua
    IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
    [PDF] [Code]

  25. Position-guided Text Prompt for Vision-Language Pre-training
    Alex Jinpeng Wang, Pan Zhou+, Mike Zheng Shou, Shuicheng Yan
    IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023
    [Axriv] [Code] ,

  26. Win: Weight-Decay-Integrated Nesterov Acceleration for Adaptive Gradient Algorithms
    Pan Zhou, Xingyu Xie, Shuicheng Yan
    International Conference on Learning Representations (ICLR), 2023 (oral)
    [Axriv] [Code]

  27. Towards Understanding Why Mask Reconstruction Pretraining Helps in Downstream Tasks
    Jiachun Pan*, Pan Zhou*, Shuicheng Yan
    International Conference on Learning Representations (ICLR), 2023
    [Axriv]

  28. LPT: Long-tailed Prompt Tuning for Image Classification
    Bowen Dong, Pan Zhou, Shuicheng Yan, Wangmeng Zuo
    International Conference on Learning Representations (ICLR), 2023
    [Axriv] [Code] ,

  29. Iterative Graph Self-Distillation
    Hanlin Zhang, Shuai Lin, Weiyang Liu, Pan Zhou, Jian Tang, Xiaodan Liang, Eric P. Xing
    IEEE Transactions on Knowledge and Data Engineering (TKDE), 2023
    [Axriv]

  30. 2022

  31. Inception Transformer
    Chenyang Si*, Weihao Yu*, Pan Zhou, Yichen Zhou, Xinchao Wang, Shuicheng Yan
    Neural Information Processing Systems (NeurIPS), 2022 (oral)
    [Axriv] [Code]

  32. Mugs: A Multi-Granular Self-Supervised Learning Framework
    Pan Zhou*, Yichen Zhou*, Chenyang Si*, Weihao Yu, Teck Khim Ng, Shuicheng Yan
    Workshop of Neural Information Processing Systems, 2022.
    [Axriv] [Code]
    Top linear probing and KNN performance on ImageNet without extra data

  33. DualFormer: Local-Global Stratified Transformer for Efficient Video Recognition
    Yuxuan Liang, Pan Zhou, Roger Zimmermann, Shuicheng Yan
    European Conference on Computer Vision (ECCV), 2022
    [Axriv] [Code]

  34. Video Graph Transformer for Video Question Answering
    Junbin Xiao, Pan Zhou, Tat-Seng Chua, Shuicheng Yan
    European Conference on Computer Vision (ECCV), 2022
    [Axriv] [Code]

  35. Self-Promoted Supervision for Few-Shot Transformer
    Bowen Dong, Pan Zhou, Shuicheng Yan, Wangmeng Zuo
    European Conference on Computer Vision (ECCV), 2022
    [Axriv] [Code]

  36. MetaFormer is Actually What You Need for Vision
    Weihao Yu, Mi Luo, Pan Zhou, Chenyang Si, Yichen Zhou, Xinchao Wang, Jiashi Feng, Shuicheng Yan
    IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022 (oral)
    [Axriv] [Code]

  37. Prototypical Graph Contrastive Learning
    Shuai Lin, Chen Liu, Pan Zhou, Zi-yuan Hu, Shuojia Wang, Ruihui Zhao, Yefeng Zheng, Liang Lin, Eric Xing, Xiaodan Liang
    IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2022
    [Axriv] [Code]

  38. 2021

  39. A Theory-Driven Self-Labeling Refinement Method for Contrastive Representation Learning
    Pan Zhou, Caiming Xiong, Xiaotong Yuan, Steven Hoi
    Neural Information Processing Systems (NeurIPS), 2021 (spotlight)
    [PDF] [SUPP] [Axriv] [Bibtex] [Code] [Slides] [Poster]

  40. Towards Understanding Why Lookahead Generalizes Better Than SGD and Beyond
    Pan Zhou, Hanshu Yan, Xiaotong Yuan, Jiashi Feng, Shuicheng Yan
    Neural Information Processing Systems (NeurIPS), 2021
    [PDF] [SUPP] [Bibtex] [Code] [Slides] [Poster]

  41. A Hybrid Stochastic-Deterministic Minibatch Proximal Gradient Method for Efficient Optimization and Generalization
    Pan Zhou, XiaoTong Yuan, Zhouchen Lin, and Steven Hoi
    IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
    [PDF] [SUPP] [Bibtex]

  42. Task Similarity Aware Meta Learning: Theory-inspired Improvement on MAML
    Pan Zhou, Yingtian Zou, XiaoTong Yuan, Jiashi Feng, Caiming Xiong, and Steven Hoi
    International Conference on Uncertainty in Artificial Intelligence (UAI), 2021 (NeurIPS'20 Meta Learning Workshop Paper)
    [PDF] [SUPP] [Code]

  43. Wav-BERT: Cooperative Acoustic and Linguistic Representation Learning for Low-Resource Speech Recognition
    Guolin Zheng, Yubei Xiao, Ke Gong, Pan Zhou, Xiaodan Liang, and Liang Lin
    Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021 (Findings)
    [Axriv]

  44. How Important is the Train-Validation Split in Meta-Learning?
    Yu Bai, Minshuo Chen, Pan Zhou, Tuo Zhao, Jason D. Lee, Sham Kakade, Huan Wang, Caiming Xiong
    International Conference on Machine Learning (ICML), 2021
    [Axriv]

  45. Prototypical Contrastive Learning of Unsupervised Representations
    Junnan Li, Pan Zhou, Caiming Xiong, and Steven Hoi
    International Conference on Learning Representations (ICLR), 2021
    [Axriv] [Bibtex] [Blog] [Code]

  46. Graph-Evolving Meta-Learning for Low-Resource Medical Dialogue Generation
    Shuai Lin, Pan Zhou, Xiaodan Liang, Jianheng Tang, Ruihui Zhao, Ziliang Chen and Liang Lin
    Association for the Advancement of Artificial Intelligence (AAAI), 2021
    [Axriv] [Bibtex] [Code]

  47. Adversarial Meta Sampling for Multilingual Low-Resource Speech Recognition
    Yubei Xiao, Ke Gong, Pan Zhou, Guolin Zheng, Xiaodan Liang and Liang Lin
    Association for the Advancement of Artificial Intelligence (AAAI), 2021
    [Axriv] [Bibtex]

  48. Efficient Gradient Support Pursuit with Less Hard Thresholding for Cardinality-Constrained Learning
    Fanhua Shang, Bingkun Wei, Hongying Liu, Yuanyuan Liu, Pan Zhou, and Maoguo Gong
    IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2021
    [PDF]

  49. 2020

  50. Theory-Inspired Path-Regularized Differential Network Architecture Search
    Pan Zhou, Caiming Xiong, Richard Socher, and Steven Hoi
    Neural Information Processing Systems (NeurIPS), 2020 (oral)
    [PDF] [SUPP] [Axriv] [Bibtex] [Blog] [Code] [Slides] [Poster]

  51. Towards Theoretically Understanding Why SGD Generalizes Better Than ADAM in Deep Learning
    Pan Zhou, Jiashi Feng, Chao Ma, Caiming Xiong, Steven Hoi, and Weinan E
    Neural Information Processing Systems (NeurIPS), 2020
    [PDF] [SUPP] [Axriv] [Bibtex] [Code] [Slides] [Poster]

  52. Improving GAN Training with Probability Ratio Clipping and Sample Reweighting
    Yue Wu, Pan Zhou, Andrew Gordon Wilson, Eric Xing, and Zhiting Hu
    Neural Information Processing Systems (NeurIPS), 2020
    [PDF] [Axriv] [Bibtex] [Codes]

  53. Hybrid Stochastic-Deterministic Minibatch Proximal Gradient: Less-Than-Single-Pass Optimization with Nearly Optimal Generalization
    Pan Zhou and Xiaotong Yuan
    International Conference on Machine Learning (ICML), 2020
    [PDF] [Axriv] [Bibtex]

  54. 2019

  55. Efficient Meta Learning via Minibatch Proximal Update
    Pan Zhou, Xiaotong Yuan, Huan Xu, Shuicheng Yan, Jiashi Feng
    Neural Information Processing Systems (NeurIPS), 2019 (spotlight)
    [PDF] [SUPP] [Bibtex] [Codes] [Slides] [Poster]

  56. Tensor Low-rank Representation for Data Recovery and Clustering
    Pan Zhou, Canyi Lu, Jiashi Feng, Zhouchen Lin, Shuicheng Yan
    IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2019
    [PDF] [SUPP] [Bibtex] [Codes]

  57. Faster First-Order Methods for Stochastic Non-Convex Optimization on Riemannian Manifolds
    Pan Zhou, Xiaotong Yuan, Shuicheng Yan, Jiashi Feng
    IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2019
    [PDF] [Bibtex]

  58. Generalized Majorization-Minimization for Non-Convex Optimization
    Hu Zhang, Pan Zhou, Yi Yang, Jiashi Feng
    International Joint Conference on Artificial Intelligence (IJCAI), 2019
    [PDF] [Bibtex]

  59. Faster First-Order Methods for Stochastic Non-Convex Optimization on Riemannian Manifolds
    Pan Zhou, Xiaotong Yuan, Jiashi Feng
    International Conference on Artificial Intelligence and Statistics (AISTATS), 2019
    [PDF] [Bibtex]

  60. 2018

  61. Efficient Stochastic Gradient Hard Thresholding
    Pan Zhou, Xiaotong Yuan, Jiashi Feng
    Neural Information Processing Systems (NeurIPS), 2018
    [PDF] [Bibtex] [Codes]

  62. New Insight into Hybrid Stochastic Gradient Descent: Beyond With-Replacement Sampling and Convexity
    Pan Zhou, Xiaotong Yuan, Jiashi Feng
    Neural Information Processing Systems (NeurIPS), 2018
    [PDF] [Bibtex]

  63. Understanding Generalization and Optimization Performance of Deep CNNs
    Pan Zhou, Jiashi Feng
    International Conference on Machine Learning (ICML), 2018
    [PDF] [Axriv] [Bibtex]

  64. Deep Adversarial Subspace Clustering
    Pan Zhou, Yunqing Hou, Jiashi Feng
    IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018
    [PDF] [Codes] [Bibtex]

  65. Empirical Risk Landscape Analysis for Understanding Deep Neural Networks
    Pan Zhou, Jiashi Feng
    International Conference on Learning Representations (ICLR), 2018
    [PDF] [Axriv] [Bibtex]

  66. Task Relation Networks
    Jianshu Li, Pan Zhou, Yunpeng Chen, Jian Zhao, Sujoy Roy, Yan Shuicheng, Jiashi Feng, and Terence Sim
    IEEE Winter Conference on Applications of Computer Vision (WACV), 2019
    [PDF]

  67. 2017

  68. Outlier-Robust Tensor PCA
    Pan Zhou, Jiashi Feng
    IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017
    [PDF] [SUPP] [Codes] [Bibtex]

  69. Tensor Factorization for Low-Rank Tensor Completion
    Pan Zhou, Canyi Lu, Zhouchen Lin, Chao Zhang
    IEEE Transactions on Image Processing (TIP), 2017
    [PDF] [SUPP] [Codes] [Bibtex]

  70. Dictionary Learning with Structured Noise
    Pan Zhou, Cong Fang, Zhouchen Lin, Chao Zhang, Edward Y. Chang
    Neurocomputing, 2017
    [PDF] [Bibtex]

  71. Feature Learning via Partial Differential Equation with Applications to Face Recognition
    Cong Fang, Zhenyu Zhao, Pan Zhou, Zhouchen Lin
    Pattern Recognition (PR), 2017
    [PDF] [Codes] [Bibtex]

  72. 2016

  73. Bilevel Model Based Discriminative Dictionary Learning for Recognition
    Pan Zhou, Chao Zhang, Zhouchen Lin
    IEEE Transactions on Image Processing (TIP), 2016
    [PDF] [SUPP] [Bibtex]

  74. Integrated Low-Rank-Based Discriminative Feature Learning for Recognition
    Pan Zhou, Zhouchen Lin, Chao Zhang
    IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2016
    [PDF] [SUPP] [Codes] [Bibtex]


Books and Patents
  1. Tensors for Data Processing
    Chapter 6 is contributed by Pan Zhou, Canyi Lu, Zhouchen Lin
    Elsevier, 2022. [PDF]

  2. Neural network based scene text recognition
    Pan Zhou, Peng Tang, Ran Xu, Chu Hong Hoi
    US Patent, 2022. [PDF]

  3. Systems and methods for contrastive learning with self-labeling refinement
    Pan Zhou, Caiming Xiong, Chu Hong Hoi
    US Patent, 2022. [PDF]

  4. System and method for differential architecture search for neural networks
    Pan Zhou, Chu Hong Hoi
    US Patent, 2021. [PDF]