Hello, I am Yidong Wang [i:doʊn wɑ:n] (王一栋). I have published several papers at the top international AI Conferences / Journals. Details of my publications can be found at Google Scholar . If you are interested in collaboration, feel free to email me at yidongwang37[at]gmail.com (please replace [at] with @).

My Research Philosophy: The 3A Framework

Algorithm → Assessment → Application synergistically advances trustworthy AI (All publications mentioned below are my (co)first-author works):

Novel algorithms to address real-world challenges
Curriculum thresholding for semi-supervised learning(SSL) (FlexMatch, NeurIPS 2021); Self-adaptive threshold optimization for SSL (FreeMatch, ICLR 2023); Imbalanced vision-language adaptation (VLMs+Decoder, IJCV 2024); Margin calibration for imbalaced learning (MARC, ACML 2022).
Rigorous assessment to ensure reliability
Unified SSL benchmark (USB, NeurIPS 2022); Privacy-preserving LLM-as-a-judge evaluation (PandaLM, ICLR 2024).
Practical applications to create societal value
Low-resource sentiment word extraction(TOWE, COLING 2022); Applied LLMs to decode gene-cell interactions(LLM4Genes, TIST 2024); Automated research review synthesis (AutoSurvey, NeurIPS 2024).

🔥 News

2023.09: 🎉🎉 I became a Ph.D. Student at Peking University.
2023.08: 🎉🎉 I finished my internship at Westlake University.
2022.10: 🎉🎉 I finished my internship at MSRA.

📖 Educations

2023.09 - 2027.06, doctoral student at National Engineering Research Center for Software Engineering, Peking University, advised by Prof. Shikun Zhang and Prof. Wei Ye.
2020.09 - 2022.10, master student in the Department of Information and Communications Engineering of Tokyo Institute of Technology, advised by Prof. Takahiro Shinozaki.
2015.09 - 2019.06, undergraduate student in the Department of Computer Science and Technology of Nanjing University, advised by Prof. Xinyu Dai.

💼 Internships

2022.05 - 2022.10, Microsoft Research Asia, advised by Dr. Jindong Wang.
2022.02 - 2022.05, Westlake University, advised by Prof. Yue Zhang.
2021.11 - 2022.02, Microsoft Research Asia, advised by Dr. Jindong Wang.

🔖 Selected Publications

(* means equal contribution)

(10) TrustJudge: Inconsistencies of LLM-as-a-Judge and How to Alleviate Them. [paper]; .

Yidong Wang*, Yunze Song*, Tingyuan Zhu, Xuanwang Zhang, Zhuohao Yu, Hao Chen, Chiyu Song, Qiufeng Wang, Cunxiang Wang, Zhen Wu, Xinyu Dai, Yue Zhang, Wei Ye, Shikun Zhang.

International Conference on Learning Representations 2026 (``ICLR 2026``).
(9) AutoSurvey: Large Language Models Can Automatically Write Surveys. [paper]; .

Yidong Wang*, Qi Guo*, Wenjin Yao, Hongbo Zhang, Xin Zhang, Zhen Wu, Meishan Zhang, Xinyu Dai, Min Zhang, Qingsong Wen, Wei Ye, Shikun Zhang, Yue Zhang.

Advances in Neural Information Processing Systems 2024 (``NeurIPS 2024``).

``It ranked 262 out of 4829 papers in terms of citations at NeurIPS 2024 (Top 5%).`` [citation evidence]
(8) PandaLM: An Automatic Evaluation Benchmark for LLM Instruction Tuning Optimization. [paper]; .

Yidong Wang*, Zhuohao Yu*, Wenjin Yao, Zhengran Zeng, Linyi Yang, Cunxiang Wang, Hao Chen, Chaoya Jiang, Rui Xie, Jindong Wang, Xing Xie, Wei Ye, Shikun Zhang, Yue Zhang.

International Conference on Learning Representations 2024 (``ICLR 2024``).

``It ranked 66 out of 2296 papers in terms of citations at ICLR 2024 (Top 2%).`` [citation evidence]
(7) How do Large Language Models understand Genes and Cells. [paper]; .

Chen Fang*, Yidong Wang*, Yunze Song, Qingqing Long, Wang Lu, Linghui Chen, Pengfei Wang, Guihai Feng, Yuanchun Zhou, Xin Li.

Transactions on Intelligent Systems and Technology (``TIST 2024``).
(6) Exploring Vision-Language Models for Imbalanced Learning. [paper]; .

Yidong Wang, Zhuohao Yu, Jindong Wang, Qiang Heng, Hao Chen, Wei Ye, Rui Xie, Xing Xie, Shikun Zhang.

International Journal of Computer Vision 2024 (``IJCV 2024``).

``It ranked 23 out of 300 papers in terms of citations among all papers in IJCV in 2024 (Top 8%).`` [citation evidence]
(5) FreeMatch: Self-adaptive Thresholding for Semi-supervised Learning. [paper]; .

Yidong Wang*, Hao Chen*, Qiang Heng, Wenxin Hou, Yue Fan, Zhen Wu, Jindong Wang, Marios Savvides, Takahiro Shinozaki, Bhiksha Raj, Bernt Schiele, Xing Xie.

International Conference on Learning Representations 2023 (``ICLR 2023``).

``It ranked 47 out of 1584 papers in terms of citations at ICLR 2023 (Top 2%).`` [citation evidence]
(4) USB: A Unified Semi-supervised Learning Benchmark for Classification. [paper]; .

Yidong Wang*, Hao Chen*, Yue Fan*, Wang Sun, Ran Tao, Wenxin Hou, Renjie Wang, Linyi Yang, Zhi Zhou, Lan-Zhe Guo, Heli Qi, Zhen Wu, Yu-Feng Li, Satoshi Nakamura, Wei Ye, Marios Savvides, Bhiksha Raj, Takahiro Shinozaki, Bernt Schiele, Jindong Wang, Xing Xie, Yue Zhang.

Advances in Neural Information Processing Systems 2022 (``NeurIPS 2022 D&B Track``).

``It ranked 167 out of 2834 papers in terms of citations at NeurIPS 2022 (Top 6%).`` [citation evidence]
(3) Exploiting Unlabeled Data for Target-Oriented Opinion Words Extraction. [paper]; .

Yidong Wang*, Hao Wu*, Ao Liu, Wenxin Hou, Zhen Wu, Jindong Wang, Takahiro Shinozaki, Manabu Okumura, Yue Zhang.

International Conference on Computational Linguistics 2022 (``COLING 2022``).
(2) Margin Calibration for Long-Tailed Visual Recognition. [paper]; .

Yidong Wang*, Bowen Zhang*, Wenxin Hou, Zhen Wu, Jindong Wang, Takahiro Shinozaki.

Asian Conference on Machine Learning 2022 (``ACML 2022``).

``It ranked 1 out of 103 papers in terms of citations at ACML 2022 (Top 1%).`` [citation evidence]
(1) Flexmatch: Boosting Semi-supervised Learning with Curriculum Pseudo Labeling. [paper]; .

Bowen Zhang*, Yidong Wang*, Wenxin Hou, Hao Wu, Jindong Wang, Manabu Okumura, Takahiro Shinozaki.

Advances in Neural Information Processing Systems 2021 (``NeurIPS 2021``).

``It ranked 16 out of 2334 papers in terms of citations at NeurIPS 2021 (Top 1%).`` [citation evidence]

📝 Publications

(39) TrustJudge: Inconsistencies of LLM-as-a-Judge and How to Alleviate Them. [paper]; .

Yidong Wang*, Yunze Song*, Tingyuan Zhu, Xuanwang Zhang, Zhuohao Yu, Hao Chen, Chiyu Song, Qiufeng Wang, Cunxiang Wang, Zhen Wu, Xinyu Dai, Yue Zhang, Wei Ye, Shikun Zhang.

International Conference on Learning Representations 2026 (``ICLR 2026``).
(38) Visual Thoughts: A Unified Perspective of Understanding Multimodal Chain-of-Thought. [paper].

Zihui Cheng, Qiguang Chen, Xiao Xu, Jiaqi Wang, Weiyun Wang, Hao Fei, Yidong Wang, Alex Jinpeng Wang, Zhi Chen, Wanxiang Che, Libo Qin.

Advances in Neural Information Processing Systems 2025 (``NeurIPS 2025``).
(37) SAEMark: Steering Personalized Multilingual LLM Watermarks with Sparse Autoencoders. [paper].

Zhuohao Yu, Xingru Jiang, Weizheng Gu, Yidong Wang, Shikun Zhang, Wei Ye.

Advances in Neural Information Processing Systems 2025 (``NeurIPS 2025``).
(36) Learning from ``Silly” Questions Improves Large Language Models, But Only Slightly. [paper].

Tingyuan Zhu, Shudong Liu, Yidong Wang, Derek F Wong, Han Yu, Takahiro Shinozaki and Jindong Wang.

The Thirty-Ninth Association for the Advancement of Artificial Intelligence Conference Good-Data Workshop (``AAAI 2025 workshop``).
(35) Survey on Factuality in Large Language Models: Knowledge, Retrieval and Domain-Specificity. [paper];

Cunxiang Wang*, Xiaoze Liu*, Yuanhao Yue*, Xiangru Tang, Tianhang Zhang, Cheng Jiayang, Yunzhi Yao, Wenyang Gao, Xuming Hu, Zehan Qi, Yidong Wang, Linyi Yang, Jindong Wang, Xing Xie, Zheng Zhang and Yue Zhang.

ACM Computing Surveys (``CSUR 2025``).
(34) XRDMatch: a semi-supervised learning framework to efficiently discover room temperature lithium superionic conductors. [paper].

Zheng Wan*, Zhenying Chen*, Hao Chen, Yizhi Jiang, Jinhuan Zhang, Yidong Wang, Jindong Wang, Hao Sun, Zhongjie Zhu, Jinhui Zhu, Linyi Yang, Wei Ye, Shikun Zhang, Xing Xie, Yue Zhang, Xiaodong Zhuang, Xiao He and Jinrong Yang.

Energy & Environmental Science (``Energy & Environmental Science 2024``).
(33) Promptrobust: Towards evaluating the robustness of large language models on adversarial prompts. [paper];

Kaijie Zhu, Jindong Wang, Jiaheng Zhou, Zichen Wang, Hao Chen, Yidong Wang, Linyi Yang, Wei Ye, Neil Zhenqiang Gong, Yue Zhang, Xing Xie.

ACM Conference on Computer and Communications Security workshop on privacy and security (Lamp) 2024 (``CCS workshop 2024``).
(32) Survey on Knowledge Distillation for Large Language Models: Methods, Evaluation, and Application. [paper].

Chuanpeng Yang, Wang Lu, Yao Zhu, Yidong Wang, Qian Chen, Chenlong Gao, Bingjie Yan, Yiqiang Chen.

Transactions on Intelligent Systems and Technology (``TIST 2024``).
(31) RAGLAB: A Modular and Research-Oriented Unified Framework for Retrieval-Augmented Generation. [paper];

Xuanwang Zhang*, Yunze Song*, Yidong Wang, Shuyun Tang, Xinfeng Li, Zhengran Zeng, Zhen Wu, Wei Ye, Wenyuan Xu, Yue Zhang, Xinyu Dai, Shikun Zhang, Qingsong Wen.

The 2024 Conference on Empirical Methods in Natural Language Processing System Demonstration Track (``EMNLP 2024 Demo Track``).
(30) FreeEval: A Modular Framework for Trustworthy and Efficient Evaluation of Large Language Models. [paper];

Zhuohao Yu, Chang Gao, Wenjin Yao, Yidong Wang, Zhengran Zeng, Wei Ye, Jindong Wang, Yue Zhang, Shikun Zhang.

The 2024 Conference on Empirical Methods in Natural Language Processing System Demonstration Track (``EMNLP 2024 Demo Track``).
(29) PURE: Aligning LLM via Pluggable Query Reformulation for Enhanced Helpfulness.

Wenjin Yao, Yidong Wang, Zhuohao Yu, Rui Xie, Shikun Zhang, Wei Ye.

Findings of The 2024 Conference on Empirical Methods in Natural Language Processing (``EMNLP 2024 Findings``).
(28) Imprecise Label Learning: A Unified Framework for Learning with Various Imprecise Label Configurations. [paper];

Hao Chen, Ankit Shah, Jindong Wang, Ran Tao, Yidong Wang, Xing Xie, Masashi Sugiyama, Rita Singh, Bhiksha Raj.

Advances in Neural Information Processing Systems 2024 (``NeurIPS 2024``).
(27) AutoSurvey: Large Language Models Can Automatically Write Surveys. [paper];

Yidong Wang*, Qi Guo*, Wenjin Yao, Hongbo Zhang, Xin Zhang, Zhen Wu, Meishan Zhang, Xinyu Dai, Min Zhang, Qingsong Wen, Wei Ye, Shikun Zhang, Yue Zhang.

Advances in Neural Information Processing Systems 2024 (``NeurIPS 2024``).
(26) How do Large Language Models understand Genes and Cells. [paper];

Chen Fang*, Yidong Wang*, Yunze Song, Qingqing Long, Wang Lu, Linghui Chen, Pengfei Wang, Guihai Feng, Yuanchun Zhou, Xin Li.

Transactions on Intelligent Systems and Technology (``TIST 2024``).
(25) PIXEL: Prompt-based Zero-shot Hashing via Visual and Textual Semantic Alignment.

Zeyu Dong, Qingqing Long, Yihang Zhou, Zhihong Zhu, Yidong Wang, Xiao Luo, Pengyang Wang, Pengfei Wang, Yuanchun Zhou.

The 33rd ACM International Conference on Information and Knowledge Management (``CIKM 2024``).
(24) Enhancing In-Context Learning via Implicit Demonstration Augmentation. [paper].

Xiaoling Zhou, Wei Ye, Yidong Wang, Chaoya Jiang, Zhemg Lee, Rui Xie, Shikun Zhang.

Annual Meeting of the Association for Computational Linguistics 2024 (``ACL 2024``).
(23) KIEval: A Knowledge-grounded Interactive Evaluation Framework for Large Language Models. [paper];

Zhuohao Yu, Chang Gao, Wenjin Yao, Yidong Wang, Wei Ye, Jindong Wang, Xing Xie, Yue Zhang, Shikun Zhang.

Annual Meeting of the Association for Computational Linguistics 2024 (``ACL 2024``).
(22) What Makes a Good Order of Examples in In-Context Learning. [paper].

Qi Guo, Leiyu Wang, Yidong Wang, Wei Ye, Shikun Zhang.

Findings of Annual Meeting of the Association for Computational Linguistics 2024 (``ACL 2024 Findings``).
(21) A General Framework for Learning from Weak Supervision. [paper].

Hao Chen, Jindong Wang, Lei Feng, Xiang Li, Yidong Wang, Xing Xie, Masashi Sugiyama, Rita Singh, Bhiksha Raj

The Forty-first International Conference on Machine Learning (``ICML 2024``).
(20) Conv-Adapter: Exploring Parameter Efficient Transfer Learning for ConvNets. [paper].

Hao Chen, Ran Tao, Han Zhang, Yidong Wang, Wei Ye, Jindong Wang, Guosheng Hu, Marios Savvides.

Conference on Computer Vision and Pattern Recognition 2024 Workshop Prompting in Vision (``CVPR 2024 Workshop``).
(19) CoderUJB: An Executable and Unified Java Benchmark for Practical Programming Scenarios. [paper];

Zhengran Zeng, Yidong Wang, Rui Xie, Wei Ye, Shikun Zhang.

The ACM SIGSOFT International Symposium on Software Testing and Analysis 2024 (``ISSTA 2024``).
(18) PandaLM: An Automatic Evaluation Benchmark for LLM Instruction Tuning Optimization. [paper];

Yidong Wang*, Zhuohao Yu*, Wenjin Yao, Zhengran Zeng, Linyi Yang, Cunxiang Wang, Hao Chen, Chaoya Jiang, Rui Xie, Jindong Wang, Xing Xie, Wei Ye, Shikun Zhang, Yue Zhang.

International Conference on Learning Representations 2024 (``ICLR 2024``).
(17) Supervised Knowledge Makes Large Language Models Better In-context Learners. [paper];

Linyi Yang*, Shuibai Zhang*, Zhuohao Yu*, Guangsheng Bao, Yidong Wang, Jindong Wang, Ruochen Xu, Wei Ye, Xing Xie, Weizhu Chen, Yue Zhang.

International Conference on Learning Representations 2024 (``ICLR 2024``).
(16) A Survey on Evaluation of Large Language Models. [paper];

Yupeng Chang*, Xu Wang*, Jindong Wang, Yuan Wu, Kaijie Zhu, Hao Chen, Linyi Yang, Xiaoyuan Yi, Cunxiang Wang, Yidong Wang, Wei Ye, Yue Zhang, Yi Chang, Philip S. Yu, Qiang Yang, Xing Xie.

Transactions on Intelligent Systems and Technology (``TIST 2024``).
(15) Towards Optimization and Model Selection for Domain Generalization: A Mixup-guided Solution. [paper].

Wang Lu, Jindong Wang, Yidong Wang, Kan Ren, Yiqiang Chen, Xing Xie.

SIAM Conference on Data Mining 2024 (``SDM 2024``).
(14) Exploring Vision-Language Models for Imbalanced Learning. [paper];

Yidong Wang, Zhuohao Yu, Jindong Wang, Qiang Heng, Hao Chen, Wei Ye, Rui Xie, Xing Xie, Shikun Zhang.

International Journal of Computer Vision 2024 (``IJCV 2024``).
(13) Out-of-Distribution Generalization in Natural Language Processing: Past, Present, and Future. [paper].

Linyi Yang*, Yaoxiao Song*, Xuan Ren*, Chenyang Lyu, Yidong Wang, Lingqiao Liu, Jindong Wang, Jennifer Foster, Yue Zhang.

The 2023 Conference on Empirical Methods in Natural Language Processing (``EMNLP 2023``).
(12) Evaluating open question answering evaluation. [paper];

Cunxiang Wang*, Sirui Cheng*, Zhikun Xu, Bowen Ding, Yidong Wang, Yue Zhang.

Advances in Neural Information Processing Systems 2023 (``NeurIPS 2023``).
(11) Non-IID always Bad? Semi-Supervised Heterogeneous Federated Learning with Local Knowledge Enhancement. [paper];

Chao Zhang, Fangzhao Wu, Jingwei Yi, Derong Xu, Yang Yu, Jingdong Wang, Yidong Wang, Tong Xu, Xing Xie, Enhong Chen.

The Conference on Information and Knowledge Management 2023 (``CIKM 2023``).
(10) GLUE-X: Evaluating Natural Language Understanding Models from an Out-of-distribution Generalization Perspective. [paper];

Linyi Yang*, Shuibai Zhang*, Libo Qin, Yafu Li, Yidong Wang, Hanmeng Liu, Jindong Wang, Xing Xie, Yue Zhang.

Findings of Annual Meeting of the Association for Computational Linguistics 2023 (``ACL 2023 Findings``).
(9) On the Robustness of ChatGPT: An Adversarial and Out-of-distribution Perspective. [paper];

Jindong Wang, Xixu Hu*, Wenxin Hou*, Hao Chen, Runkai Zheng, Yidong Wang, Linyi Yang, Haojun Huang, Wei Ye, Xiubo Geng, Binxin Jiao, Yue Zhang, Xing Xie.

Workshop on Trustworthy and Reliable Large-Scale Machine Learning Models at ICLR 2023 (``RTML Workshop 2023``).
(8) FreeMatch: Self-adaptive Thresholding for Semi-supervised Learning. [paper];

Yidong Wang*, Hao Chen*, Qiang Heng, Wenxin Hou, Yue Fan, Zhen Wu, Jindong Wang, Marios Savvides, Takahiro Shinozaki, Bhiksha Raj, Bernt Schiele, Xing Xie.

International Conference on Learning Representations 2023 (``ICLR 2023``).
(7) SoftMatch: Addressing the Quantity-Quality Tradeoff in Semi-supervised Learning. [paper];

Hao Chen*, Ran Tao*, Yue Fan, Yidong Wang, Marios Savvides, Jindong Wang, Bhiksha Raj, Xing Xie, Bernt Schiele.

International Conference on Learning Representations 2023 (``ICLR 2023``).
(6) USB: A Unified Semi-supervised Learning Benchmark for Classification. [paper];

Yidong Wang*, Hao Chen*, Yue Fan*, Wang Sun, Ran Tao, Wenxin Hou, Renjie Wang, Linyi Yang, Zhi Zhou, Lan-Zhe Guo, Heli Qi, Zhen Wu, Yu-Feng Li, Satoshi Nakamura, Wei Ye, Marios Savvides, Bhiksha Raj, Takahiro Shinozaki, Bernt Schiele, Jindong Wang, Xing Xie, Yue Zhang.

Advances in Neural Information Processing Systems 2022 (``NeurIPS 2022``).
(5) Margin Calibration for Long-Tailed Visual Recognition. [paper];

Yidong Wang*, Bowen Zhang*, Wenxin Hou, Zhen Wu, Jindong Wang, Takahiro Shinozaki.

Asian Conference on Machine Learning 2022 (``ACML 2022``).
(4) Exploiting Unlabeled Data for Target-Oriented Opinion Words Extraction. [paper];

Yidong Wang*, Hao Wu*, Ao Liu, Wenxin Hou, Zhen Wu, Jindong Wang, Takahiro Shinozaki, Manabu Okumura, Yue Zhang.

International Conference on Computational Linguistics 2022 (``COLING 2022``).
(3) Exploiting Adapters for Cross-lingual Low-resource Speech Recognition. [paper];

Wenxin Hou, Han Zhu, Yidong Wang, Jindong Wang, Tao Qin, Renjun Xu, Takahiro Shinozaki.

IEEE/ACM Transactions on Audio, Speech and Language Processing 2022 (``TASLP 2022``).
(2) Flexmatch: Boosting Semi-supervised Learning with Curriculum Pseudo Labeling. [paper];

Bowen Zhang*, Yidong Wang*, Wenxin Hou, Hao Wu, Jindong Wang, Manabu Okumura, Takahiro Shinozaki.

Advances in Neural Information Processing Systems 2021 (``NeurIPS 2021``).
(1) Meta-Adapter: Efficient Cross-Lingual Adaptation With Meta-Learning. [paper];

Wenxin Hou, Yidong Wang, Shengzhou Gao, Takahiro Shinozaki.

IEEE International Conference on Acoustics, Speech, and Signal Processing 2021 (``ICASSP 2021``).

💻 Selected Projects

PandaLM refers to ReProducible and Automated Language Model Assessment. PandaLM aims to provide reproducible and automated comparisons between different large language models (LLMs). By giving PandaLM the same context, it can compare the responses of different LLMs and provide a reason for the decision, along with a reference answer. I am the main contributor to this repo and now leading the PandaLM team.
USB is a Pytorch-based Python package for Semi-Supervised Learning (SSL). It is easy-to-use/extend, affordable, and comprehensive for developing and evaluating SSL algorithms. USB provides the implementation of 14 SSL algorithms based on Consistency Regularization, and 15 tasks for evaluation from CV, NLP, and Audio domain. I am the main contributor to this repo and now leading the USB team.
TorchSSL is an all-in-one toolkit based on PyTorch for semi-supervised learning (SSL). Currently, we implemented 9 popular SSL algorithms to enable fair comparison and boost the development of SSL algorithms. I am the main contributor to this repo and now leading the TorchSSL team.

🎖 Honors and Awards

First Place in the Entrance Examination for PhD at the School of Software and Microelectronics, Peking University, 2023.
Outstanding Student Award, Tokyo Institue of Technology, 2022.
Stars of Tomorrow, Microsoft Research Asia, 2021&2022.
Jasso Scholarship, Tokyo Institue of Technology, 2020.
Excellence in Nanjing University Training Program of Innovation for Undergraduates, 2019.
Honorable Mention of Interdisciplinary Contest in Modeling, 2018.
Renmin Scholarship, Nanjing University, 2017&2018.

📄 Academic Services

Reviewer for Conferences: NeurIPS 2022, CVPR 2023, ICML 2023, ICCV 2023, NeurIPS 2023, AAAI 2024, ICLR 2024, CVPR 2024, ICML 2024, ECCV 2024, NAACL 2024, ACL 2024, COLM 2024, NeurIPS 2024, ICLR 2025.
Reviewer for Journals: IJCV, TIP, ACM TIST, JCST.

🏫 Teaching Experience

2024 Spring Teaching Assistant, Natural Language Processing(自然语言处理) by Prof. Di He, Peking University.

🎤 Invited Talks

2023, Microsoft Research Asia, Sharing Internship Experience.
2023, The AI Talks, Advancing Semi-Supervised Learning: Methods and Benchmarks.
2023, East China Normal University, Introduction to Semi-supervised Learning.
2024, Nanjing University, Addressing Low-Resource Challenges in Machine Learning: Strategies from the Early Algorithm Era to the Age of Large Models.