Hello, I am Yidong Wang [i:doʊn wɑ:n] (王一栋). My research interests lie in semi-supervised learning, transfer learning, and imbalanced learning. I have published 10+ papers at the top international AI Conferences / Journals with total google scholar citations 1000+.

🔥 News

  • 2023.09:  🎉🎉 I became a Ph.D. Student at Peking University.
  • 2023.08:  🎉🎉 I finished my internship at Westlake University.
  • 2022.10:  🎉🎉 I finished my internship at MSRA.

📖 Educations

  • 2023.09 - 2027.06, doctoral student at National Engineering Research Center for Software Engineering, Peking University, advised by Prof. Shikun Zhang and Prof. Wei Ye.
  • 2020.09 - 2022.10, master student in the Department of Information and Communications Engineering of Tokyo Institute of Technology, advised by Prof. Takahiro Shinozaki.
  • 2015.09 - 2019.06, undergraduate student in the Department of Computer Science and Technology of Nanjing University, advised by Prof. Xinyu Dai.

💼 Internships

  • 2022.05 - 2022.10, Microsoft Research Asia, advised by Dr. Jindong Wang.
  • 2022.02 - 2022.05, Westlake University, advised by Prof. Yue Zhang.
  • 2021.11 - 2022.02, Microsoft Research Asia, advised by Dr. Jindong Wang.

📝 Preprints

  • (6) Survey on Factuality in Large Language Models: Knowledge, Retrieval and Domain-Specificity. [paper];

    Cunxiang Wang, Xiaoze Liu, Yuanhao Yue, Xiangru Tang, Tianhang Zhang, Cheng Jiayang, Yunzhi Yao, Wenyang Gao, Xuming Hu, Zehan Qi, Yidong Wang, Linyi Yang, Jindong Wang, Xing Xie, Zheng Zhang and Yue Zhang.

  • (5) PandaLM: An Automatic Evaluation Benchmark for LLM Instruction Tuning Optimization. [paper];

    Yidong Wang, Zhuohao Yu, Zhengran Zeng, Linyi Yang, Cunxiang Wang, Hao Chen, Chaoya Jiang, Rui Xie, Jindong Wang, Xing Xie, Wei Ye, Shikun Zhang, Yue Zhang.

  • (4) Imprecise Label Learning: A Unified Framework for Learning with Various Imprecise Label Configurations. [paper].

    Hao Chen, Ankit Shah, Jindong Wang, Ran Tao, Yidong Wang, Xing Xie, Masashi Sugiyama, Rita Singh, Bhiksha Raj.

  • (3) PromptBench: Towards Evaluating the Robustness of Large Language Models on Adversarial Prompts. [paper];

    Kaijie Zhu, Jindong Wang, Jiaheng Zhou, Zichen Wang, Hao Chen, Yidong Wang, Linyi Yang, Wei Ye, Neil Zhenqiang Gong, Yue Zhang, Xing Xie.

  • (2) Conv-Adapter: Exploring Parameter Efficient Transfer Learning for ConvNets. [paper].

    Hao Chen, Ran Tao, Han Zhang, Yidong Wang, Wei Ye, Jindong Wang, Guosheng Hu, Marios Savvides.

  • (1) An Embarrassingly Simple Baseline for Imbalanced Semi-Supervised Learning. [paper].

    Hao Chen, Yue Fan, Yidong Wang, Jindong Wang, Bernt Schiele, Xing Xie, Marios Savvides, Bhiksha Raj.

📝 Publications

  • (16) A Survey on Evaluation of Large Language Models. [paper];

    Yupeng Chang, Xu Wang, Jindong Wang, Yuan Wu, Kaijie Zhu, Hao Chen, Linyi Yang, Xiaoyuan Yi, Cunxiang Wang, Yidong Wang, Wei Ye, Yue Zhang, Yi Chang, Philip S. Yu, Qiang Yang, Xing Xie.

    Transactions on Intelligent Systems and Technology (TIST 2023).

  • (15) Out-of-Distribution Generalization in Text Classification: Past, Present, and Future. [paper].

    Linyi Yang, Yaoxiao Song, Xuan Ren, Chenyang Lyu, Yidong Wang, Lingqiao Liu, Jindong Wang, Jennifer Foster, Yue Zhang.

    The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023).

  • (14) Evaluating open question answering evaluation. [paper];

    Cunxiang Wang, Sirui Cheng, Zhikun Xu, Bowen Ding, Yidong Wang, Yue Zhang.

    Advances in Neural Information Processing Systems 2023 (NeurIPS 2023).

  • (13) Non-IID always Bad? Semi-Supervised Heterogeneous Federated Learning with Local Knowledge Enhancement. [paper];

    Chao Zhang, Fangzhao Wu, Jingwei Yi, Derong Xu, Yang Yu, Jingdong Wang, Yidong Wang, Tong Xu, Xing Xie, Enhong Chen.

    The Conference on Information and Knowledge Management 2023 (CIKM 2023).

  • (12) Towards Optimization and Model Selection for Domain Generalization: A Mixup-guided Solution. [paper].

    Wang Lu, Jindong Wang, Yidong Wang, Kan Ren, Yiqiang Chen, Xing Xie.

    KDD 2023 workshop on Causal Discovery, Prediction and Decision (CDPD 2023).

  • (11) Exploring Vision-Language Models for Imbalanced Learning. [paper];

    Yidong Wang, Zhuohao Yu, Jindong Wang, Qiang Heng, Hao Chen, Wei Ye, Rui Xie, Xing Xie, Shikun Zhang.

    International Journal of Computer Vision 2023 (IJCV 2023).

  • (10) GLUE-X: Evaluating Natural Language Understanding Models from an Out-of-distribution Generalization Perspective. [paper];

    Linyi Yang, Shuibai Zhang, Libo Qin, Yafu Li, Yidong Wang, Hanmeng Liu, Jindong Wang, Xing Xie, Yue Zhang.

    Findings of Annual Meeting of the Association for Computational Linguistics 2023 (ACL 2023 Findings).

  • (9) On the Robustness of ChatGPT: An Adversarial and Out-of-distribution Perspective. [paper];

    Jindong Wang, Xixu Hu, Wenxin Hou, Hao Chen, Runkai Zheng, Yidong Wang, Linyi Yang, Haojun Huang, Wei Ye, Xiubo Geng, Binxin Jiao, Yue Zhang, Xing Xie.

    Workshop on Trustworthy and Reliable Large-Scale Machine Learning Models at ICLR 2023 (RTML Workshop 2023).

  • (8) FreeMatch: Self-adaptive Thresholding for Semi-supervised Learning. [paper];

    Yidong Wang, Hao Chen, Qiang Heng, Wenxin Hou, Yue Fan, Zhen Wu, Jindong Wang, Marios Savvides, Takahiro Shinozaki, Bhiksha Raj, Bernt Schiele, Xing Xie.

    International Conference on Learning Representations 2023 (ICLR 2023).

  • (7) SoftMatch: Addressing the Quantity-Quality Tradeoff in Semi-supervised Learning. [paper];

    Hao Chen, Ran Tao, Yue Fan, Yidong Wang, Marios Savvides, Jindong Wang, Bhiksha Raj, Xing Xie, Bernt Schiele.

    International Conference on Learning Representations 2023 (ICLR 2023).

  • (6) USB: A Unified Semi-supervised Learning Benchmark for Classification. [paper];

    Yidong Wang, Hao Chen, Yue Fan, Wang Sun, Ran Tao, Wenxin Hou, Renjie Wang, Linyi Yang, Zhi Zhou, Lan-Zhe Guo, Heli Qi, Zhen Wu, Yu-Feng Li, Satoshi Nakamura, Wei Ye, Marios Savvides, Bhiksha Raj, Takahiro Shinozaki, Bernt Schiele, Jindong Wang, Xing Xie, Yue Zhang.

    Advances in Neural Information Processing Systems 2022 (NeurIPS 2022).

  • (5) Margin Calibration for Long-Tailed Visual Recognition. [paper];

    Yidong Wang, Bowen Zhang, Wenxin Hou, Zhen Wu, Jindong Wang, Takahiro Shinozaki.

    Asian Conference on Machine Learning 2022 (ACML 2022).

  • (4) Exploiting Unlabeled Data for Target-Oriented Opinion Words Extraction. [paper];

    Yidong Wang, Hao Wu, Ao Liu, Wenxin Hou, Zhen Wu, Jindong Wang, Takahiro Shinozaki, Manabu Okumura, Yue Zhang.

    International Conference on Computational Linguistics 2022 (COLING 2022).

  • (3) Exploiting Adapters for Cross-lingual Low-resource Speech Recognition. [paper];

    Wenxin Hou, Han Zhu, Yidong Wang, Jindong Wang, Tao Qin, Renjun Xu, Takahiro Shinozaki.

    IEEE/ACM Transactions on Audio, Speech and Language Processing 2022 (TASLP 2022).

  • (2) Flexmatch: Boosting Semi-supervised Learning with Curriculum Pseudo Labeling. [paper];

    Bowen Zhang, Yidong Wang (co-first author), Wenxin Hou, Hao Wu, Jindong Wang, Manabu Okumura, Takahiro Shinozaki.

    Advances in Neural Information Processing Systems 2021 (NeurIPS 2021).

  • (1) Meta-Adapter: Efficient Cross-Lingual Adaptation With Meta-Learning. [paper];

    Wenxin Hou, Yidong Wang, Shengzhou Gao, Takahiro Shinozaki.

    IEEE International Conference on Acoustics, Speech, and Signal Processing 2021 (ICASSP 2021).

💻 Selected Projects

  • PandaLM refers to ReProducible and Automated Language Model Assessment. PandaLM aims to provide reproducible and automated comparisons between different large language models (LLMs). By giving PandaLM the same context, it can compare the responses of different LLMs and provide a reason for the decision, along with a reference answer. I am the main contributor to this repo and now leading the PandaLM team.
  • USB is a Pytorch-based Python package for Semi-Supervised Learning (SSL). It is easy-to-use/extend, affordable, and comprehensive for developing and evaluating SSL algorithms. USB provides the implementation of 14 SSL algorithms based on Consistency Regularization, and 15 tasks for evaluation from CV, NLP, and Audio domain. I am the main contributor to this repo and now leading the USB team.
  • TorchSSL is an all-in-one toolkit based on PyTorch for semi-supervised learning (SSL). Currently, we implemented 9 popular SSL algorithms to enable fair comparison and boost the development of SSL algorithms. I am the main contributor to this repo and now leading the TorchSSL team.
  • Microsoft NeuralSpeech is a research project in Microsoft Research Asia focusing on neural network based speech processing, including automatic speech recognition (ASR), text to speech (TTS), etc. The code of Exploiting Adapters for Cross-lingual Low-resource Speech Recognition (TASLP 2022) has been moved to this repo.

🎖 Honors and Awards

  • First Place in the Entrance Examination for PhD at the School of Software and Microelectronics, Peking University, 2023.
  • Outstanding Student Award, Tokyo Institue of Technology, 2022.
  • Stars of Tomorrow, Microsoft Research Asia, 2021&2022.
  • Jasso Scholarship, Tokyo Institue of Technology, 2020.
  • Excellence in Nanjing University Training Program of Innovation for Undergraduates, 2019.
  • Honorable Mention of Interdisciplinary Contest in Modeling, 2018.
  • Renmin Scholarship, Nanjing University, 2017&2018.

📄 Academic Services

  • Reviewer for Conferences: NeurIPS 2022, CVPR 2023, ICML 2023, ICCV 2023, NeurIPS 2023, AAAI 2024, ICLR 2024.
  • Reviewer for Journals: IJCV, ACM TIST, JCST.