πŸ“ Selected Publications

Preprint
sym

OS-Oracle: A Comprehensive Framework for Cross-Platform GUI Critic Models πŸ”₯πŸ”₯
Zhenyu Wu*, Jingjing Xie*, Zehao Li, Bowen Yang, Qiushi Sun, Zhaoyang Liu, Zhoumianze Liu, Yu Qiao, Xiangyu Yue, Zun Wang, Zichen Ding βœ‰οΈ

(* means equal contributions, βœ‰οΈ means corresponding author.)

  • Check code, dataset & models at Our Github and Our HF Collections. πŸ€—
  • A scalable data pipeline for synthesizing cross-platform GUI critic data. πŸ“Š
  • Introducing an elaborate training recipe that integrates SFT with CP-GRPO. 🧠
  • A holistic benchmark for evaluating GUI critic across Mobile, Web, and Desktop platforms. βš–οΈ
  • OS-Oracle-7B achieves the SOTA results on OS-Critic Bench and support the full CUA lifecycle. πŸ†
Preprint
sym

ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data πŸ”₯πŸ”₯
Zhaoyang Liu*, Jingjing Xie*, Zichen Ding*, Zehao Li*, Bowen Yang*, Zhenyu Wu*, Xuehui Wang, Qiushi Sun, Shi Liu, Weiyun Wang, Shenglong Ye, Qingyun Li, Zeyue Tian, Gen Luo, Xiangyu Yue, Biqing Qi, Kai Chen, Bowen Zhou, Yu Qiao, Qifeng Chen, Wenhai Wang.

  • Check code, dataset & models at Our Github and Our HF Collections. πŸ€—
  • The first open-source framework and dataset for truly cross-platform Computer Use Agents. πŸ€–
  • Achieve the SOTA results on MMBench-GUI, ScreenSpot-Pro, and WebArena-Lite-v2, etc. πŸ†
  • Provide a comprehensive training recipe to advance computer-use agents. πŸš€
Preprint
sym

MMBench-GUI: Hierarchical Multi-Platform Evaluation Framework for GUI Agents πŸ”₯πŸ”₯
Xuehui Wang*, Zhenyu Wu*, Jingjing Xie*, Zichen Ding*, Bowen Yang*, Zehao Li*, Zhaoyang Liu*, Qingyun Li, Xuan Dong, Zhe Chen, Weiyun Wang, Xiangyu Zhao, Jixuan Chen, Haodong Duan, Tianbao Xie, Chenyu Yang, Shiqian Su, Yue Yu, Yuan Huang, Yiqian Liu, Xiao Zhang, Yanting Zhang, Xiangyu Yue, Weijie Su, Xizhou Zhu, Wei Shen, Jifeng Dai, Wenhai Wang.

  • Check code at Our Project. 🎬
  • A cross-platform, hierarchical benchmark designed to comprehensively evaluate GUI agents. πŸ”
  • Introduce EQA to jointly assess both the success and efficiency of agent behavior in online tasks. 🧐
COLM 2025
sym

Breaking the Data Barrier – Building GUI Agents Through Task Generalization
Junlei Zhang*, Zichen Ding*, Chang Ma, Zijie Chen, Qiushi Sun, Zhenzhong Lan, Junxian He.

  • Check code at Our Project. πŸ“½οΈ
  • Provide insights into cross-domain knowledge transfer for GUI agents. πŸ€–
  • Offer a practical approach to addressing data scarcity challenges in this emerging field. πŸ’«
ACL 2025
sym

OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis [CCF-A]
Qiushi Sun*, Kanzhi Cheng*, Zichen Ding*, Chuanyang Jin*, Yian Wang, Fangzhi Xu, Zhenyu Wu, Chengyou Jia, Liheng Chen, Zhoumianze Liu, Ben Kao, Guohao Li, Junxian He, Yu Qiao, Zhiyong Wu.

  • Check demos at Our Website. 🌐
  • Shift from task-driven to interaction-driven GUI data synthesis. πŸ€–
  • A manual-free data pipeline for synthesizing GUI agent trajectories. 🧬
EMNLP 2024
sym

Automated Peer Reviewing in Paper SEA: Standardization, Evaluation, and Analysis [CCF-B]
Jianxiang Yu*, Zichen Ding*, Jiaqi Tan, Kangyang Luo, Zhenmin Weng, Chenghua Gong, Long Zeng, Renjing Cui, Chengcheng Han, Qiushi Sun, Zhiyong Wu, Yunshi Lan, Xiang Li.

  • Check demos at Our Website. 🌐
  • An innovative framework for automating peer review. 🌊

More preprints under review will be released soon, and some papers can be found on Google Scholar. πŸ“šβœ¨πŸ”