π Selected Publications 

OS-Oracle: A Comprehensive Framework for Cross-Platform GUI Critic Models π₯π₯
Zhenyu Wu*, Jingjing Xie*, Zehao Li, Bowen Yang, Qiushi Sun, Zhaoyang Liu, Zhoumianze Liu, Yu Qiao, Xiangyu Yue, Zun Wang, Zichen Ding βοΈ
(* means equal contributions, βοΈ means corresponding author.)
- Check code, dataset & models at Our Github and Our HF Collections. π€
- A scalable data pipeline for synthesizing cross-platform GUI critic data. π
- Introducing an elaborate training recipe that integrates SFT with CP-GRPO. π§
- A holistic benchmark for evaluating GUI critic across Mobile, Web, and Desktop platforms. βοΈ
- OS-Oracle-7B achieves the SOTA results on OS-Critic Bench and support the full CUA lifecycle. π

ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data π₯π₯
Zhaoyang Liu*, Jingjing Xie*, Zichen Ding*, Zehao Li*, Bowen Yang*, Zhenyu Wu*, Xuehui Wang, Qiushi Sun, Shi Liu, Weiyun Wang, Shenglong Ye, Qingyun Li, Zeyue Tian, Gen Luo, Xiangyu Yue, Biqing Qi, Kai Chen, Bowen Zhou, Yu Qiao, Qifeng Chen, Wenhai Wang.
- Check code, dataset & models at Our Github and Our HF Collections. π€
- The first open-source framework and dataset for truly cross-platform Computer Use Agents. π€
- Achieve the SOTA results on MMBench-GUI, ScreenSpot-Pro, and WebArena-Lite-v2, etc. π
- Provide a comprehensive training recipe to advance computer-use agents. π

MMBench-GUI: Hierarchical Multi-Platform Evaluation Framework for GUI Agents π₯π₯
Xuehui Wang*, Zhenyu Wu*, Jingjing Xie*, Zichen Ding*, Bowen Yang*, Zehao Li*, Zhaoyang Liu*, Qingyun Li, Xuan Dong, Zhe Chen, Weiyun Wang, Xiangyu Zhao, Jixuan Chen, Haodong Duan, Tianbao Xie, Chenyu Yang, Shiqian Su, Yue Yu, Yuan Huang, Yiqian Liu, Xiao Zhang, Yanting Zhang, Xiangyu Yue, Weijie Su, Xizhou Zhu, Wei Shen, Jifeng Dai, Wenhai Wang.
- Check code at Our Project. π¬
- A cross-platform, hierarchical benchmark designed to comprehensively evaluate GUI agents. π
- Introduce EQA to jointly assess both the success and efficiency of agent behavior in online tasks. π§

Breaking the Data Barrier β Building GUI Agents Through Task Generalization
Junlei Zhang*, Zichen Ding*, Chang Ma, Zijie Chen, Qiushi Sun, Zhenzhong Lan, Junxian He.
- Check code at Our Project. π½οΈ
- Provide insights into cross-domain knowledge transfer for GUI agents. π€
- Offer a practical approach to addressing data scarcity challenges in this emerging field. π«

OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis [CCF-A]
Qiushi Sun*, Kanzhi Cheng*, Zichen Ding*, Chuanyang Jin*, Yian Wang, Fangzhi Xu, Zhenyu Wu, Chengyou Jia, Liheng Chen, Zhoumianze Liu, Ben Kao, Guohao Li, Junxian He, Yu Qiao, Zhiyong Wu.
- Check demos at Our Website. π
- Shift from task-driven to interaction-driven GUI data synthesis. π€
- A manual-free data pipeline for synthesizing GUI agent trajectories. π§¬

Automated Peer Reviewing in Paper SEA: Standardization, Evaluation, and Analysis [CCF-B]
Jianxiang Yu*, Zichen Ding*, Jiaqi Tan, Kangyang Luo, Zhenmin Weng, Chenghua Gong, Long Zeng, Renjing Cui, Chengcheng Han, Qiushi Sun, Zhiyong Wu, Yunshi Lan, Xiang Li.
- Check demos at Our Website. π
- An innovative framework for automating peer review. π
-
PreprintOS-Sentinel: Towards Safety-Enhanced Mobile GUI Agents via Hybrid Validation in Realistic Workflows,
Qiushi Sun*, Mukai Li*, Zhoumianze Liu*, Zhihui Xie*, Fangzhi Xu, Zhangyue Yin, Kanzhi Cheng, Zehao Li, Zichen Ding, Qi Liu, Zhiyong Wu, Zhuosheng Zhang, Ben Kao, Lingpeng Kong. -
PreprintInternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency,
Weiyun Wang, Zhangwei Gao, Lixin Gu, Hengjun Pu, Long Cui, Xingguang Wei, Zhaoyang Liu, Linglin Jing, Shenglong Ye, Jie Shao, Zhaokai Wang, Zhe Chen, Hongjie Zhang, Ganlin Yang, Haomin Wang, Qi Wei, Jinhui Yin, Wenhao Li, Erfei Cui, Guanzhou Chen, Zichen Ding, .etc.
-
WCUA@ICML 2025ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows,
Qiushi Sun, Zhoumianze Liu, Chang Ma, Zichen Ding, Fangzhi Xu, Zhangyue Yin, Haiteng Zhao, Zhenyu Wu, Kanzhi Cheng, Zhaoyang Liu, Jianing Wang, Qintong Li, Xiangru Tang, Tianbao Xie, Xiachong Feng, Xiang Li, Ben Kao, Wenhai Wang, Biqing Qi, Lingpeng Kong, Zhiyong Wu. -
ACL 2025Letβs Be Self-generated via Step by Step: A Curriculum Learning Approach to Automated Reasoning with LLMs, [CCF-A]
Kangyang Luo, Zichen Ding, Zhenmin Weng, Lingfeng Qiao, Meng Zhao, Xiang Li, Di Yin, Jinlong Shu. -
IJCNLP-AACL 2025SEAGraph: Unveiling the Whole Story of Paper Review Comments,
Jianxiang Yu*, Jiaqi Tan*, Zichen Ding, Jiapeng Zhu, Jiahao Li, Yao Cheng, Qier Cui, Yunshi Lan, Yao Liu, Xiang Li -
ICLR 2025 (Spotlight)OS-ATLAS: A Foundation Action Model For Generalist GUI Agents, [Core A*]
Zhiyong Wu*, Zhenyu Wu*, Fangzhi Xu*, Yian Wang*, Qiushi Sun, Chengyou Jia, Kanzhi Cheng, Zichen Ding, Liheng Chen, Paul Pu Liang, Yu Qiao. -
SIGKDD 2025RELIEF: Reinforcement Learning Empowered Graph Feature Prompt Tuning, [CCF-A]
Jiapeng Zhu, Zichen Ding, Jianxiang Yu, Jiaqi Tan, Xiang Li, Weining Qian.
LLMAgents@ICLR 2024OS-Copilot: Towards Generalist Computer Agents with Self-Improvement, [Core A*]
Zhiyong Wu*, Chengcheng Han*, Zichen Ding, Zhenmin Weng, Zhoumianze Liu, Shunyu Yao, Tao Yu, Lingpeng Kong.
More preprints under review will be released soon, and some papers can be found on Google Scholar. πβ¨π