AntGLM-Med-10B for PubMedQA
This project is affiliated with Ant Group, ranking Top-5 on PubMedQA leaderborad.
-
From Beginner to Expert: Modeling Medical Knowledge into General LLMs
-
Author Information
Qiang Li, mangxiao.lq@antgroup.com (Co-first Author)
XiaoYan Yang, joyce.yxy@antgroup.com (Co-first Author)
HaoWen Wang, wanghaowen.whw@antgroup.com
Qin Wang, chengzhao.wq@antgroup.com
Lei Liu, haishen.ll@antgroup.com (Corresponding Author)
Junjie Wang, benge.wjj@antgroup.com
Yang Zhang, yaoling.zy@antgroup.com
Mingyuan Chu, chumingyuan.cmy@antgroup.com
Sen Hu, hs272483@antgroup.com
Yicheng Chen, yicheng.chen@antgroup.com
Yue Shen, zhanying@antgroup.com
Cong Fan fancong.fan@antgroup.com
Wangshu Zhang, wangshu.zws@antgroup.com
Teng Xu, harvey.xt@antgroup.com
JinJie Gu, jinjie.gujj@antgroup.com
Jing Zheng, jing.zheng@antgroup.com
GuanNan Zhang, zgn138592@antgroup.com -
Model Information
Model Name: AntGLM-Med
Model Size: 10B
Accuracy: 80.6
Affiliation: Ant Group (Please note this information!) -
🚀 The main contrinbution of AntGLM-Med-10B is summarized as: we proposed a 3-stage optimization procedure for adpating a general LLM in the medical domain, including continue pre-training, instruction tuning, and task adpatation. The base model is based on AntGLM-10B, which is a general LLM developed by Ant Group. The AntGLM is featured by 2048 max length, 48 layers, a hidden size of 4096, 64 attention heads, and a prefix-decoder arcitecture. Other technical details can refer to the above-mentioned paper.