starcoder-ft-zh / README.md
zirui3's picture
Update README.md
fa4e397
---
license: cc-by-4.0
datasets:
- zirui3/TSSB-3M-instructions
- conceptofmind/FLAN_2022
- zirui3/zhihu_qa
- zirui3/cMedQA2-instructions
tags:
- code
---
# summary
This model is bigcode/starcoder fine-tuned on codegen dataset & natural language dataset(chinese/english instruction dataset)
# dataset
* codegen-instruct
* [zirui3/TSSB-3M-instructions](https://huggingface.co/datasets/zirui3/TSSB-3M-instructions)(python code bugfix)
* FLAN(english)
* [OIG](https://huggingface.co/datasets/laion/OIG) (Open-Assistant,engliesh)
* [zirui3/zhihu_qa](https://huggingface.co/datasets/zirui3/zhihu_qa)(chinese)
* [COIG](https://huggingface.co/datasets/BAAI/COIG) (chinese)
* pCLUE(chinese)
* [zirui3/cMedQA2-instructions](https://huggingface.co/datasets/zirui3/TSSB-3M-instructions) (chinese medical domain)