metadata
language:
- ko
metrics:
- accuracy
- f1
pipeline_tag: text-classification
XLM-Roberta-base --> 8emotions!
Label Dictionry
- label_dictionary
- emo2int = { "๊ธฐ์จ": 0, "๋นํฉ": 1, "๋ถ๋ ธ": 2, "๋ถ์": 3, "์์ฒ": 4, "์ฌํ": 5, "์ค๋ฆฝ": 6 }
- kore2en = { "๊ธฐ์จ": "joy", "๋นํฉ": "surprise", "๋ถ๋ ธ": "anger", "๋ถ์": "fear", "์์ฒ": "hurt", "์ฌํ": "sadness", "์ค๋ฆฝ": "neutral" }
Dataset
๊ฐ์ฑ๋ํ๋ง๋ญ์น(AI Hub)
- input format(recommendation) - this model is trained by ChatBOT dataset.
- ref: https://www.aihub.or.kr/aihubdata/data/view.do?currMenu=115&topMenu=100&dataSetSn=86
ํ๊ตญ์ด ๊ฐ์ ์ ๋ณด๊ฐ ํฌํจ๋ ์ฐ์์ ๋ํ ๋ฐ์ดํฐ์ (AIHub)
And.. this dataset doesn't have neutral class..
So additional dataset is used.
ref: https://aihub.or.kr/aihubdata/data/view.do?dataSetSn=271
finally I Concatenate 2 Datasets.
Input Format(Please Use Special Tokens [USR], [BOT] to use model API!)
(example) [USR] ์๋ . [BOT] ์๋ ํ์ธ์! ๋ฌด์์ ๋์๋๋ฆด๊น์? [USR] ๋ณ์ผ ์์ด.
์ด ๋๊ฐ์ ํน์ ํ ํฐ์ ๋ฐ๋์ ์ฌ์ฉํด์ฃผ์๊ธธ ๋ถํ๋๋ฆฝ๋๋ค.