beomi/Yi-Ko-DUS-9B · Training time, Resource and Training details

First of all, I would like to say thank you for your pioneering research on the Korean language.

I have two questions as I am in the process of replicating your approach of adding Korean language processing to the Yi model by myself.

The first question is the information of training time with respect to your own GPU resources.
I currently have 8 A100-80GB and plan to use the same dataset as yours for training.
I want to determine whether my environment is sufficient to reproduce your work.

The second question is a way of feeding Korean knowledge into the English-enhanced LLM.
I wonder if continued-pretraining can be done using Korean data in the same way as the pretraining process of the Yi model.
I am also curious about whether the tokenizer requires an independent training process.
If necessary, I would like to know if it would be possible to share related information.

Thank you.