freewheelin
/

free-evo-qwen72b-v0.8-re

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

free-evo-qwen72b-v0.8-re / README.md

freewheelin's picture

Update README.md

2132b62 verified 5 months ago

|

No virus

796 Bytes

metadata

language:
  - ko
  - en
license: mit

Model Card for free-evo-qwen72b-v0.8

1st place : 2024 4th May - avg. 81.28 Open Llm Leaderboard

but this kicked away. maybe the explanation was not enough.

Method

We were inspired by this Sakana project

Process

1. two models with the same architecture are needed so fine-tune a model to create a gap between the two of them.
1. merge original one and fine-tuned one
1. evaluate the merged model
1. merge again it with original model
1. evaluate again
1. keep going until evaluate avg is higher then original one

that's it. simple.

Base Architecture

QWEN2

Base Models

several QWEN2 based models