Jun 30

Continuation of:
https://huggingface.co/jukofyork/Dark-Miqu-70B/discussions/3
I've succeeded in removing slop from CR+ for both sfw and nsfw scenarios using control vectors. Strangely, sfw unslop control vector did not affect nsfw slop, and nsfw control vector made model extra horny, which in my opinion is an undesirable side effect. While sfw vector managed to stay coherent during my stress tests, nsfw vector caused poor commandr to disintegrate, it didn't know what to say without any of those overused phrases in erotic fiction that the control vector stopped from appearing. Looks like the issue for nsfw is at much deeper level: the data where the model gets it from is very monotonous, and when forced write in different style, it doesn't know what to do. This is what most likely makes it incredibly difficult to remove nsfw slop using regular prompting techniques.

BigHuggyD

Jun 30

Well darn...

ChuckMcSneed

Jul 2

I'm making more progress with control vectors!
https://huggingface.co/ChuckMcSneed/control_vectors/blob/main/command-r-plus/bio/control_vector-commandr-bio.gguf
I tuned this one on very descriptive biological language as positive and vague flowery prose as negative. Seems to make it more aware of the biology and surroundings of characters.
https://huggingface.co/ChuckMcSneed/control_vectors/blob/main/command-r-plus/incharacter/control_vector-commandr-incharacter.gguf
This one makes the model act slightly more in character, but the improvement is not very significant as commandr is already quite good at it.

Control vector discussion

gemma-2:9b [default]

gemma-2:9b [dark=1]

gemma-2:27b [default]

gemma-2:27b [dark=1]

gemma-2:27b [show_dont_tell=1]

Chapter 7: The Hanged Man

wizard-lm-2:8x22b [shivers__no_shivers=-5.0]

mistral-7b-instruct (no finetune)

mistral-7b-instruct aggressive lora finetune on down_proj

mistral-7b-instruct the same aggressive lora finetune on "q_proj", "k_proj", "v_proj", "o_proj","gate_proj", "up_proj", "down_proj". This is obviously worse, 'import jsonl' is wrong, and no main method.

USER

ASSISTANT

USER

ASSISTANT

USER

ASSISTANT

USER

ASSISTANT

USER

ASSISTANT

USER

ASSISTANT

USER

ASSISTANT

USER

ASSISTANT

USER

ASSISTANT

USER

ASSISTANT

USER

ASSISTANT

USER

ASSISTANT

With system prompt above

mistral-large

llama-3:405b

No system prompt

mistral-large

llama-3:405b

Directive

Goal-Action

Back-Tracking

USER

ASSISTANT

USER

ASSISTANT

USER

ASSISTANT

USER

ASSISTANT

USER

ASSISTANT

1:

2:

1: Correlated Choices

2: Heteroscedasticity

USER

ASSISTANT

USER

ASSISTANT

BUT...

So...

Softmax Calculation Using Gumbel Distributions

Overview

Setting Up the Integration Formula

Simplifying to the Softmax Function

Conclusion

Original LLM output (bimodal / skewed names)

LLM output when we specify the name is "Tom" (unimodal)

What we would like the LLM to output (bimodal / de-skewed names)

Unimodal

Bimodal

So why can't we just use samplers to solve this?

To be continued...

Aside: can we actually use Cross-Entropy Loss (aka "negative log-loss") like this with non-one-hot encoded targets???

So what could go wrong?

1. "Mode Collapse"

2. Difficulty identifying the bimodal vs unimodal cases