meta-llama/Llama-3.1-8B-Instruct · The model does not follow the instruction.

I am trying to fine-tune the model using instruction from my own data with LoRA. Here is the formate of my input that I fine tune the model on. In the instruction I am asking the model to answer the question using only one word 'Yes' or 'No'.
Instruction: ....
context: ....
question: ....
answer: Yes.

However the model does not follow the format or instruction. The answer it generates is none sense most of the time and when it is sensical it contains all sort of random characters with 'yes' or 'no' answer (for example it outputs: 'Answer:nbsp;Yes.'). Things i have done so far to fix the issue:

trying different prompts
applying LoRA to different layers (right now it is applied to the key, value, and query layers).
reduce the temperature to avoid hallucination.
limit the number of new token generated to small numbers.

Any advice on how to solve the issue is much appreciated.

Thank you so much!