Hello!
Thank you for the excellent service you provide. I’ve been using Llama3.1 70B for creative writing tasks, particularly for generating long-form content that takes advantage of its large context window.
However, I’ve noticed that as the output extends, it tends to become repetitive, which limits its effectiveness for sustained creative workflows. To address this, would it be possible to support additional parameters like min-p, frequency penalty, and presence penalty?
I believe many users could benefit from having finer control over repetitiveness, and these parameters are well-known for improving text variability without compromising coherence.