When using the notebook, the results are clear and seem aligned with the prompts. However, with the inference client, I mostly encounter noise or responses that don’t appear to understand the input. The output often lasts only 3–4 seconds before stopping.
I’m unsure if this is the intended behavior of the base model, requiring training for better performance, or if there might be an issue on my end, such as problems with microphone input affecting the model’s performance.
Any guidance would be greatly appreciated!