ComfyUI Prompt Enhancement Guide: Using Ollama and LLMs for Better AI Image Generation
Recently, I came across the concept of "Enhancing prompts." It uses large language model (LLM) models like Gemma and LLaVa to improve your prompts and even describe your image. We'll be using this feature in the stable diffusion tool called comfyUI, thanks to the custom node "comfyui_IF_AI nodes." We'll use Ollama to serve the LLM models. Setting it up is easy and straightforward. So, let's get started!
In this blog we will learn all the below things
- Setting up ollama
- Installing custom nodes
- Downloading the workflow
- Generating images using different styles of prompts
- Conclusion
Setting up ollama and downloading the LLM models
To use this guide, you'll need comfyUI. You can either run it locally or create a cloud instance with JarvisLabs.
We need to download the Ollama and we need to pull the model which we want to use. Open the terminal and paste the below command.
# This is for Linux users only
curl -L https://ollama.com/download/ollama-linux-amd64 -o /usr/bin/ollama
chmod +x /usr/bin/ollama
If you are using windows or mac, Check this official documentation for installation Ollama docs
Once you downloaded the ollama. Paste the below command to run the ollama server.
ollama serve
For this blog, We will use llava model because it is capable of understanding the texts and images. Paste the below command to download the llava model
ollama pull llava
If you had any issues with setting up ollama, Check this blog for more information How to deploy ollama LLM model or you can ping us on chat.
Installing custom nodes
We need to connect comfyUI to the Ollama server. We can achieve this using a custom node called "ComfyUI_IF_AI nodes." In the comfyUI manager, simply type "ComfyUI_IF_AI" and click the install button. After installation is complete, be sure to restart the comfyUI server.
checkout this video for how you can install custom nodes using comfyUI manager youtube or check this comfyUI documentation to learn the alternative methods.
Downloading the workflow
You can find the workflow in the comfyUI_IF_AI github repo, Navigate to workflows folder, open the json and download it. Load the workflow in comfyUI and that's all we need to do. Now enter the prompt in the "if prompt to prompt" node and click the queue prompt. You can see your prompt gets enhanced by the LLM model.
Make sure to change the base ip to your localhost. Leave the port as it is, No need to change the value. You can also connect to your opean AI or anthropic accounts.
Generating images using different styles of prompts
You can change the prompt style in the "if prompt to prompt" node. This node offers multiple styles like product, cinematic, anime and it offers negative prompt styles too.
In that workflow, We can find a group of nodes belongs to image to prompt genearting. You can directly pass a image or folder path where you stored all the images.
To get better results. Use the best models. Here is the list of my favourite models.
- Turbovision
- Jaggeruant XL
- Dreamshaper XL
- Deliberate V8
Additionaly you can add a loRA weights in the load loRA node. Make sure that your checkpoint models supports loRA. For example SD1.5 doesn't support SDXL loRA. So use the correct loRA.
If you are facing any issues, ping us on chat. We are here to help you.