RunPod•8mo ago

Error requiring "flash_attn"

I'm trying to run MiniCPM-V which according to docs supports VLLM (https://github.com/OpenBMB/MiniCPM-V/tree/main?tab=readme-ov-file#inference-with-vllm), but on run I'm getting

ImportError: This modeling file requires the following packages that were not found in your environment: flash_attn. Run pip install flash_attn

Any help on how to overcome this error? I was trying to use the webUI to configure serverless.

GitHub

GitHub - OpenBMB/MiniCPM-V: MiniCPM-V 2.6: A GPT-4V Level MLLM for ...

MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone - OpenBMB/MiniCPM-V

Solution:

It looks like you need flash_attn python module. You need to uncomment the flash_attn line in requirements.txt. It currently looks like this:

#flash_attn==2.3.4

#flash_attn==2.3.4

It needs to look like this:...

Jump to solution

3 Replies

Solution

Encyrption•8mo ago

It looks like you need flash_attn python module. You need to uncomment the flash_attn line in requirements.txt. It currently looks like this:

#flash_attn==2.3.4

#flash_attn==2.3.4

It needs to look like this:

flash_attn==2.3.4

flash_attn==2.3.4

After making that change you will need to rebuild the image.

lawtjOP•8mo ago

thanks for the tip - is this the requirements.txt file in the root directory of the runpod/worker-v1-vllm:v1.3.1stable-cuda12.1.0 image? I don't see a commented out line there. i can add flash_attn to the file but wondering if i'm using the wrong image.

Encyrption•8mo ago

It is in the root directory of the github package you listed previously: https://github.com/OpenBMB/MiniCPM-V/tree/main?tab=readme-ov-file

GitHub

GitHub - OpenBMB/MiniCPM-V: MiniCPM-V 2.6: A GPT-4V Level MLLM for ...

MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone - OpenBMB/MiniCPM-V

Gaming

Programming

Error requiring "flash_attn"

Did you find this page helpful?