gdimanov Comments - Answer Overflow

gdimanov

Posts Comments

RRunPod

•Created by gdimanov on 6/16/2024 in #⛅｜pods

Ram issue

Okay, managed to get it working, ty guys, had to restrict --max-model-len

14 replies

RRunPod

•Created by gdimanov on 6/16/2024 in #⛅｜pods

Ram issue

could you refence me the documentation from where you got the "--tensor-parallel-size "

14 replies

RRunPod

•Created by gdimanov on 6/16/2024 in #⛅｜pods

Ram issue

something new : "The model's max seq len (131072) is larger than the maximum number of tokens that can be stored in KV cache (74192). Try increasing gpu_memory_utilization or decreasing max_model_len when initializing the engine"

14 replies

RRunPod

•Created by gdimanov on 6/16/2024 in #⛅｜pods

Ram issue

"--host 0.0.0.0 --model cognitivecomputations/dolphin-2.9.2-qwen2-7b --tensor-parallel-size 3"

14 replies

RRunPod

•Created by gdimanov on 6/16/2024 in #⛅｜pods

Ram issue

14 replies

RRunPod

•Created by gdimanov on 6/16/2024 in #⛅｜pods

Ram issue

14 replies

RRunPod

•Created by gdimanov on 6/16/2024 in #⛅｜pods

Ram issue

The app also seems to be restarting non stop - maybe due to the error. I have "start container" message each 20 seconds.

14 replies

Gaming

Programming