riverfog7
riverfog7
RRunPod
Created by John lanser on 2/25/2025 in #⛅|pods
Run commands on restart
make some file on start command and run different commands (the restart command) when the file exists?
3 replies
RRunPod
Created by Kalpak on 2/22/2025 in #⚡|serverless
Help with deploying WhisperX ($35 bounty)
It says huggingface_access_token on the Readme
10 replies
RRunPod
Created by Lattus on 1/22/2025 in #⚡|serverless
Serverless deepseek-ai/DeepSeek-R1 setup?
Is the model you are trying to run a GGUF quant? You'll need a custom script for GGUF quants or if there is multiple models in a single repo
41 replies
RRunPod
Created by Bj9000 on 1/27/2025 in #⚡|serverless
Serveless quants
change tensor-parallel-size to gpu count
10 replies
RRunPod
Created by Bj9000 on 1/27/2025 in #⚡|serverless
Serveless quants
start_vllm.sh
VLLM_API_KEY="asdfasdf"
MODEL_REPO="bartowski/DeepSeek-R1-Distill-Llama-70B-GGUF"
ORIG_MODEL="deepseek-ai/DeepSeek-R1-Distill-Llama-70B"
MODEL_FILE="DeepSeek-R1-Distill-Llama-70B-Q4_K_M.gguf"
RANDOM_SEED=42

chmod +x /install_requirements.sh
source /install_requirements.sh
rm /install_requirements.sh

#huggingface-cli login --token $HF_TOKEN
if [ ! -f "/workspace/models/${MODEL_REPO}/${MODEL_FILE}" ]; then
mkdir -p "/workspace/models/${MODEL_REPO}"
huggingface-cli download "${MODEL_REPO}" "${MODEL_FILE}" --local-dir "/workspace/models/${MODEL_REPO}"
fi

vllm serve \
"/workspace/models/${MODEL_REPO}/${MODEL_FILE}" \
--port 80 --api-key "${VLLM_API_KEY}" \
--enable-reasoning --reasoning-parser "deepseek_r1" \
--tokenizer "${ORIG_MODEL}" --kv-cache-dtype "auto" \
--max-model-len 16384 --pipeline-parallel-size 1 \
--tensor-parallel-size 2 --seed "${RANDOM_SEED}" \
--swap-space 4 --cpu-offload-gb 0 --gpu-memory-utilization 0.95 \
--quantization "gguf" --device "cuda"
VLLM_API_KEY="asdfasdf"
MODEL_REPO="bartowski/DeepSeek-R1-Distill-Llama-70B-GGUF"
ORIG_MODEL="deepseek-ai/DeepSeek-R1-Distill-Llama-70B"
MODEL_FILE="DeepSeek-R1-Distill-Llama-70B-Q4_K_M.gguf"
RANDOM_SEED=42

chmod +x /install_requirements.sh
source /install_requirements.sh
rm /install_requirements.sh

#huggingface-cli login --token $HF_TOKEN
if [ ! -f "/workspace/models/${MODEL_REPO}/${MODEL_FILE}" ]; then
mkdir -p "/workspace/models/${MODEL_REPO}"
huggingface-cli download "${MODEL_REPO}" "${MODEL_FILE}" --local-dir "/workspace/models/${MODEL_REPO}"
fi

vllm serve \
"/workspace/models/${MODEL_REPO}/${MODEL_FILE}" \
--port 80 --api-key "${VLLM_API_KEY}" \
--enable-reasoning --reasoning-parser "deepseek_r1" \
--tokenizer "${ORIG_MODEL}" --kv-cache-dtype "auto" \
--max-model-len 16384 --pipeline-parallel-size 1 \
--tensor-parallel-size 2 --seed "${RANDOM_SEED}" \
--swap-space 4 --cpu-offload-gb 0 --gpu-memory-utilization 0.95 \
--quantization "gguf" --device "cuda"
10 replies
RRunPod
Created by Bj9000 on 1/27/2025 in #⚡|serverless
Serveless quants
install_requirements.sh
MINICONDA_URL="https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh"
VLLM_USE_NIGHTLY=1

apt-get update

if [ ! -x /usr/bin/sudo ]; then
apt-get install -y sudo
fi

if [ ! -x /usr/bin/wget ]; then
sudo apt-get install -y wget
fi

if [ ! -x /usr/bin/screen ]; then
sudo apt-get install -y screen
fi

if [ ! -x /usr/bin/nvtop ]; then
sudo apt-get install -y nvtop
fi

# install miniconda
mkdir -p ~/miniconda3
wget $MINICONDA_URL -O ~/miniconda3/miniconda.sh
bash ~/miniconda3/miniconda.sh -b -u -p ~/miniconda3
rm ~/miniconda3/miniconda.sh
~/miniconda3/condabin/conda init bash
source ~/.bashrc

# install vllm and dependencies
conda create -n vllm python=3.12 -y
conda activate vllm
python -m pip install --upgrade pip
pip install -U "huggingface_hub[cli]"

if [ $VLLM_USE_NIGHTLY -eq 1 ]; then
pip install vllm --pre --extra-index-url https://wheels.vllm.ai/nightly
else
pip install vllm
fi
MINICONDA_URL="https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh"
VLLM_USE_NIGHTLY=1

apt-get update

if [ ! -x /usr/bin/sudo ]; then
apt-get install -y sudo
fi

if [ ! -x /usr/bin/wget ]; then
sudo apt-get install -y wget
fi

if [ ! -x /usr/bin/screen ]; then
sudo apt-get install -y screen
fi

if [ ! -x /usr/bin/nvtop ]; then
sudo apt-get install -y nvtop
fi

# install miniconda
mkdir -p ~/miniconda3
wget $MINICONDA_URL -O ~/miniconda3/miniconda.sh
bash ~/miniconda3/miniconda.sh -b -u -p ~/miniconda3
rm ~/miniconda3/miniconda.sh
~/miniconda3/condabin/conda init bash
source ~/.bashrc

# install vllm and dependencies
conda create -n vllm python=3.12 -y
conda activate vllm
python -m pip install --upgrade pip
pip install -U "huggingface_hub[cli]"

if [ $VLLM_USE_NIGHTLY -eq 1 ]; then
pip install vllm --pre --extra-index-url https://wheels.vllm.ai/nightly
else
pip install vllm
fi
10 replies
RRunPod
Created by Bj9000 on 1/27/2025 in #⚡|serverless
Serveless quants
I do have a running script
10 replies
RRunPod
Created by Stewette on 2/8/2025 in #⚡|serverless
The default steps on the website for serverless create broken containers that I am charged for.
8xMI300X should work with 1.5 terabytes of vram
6 replies
RRunPod
Created by Justin on 2/17/2025 in #⚡|serverless
Baking model into Dockerimage
vllm --model /path/to/model does not work. You have to do vllm /path/to/model
6 replies
RRunPod
Created by 자베르 on 2/14/2025 in #⚡|serverless
Github Serverless building takes too much
Github actions is good. Its free for some extent and the image gets host a docker repo (ghcr.io), and never gets queued for 1hour
9 replies
RRunPod
Created by Aayush999 on 2/10/2025 in #⛅|pods
Docker image from Docker hub
check the CMD command it needs to block
15 replies
RRunPod
Created by LAB AI on 2/20/2025 in #⛅|pods
Pod image for network storage management
maybe make a template that changes the docker CMD command to something like
bash -c 'apt update && apt install rsync && /start.sh'
bash -c 'apt update && apt install rsync && /start.sh'
2 replies
RRunPod
Created by KamKam on 2/21/2025 in #⛅|pods
Limit Memory Usage
look up OOMkiller
2 replies
RRunPod
Created by const on 2/21/2025 in #⛅|pods
H100 pod not connecting to network drive of the same region
It works now
11 replies
RRunPod
Created by const on 2/21/2025 in #⛅|pods
H100 pod not connecting to network drive of the same region
Same problem on CA region A40 pods
11 replies
RRunPod
Created by freedomk520 on 2/21/2025 in #⛅|pods
4 x A40 never ready in CA
Same Issue here (just got one to start up but Volume disk / network volume doesnt work) (CA region)
19 replies
RRunPod
Created by drazenz on 2/19/2025 in #⛅|pods
Downloading models causes the pod to freeze
Also the ram utilization reported by runpod is 0 (which is not reasonable, free -h says 22gib used)
8 replies
RRunPod
Created by drazenz on 2/19/2025 in #⛅|pods
Downloading models causes the pod to freeze
but the memory usage is fine and doesnt cause any memory problems
8 replies
RRunPod
Created by drazenz on 2/19/2025 in #⛅|pods
Downloading models causes the pod to freeze
I just tried and downloading to /root/ works (container disk) but /workspace doesnt work and freezes(volume or network volume)
8 replies
RRunPod
Created by drazenz on 2/19/2025 in #⛅|pods
Downloading models causes the pod to freeze
does downloading with huggingface-cli work?
8 replies