Yasmin Posts - Answer Overflow

Yasmin

•Created by Yasmin on 10/16/2024 in #⛅｜pods

Llama

Hello! For those who tried, how much GPU is needed for inference only, and for fine-tuning of Llama 70B? How about the inference of the 400B version (for knowledge distillation)? Is the quality difference worth it? Thanks!

7 replies

RRunPod

•Created by Yasmin on 6/17/2024 in #⛅｜pods

Terminate POD with SSH

Hello! I use the following command to stop and then terminate the pod using ssh. It stops, but it is just marked as "Exited" in the interface; so it seems that the second command does not work.

`
nohup bash -c "sleep 1h; runpodctl stop pod $RUNPOD_POD_ID && runpodctl remove pod $RUNPOD_POD_ID" &

I would like to fully terminate it from ssh, so that it does not incur charges while I do not need the POD anymore as the task is already finished. So, how can adjust the command to fully terminate the POD? Thanks!

8 replies

RRunPod

•Created by Yasmin on 6/8/2024 in #⛅｜pods

Very Slow Mapping

Hello! I am trying to run dataset.map() and it takes only a few minutes when I run it on Colab. However, when I run it on any machine on RunPod, it reports that it has several hours to finish. I reported this to the Support, but no solution yet. I wonder if anyone faced a similar issue, and how to solve it. The code below is for pre-processing an audio dataset for Whisper fine-tuning. Thanks!

def prepare_dataset(batch):
  audio = batch["audio"]

  batch["input_features"] = feature_extractor(audio["array"],
                     sampling_rate=audio["sampling_rate"]).input_features[0]

  batch["labels"] = tokenizer(batch["translation"]).input_ids

  return batch

dataset = dataset.map(prepare_dataset,
                      remove_columns=dataset.column_names["train"],
                      num_proc=None)

def prepare_dataset(batch):
  audio = batch["audio"]

  batch["input_features"] = feature_extractor(audio["array"],
                     sampling_rate=audio["sampling_rate"]).input_features[0]

  batch["labels"] = tokenizer(batch["translation"]).input_ids

  return batch

dataset = dataset.map(prepare_dataset,
                      remove_columns=dataset.column_names["train"],
                      num_proc=None)

11 replies

RRunPod

•Created by Yasmin on 4/27/2024 in #⛅｜pods

Performance A100-SXM4-40GB vs A100-SXM4-80GB

Hello! I have one GPU: NVIDIA A100-SXM4-40GB on Google Colab Pro. I have one GPU: NVIDIA A100-SXM4-80GB on RunPod. My notebook successfully fine-tunes Whisper-Small on Google Colab (40GB) with batch size 32. However, when I run the same notebook on RunPod (80GB), I get a GPU out of memory error; it only works with batch size 16. Any explanation and solution as why A100-SXM4-80GB cannot run the same batch size used on A100-SXM4-40GB? Thanks!

5 replies

Gaming

Programming