Yasmin
Yasmin
RRunPod
Created by Yasmin on 6/17/2024 in #⛅|pods
Terminate POD with SSH
Hello! I use the following command to stop and then terminate the pod using ssh. It stops, but it is just marked as "Exited" in the interface; so it seems that the second command does not work. ` nohup bash -c "sleep 1h; runpodctl stop pod $RUNPOD_POD_ID && runpodctl remove pod $RUNPOD_POD_ID" & I would like to fully terminate it from ssh, so that it does not incur charges while I do not need the POD anymore as the task is already finished. So, how can adjust the command to fully terminate the POD? Thanks!
8 replies
RRunPod
Created by Yasmin on 6/8/2024 in #⛅|pods
Very Slow Mapping
Hello! I am trying to run dataset.map() and it takes only a few minutes when I run it on Colab. However, when I run it on any machine on RunPod, it reports that it has several hours to finish. I reported this to the Support, but no solution yet. I wonder if anyone faced a similar issue, and how to solve it. The code below is for pre-processing an audio dataset for Whisper fine-tuning. Thanks!
def prepare_dataset(batch):
audio = batch["audio"]

batch["input_features"] = feature_extractor(audio["array"],
sampling_rate=audio["sampling_rate"]).input_features[0]

batch["labels"] = tokenizer(batch["translation"]).input_ids

return batch

dataset = dataset.map(prepare_dataset,
remove_columns=dataset.column_names["train"],
num_proc=None)
def prepare_dataset(batch):
audio = batch["audio"]

batch["input_features"] = feature_extractor(audio["array"],
sampling_rate=audio["sampling_rate"]).input_features[0]

batch["labels"] = tokenizer(batch["translation"]).input_ids

return batch

dataset = dataset.map(prepare_dataset,
remove_columns=dataset.column_names["train"],
num_proc=None)
11 replies
RRunPod
Created by Yasmin on 4/27/2024 in #⛅|pods
Performance A100-SXM4-40GB vs A100-SXM4-80GB
Hello! I have one GPU: NVIDIA A100-SXM4-40GB on Google Colab Pro. I have one GPU: NVIDIA A100-SXM4-80GB on RunPod. My notebook successfully fine-tunes Whisper-Small on Google Colab (40GB) with batch size 32. However, when I run the same notebook on RunPod (80GB), I get a GPU out of memory error; it only works with batch size 16. Any explanation and solution as why A100-SXM4-80GB cannot run the same batch size used on A100-SXM4-40GB? Thanks!
5 replies