PatrickR
PatrickR
RRunPod
Created by vladfaust on 9/5/2024 in #⚡|serverless
I shouldn't be paying for this
How are you loading your model?
7 replies
RRunPod
Created by Heartthrob10 on 8/1/2024 in #⚡|serverless
how to set a max output token
No description
12 replies
RRunPod
Created by NERDDISCO on 7/25/2024 in #⚡|serverless
Llama 3.1 via Ollama
Reverted in the docs
19 replies
RRunPod
Created by NERDDISCO on 7/25/2024 in #⚡|serverless
Llama 3.1 via Ollama
downgrade the docker image to 0.0.7
19 replies
RRunPod
Created by NERDDISCO on 7/25/2024 in #⚡|serverless
Llama 3.1 via Ollama
No description
19 replies
RRunPod
Created by NERDDISCO on 7/25/2024 in #⚡|serverless
Llama 3.1 via Ollama
Also, the Docker image just updated to version 0.9 pooyaharatian/runpod-ollama:0.0.9
19 replies
RRunPod
Created by NERDDISCO on 7/25/2024 in #⚡|serverless
Llama 3.1 via Ollama
Yes. Like orca-mini or llama3.1
19 replies
RRunPod
Created by NERDDISCO on 7/25/2024 in #⚡|serverless
Llama 3.1 via Ollama
Docs on that Docker image are now updated. Thanks for the ping!
19 replies
RRunPod
Created by heyado on 7/10/2024 in #⚡|serverless
Can I use a golang handler with serverless?
17 replies
RRunPod
Created by BadNoise on 7/5/2024 in #⚡|serverless
Pipeline is not using gpu on serverless
No description
70 replies
RRunPod
Created by BadNoise on 7/5/2024 in #⚡|serverless
Pipeline is not using gpu on serverless
I rebuilt the new Docker image based off another image:
FROM runpod/base:0.6.1-cuda12.2.0


COPY builder/requirements.txt /requirements.txt
RUN python3.11 -m pip install --upgrade pip && \
python3.11 -m pip install --upgrade -r /requirements.txt --no-cache-dir && \
rm /requirements.txt

ADD . /

CMD python3.11 -u /src/handler.py
FROM runpod/base:0.6.1-cuda12.2.0


COPY builder/requirements.txt /requirements.txt
RUN python3.11 -m pip install --upgrade pip && \
python3.11 -m pip install --upgrade -r /requirements.txt --no-cache-dir && \
rm /requirements.txt

ADD . /

CMD python3.11 -u /src/handler.py
70 replies
RRunPod
Created by BadNoise on 7/5/2024 in #⚡|serverless
Pipeline is not using gpu on serverless
Yes, output of the device is GPU. BTW I used the CLI tool runpodctl project create for faster itteration cycles/not having to rebuild docker constantly.
70 replies
RRunPod
Created by BadNoise on 7/5/2024 in #⚡|serverless
Pipeline is not using gpu on serverless
So I am getting the GPU to run through CUDA.
70 replies
RRunPod
Created by BadNoise on 7/5/2024 in #⚡|serverless
Pipeline is not using gpu on serverless
Here is my python code:
import torch
import runpod
from runpod.serverless.utils.rp_validator import validate
from transformers import AutoModelForSequenceClassification, AutoTokenizer, pipeline

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

print(device)
INPUT_SCHEMA = {
'sequence': {
'type': str,
'required': True
},
'labels': {
'type': list,
'required': True,
}
}

def classify_text(sequence, labels):
model = AutoModelForSequenceClassification.from_pretrained(
"facebook/bart-large-mnli",
local_files_only=False # Change this to False to download if not available locally
).to(device)
tokenizer = AutoTokenizer.from_pretrained(
"facebook/bart-large-mnli", local_files_only=False) # Change this to False to download if not available locally

classifier = pipeline(
"zero-shot-classification",
model=model,
tokenizer=tokenizer,
device=0,
)

return classifier(sequence, labels, multi_label=True)

async def handler(job):
val_input = validate(job['input'], INPUT_SCHEMA)
if 'errors' in val_input:
return {"error": val_input['errors']}
val_input = val_input['validated_input']

classification_result = classify_text(val_input["sequence"], val_input["labels"])

return {
"classification_result": classification_result,
"device": str(device)
}

runpod.serverless.start({"handler": handler, "concurrency_modifier": lambda x: 1000})
import torch
import runpod
from runpod.serverless.utils.rp_validator import validate
from transformers import AutoModelForSequenceClassification, AutoTokenizer, pipeline

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

print(device)
INPUT_SCHEMA = {
'sequence': {
'type': str,
'required': True
},
'labels': {
'type': list,
'required': True,
}
}

def classify_text(sequence, labels):
model = AutoModelForSequenceClassification.from_pretrained(
"facebook/bart-large-mnli",
local_files_only=False # Change this to False to download if not available locally
).to(device)
tokenizer = AutoTokenizer.from_pretrained(
"facebook/bart-large-mnli", local_files_only=False) # Change this to False to download if not available locally

classifier = pipeline(
"zero-shot-classification",
model=model,
tokenizer=tokenizer,
device=0,
)

return classifier(sequence, labels, multi_label=True)

async def handler(job):
val_input = validate(job['input'], INPUT_SCHEMA)
if 'errors' in val_input:
return {"error": val_input['errors']}
val_input = val_input['validated_input']

classification_result = classify_text(val_input["sequence"], val_input["labels"])

return {
"classification_result": classification_result,
"device": str(device)
}

runpod.serverless.start({"handler": handler, "concurrency_modifier": lambda x: 1000})
`
70 replies
RRunPod
Created by BadNoise on 7/5/2024 in #⚡|serverless
Pipeline is not using gpu on serverless
Hey, so I went through this and I've this input:
{
"input": {
"sequence": "The weather is sunny today.",
"labels": ["weather", "sports", "news"]
}
}
{
"input": {
"sequence": "The weather is sunny today.",
"labels": ["weather", "sports", "news"]
}
}
and this output:
{
"id": "test-822c3793-23b3-4464-8b65-972bb5776867",
"status": "COMPLETED",
"output": {
"classification_result": {
"sequence": "The weather is sunny today.",
"labels": [
"weather",
"news",
"sports"
],
"scores": [
0.989009439945221,
0.24655567109584808,
0.008112689480185509
]
},
"device": "cuda"
}
}
{
"id": "test-822c3793-23b3-4464-8b65-972bb5776867",
"status": "COMPLETED",
"output": {
"classification_result": {
"sequence": "The weather is sunny today.",
"labels": [
"weather",
"news",
"sports"
],
"scores": [
0.989009439945221,
0.24655567109584808,
0.008112689480185509
]
},
"device": "cuda"
}
}
70 replies
RRunPod
Created by BadNoise on 7/5/2024 in #⚡|serverless
Pipeline is not using gpu on serverless
Risky click 😆
70 replies
RRunPod
Created by BadNoise on 7/5/2024 in #⚡|serverless
Pipeline is not using gpu on serverless
That would be useful yes! Would love to test out and see what is going on.
70 replies
RRunPod
Created by singhtanmay345 on 3/4/2024 in #⚡|serverless
IN-QUEUE Indefinitely
Do you have a repo I can checkout? I would be intrested in seeing what we can add to help support this.
26 replies
RRunPod
Created by BadNoise on 7/5/2024 in #⚡|serverless
Pipeline is not using gpu on serverless
I have a feeling this line:
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
Is doing something funky. You should try doing a print right after that:
print(torch.cuda.is_available())
print(torch.cuda.device_count())
print(torch.cuda.memory_allocated())
print(torch.cuda.memory_reserved())
print(torch.cuda.is_available())
print(torch.cuda.device_count())
print(torch.cuda.memory_allocated())
print(torch.cuda.memory_reserved())
And see if your code thinks it is running on a CPU.
70 replies
RRunPod
Created by ssssteven on 7/3/2024 in #⚡|serverless
network connections are very slow, Failed to return job results.
For longer jobs you'll want to use run and not the runsync method.
40 replies