annasuhstuff
RRunPod
•Created by annasuhstuff on 8/5/2024 in #⚡|serverless
HF_TOKEN question
But I really don't think that you @nerdylive are a bad community helper, because statistics says that people just do the most common mistakes and forget to check the basics)
26 replies
RRunPod
•Created by annasuhstuff on 8/5/2024 in #⚡|serverless
HF_TOKEN question
' anna, try check if the token has the access to the model repo in hf' --
well, as I said in the begining, if I run it without runpod, but logging with token '1234', and it WORKS, it means my token '1234' is valid
but when I put '1234' in Runpod, it somehow breaks down and ends with an error...
anyways, I have fixed it just by refreshing the page and inputing this token again)
But indeed, my friend @nerdylive , it was kinda meaningless to check the token the third of time fourth time as you suggested)
26 replies
RRunPod
•Created by annasuhstuff on 8/7/2024 in #⚡|serverless
"IN QUEUE" and nothing happeneds
@Encyrption dear friend, you have just explained everything I need. Indeed I have forgotten handler, silly me. now everything works perfectly! Thank you!
7 replies
RRunPod
•Created by annasuhstuff on 8/7/2024 in #⚡|serverless
"IN QUEUE" and nothing happeneds
@nerdylive sorry, but even i do curl -X POST https://www.runpod.io/console/serverless/user/endpoint/jhfkefjljllckt -H 'Content-Type: application/json' -H 'Authorization: Bearer xxxxx' -d '{"input": {"prompt": "cancer is:"}}' , it is just in queue and nothing happends.
what exactly should i find in the documentation? can you explain please
7 replies
RRunPod
•Created by annasuhstuff on 8/5/2024 in #⚡|serverless
HF_TOKEN question
somehow it launches just from VS code, but not here)
token is same
26 replies
RRunPod
•Created by annasuhstuff on 8/5/2024 in #⚡|serverless
HF_TOKEN question
yes, this is the model we have trained in my company)
26 replies
RRunPod
•Created by annasuhstuff on 6/27/2024 in #⚡|serverless
Quantization method
now I have this error
9 replies
RRunPod
•Created by annasuhstuff on 6/27/2024 in #⚡|serverless
Quantization method
2024-06-27T10:50:05.563358317Z ValueError: Quantization method specified in the model config (bitsandbytes) does not match the quantization method specified in the
quantization
argument (gptq).9 replies
RRunPod
•Created by annasuhstuff on 6/27/2024 in #⚡|serverless
Quantization method
thank you so much, now i get it
9 replies
RRunPod
•Created by annasuhstuff on 6/26/2024 in #⚡|serverless
LoRA adapter on Runpod.io (using vLLM Worker)
yes
21 replies
RRunPod
•Created by annasuhstuff on 6/26/2024 in #⚡|serverless
LoRA adapter on Runpod.io (using vLLM Worker)
above i have wrote the code which does not use peft
after merging adapter and base model, i have a config, but the model quality gets a lot worse
so i thought i could set API endpoint without config
21 replies
RRunPod
•Created by annasuhstuff on 6/26/2024 in #⚡|serverless
LoRA adapter on Runpod.io (using vLLM Worker)
Are there any ways to avoid this eror? (no config found)
Yes, with peft i need, but with other method i dont
so, to enable API endpoint, as I see, config is also a musthave?
21 replies
RRunPod
•Created by annasuhstuff on 6/26/2024 in #⚡|serverless
LoRA adapter on Runpod.io (using vLLM Worker)
is there any way to set an endpoint using my model without config (cause i only have adapter_config) or i have to somehow change the model?
21 replies
RRunPod
•Created by annasuhstuff on 6/26/2024 in #⚡|serverless
LoRA adapter on Runpod.io (using vLLM Worker)
2024-06-26T12:49:19.602715611Z OSError: alsokit/eLM-mini-4B-4K-4bit-v01 does not appear to have a file named config.json. Checkout 'https://huggingface.co/alsokit/eLM-mini-4B-4K-4bit-v01/tree/main' for available files.
21 replies
RRunPod
•Created by annasuhstuff on 6/26/2024 in #⚡|serverless
LoRA adapter on Runpod.io (using vLLM Worker)
the problem is, i only have adapter_config
21 replies
RRunPod
•Created by annasuhstuff on 6/26/2024 in #⚡|serverless
LoRA adapter on Runpod.io (using vLLM Worker)
ow, it seems like it works after doing these steps
!pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
!pip install --no-deps xformers "trl<0.9.0" peft accelerate bitsandbytes
from unsloth import FastLanguageModel
import torch
max_seq_length = 2048 # Choose any! We auto support RoPE Scaling internally!
dtype = None # None for auto detection. Float16 for Tesla T4, V100, Bfloat16 for Ampere+
load_in_4bit = True # Use 4bit quantization to reduce memory usage. Can be False.
4bit pre quantized models we support for 4x faster downloading + no OOMs.
fourbit_models = [
"unsloth/mistral-7b-v0.3-bnb-4bit", # New Mistral v3 2x faster!
"unsloth/mistral-7b-instruct-v0.3-bnb-4bit",
"unsloth/llama-3-8b-bnb-4bit", # Llama-3 15 trillion tokens model 2x faster!
"unsloth/llama-3-8b-Instruct-bnb-4bit",
"unsloth/llama-3-70b-bnb-4bit",
"unsloth/Phi-3-mini-4k-instruct", # Phi-3 2x faster!
"unsloth/Phi-3-medium-4k-instruct",
"unsloth/mistral-7b-bnb-4bit",
"unsloth/gemma-7b-bnb-4bit", # Gemma 2.2x faster!
] # More models at https://huggingface.co/unsloth
model, tokenizer = FastLanguageModel.from_pretrained(
# model_name = "unsloth/mistral-7b-v0.3", # Choose ANY! eg teknium/OpenHermes-2.5-Mistral-7B
model_name = "unsloth/Phi-3-mini-4k-instruct",
max_seq_length = max_seq_length,
dtype = dtype,
load_in_4bit = load_in4bit,
# token = "hf...", # use one if using gated models like meta-llama/Llama-2-7b-hf
)
Does Runpod serverless support LoRA adapter?
21 replies
RRunPod
•Created by annasuhstuff on 6/26/2024 in #⚡|serverless
LoRA adapter on Runpod.io (using vLLM Worker)
well, also have an error
ValueError: Can't find 'adapter_config.json' at 'alsokit/eLM-mini-4B-4K-4bit-v01'
21 replies
RRunPod
•Created by annasuhstuff on 6/26/2024 in #⚡|serverless
LoRA adapter on Runpod.io (using vLLM Worker)
using this
from peft import PeftModel, PeftConfig
from transformers import AutoModelForCausalLM
config = PeftConfig.from_pretrained("alsokit/eLM-mini-4B-4K-4bit-v01")
base_model = AutoModelForCausalLM.from_pretrained("unsloth/phi-3-mini-4k-instruct-bnb-4bit")
model = PeftModel.from_pretrained(base_model, "alsokit/eLM-mini-4B-4K-4bit-v01")
21 replies
RRunPod
•Created by annasuhstuff on 6/25/2024 in #⚡|serverless
No config error /
When I create an endpoint, i just pass my hj token in environment variables
HF_TOKEN : hf_blablabla
(i copy the token from my hf account)
5 replies