annasuhstuff Comments - Answer Overflow

Topics

annasuhstuff

•Created by annasuhstuff on 8/5/2024 in #⚡｜serverless

HF_TOKEN question

But I really don't think that you @nerdylive are a bad community helper, because statistics says that people just do the most common mistakes and forget to check the basics)

26 replies

•Created by annasuhstuff on 8/5/2024 in #⚡｜serverless

HF_TOKEN question

' anna, try check if the token has the access to the model repo in hf' -- well, as I said in the begining, if I run it without runpod, but logging with token '1234', and it WORKS, it means my token '1234' is valid but when I put '1234' in Runpod, it somehow breaks down and ends with an error... anyways, I have fixed it just by refreshing the page and inputing this token again) But indeed, my friend @nerdylive , it was kinda meaningless to check the token the third of time fourth time as you suggested)

26 replies

•Created by annasuhstuff on 8/7/2024 in #⚡｜serverless

"IN QUEUE" and nothing happeneds

@Encyrption dear friend, you have just explained everything I need. Indeed I have forgotten handler, silly me. now everything works perfectly! Thank you!

7 replies

•Created by annasuhstuff on 8/7/2024 in #⚡｜serverless

"IN QUEUE" and nothing happeneds

@nerdylive sorry, but even i do curl -X POST https://www.runpod.io/console/serverless/user/endpoint/jhfkefjljllckt -H 'Content-Type: application/json' -H 'Authorization: Bearer xxxxx' -d '{"input": {"prompt": "cancer is:"}}' , it is just in queue and nothing happends. what exactly should i find in the documentation? can you explain please

7 replies

•Created by annasuhstuff on 8/5/2024 in #⚡｜serverless

HF_TOKEN question

somehow it launches just from VS code, but not here) token is same

26 replies

•Created by annasuhstuff on 8/5/2024 in #⚡｜serverless

HF_TOKEN question

yes, this is the model we have trained in my company)

26 replies

•Created by annasuhstuff on 6/27/2024 in #⚡｜serverless

Quantization method

now I have this error

9 replies

•Created by annasuhstuff on 6/27/2024 in #⚡｜serverless

Quantization method

2024-06-27T10:50:05.563358317Z ValueError: Quantization method specified in the model config (bitsandbytes) does not match the quantization method specified in the quantization argument (gptq).

9 replies

•Created by annasuhstuff on 6/27/2024 in #⚡｜serverless

Quantization method

thank you so much, now i get it

9 replies

•Created by annasuhstuff on 6/26/2024 in #⚡｜serverless

LoRA adapter on Runpod.io (using vLLM Worker)

yes

22 replies

•Created by annasuhstuff on 6/26/2024 in #⚡｜serverless

LoRA adapter on Runpod.io (using vLLM Worker)

above i have wrote the code which does not use peft after merging adapter and base model, i have a config, but the model quality gets a lot worse so i thought i could set API endpoint without config

22 replies

•Created by annasuhstuff on 6/26/2024 in #⚡｜serverless

LoRA adapter on Runpod.io (using vLLM Worker)

Are there any ways to avoid this eror? (no config found) Yes, with peft i need, but with other method i dont so, to enable API endpoint, as I see, config is also a musthave?

22 replies

•Created by annasuhstuff on 6/26/2024 in #⚡｜serverless

LoRA adapter on Runpod.io (using vLLM Worker)

is there any way to set an endpoint using my model without config (cause i only have adapter_config) or i have to somehow change the model?

22 replies

•Created by annasuhstuff on 6/26/2024 in #⚡｜serverless

LoRA adapter on Runpod.io (using vLLM Worker)

2024-06-26T12:49:19.602715611Z OSError: alsokit/eLM-mini-4B-4K-4bit-v01 does not appear to have a file named config.json. Checkout 'https://huggingface.co/alsokit/eLM-mini-4B-4K-4bit-v01/tree/main' for available files.

22 replies

•Created by annasuhstuff on 6/26/2024 in #⚡｜serverless

LoRA adapter on Runpod.io (using vLLM Worker)

the problem is, i only have adapter_config

22 replies

•Created by annasuhstuff on 6/26/2024 in #⚡｜serverless

LoRA adapter on Runpod.io (using vLLM Worker)

ow, it seems like it works after doing these steps !pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git" !pip install --no-deps xformers "trl<0.9.0" peft accelerate bitsandbytes from unsloth import FastLanguageModel import torch max_seq_length = 2048 # Choose any! We auto support RoPE Scaling internally! dtype = None # None for auto detection. Float16 for Tesla T4, V100, Bfloat16 for Ampere+ load_in_4bit = True # Use 4bit quantization to reduce memory usage. Can be False. 4bit pre quantized models we support for 4x faster downloading + no OOMs. fourbit_models = [ "unsloth/mistral-7b-v0.3-bnb-4bit", # New Mistral v3 2x faster! "unsloth/mistral-7b-instruct-v0.3-bnb-4bit", "unsloth/llama-3-8b-bnb-4bit", # Llama-3 15 trillion tokens model 2x faster! "unsloth/llama-3-8b-Instruct-bnb-4bit", "unsloth/llama-3-70b-bnb-4bit", "unsloth/Phi-3-mini-4k-instruct", # Phi-3 2x faster! "unsloth/Phi-3-medium-4k-instruct", "unsloth/mistral-7b-bnb-4bit", "unsloth/gemma-7b-bnb-4bit", # Gemma 2.2x faster! ] # More models at https://huggingface.co/unsloth model, tokenizer = FastLanguageModel.from_pretrained( # model_name = "unsloth/mistral-7b-v0.3", # Choose ANY! eg teknium/OpenHermes-2.5-Mistral-7B model_name = "unsloth/Phi-3-mini-4k-instruct", max_seq_length = max_seq_length, dtype = dtype, load_in_4bit = load_in4bit, # token = "hf...", # use one if using gated models like meta-llama/Llama-2-7b-hf ) Does Runpod serverless support LoRA adapter?

22 replies

•Created by annasuhstuff on 6/26/2024 in #⚡｜serverless

LoRA adapter on Runpod.io (using vLLM Worker)

well, also have an error ValueError: Can't find 'adapter_config.json' at 'alsokit/eLM-mini-4B-4K-4bit-v01'

22 replies

•Created by annasuhstuff on 6/26/2024 in #⚡｜serverless

LoRA adapter on Runpod.io (using vLLM Worker)

using this from peft import PeftModel, PeftConfig from transformers import AutoModelForCausalLM config = PeftConfig.from_pretrained("alsokit/eLM-mini-4B-4K-4bit-v01") base_model = AutoModelForCausalLM.from_pretrained("unsloth/phi-3-mini-4k-instruct-bnb-4bit") model = PeftModel.from_pretrained(base_model, "alsokit/eLM-mini-4B-4K-4bit-v01")

22 replies

•Created by annasuhstuff on 6/25/2024 in #⚡｜serverless

No config error /

When I create an endpoint, i just pass my hj token in environment variables HF_TOKEN : hf_blablabla (i copy the token from my hf account)

5 replies