Sai Saurab Scorelabs Comments

Sai Saurab Scorelabs

Posts Comments

MModular

•Created by Sai Saurab Scorelabs on 12/21/2024 in #questions

Are there benchmarks available for llama 3.1 8b running on max?

Are there any other parameters that I can configure? How many cpu cores to use, how much memory to allocate etc?

14 replies

MModular

•Created by Sai Saurab Scorelabs on 12/21/2024 in #questions

Are there benchmarks available for llama 3.1 8b running on max?

@Brad Larson Thanks a lot. That worked.

14 replies

MModular

•Created by Sai Saurab Scorelabs on 12/21/2024 in #questions

Are there benchmarks available for llama 3.1 8b running on max?

@Brad Larson But I want to benchmark performance on unquantized weights? Is there a way to do that on cpu? I am able to run bfloat16 weights using ipex on cpu. Is it not possible to do that using max?

14 replies

MModular

•Created by Sai Saurab Scorelabs on 12/21/2024 in #questions

Are there benchmarks available for llama 3.1 8b running on max?

Can i disable quantization?

14 replies

MModular

•Created by Sai Saurab Scorelabs on 12/21/2024 in #questions

Are there benchmarks available for llama 3.1 8b running on max?

Could you please help me resolve this?

14 replies

MModular

•Created by Sai Saurab Scorelabs on 12/21/2024 in #questions

Are there benchmarks available for llama 3.1 8b running on max?

But I am getting error

Quantization encodings are not supported in safetensor format. Got: QuantizationEncoding.Q4_K

Quantization encodings are not supported in safetensor format. Got: QuantizationEncoding.Q4_K

14 replies

MModular

•Created by Sai Saurab Scorelabs on 12/21/2024 in #questions

Are there benchmarks available for llama 3.1 8b running on max?

@Ehsan M. Kermani I am trying to run modular max on cpu using

magic run serve --huggingface-repo-id=meta-llama/Llama-3.1-8B-Instruct

magic run serve --huggingface-repo-id=meta-llama/Llama-3.1-8B-Instruct

14 replies

Gaming

Programming