Sai Saurab Scorelabs
Sai Saurab Scorelabs
MModular
Created by Sai Saurab Scorelabs on 12/21/2024 in #questions
Are there benchmarks available for llama 3.1 8b running on max?
Are there any other parameters that I can configure? How many cpu cores to use, how much memory to allocate etc?
14 replies
MModular
Created by Sai Saurab Scorelabs on 12/21/2024 in #questions
Are there benchmarks available for llama 3.1 8b running on max?
@Brad Larson Thanks a lot. That worked.
14 replies
MModular
Created by Sai Saurab Scorelabs on 12/21/2024 in #questions
Are there benchmarks available for llama 3.1 8b running on max?
@Brad Larson But I want to benchmark performance on unquantized weights? Is there a way to do that on cpu? I am able to run bfloat16 weights using ipex on cpu. Is it not possible to do that using max?
14 replies
MModular
Created by Sai Saurab Scorelabs on 12/21/2024 in #questions
Are there benchmarks available for llama 3.1 8b running on max?
Can i disable quantization?
14 replies
MModular
Created by Sai Saurab Scorelabs on 12/21/2024 in #questions
Are there benchmarks available for llama 3.1 8b running on max?
Could you please help me resolve this?
14 replies
MModular
Created by Sai Saurab Scorelabs on 12/21/2024 in #questions
Are there benchmarks available for llama 3.1 8b running on max?
But I am getting error
Quantization encodings are not supported in safetensor format. Got: QuantizationEncoding.Q4_K
Quantization encodings are not supported in safetensor format. Got: QuantizationEncoding.Q4_K
14 replies
MModular
Created by Sai Saurab Scorelabs on 12/21/2024 in #questions
Are there benchmarks available for llama 3.1 8b running on max?
@Ehsan M. Kermani I am trying to run modular max on cpu using
magic run serve --huggingface-repo-id=meta-llama/Llama-3.1-8B-Instruct
magic run serve --huggingface-repo-id=meta-llama/Llama-3.1-8B-Instruct
14 replies