aikitoria
aikitoria
RRunPod
Created by 0x6d6178 on 10/27/2024 in #⛅|pods
When will H200's be available?
👀
9 replies
RRunPod
Created by 0x6d6178 on 10/27/2024 in #⛅|pods
When will H200's be available?
so what you're saying is we need a script that tries to grab one every second
9 replies
RRunPod
Created by 0x6d6178 on 10/27/2024 in #⛅|pods
When will H200's be available?
are there actually some? I've never seen it as "available"
9 replies
RRunPod
Created by 0x6d6178 on 10/27/2024 in #⛅|pods
When will H200's be available?
why is it incorrectly listed under "previous gen"?
9 replies
RRunPod
Created by Volko on 4/17/2024 in #⛅|pods
is AWQ faster than GGUF ?
you use aphrodite-engine or TensorRT-LLM (good luck!) for maximum speed on multiple GPUs
10 replies
RRunPod
Created by Volko on 4/17/2024 in #⛅|pods
is AWQ faster than GGUF ?
you use EXL2 for maximum speed on a single GPU
10 replies
RRunPod
Created by Volko on 4/17/2024 in #⛅|pods
is AWQ faster than GGUF ?
you use GGUF if you want to run on a very small GPU and have to keep some of the model on CPU only. it's for hybrid CPU/GPU inference
10 replies
RRunPod
Created by Volko on 4/17/2024 in #⛅|pods
is AWQ faster than GGUF ?
I've not used AWQ or GPTQ directly, those are older formats
10 replies
RRunPod
Created by Dhruv Mullick on 4/5/2024 in #⛅|pods
TensorRT-LLM setup
worlds least stable software
53 replies
RRunPod
Created by Dhruv Mullick on 4/5/2024 in #⛅|pods
TensorRT-LLM setup
except if I build trtllm myself the built executable doesn't work
53 replies
RRunPod
Created by Dhruv Mullick on 4/5/2024 in #⛅|pods
TensorRT-LLM setup
it's probably not that hard to add it
53 replies
RRunPod
Created by Dhruv Mullick on 4/5/2024 in #⛅|pods
TensorRT-LLM setup
but my feature request died it seems https://github.com/NVIDIA/TensorRT-LLM/issues/1154
53 replies
RRunPod
Created by Dhruv Mullick on 4/5/2024 in #⛅|pods
TensorRT-LLM setup
I definitely want min-p sampling for example
53 replies
RRunPod
Created by Dhruv Mullick on 4/5/2024 in #⛅|pods
TensorRT-LLM setup
realized it would be more work than I have time for rn
53 replies
RRunPod
Created by Dhruv Mullick on 4/5/2024 in #⛅|pods
TensorRT-LLM setup
I didn't get to the step of actually running triton
53 replies
RRunPod
Created by Dhruv Mullick on 4/5/2024 in #⛅|pods
TensorRT-LLM setup
but you have to install trtllm the same way to get the tools to build the engine locally
53 replies
RRunPod
Created by Dhruv Mullick on 4/5/2024 in #⛅|pods
TensorRT-LLM setup
then you should run it in the nvidia container image like I did there yeah
53 replies
RRunPod
Created by Dhruv Mullick on 4/5/2024 in #⛅|pods
TensorRT-LLM setup
if you don't want to run triton that should work just fine
53 replies
RRunPod
Created by Dhruv Mullick on 4/5/2024 in #⛅|pods
TensorRT-LLM setup
so I made a container off the nvidia one that runpod can launch, here https://discord.com/channels/912829806415085598/1211077936338178129/1211673633727057920
53 replies
RRunPod
Created by Dhruv Mullick on 4/5/2024 in #⛅|pods
TensorRT-LLM setup
my original goal was to run tritonserver
53 replies