RunPod•14mo ago

ngc tritonserver container image not usable?

I tried to create a pod on a server with cuda >= 12.2 using this image: nvcr.io/nvidia/tritonserver:24.01-trtllm-python-py3 it loads up correctly, but the resulting server is not usable, cannot connect ssh (the window immediately closes after typing passphrase). the same image works fine on servers from vast.ai, what's the issue?

9 Replies

J.•14mo ago

Modify the docker image so that you install your own nginx / openssh stuff on it: Example of me starting up the openssh server / stuff from my own template. I tend to build off the runpod base templates. https://github.com/justinwlin/Runpod-OpenLLM-Pod-and-Serverless/blob/main/start.sh Dependencies here: OpenSSH / Nginx prob what you need at minimum so that you can ssh / also port forward: https://github.com/runpod/containers/blob/af63c609f5ac84495fb0b8bc4779b17c1d4b21e0/README.md?plain=1#L23

aikitoriaOP•14mo ago

I see, I will need to try that later why is this necessary when the image works fine as is on vast though?

J.•14mo ago

Dunno, Vast.ai and Runpod could just be setup differently. Never used vast.ai. Theoretically if your SSH to it, and you got your own public key or password based authentication going on there, it should be fine. https://discord.com/channels/912829806415085598/1202555381126008852 Madiator has a pip package for ex. where he sets up password based ssh / I've set it up on my own containers / docker files. But I dont know your container well enough to comment on it, cause i usually build mine

ashleyk•14mo ago

Also make sure your docker image is actually keeping the container alive. It sounds like it is constantly restarting because its not being kept alive. You can add bash -c 'sleep infinity' to your Docker comand and see if that resolves it. You don't need nginx etc.

aikitoriaOP•14mo ago

thanks for the advice, I cobbled together these with help from chatgpt and it seems to have produced a container that works on runpod, with the original environment intact so the triton executables can be launched from ssh: start.sh

#!/bin/bash
/usr/sbin/sshd -D
sleep infinity

#!/bin/bash
/usr/sbin/sshd -D
sleep infinity

Dockerfile

# Use triton as the base image
FROM nvcr.io/nvidia/tritonserver:24.01-trtllm-python-py3

# Avoid prompts from apt
ENV DEBIAN_FRONTEND=noninteractive

# Install OpenSSH Server
RUN apt-get update && \
    apt-get install -y openssh-server && \
    rm -rf /var/lib/apt/lists/*

# Configure SSH to disallow password authentication, expose the SSH port
RUN mkdir /var/run/sshd && \
    echo 'PermitRootLogin prohibit-password' >> /etc/ssh/sshd_config && \
    echo 'PasswordAuthentication no' >> /etc/ssh/sshd_config && \
    echo 'PubkeyAuthentication yes' >> /etc/ssh/sshd_config

# Add the public key for the root user
RUN mkdir -p /root/.ssh && \
    echo 'ssh-ed25519 <snip> test' > /root/.ssh/authorized_keys

# Set correct permissions for the SSH directory and keys
RUN chmod 700 /root/.ssh && chmod 600 /root/.ssh/authorized_keys

# Dump current environment into a file
RUN printenv > /etc/environment

# Ensure that SSH sessions source the global environment variables
RUN echo 'source /etc/environment' >> /root/.bashrc

# Copy the start script into the container
COPY start.sh /start.sh
RUN chmod +x /start.sh

# Expose the SSH port
EXPOSE 22

CMD ["/start.sh"]

# Use triton as the base image
FROM nvcr.io/nvidia/tritonserver:24.01-trtllm-python-py3

# Avoid prompts from apt
ENV DEBIAN_FRONTEND=noninteractive

# Install OpenSSH Server
RUN apt-get update && \
    apt-get install -y openssh-server && \
    rm -rf /var/lib/apt/lists/*

# Configure SSH to disallow password authentication, expose the SSH port
RUN mkdir /var/run/sshd && \
    echo 'PermitRootLogin prohibit-password' >> /etc/ssh/sshd_config && \
    echo 'PasswordAuthentication no' >> /etc/ssh/sshd_config && \
    echo 'PubkeyAuthentication yes' >> /etc/ssh/sshd_config

# Add the public key for the root user
RUN mkdir -p /root/.ssh && \
    echo 'ssh-ed25519 <snip> test' > /root/.ssh/authorized_keys

# Set correct permissions for the SSH directory and keys
RUN chmod 700 /root/.ssh && chmod 600 /root/.ssh/authorized_keys

# Dump current environment into a file
RUN printenv > /etc/environment

# Ensure that SSH sessions source the global environment variables
RUN echo 'source /etc/environment' >> /root/.bashrc

# Copy the start script into the container
COPY start.sh /start.sh
RUN chmod +x /start.sh

# Expose the SSH port
EXPOSE 22

CMD ["/start.sh"]

now to continue with what I actually wanted to test..

ashleyk•14mo ago

Nice, you don't need to expose port 22 in your Dockerfile for RunPod though

aikitoriaOP•14mo ago

I see, does it do it automatically? I like being able to use the direct connect for ssh so I can use tunnels when using the runpod proxy it didn't seem to work properly

J.•14mo ago

You can do it through the Runpod UI https://docs.runpod.io/pods/configuration/expose-ports

Expose ports | RunPod Documentation

Learn to expose your ports.

Geri•13mo ago

hi - is anybody here to run llama2 with tensorrt-llm and trition inference server backend?

Gaming

Programming

ngc tritonserver container image not usable?

Did you find this page helpful?