ngc tritonserver container image not usable?

I tried to create a pod on a server with cuda >= 12.2 using this image: nvcr.io/nvidia/tritonserver:24.01-trtllm-python-py3 it loads up correctly, but the resulting server is not usable, cannot connect ssh (the window immediately closes after typing passphrase). the same image works fine on servers from vast.ai, what's the issue?
9 Replies
justin
justin9mo ago
Modify the docker image so that you install your own nginx / openssh stuff on it: Example of me starting up the openssh server / stuff from my own template. I tend to build off the runpod base templates. https://github.com/justinwlin/Runpod-OpenLLM-Pod-and-Serverless/blob/main/start.sh Dependencies here: OpenSSH / Nginx prob what you need at minimum so that you can ssh / also port forward: https://github.com/runpod/containers/blob/af63c609f5ac84495fb0b8bc4779b17c1d4b21e0/README.md?plain=1#L23
aikitoria
aikitoriaOP9mo ago
I see, I will need to try that later why is this necessary when the image works fine as is on vast though?
justin
justin9mo ago
Dunno, Vast.ai and Runpod could just be setup differently. Never used vast.ai. Theoretically if your SSH to it, and you got your own public key or password based authentication going on there, it should be fine. https://discord.com/channels/912829806415085598/1202555381126008852 Madiator has a pip package for ex. where he sets up password based ssh / I've set it up on my own containers / docker files. But I dont know your container well enough to comment on it, cause i usually build mine
ashleyk
ashleyk9mo ago
Also make sure your docker image is actually keeping the container alive. It sounds like it is constantly restarting because its not being kept alive. You can add bash -c 'sleep infinity' to your Docker comand and see if that resolves it. You don't need nginx etc.
aikitoria
aikitoriaOP9mo ago
thanks for the advice, I cobbled together these with help from chatgpt and it seems to have produced a container that works on runpod, with the original environment intact so the triton executables can be launched from ssh: start.sh
#!/bin/bash
/usr/sbin/sshd -D
sleep infinity
#!/bin/bash
/usr/sbin/sshd -D
sleep infinity
Dockerfile
# Use triton as the base image
FROM nvcr.io/nvidia/tritonserver:24.01-trtllm-python-py3

# Avoid prompts from apt
ENV DEBIAN_FRONTEND=noninteractive

# Install OpenSSH Server
RUN apt-get update && \
apt-get install -y openssh-server && \
rm -rf /var/lib/apt/lists/*

# Configure SSH to disallow password authentication, expose the SSH port
RUN mkdir /var/run/sshd && \
echo 'PermitRootLogin prohibit-password' >> /etc/ssh/sshd_config && \
echo 'PasswordAuthentication no' >> /etc/ssh/sshd_config && \
echo 'PubkeyAuthentication yes' >> /etc/ssh/sshd_config

# Add the public key for the root user
RUN mkdir -p /root/.ssh && \
echo 'ssh-ed25519 <snip> test' > /root/.ssh/authorized_keys

# Set correct permissions for the SSH directory and keys
RUN chmod 700 /root/.ssh && chmod 600 /root/.ssh/authorized_keys

# Dump current environment into a file
RUN printenv > /etc/environment

# Ensure that SSH sessions source the global environment variables
RUN echo 'source /etc/environment' >> /root/.bashrc

# Copy the start script into the container
COPY start.sh /start.sh
RUN chmod +x /start.sh

# Expose the SSH port
EXPOSE 22

CMD ["/start.sh"]
# Use triton as the base image
FROM nvcr.io/nvidia/tritonserver:24.01-trtllm-python-py3

# Avoid prompts from apt
ENV DEBIAN_FRONTEND=noninteractive

# Install OpenSSH Server
RUN apt-get update && \
apt-get install -y openssh-server && \
rm -rf /var/lib/apt/lists/*

# Configure SSH to disallow password authentication, expose the SSH port
RUN mkdir /var/run/sshd && \
echo 'PermitRootLogin prohibit-password' >> /etc/ssh/sshd_config && \
echo 'PasswordAuthentication no' >> /etc/ssh/sshd_config && \
echo 'PubkeyAuthentication yes' >> /etc/ssh/sshd_config

# Add the public key for the root user
RUN mkdir -p /root/.ssh && \
echo 'ssh-ed25519 <snip> test' > /root/.ssh/authorized_keys

# Set correct permissions for the SSH directory and keys
RUN chmod 700 /root/.ssh && chmod 600 /root/.ssh/authorized_keys

# Dump current environment into a file
RUN printenv > /etc/environment

# Ensure that SSH sessions source the global environment variables
RUN echo 'source /etc/environment' >> /root/.bashrc

# Copy the start script into the container
COPY start.sh /start.sh
RUN chmod +x /start.sh

# Expose the SSH port
EXPOSE 22

CMD ["/start.sh"]
now to continue with what I actually wanted to test..
ashleyk
ashleyk9mo ago
Nice, you don't need to expose port 22 in your Dockerfile for RunPod though
aikitoria
aikitoriaOP9mo ago
I see, does it do it automatically? I like being able to use the direct connect for ssh so I can use tunnels when using the runpod proxy it didn't seem to work properly
justin
justin9mo ago
You can do it through the Runpod UI https://docs.runpod.io/pods/configuration/expose-ports
Expose ports | RunPod Documentation
Learn to expose your ports.
Geri
Geri8mo ago
hi - is anybody here to run llama2 with tensorrt-llm and trition inference server backend?
Want results from more Discord servers?
Add your server