Beta test Llama3 serving and GUI on MAX Nightly

You can beta test the upcoming MAX serving and Llama3 chatbot GUI using the nightly release, I'd love feedback if you run into any problems. Start the GUI by running these commands on arm64 macOS or x86/arm64 Linux (Note the smallest llama3 model is 4.5GB):
rm -rf ~/.modular
curl -s https://get.modular.com | sh -
modular auth
modular install nightly/max
MAX_NIGHTLY_PATH=$(modular config max-nightly.path)
SHELL_RC=~/.$(basename "$SHELL")rc
echo 'export MODULAR_HOME="'$HOME'/.modular"' >> $SHELL_RC
echo 'export PATH="'$MAX_NIGHTLY_PATH'/bin:$PATH"' >> $SHELL_RC
curl -fsSL https://pixi.sh/install.sh | $SHELL
source "$SHELL_RC"
git clone https://github.com/modularml/max.git ~/max
cd ~/max
git checkout nightly
cd examples/gui
pixi run gui
rm -rf ~/.modular
curl -s https://get.modular.com | sh -
modular auth
modular install nightly/max
MAX_NIGHTLY_PATH=$(modular config max-nightly.path)
SHELL_RC=~/.$(basename "$SHELL")rc
echo 'export MODULAR_HOME="'$HOME'/.modular"' >> $SHELL_RC
echo 'export PATH="'$MAX_NIGHTLY_PATH'/bin:$PATH"' >> $SHELL_RC
curl -fsSL https://pixi.sh/install.sh | $SHELL
source "$SHELL_RC"
git clone https://github.com/modularml/max.git ~/max
cd ~/max
git checkout nightly
cd examples/gui
pixi run gui
You can also ssh into a machine with vscode and run the above commands in the terminal, which will forward ports so you can run the GUI in your local browser. Work is being done to vastly improve the experience of getting up and running and MAX, stay tuned for that.
No description
27 Replies
Darin Simmons
Darin Simmonsā€¢3mo ago
@Jack Clayton It appears you have permission to the max/repo that I do not.
darin@home:~$ git clone git@github.com:modularml/max ~/max
Cloning into '/home/darin/max'...
The authenticity of host 'github.com (140.82.116.3)' can't be established.
ED25519 key fingerprint is SHA256:+DiY3wvvV6TuJJhbpZisF/zLDA0zPMSvHdkr4UvCOqU.
This key is not known by any other names
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added 'github.com' (ED25519) to the list of known hosts.
git@github.com: Permission denied (publickey).
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.
darin@home:~$ git clone git@github.com:modularml/max ~/max
Cloning into '/home/darin/max'...
The authenticity of host 'github.com (140.82.116.3)' can't be established.
ED25519 key fingerprint is SHA256:+DiY3wvvV6TuJJhbpZisF/zLDA0zPMSvHdkr4UvCOqU.
This key is not known by any other names
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added 'github.com' (ED25519) to the list of known hosts.
git@github.com: Permission denied (publickey).
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.
Is it the git in the git@github.com that I haven't given permissions or ? Ubuntu 22.04 clean install Simply changed it to git clone https://github.com/modularml/max.git and am able to move on
Jack Clayton
Jack Claytonā€¢3mo ago
Ok cool thanks I'll fix the command, yes need to have a ssh key set for that to work thanks for raising
Darin Simmons
Darin Simmonsā€¢3mo ago
šŸ¦™ Llama3 Select a quantization encoding to download model from a predefined Model URL. If the model exists at Model Path it won't be downloaded again. You can set a custom Model URL or Model Path that matches the quantization encoding. It does say 30minute download time btw 2.34MB/s. Is Telstra serving this up from mid 2005? šŸ˜‰ Of course 5Gb download would have been 1/2 monthly at the time. I lived in Wollongong for a year.
Jack Clayton
Jack Claytonā€¢3mo ago
Yeah it's a 8B model so 4.5GB, but with the Q4_K it's very fast on CPU and high quality outputs Oh nice my Dad is from the Gong, very nice area
Darin Simmons
Darin Simmonsā€¢3mo ago
yes, we learned to put beets on our chicken burgers, šŸ¤Æ . I went to uni for a year while my wife worked and volunteered and travelled. Anything you want to me try once it downloads?
Jack Clayton
Jack Claytonā€¢3mo ago
You can ask it any question and make sure it's working with context for the conversation, the system prompt down the bottom says it's a coding assistant, you can change that without rebuilding the model
Darin Simmons
Darin Simmonsā€¢3mo ago
It's llama doing the llama thing. šŸ¦™ I asked it for some mojo code and it suggested import parallel and used let but it's responding and working just fine. I'm asking about going to San Francisco now with a different context. I tried "When is the best day and time to travel into San Francisco?" and it started giving a very general answer. I asked another question before it was done, and it appears to be hung. Going out to dinner, will try some more when I get back.
Jack Clayton
Jack Claytonā€¢3mo ago
Cool thanks for the feedback, will make it so it locks sending another message until the current stream is finished or cancelled like ChatGPT. Much appreciated.
Darin Simmons
Darin Simmonsā€¢3mo ago
I stopped via the button on the upper right, I chose q6_k , it started the download process, I went to dinner and came back to...
Darin Simmons
Darin Simmonsā€¢3mo ago
Killing the app and restarting worked fine (defaulted to q4), I switched it q6_k and it built find and then (eventually) started. Old AMD Ryzen 3 1700 , 64GB ram.
Jack Clayton
Jack Claytonā€¢3mo ago
Thanks for that cheers
Darin Simmons
Darin Simmonsā€¢3mo ago
rm -rf the whole thing, got back to q4, flipped it to q6 and it loaded fine and then ran fine
Martin Dudek
Martin Dudekā€¢3mo ago
Installation went without issues on my M2 Mac Book Pro, thanks for sharing this app. Works well except the issue @Darin Simmons already mentioned. Cancel button would be great
Jack Clayton
Jack Claytonā€¢3mo ago
Awesome thanks for the report
Want results from more Discord servers?
Add your server