Llama2 Chatbot
I'll create a thread here with the source so it doesn't clutter the chat
13 Replies
I had to delete a few things about the model setup because the post was too long
I can share those as well.
do you have a github repo for the model setup?
Not yet. The model setup isn't super complicated, but you do need to request a download key from Facebook
sure just trying to reproduce locally. so it is outputting the llm result correct?
you just want the text to populate as it is generated?
Correct. @MaartenBreddels fix worked.
I'm happy to share my process. It's just involved to setup right now because this is very WIP. Basically you need to sign up to get access to llama 2, download a couple hundred gigs of models, convert them to another format, install a few libraries from git... It's just a mess right now.
I think we could probably create an example using some kind of streamlined functionality. For example, we could replicate this example by streaming converted tokens with a delay. We could be able to modify the new AI example to achieve this. It would be a proof of concept on how to replicate the openai UI that does the same thing, without having to run an actual model.
why is there no delay right now?
Ah, when I said delay, I meant add a small random delay to simulate the token generation speed of a large language model. It would be purely for visualization reasons.
ah, why don't you get a delay from the model by itself then, i expect the models to be slow, but it's not?
I think we might be talking past each other a bit haha.
What I'm trying to say is that the models are fairly difficult to setup and run easily. So trying to set one up for an easy to run example might not be so easy. We could show the ability to have "real time" streaming responses from an AI model by simulating the processing delays with a random sleep.
It would just be to show the ability to create an updating text output.
Just to close the loop on my previous issue, here's a video of the final (working) solution
Ah, now I understand! Yes, we could show the UI that way until it's configured correctly
same with using openai, if you don't give a token, have some default reply, i like that idea
are you planning to write an article on that?
I'm not, but I'd be happy to provide an example and make a Tweet. I really don't like writing articles. I probably should do it more often.