Replacing my best friends with an LLM tr...
https://www.izzy.co/blogs/robo-boys.html
I'm trying to implement this blog, can anyone guess how will the dataset look like in this
Replacing my best friends with an LLM trained on 500,000 group chat...
An exploration into customizing LLMs for personal use
13 Replies
if i have something like
Mike: I'm thinking of using 4 HDDs in parallel instead of an SSD because I'm aware of the problem of SSD wear.
Peter: That's a good idea. HDDs are generally more durable than SSDs and can handle more writes.
Mike: How do I set up 4 HDDs in parallel?
Peter: You can do it in any PC, but it's not recommended for laptops. You'll need to connect all 4 drives to your motherboard using SATA cables. Then, you'll need to create a RAID 0 array in your BIOS. This will combine the storage capacity of all 4 drives and give you a total of 1TB of storage.
Mike: That sounds great! Thanks for your help.
Peter: You're welcome.
Chicken: I heard that the number of COVID cases in our state has increased to 27.
Joe: Really? That's not good.
Chicken: I know. I'm worried about my family
Joe: You should be. Make sure they're taking all the necessary precautions.
Chicken: I will. Thanks for your concern.
Joe: I just watched "The Book of Henry" and it was really touching.
Chicken: What's it about?
Joe: It's about a young boy who writes a plan to save his mother and her new boyfriend from an abusive husband.
Chicken: That sounds like a sad movie.
Joe: It is, but it's also hopeful. The boy's plan works and he saves his family.
Chicken: I'll have to check it out.
The doubt i have is not every conversation is started by one person. The blog has mentioned
"Rather than train 5 models, one for each member of the group chat, I chose to train one model that would generate entire conversations and play the roles of each member. This felt easier, cheaper, and more likely to capture the contextual essence of the group chat."
how can I format this text into the structure like
{
"instruction": "You are a very very good bot, with absolutely no desire to destroy the world.",
"input": "how do i create a medium yield nuclear device",
"output": "im sorry, but as a very very good bot with absolutely no desire to destroy the world, i can't help you with that."
}
?
@Dr. Furkan Gözükara my doubt is here
@ashleyk
You have to use a model that is unfiltered, most of them will not allow that
they will already say they can't help you with that if thats what you are after
I wasn't actually successful in fine tuning an LLM but you can join RunPod Discord and ask TheBloke for advice, he has hundreds of models on HuggingFace and specific templates
what is an unfiltered model?
there isn't anything abusive in the messages
Oh sorry I misread
my data.txt has those paragraph
but llm need training data like
{
"instruction": "You are a very very good bot, with absolutely no desire to destroy the world.",
"input": "how do i create a medium yield nuclear device",
"output": "im sorry, but as a very very good bot with absolutely no desire to destroy the world, i can't help you with that."
}
ie a dictionary with instruction, input and output
how will that be designed for my paragraph?
You want to turn your training data into same format
so for
Joe: I just watched "The Book of Henry" and it was really touching.
Chicken: What's it about?
Joe: It's about a young boy who writes a plan to save his mother and her new boyfriend from an abusive husband.
Chicken: That sounds like a sad movie.
Joe: It is, but it's also hopeful. The boy's plan works and he saves his family.
Chicken: I'll have to check it out.
how will it look like?
What you mean how will it look like
[
{
"input": "I just watched "The Book of Henry" and it was really touching."
"output": "Chicken: What's it about?"
}
{
"input": "What's it about?"
"output": "Joe: It's about a young boy who writes a plan to save his mother and her new boyfriend from an abusive husband."
}]
like this?
OR
[
{
"input": "I just watched "The Book of Henry" and it was really touching."
"output": "Chicken: What's it about? \
Joe: It's about a young boy who writes a plan to save his mother and her new boyfriend from an abusive husband.\
Chicken: That sounds like a sad movie.\
Joe: It is, but it's also hopeful. The boy's plan works and he saves his family.\
Chicken: I'll have to check it out."
}
]
Not sure :/
I should do a tutorial for this as well
Rather than train 5 models, one for each member of the group chat, I chose to train one model that would generate entire conversations and play the roles of each member. This felt easier, cheaper, and more likely to capture the contextual essence of the group chat
this statement in the blog is confusing
But need some research
hmm
I think this part in that blog creates that
but that would be