R
RunPod7mo ago
Omer

DATA LOSS IN EU-RO-1 - URGENT

Need someone to communicate ASAP this is literally sev0. We have all of our data and work backed in that network drive and out of the blue the files just disappeared.
64 Replies
Madiator2011
Madiator20117mo ago
Could you submit ticket on website with all info
Omer
OmerOP7mo ago
4667 Ticket number, we already opened one @Papa Madiator @Papa Madiator I’m waiting?? Whose joining the ticket yall We have models, datasets, code that are on that drive we need someone ASAP and I cannot stress this enough
nerdylive
nerdylive7mo ago
Hmm same for me haha My ticket I'd 4609 still no response yet trying to be patient
Omer
OmerOP7mo ago
@nerdylive when did this happen to you? I’m not sure if I’m calmer or more anxious by the fact that is not just us lol Just panicking at the moment tbh
nerdylive
nerdylive7mo ago
Hmm like a few days ago
Omer
OmerOP7mo ago
And no response yet?? This is sick
nerdylive
nerdylive7mo ago
There are 3 days ago it happened Still being investigated they said I deleted it and made a new one, it was all gone like a new fresh one
Omer
OmerOP7mo ago
Ok now I’m officially freaking out
nerdylive
nerdylive7mo ago
calm down, maybe they can recover it, because they replicate network drive ( if I'm not wrong )
Omer
OmerOP7mo ago
I sure hope so.. there’s no way for us to recover from this if the driver is gone
nerdylive
nerdylive7mo ago
Wait what driver
Omer
OmerOP7mo ago
Network driver Drive Sorry
nerdylive
nerdylive7mo ago
Oh yeah Let's just wait for now
digigoblin
digigoblin7mo ago
Is this happening in a specific region or all regions?
nerdylive
nerdylive7mo ago
I'm not sure, what ur region @Omer
Omer
OmerOP7mo ago
Eu-Ro-1
nerdylive
nerdylive7mo ago
Oh same
digigoblin
digigoblin7mo ago
So seems some issue in RO region then 🙈 Let me check mine and see if my files are disappearing too.
nerdylive
nerdylive7mo ago
Yeah sure let us know too
digigoblin
digigoblin7mo ago
My A1111 volume is still fine, it didn't need to sync anything from Hugging Face
No description
digigoblin
digigoblin7mo ago
I'll test ComfyUI as well.
xcxooxl
xcxooxl7mo ago
Did you do anything heavy i/o when it happened like download or uploading large files or creating lots of files?
nerdylive
nerdylive7mo ago
Hmm I don't know When it exactly happens, but when I check its gone
digigoblin
digigoblin7mo ago
I can confirm that I've lost files from my ComfyUI network volume in RO region as well 😱
nerdylive
nerdylive7mo ago
Wait really? 🤔
digigoblin
digigoblin7mo ago
Yep, lost about 320MB of data in RO region for my ComfyUI storage I resynced it from my NO storage, but I've removed RO from the list of my RabbitMQ consumers for now anyway until the issue is resolved. Now I need to back up my data to external cloud storage because RunPod is unreliable 😱
nerdylive
nerdylive7mo ago
Make a ticket too They replicate network volumes right? How can we experience this
digigoblin
digigoblin7mo ago
Supposedly, there is probably something wrong with the replication 🤷‍♂️ My ticket number is 4669
Madiator2011
Madiator20117mo ago
Forwarded @Omer @digigoblin @nerdylive could you also provide info on what datacenter you had issues, what templates did you used, are there any auto syncing functions in templates. That would help a lot.
digigoblin
digigoblin7mo ago
All RO
nerdylive
nerdylive7mo ago
yep
digigoblin
digigoblin7mo ago
No autosyncing, Pytorch template to check and data was gone
nerdylive
nerdylive7mo ago
yes, but my template works well for the autosync i've used it just today, yesterday with new network volume and it works just fine.. didn't delete other datas
Madiator2011
Madiator20117mo ago
We kinda trying to get some intel what might be going on
digigoblin
digigoblin7mo ago
Yeah, I haven't had issues either, only checked when @Omer and @nerdylive said they lost data and I saw I lost data too.
nerdylive
nerdylive7mo ago
Somebody uses my template and said it was fine too btw
Madiator2011
Madiator20117mo ago
btw was NS attached to some pod or just laying arround untuched when data lost happened?
nerdylive
nerdylive7mo ago
nothing happened on the host side?
digigoblin
digigoblin7mo ago
Mine is attached to serverless not pods.
nerdylive
nerdylive7mo ago
not sure when it happened actually but its attached to serverless both cpu, gpu
digigoblin
digigoblin7mo ago
I attached it to a pod just to check if data was lost, and discovered it was.
nerdylive
nerdylive7mo ago
2 also using pods to download before
digigoblin
digigoblin7mo ago
Other regions seem to be ok, I use NO, SE, CA and RO, and only RO seems to be affected.
Omer
OmerOP7mo ago
Ours was attached to a pod @Papa Madiator the most important thing - is there any backup to the storage?
Madiator2011
Madiator20117mo ago
for secure cloud it's possible (I do not have high level access though) we kinda need figure out why data is gone at first place
nerdylive
nerdylive7mo ago
even when i deleted the empty ns?
Omer
OmerOP7mo ago
There are no network drives on community pods And the loss happened in a network drive
nerdylive
nerdylive7mo ago
yeap all secure cloud
digigoblin
digigoblin7mo ago
What do you mean by deleting empty ns? Why would you want to delete it if its already empty?
Omer
OmerOP7mo ago
We had a terabyte of storage in the drive , time is of the essence- is there anyone available for a bridge?
digigoblin
digigoblin7mo ago
How much did you lose? everything?
Omer
OmerOP7mo ago
Every single bit
nerdylive
nerdylive7mo ago
well it still charges my account right?
digigoblin
digigoblin7mo ago
And its still gone when you mount to a new pod @Omer ?
Omer
OmerOP7mo ago
Yes
nerdylive
nerdylive7mo ago
i don't see a reason to use it anymore when it doesnt have the data and it seems to be broken lol
digigoblin
digigoblin7mo ago
Oh, I get what you mean now, sorry was confused
nerdylive
nerdylive7mo ago
yeah np
xcxooxl
xcxooxl7mo ago
any updatesd?
nerdylive
nerdylive7mo ago
Not yet
digigoblin
digigoblin7mo ago
@xcxooxl did you log a ticket for it if you're also experiencing data loss?
haris
haris7mo ago
@xcxooxl @digigoblin @Omer what templates were used when this happened?
nerdylive
nerdylive7mo ago
digigoblin used pytorch template to check, i think
digigoblin
digigoblin7mo ago
Yep
Want results from more Discord servers?
Add your server