internal node discovery
Hello,
I am trying to deploy an Elixir based app and am banging my head against the wall for clustering. I use dns clustering with the internal network as thedns query and that seems to get nodes. But the nodes can't connect to each other, and its very hard to understand whats going on.
Is there some way for service internal discovery between replicas that I can use?
Solution:Jump to solution
New reply sent from Help Station thread:
I'd recommend using libcluster with the DNSPoll strategy: https://hexdocs.pm/libcluster/Cluster.Strategy.DNSPoll.htmlThis should find all replicas behind the internal DNS name (x.railway.internal) and perform a Node.connect/1 for all entries, establishing a clusterYou're seeing this because this thread has been automatically linked to the Help Station thread....
16 Replies
Project ID:
cff55b2e-ba77-4df5-9e61-cb0fe7bbead4
cff55b2e-ba77-4df5-9e61-cb0fe7bbead4
can you explain what "dns clustering with the internal network" means to you?
Sure think, so Erlang has a feature called clustering where if you run multiple nodes of your application, and they can discover each other, they connect and are able to do things like transfer jobs, share pubsub etc.
But to get that setup you need some way for nodes to discover each other. Ways to do that are:
- a static list (obviously not railway compatible)
- An orchestrator like k8s that can provide a list of nodes, example: https://www.gigalixir.com/docs/cluster
- Some way to broadcast/gossip discovery, in my case I tried dns resolution, since we have the app.internal names we can use to communicate in the app pricate network, an example: https://fly.io/docs/elixir/the-basics/clustering/
Gigalixir
Clustering
How Gigalixir Manages Clustering Nodes with Libcluster and How to Set up Distributed Phoenix Channels after successfully clustering your nodes Gigalixir
why wouldn't a static list of internal domains work?
static list would be ips
why wouldn't domains work?
I am not sure? It is very difficult to accertain from logs alone what is going wrong in each test
But if there is a way to get the internal ipv6 address that might be another thing I can try?
Like for example the fly.io example has
I did find mentions of a replica_id envvariable, but using that as the name for each node doesn't seem to work
yes of course you simply do a AAAA DNS lookup for a given private domain, but respectively it sounds like you may not fully understand how the erland side of things works, that means implementing it on railway is going to be very challenging, especially since I don't know how the erlang side of things works either
I mean I could try to get an interactive console on my nodes and try to figure out how the railway internal network is set up but I thought I'd ask if anyone has already dealt with this?
And things like undocumented env variables that you can only find through discord π I was hoping to maybe figure out a way to get each replicas ip through an env variable, as that would help tremedously too
the only way would be to do a dns lookup
but if it helps, the private network is setup like a LAN network, no firewalls, proxys or anything, just helpful dns names
Okay that is good to know! That sounds like if I figure out a way to get each nodes own ipv6 I can make it work
Keep in mind that the ip address of a given will change every time that service is redeployed
getting the ip address is the easy part lol
Redeploy is not a problem, the clustering is handling all that automatically, I just need to basically have every node know its own ip first and then its all good π might just have to dump the env to see if I can find what I need there
again, do a DNS lookup -
https://utilities.up.railway.app/dns-lookup?value=utilities.railway.internal&type=aaaa
I'm not sure what you expect to find in the environment variables -
https://utilities.up.railway.app/env-vars
Solution
New reply sent from Help Station thread:
I'd recommend using libcluster with the DNSPoll strategy: https://hexdocs.pm/libcluster/Cluster.Strategy.DNSPoll.htmlThis should find all replicas behind the internal DNS name (x.railway.internal) and perform a Node.connect/1 for all entries, establishing a clusterYou're seeing this because this thread has been automatically linked to the Help Station thread.