Is there a way to get a prompt on a deployed service?

I'm trying to debug my observability setup (whether prometheus is accessible from my deployed app). Is this possible?

66 Replies

Percy•3mo ago

Project ID: c96ea2ca-1b03-415f-8931-5af0e51c87c4

c96ea2ca-1b03-415f-8931-5af0e51c87c4

what's weird is that i see an arroww from my app to postgres but not to my other deployments...is is possible that they are not on the same n etwork?

addamssonOP•3mo ago

I tried this in local and it worked

Brody•3mo ago

Is there a way to get a prompt on a deployed service

there is not, railway does not provide a way to ssh into a service.

whether prometheus is accessible from my deployed app

it is as long as you are listening on ipv6 and using the correct port, as railway's private network is ipv6 only.

what's weird is that i see an arroww from my app to postgres but not to my other deployments

the arrows and their directions are dynamically detected when you use reference variables, you are likely hard coding variables when you shouldn't be - https://docs.railway.app/guides/variables#reference-variables

is is possible that they are not on the same n etwork?

its not possible, every service within a given environment in a project is in the same private network.

I tried this in local and it worked

local is likely ipv4, your services need to listen on an ipv6 host, or ideal they should all dual stack bind.

addamssonOP•3mo ago

this is the first time i hear about this (and i've been using aws a lot)...what's an ipv6 host and how do i listen to one? what's a dual stack bind? 😅 this worked literally everywhere before (local, aws, heroku) and it only fails on railway are there docs on this? also why does it work this way? it forces people to change their code so that they can facilitate the quirks of railway there must be a good reason i don't even understand why ipv6 is a concern when i'm using domain names eg: "http://prometheus:1234" i checked and there is no docs on this id mentions that private networking uses wireguard and some sort of mesh (whatever those might be) but no details on what this means when i try to send metrics/traces

Brody•3mo ago

IPv6 is not a quirk at all, and not something specific to Railway, you just happen to always work on networks with IPv4 before. There are docs for this - https://docs.railway.app/reference/private-networking https://docs.railway.app/guides/private-networking because the domain names will resolve to an IPv6 address

addamssonOP•3mo ago

so why can't i send metrics from prometheus and traces from tempo? I get that, but the fact is that my code worked everywhere else i only have to change it because of railway

Brody•3mo ago

yep, as mentioned everywhere else just happened to be IPv4

addamssonOP•3mo ago

you don't get my point and this is a tendency here you implement something, cut corners by not implementing ipv4, and then you imply that all of this is totally OK, and your users are to blame for not thinking about the corners you cut 😅 and this is not the first case 😒 i'm not saying that you're not right, but from a user perspective it is a slap on the face

Brody•3mo ago

not implementing IPv4 is not corner cutting, IPv6 was introduced over two decade ago

addamssonOP•3mo ago

in your opinion see, that again my program works everywhere but here but all blame is assigned to me

Brody•3mo ago

I don't like to compare but Fly.io's private network is also IPv6 only

addamssonOP•3mo ago

haven't used it

Brody•3mo ago

Just an example to show that we are not the only ones who choose to use a newer standard

addamssonOP•3mo ago

i get that, but it is besides my point i'm not arguing that this is a better way or more forward-compatible so if i understand this correctly the problem is tempo / prometheus is not configured to listen on ipv6? i'm using a metric exporter interestingly enough grafana can connect to both

Brody•3mo ago

If your app is attempting to connect to temp / prometheus, they need to listen on IPv6, that is correct

addamssonOP•3mo ago

i'm using a fork of a railway template that's why i'm assuming that it would work

Brody•3mo ago

you'd also need to be using the correct port when connecting to them, same way you use ports even developing locally

addamssonOP•3mo ago

i'm doing just that in theory

const NodeSdkLive = NodeSdk.layer(() => ({
    resource: { serviceName: "Larisel" },
    spanProcessor: new BatchSpanProcessor(
        new OTLPTraceExporter({
            url: "http://tempo:3100",
        }),
    ),
    instrumentations: [getNodeAutoInstrumentations()],
    // metricReader: new PrometheusExporter({ port: 9090 }),
    metricReader: new PeriodicExportingMetricReader({
        exportIntervalMillis: 500,
        exporter: new OTLPMetricExporter({
            url: "http://prometheus:9090",
        }),
    }),
}));

const NodeSdkLive = NodeSdk.layer(() => ({
    resource: { serviceName: "Larisel" },
    spanProcessor: new BatchSpanProcessor(
        new OTLPTraceExporter({
            url: "http://tempo:3100",
        }),
    ),
    instrumentations: [getNodeAutoInstrumentations()],
    // metricReader: new PrometheusExporter({ port: 9090 }),
    metricReader: new PeriodicExportingMetricReader({
        exportIntervalMillis: 500,
        exporter: new OTLPMetricExporter({
            url: "http://prometheus:9090",
        }),
    }),
}));

Brody•3mo ago

The vast majority of templates are user provided, it's possible they didn't fully test the template?

addamssonOP•3mo ago

i don't know, maybe they have some other use case

Brody•3mo ago

can you send me over some actual error messages?

addamssonOP•3mo ago

there are no errors interstingly i'm using the official otel library i just don't see the metrics (nor the traces)

Brody•3mo ago

is there perhaps a verbose debug mode you can turn on?

addamssonOP•3mo ago

i'll take a look but i think the main problem is that prom/tempo is not configured for ipv6 i'm trying to use the setup that I was using before (metrics through prom, traces through tempo, aggregated in grafana) once i figure this out i can share the code too if somebody is interested

Brody•3mo ago

I'm sure future readers would love that

addamssonOP•3mo ago

👍 maybe i can post an article about it anyway is it easier / more secure / other if you use ipv6? what's the rationale behind it?

Brody•3mo ago

I'm sorry I wouldn't know the exact reasons why it was chosen, I was not around for that discussion

addamssonOP•3mo ago

ah, ok in the docs it says that i have to use x.railway.internal in order for this to work, but in the railway app it says "you can also call me at "x". so would it make a difference if i used x.railway.internal:1234 instead of x:1234 or is it the same?

Brody•3mo ago

it shouldn't, I've never seen it make a difference

addamssonOP•3mo ago

have you ever used prometheus or tempo on railway?

Brody•3mo ago

I have not

addamssonOP•3mo ago

the search goes on then 😄

Brody•3mo ago

have you got your services to listen on ipv6?

addamssonOP•3mo ago

i had to enable the otlp write receiver now i can see metrics in prom

Brody•3mo ago

awsome!

addamssonOP•3mo ago

looks like the ipv6 settings crash tempo i copied the settings from this page: https://grafana.com/docs/tempo/latest/configuration/network/ipv6/ but when i push this to railway i get a wall of errors

addamssonOP•3mo ago

this on repeat

Brody•3mo ago

link me to the specific service please

addamssonOP•3mo ago

what is the name of the network interface? i think eth0 doesn't exist

Brody•3mo ago

railnet0, but it shouldnt matter since you shouldnt be hardcoding any interfaces

addamssonOP•3mo ago

b502be2b-037a-47c7-8744-2fe5e2c93a2d tempo does that i think

Brody•3mo ago

bad practice 😬 what env?

addamssonOP•3mo ago

according to this: https://community.grafana.com/t/grafana-loki-cluster-failed-to-start-if-no-network-interface-name-eth0/59157/3

Grafana Labs Community Forums

Grafana Loki Cluster Failed to Start If No Network Interface Name eth0

Hi Marcusteixeira, Thanks for your reply. I try your advice, but no luck, still failed to start with the same error, any other advice? Here is the more detail error message: level=error ts=xxxxx caller=loki.go:330 msg="module failed" module=memberlist-kv error="invalid service state: Failed, expected: Running, failure services &{0xc0005f8e60 {...

addamssonOP•3mo ago

staging

level=info ts=2024-09-09T19:40:18.412940643Z caller=main.go:121 msg="Starting Tempo" version="(version=r165-7421936, branch=r165, revision=7421936ba)"

level=info ts=2024-09-09T19:40:18.414789251Z caller=cache.go:55 msg="caches available to storage backend" parquet-footer=false bloom=false parquet-offset-idx=false parquet-column-idx=false trace-id-index=false parquet-page=false

level=info ts=2024-09-09T19:40:18.41779872Z caller=server.go:249 msg="server listening on addresses" http=[::]:3100 grpc=[::]:9095

level=info ts=2024-09-09T19:40:18.418135313Z caller=cache.go:55 msg="caches available to storage backend" parquet-footer=false bloom=false parquet-offset-idx=false parquet-column-idx=false trace-id-index=false parquet-page=false

level=warn ts=2024-09-09T19:40:18.42118818Z caller=netutil.go:90 msg="error getting addresses for interface" inf=eth0 err="route ip+net: no such network interface"

level=info ts=2024-09-09T19:40:18.421222206Z caller=memberlist_client.go:439 msg="Using memberlist cluster label and node name" cluster_label= node=e876d5c06828-2d8cd974

level=warn ts=2024-09-09T19:40:18.421335895Z caller=netutil.go:90 msg="error getting addresses for interface" inf=en0 err="route ip+net: no such network interface"

level=error ts=2024-09-09T19:40:18.421379281Z caller=main.go:124 msg="error running Tempo" err="failed to init module services: error initialising module: compactor: failed to create compactor: no useable address found for interfaces [eth0 en0]"

level=info ts=2024-09-09T19:40:18.412940643Z caller=main.go:121 msg="Starting Tempo" version="(version=r165-7421936, branch=r165, revision=7421936ba)"

level=info ts=2024-09-09T19:40:18.414789251Z caller=cache.go:55 msg="caches available to storage backend" parquet-footer=false bloom=false parquet-offset-idx=false parquet-column-idx=false trace-id-index=false parquet-page=false

level=info ts=2024-09-09T19:40:18.41779872Z caller=server.go:249 msg="server listening on addresses" http=[::]:3100 grpc=[::]:9095

level=info ts=2024-09-09T19:40:18.418135313Z caller=cache.go:55 msg="caches available to storage backend" parquet-footer=false bloom=false parquet-offset-idx=false parquet-column-idx=false trace-id-index=false parquet-page=false

level=warn ts=2024-09-09T19:40:18.42118818Z caller=netutil.go:90 msg="error getting addresses for interface" inf=eth0 err="route ip+net: no such network interface"

level=info ts=2024-09-09T19:40:18.421222206Z caller=memberlist_client.go:439 msg="Using memberlist cluster label and node name" cluster_label= node=e876d5c06828-2d8cd974

level=warn ts=2024-09-09T19:40:18.421335895Z caller=netutil.go:90 msg="error getting addresses for interface" inf=en0 err="route ip+net: no such network interface"

level=error ts=2024-09-09T19:40:18.421379281Z caller=main.go:124 msg="error running Tempo" err="failed to init module services: error initialising module: compactor: failed to create compactor: no useable address found for interfaces [eth0 en0]"

Brody•3mo ago

best to not hardcode an interface, but if you have to its railnet0 - https://utilities.up.railway.app/stats now that im looking at this, i think i should note that the ipv4 address is used for public traffic.

addamssonOP•3mo ago

i don't hardcode stuff, but this comes from the tracing tool i've been using (Grafana Tempo) you mean the public api? i only added a public api so that i can check it with postman that's how i figured out what the problem was with prom

Brody•3mo ago

i didnt mention a public api?

addamssonOP•3mo ago

i'm not sure what you meant 😅 ok it seems the interface name is hardcoded in many places https://github.com/grafana/tempo/issues/3590 lemme try this

Brody•3mo ago

look at the link i sent for context on what i said

addamssonOP•3mo ago

railnet0, right?

Brody•3mo ago

yes, please look at the link

addamssonOP•3mo ago

the stats page? let's see if this solves the issue ok, tempo booted up still no traces though

Brody•3mo ago

thats not ideal

addamssonOP•3mo ago

someting is amyss, but i see no requests in the log so it might be in my service when i tamper with tempo from grafana i can see it in the logs good news is that app -> prom -> grafana is now working

Brody•3mo ago

awsome!

addamssonOP•3mo ago

is there a way to expose a port on a deployment? i think tempo is listening on a hardcoded port

Brody•3mo ago

publicly?

addamssonOP•3mo ago

no, only on the private network

Brody•3mo ago

you dont need to do anything for that, there is no firewall or anything

addamssonOP•3mo ago

oh, i'll try something then

addamssonOP•3mo ago

FUCK YES

addamssonOP•3mo ago

thanks for the help, everything seems to be working now!

Brody•3mo ago

what was the final fix?

Gaming

Programming

Is there a way to get a prompt on a deployed service?