@Chaika Hey! I dont know if you remember

@Chaika Hey! I dont know if you remember, but you helped me with serving wildcard SSL certs on custom domains for customers a few weeks ago. I've recently been having an issue where if you constantly reload the custom domain site without cache (ctrl + shift + r), then there's ~50% chance for an "Invalid SSL Cert" error. Do you have any idea why this could be happening? I can DM you the website that this happens with if needed.
27 Replies
Chaika
Chaika•4mo ago
@x03 I don't have that bad of a memory lol, you don't have more then one server/traffik container right? This is happening on the same cf for saas setup on a custom domain of a customer?
x03
x03OP•4mo ago
Apologies for the late reply, something came up. I only have one server and use a single container with the catch-all traefik config. Yes this is happening on the same exact one that we configured. The only variable that changed since previously is that the domain got heavy loads of traffic On launch there was 20,000 unique visitors to the domain, nowadays its ~300 I dont know if the traffic is what caused this, but nothing else really changed
x03
x03OP•4mo ago
oh actually its about 1k daily, not 300
No description
x03
x03OP•4mo ago
just for that one domain
Chaika
Chaika•4mo ago
and the issue is on any custom hostname using cf for saas/the fallback cert, or any domain using traffik at all, or just your own hostnames you have other configs for?
x03
x03OP•4mo ago
So this is only for custom saas domains. My *.tsar.app domains use the same traefik config too and dont have this issue
Chaika
Chaika•4mo ago
hmm yea that is interesting, do you see anything in logs about the certificate? If you bypass proxy and hit traffik directly, do you see the cert being served reliability? https://discord.com/channels/595317990191398933/1268644381418848326/1268647595421601874 weird that it would suddenly change to not reliably serving it
x03
x03OP•3mo ago
I'll test this out in a bit @Chaika sorry for the long wait, I had a busy week and I'm only now taking a look at this. Since my last time checking (same day as my initial message), I have not tested this issue at all. Looking at it today, I cannot reproduce the issue that I was having. Normally when I would reload without cache, there would be ~50% chance of it throwing an SSL error. Now it's 0%, every reload succeeds. I dont remember changing any settings on cloudflare OR the server. The only thing I've done was deploy a few updates which should not have affected anything SSL related. The usage is about the same, with 1k unique visitors and 30k requests, with 60% cached. I also went through all my Coolify configs, and everything for my traefik proxy settings is default besides the container-specific config that we set up for the .app domain. The wildest thing is that even though the custom domains started working properly again, for whatever reason my https://tsar.dev domain now throws SSL errors 100% of the time 😭 😭 I've literally changed NOTHING and this domain does not even use any fancy reverse proxy configs, its literally the same as all my other domains. The only thing special about this domain is that it uses CF Zero Trust. This .dev domain literally worked a few days ago and now it's not working for whatever reason. My domain settings for the .dev seem to be fine, with the mode being set to strict. I'm very confused as to why I get these random SSL errors out of nowhere, not too sure if its a problem on my end or perhaps Coolify. I'll definitely keep a lookout for anything related to this from now on.
x03
x03OP•3mo ago
Ah, seems like my .dev certificate expired. Do you know of any way to avoid this, or at least automate renewal?
No description
x03
x03OP•3mo ago
Also here's the output for the (now working) custom domain:
No description
x03
x03OP•3mo ago
Looking a bit more into this issue I'm like 80% sure it's Coolify. I've opened a post in their Discord to hopefully get some help with this.
Chaika
Chaika•3mo ago
Coolify/Traefik should automate renewal for you unless it's failing, would have to check logs and see why it did fail I did try warning/cf origin certs are still an option if you can figure out how to get them to work with traefik lol https://discord.com/channels/595317990191398933/1268644381418848326/1268655682966782125
x03
x03OP•3mo ago
i might have to look into that
x03
x03OP•3mo ago
@Chaika so I did a bit of digging, and I've set up Cloudflare as my traefik cert provider (not sure if this relates to the origin cert stuff or not). This still didnt work, so I kept digging through docs and found this, so I guess I'm gonna try and set this up
No description
x03
x03OP•3mo ago
Nevermind, there's no "setup" for this, I think it should work out of the box after I set my provider. Sadly it's still not working though. Here's what I added to my global proxy config:
...
services:
traefik:
...
environment:
- CLOUDFLARE_DNS_API_TOKEN=***
command:
...
- '--certificatesresolvers.letsencrypt.acme.dnschallenge.provider=cloudflare'
...
services:
traefik:
...
environment:
- CLOUDFLARE_DNS_API_TOKEN=***
command:
...
- '--certificatesresolvers.letsencrypt.acme.dnschallenge.provider=cloudflare'
Any way to check if the provider change worked? Running the curl command still shows 'Let's Encrypt', I assume because the old cert still hasn't expired. I'm looking into CF's origin cert stuff, and is there any way to allow all the SaaS domains to use the cert or do I need to add them all manually Man working with this stuff is not fun... Okay so the issue with the .dev domain turned out to be the fact that it was behind Zero Trust so ACME failed to refresh the certificate 😅 This took me way too long to realize @Chaika okay so the first domain I had issues with was because of Zero Trust blocking ACME requests, the second domain was because I had "under attack" mode on (I use it as a bootleg anti-scrape on newer projects) and that mode was also blocking the ACME requests. Turns out this issue was pretty simple, but I had no idea how any of this certificate stuff worked and didnt even know what ACME was until I read deeper into it.
Chaika
Chaika•3mo ago
DNS Challenges like through the CF API would get around both of those would be same as your certs rn, you'd just need one covering your own zone like example.com,*.example.com
x03
x03OP•3mo ago
Oh I see, thanks for the clarification I'll look into setting up the cloudflare origin certs later, im just glad everything works at the moment Wdym by this?
Chaika
Chaika•3mo ago
if you still wanted to use the CF Resolver, might need to set that for the default as well, something like - "traefik.tls.stores.default.defaultgeneratedcert.resolver=cloudflare" I assume you could test it by changing the sans and adding something else/forcing it to refresh but might cause downtime ACME has two main verification modes. HTTP and DNS. HTTP to a special path (.well-known/acme-challenge/ttt) can be blocked by things as you've noticed and is generally a bit more unstable. DNS adds TXT Records via the Cloudflare API (or whatever DNS you use) and doesn't care about firewall/http reachability https://letsencrypt.org/docs/challenge-types/
x03
x03OP•3mo ago
Ohhhh interesting So I'd need to add TXT records to all my domains What about SaaS domains
Chaika
Chaika•3mo ago
well not manually lol, you'd have the integration do that for you using the cf api
x03
x03OP•3mo ago
oh
Chaika
Chaika•3mo ago
and you wouldn't need to verify your saas domains in this context, just need certs for your own
x03
x03OP•3mo ago
Alright I see, I'll read some docs to find out how to swap to DNS instead of HTTP
x03
x03OP•3mo ago
Traefik Let's Encrypt Documentation - Traefik
Learn how to configure Traefik Proxy to use an ACME provider like Let's Encrypt for automatic certificate generation. Read the technical documentation.
x03
x03OP•3mo ago
looks simple enough, ill try and set it up @Chaika any way to verify that swapping to DNS was a success? I ran a dig TXT _acme-challenge.yourdomain.com command and it all checks out Thanks for all your help, everything is perfect now
Chaika
Chaika•3mo ago
logs?
I ran a dig TXT _acme-challenge.yourdomain.com command and it all checks out
You'd have to be really quick to see it lol, it adds, verifies, and deletes
x03
x03OP•3mo ago
There was no logs, which I guess is a good sign 😭
Want results from more Discord servers?
Add your server