[Workers] How to receive emails and get attachment

Hi, how could I create a simple worker which receives emails and uploads a PDF attachment to a server?

31 Replies

And an example email worker would be https://github.com/edevil/email_worker_parser and then you just take the attachments from it and upload to your server

GitHub

GitHub - edevil/email_worker_parser: An example of a worker that pa...

An example of a worker that parses an email. Contribute to edevil/email_worker_parser development by creating an account on GitHub.

AndromedaOP•2y ago

Forgive me for the incredibly basic question, but I can't resolve require("postal-mime") - is there some way to do that within the basic browser editing environment or would I have to have a local project of some kind which I can then publish? Alternatively, just looking at the postal-mime example on the NPM site, the following example was used:

const email = await parser.parse(`Subject: My awesome email 🤓
Content-Type: text/html; charset=utf-8

<p>Hello world 😵‍💫</p>`);

const email = await parser.parse(`Subject: My awesome email 🤓
Content-Type: text/html; charset=utf-8

<p>Hello world 😵‍💫</p>`);

Which along with the repository example above suggests to me that the attachment data is somehow encoded into event.raw. Would it be a sensible/plausible idea to HTTP POST the entire content of event.raw to my webserver and do the parsing there, in an environment a bit less restrictive (and which I'm much more comfortable with) than Workers? slight bump!

DaniFoldi•2y ago

The quick editor and the playground don't currently support importing npm modules, so you'd have to work on it locally and use wrangler deploy to publish to the edge.

AndromedaOP•2y ago

I see, thanks Now, I'm incredibly lazy, so just a quick question, would there be any harm to just making event.raw the body of an HTTP request and posting that? It may very slightly cut down in CPU time too

Chaika•2y ago

no I'd say that's a good idea. the raw property of the EmailEvent is a ReadableStream which is fine to set as a body of a request, and it allows you to stream it through the Worker without buffering it, ex

export default {
  async email(message, env, ctx) {
    await fetch("https://yourwebserver.example.com", {
      body: message.raw,
      method: "POST",
      headers: {
        "email-to": message.to,
        "email-from": message.from,
        "apikeyorsomeidentifideryouuse": env.secret,
      }
    });  
  }
}

export default {
  async email(message, env, ctx) {
    await fetch("https://yourwebserver.example.com", {
      body: message.raw,
      method: "POST",
      headers: {
        "email-to": message.to,
        "email-from": message.from,
        "apikeyorsomeidentifideryouuse": env.secret,
      }
    });  
  }
}

AndromedaOP•2y ago

That's perfect, I'll give that a shot - thank you very much!

AndromedaOP•2y ago

Hmm, the Cloudflare docs and the allowlisst template use this as an example

AndromedaOP•2y ago

However, it doesn't seem to actually work

AndromedaOP•2y ago

Since this is the format that message.from appears to take

AndromedaOP•2y ago

Is it safe to just parse for what's between the angle brackets? Is that guaranteed to be the actual sender's email address and it can't be spoofed?

Chaika•2y ago

That looks like header.from message.from is Envelope from and message.headers.get from is header from, https://www.xeams.com/difference-envelope-header.htm Email is insecure by design, it's not really a question of "can it be spoofed" and instead "if the sender properly sets everything up on their end, how easy is it for this stuff to be spoofed?" afaik DKIM Alignment checks should make both have to align. Usually clients like gmail and such will parse header from and use that for search and stuff. Parsing it isn't so easy though, will probably want to find a library for RFC5322.FROM parsing

AndromedaOP•2y ago

oof, this is a little confusing - sorry, I'm being a little slow here 😅 You were correct in identifying that I used message.header.from instead of message.from - that's my bad, I think I might've nicked that from one of the templates or something Can I trust that message.from accurately describes the sender then?

Chaika•2y ago

not really It's not what you'd see in gmail or any email client Those all show header from, and they also use header from for searches and stuff

AndromedaOP•2y ago

hmm, why do the examples use it for whitelisting/blacklisting senders then?

Chaika•2y ago

If they didn't, they'd have to parse it and use some parser They used to use header before and they just didn't work for people because they were an exact match

AndromedaOP•2y ago

So what would your advice be if I want a worker which only accepts emails from a specific inbox I control? Should I look for the parsing library you mentioned above? Or maybe I should have the worker forward all email, and instead have the safe validation be handled on my server side

Chaika•2y ago

You control the sending inbox? It matters a lot less in that case, you could just hardcode it to either, shouldn't change, and if you have dmarc setup it right it would ensure alignment of both

AndromedaOP•2y ago

Oh, right To summarise: - I want to only accept emails from a specific inbox - I control the sending inbox - I want this to be secure enough that no other inbox would be able to pass the filter Would message.from be the correct attribute for me to access in this case?

Chaika•2y ago

My understanding is that the header from is the "more secure" option as long as you are using DKIM signing, as the entire contents of the message, including the header from, is signed with a key on that domain

AndromedaOP•2y ago

right... I'm still a bit lost then Is it Good Enough™️ to just check for the part between the angle brackets or no?

Chaika•2y ago

If you have control over the sending inbox, why not match the whole thing? Or do you think you'd change your name? It's kind of silly but it removes any issues with parsing it

AndromedaOP•2y ago

I'd rather not hardcode the name part of it because then that means that the system would break if I changed my display name in my email client yep, the latter Or rather, not me changing it - it'd be my organisation changing it My only concern with that is, does the email protocol allow for someone to just insert angle brackets into their display name, therefore tricking the worker into thinking it's a legitimate address? I'm aware that it's very unlikely someone will figure that out, but I don't like the idea of security through obscurity

Chaika•2y ago

The format for it is RFC5322.FROM I don't know much about it other then it's complex to write a parser for it

AndromedaOP•2y ago

goodness, yeah I'm hoping that I can take a shortcut given the fact that I know for certain emails from my inbox will take the format of "Display name" <[email protected]> Would something like this be secure?

message.headers.get('from').endsWith('<[email protected]>')

message.headers.get('from').endsWith('<[email protected]>')

Chaika•2y ago

I have no clue, you could try reading over the spec yourself, it's in rfc5322. I haven't really tried other then understanding it's too much to do by hand

Chaika•2y ago

apparently I was off about alignment though, dmarc ensures at least one aligns with the header from. ex, this is valid: X-Mail-From being the envelope from, which is authorized by records on spfmailtechno.com to send. From: is auth'd via DKIM, and aligns in this case. DMARC only requires one to pass & align for it to work. This describes it better then I can: https://support.google.com/a/answer/10032169?hl=en Mail servers all have their own additional checks they put into their spam score though, ex fastmail considers this to be misaligned and that somehow factors into their code, which is what I got confused by it. Anyway email is a mess, afaik the from header is still better as long as you're using dkim/dmarc, and it aligns with what you'd expect

AndromedaOP•2y ago

Goodness, this is confusing Honestly I'll just attempt this and hope and pray for the best

Chaika•2y ago

lol sorry it is a bit confusing but I probably made it worse then it is. I would just see it as: Header From is the right thing to use. In order to spoof header from, you'd either need your domain to have an SPF Policy that includes the Sender's IP, or a dkim record that includes the signature of the key that signed the message. Both things involve control over the domain. Envelope from is more free, only SPF cares about it and you could fail it and still have the mail delivered as long as the dkim record matched some other domain. Envelope from is used for auth, not identity.

AndromedaOP•2y ago

I see, thank you! Then if I'm understanding this correctly, my above method should actually be secure, right?

Chaika•2y ago

In theory of using the right field? Yea In how you're checking? No clue lol. All the docs on the from header and parsing are here: https://datatracker.ietf.org/doc/html/rfc5322#section-3.6.3 It seems that maybe it would be fine, as long as Cloudflare is validating the header is right and rejecting anything wrong. <> are only allowed in quotes

AndromedaOP•2y ago

perfect, thank you! that's Good Enough™️ for me then, I'm considering this solved!

Gaming

Programming

[Workers] How to receive emails and get attachment

Did you find this page helpful?