[Workers] How to receive emails and get attachment
Hi, how could I create a simple worker which receives emails and uploads a PDF attachment to a server?
31 Replies
And an example email worker would be https://github.com/edevil/email_worker_parser and then you just take the attachments from it and upload to your server
GitHub
GitHub - edevil/email_worker_parser: An example of a worker that pa...
An example of a worker that parses an email. Contribute to edevil/email_worker_parser development by creating an account on GitHub.
Forgive me for the incredibly basic question, but I can't resolve
require("postal-mime")
- is there some way to do that within the basic browser editing environment or would I have to have a local project of some kind which I can then publish?
Alternatively, just looking at the postal-mime example on the NPM site, the following example was used:
Which along with the repository example above suggests to me that the attachment data is somehow encoded into event.raw
. Would it be a sensible/plausible idea to HTTP POST the entire content of event.raw
to my webserver and do the parsing there, in an environment a bit less restrictive (and which I'm much more comfortable with) than Workers?
slight bump!The quick editor and the playground don't currently support importing npm modules, so you'd have to work on it locally and use
wrangler deploy
to publish to the edge.I see, thanks
Now, I'm incredibly lazy, so just a quick question, would there be any harm to just making
event.raw
the body of an HTTP request and posting that?
It may very slightly cut down in CPU time toono I'd say that's a good idea. the raw property of the EmailEvent is a ReadableStream which is fine to set as a body of a request, and it allows you to stream it through the Worker without buffering it, ex
That's perfect, I'll give that a shot - thank you very much!
However, it doesn't seem to actually work
Since this is the format that
message.from
appears to takeIs it safe to just parse for what's between the angle brackets? Is that guaranteed to be the actual sender's email address and it can't be spoofed?
That looks like header.from
message.from is Envelope from and message.headers.get from is header from, https://www.xeams.com/difference-envelope-header.htm
Email is insecure by design, it's not really a question of "can it be spoofed" and instead "if the sender properly sets everything up on their end, how easy is it for this stuff to be spoofed?" afaik DKIM Alignment checks should make both have to align. Usually clients like gmail and such will parse header from and use that for search and stuff. Parsing it isn't so easy though, will probably want to find a library for RFC5322.FROM parsing
oof, this is a little confusing - sorry, I'm being a little slow here 😅
You were correct in identifying that I used
message.header.from
instead of message.from
- that's my bad, I think I might've nicked that from one of the templates or something
Can I trust that message.from
accurately describes the sender then?not really
It's not what you'd see in gmail or any email client
Those all show header from, and they also use header from for searches and stuff
hmm, why do the examples use it for whitelisting/blacklisting senders then?
If they didn't, they'd have to parse it and use some parser
They used to use header before and they just didn't work for people because they were an exact match
So what would your advice be if I want a worker which only accepts emails from a specific inbox I control? Should I look for the parsing library you mentioned above?
Or maybe I should have the worker forward all email, and instead have the safe validation be handled on my server side
You control the sending inbox? It matters a lot less in that case, you could just hardcode it to either, shouldn't change, and if you have dmarc setup it right it would ensure alignment of both
Oh, right
To summarise:
- I want to only accept emails from a specific inbox
- I control the sending inbox
- I want this to be secure enough that no other inbox would be able to pass the filter
Would
message.from
be the correct attribute for me to access in this case?My understanding is that the header from is the "more secure" option as long as you are using DKIM signing, as the entire contents of the message, including the header from, is signed with a key on that domain
right... I'm still a bit lost then
Is it Good Enough™️ to just check for the part between the angle brackets or no?
If you have control over the sending inbox, why not match the whole thing? Or do you think you'd change your name? It's kind of silly but it removes any issues with parsing it
I'd rather not hardcode the name part of it because then that means that the system would break if I changed my display name in my email client
yep, the latter
Or rather, not me changing it - it'd be my organisation changing it
My only concern with that is, does the email protocol allow for someone to just insert angle brackets into their display name, therefore tricking the worker into thinking it's a legitimate address?
I'm aware that it's very unlikely someone will figure that out, but I don't like the idea of security through obscurity
The format for it is RFC5322.FROM
I don't know much about it other then it's complex to write a parser for it
goodness, yeah
I'm hoping that I can take a shortcut given the fact that I know for certain emails from my inbox will take the format of
"Display name" <[email protected]>
Would something like this be secure?
I have no clue, you could try reading over the spec yourself, it's in rfc5322. I haven't really tried other then understanding it's too much to do by hand
apparently I was off about alignment though, dmarc ensures at least one aligns with the header from.
ex, this is valid:
X-Mail-From being the envelope from, which is authorized by records on spfmailtechno.com to send. From: is auth'd via DKIM, and aligns in this case. DMARC only requires one to pass & align for it to work. This describes it better then I can: https://support.google.com/a/answer/10032169?hl=en
Mail servers all have their own additional checks they put into their spam score though, ex fastmail considers this to be misaligned and that somehow factors into their code, which is what I got confused by it.
Anyway email is a mess, afaik the from header is still better as long as you're using dkim/dmarc, and it aligns with what you'd expect
Goodness, this is confusing
Honestly I'll just attempt this and hope and pray for the best
lol sorry it is a bit confusing but I probably made it worse then it is.
I would just see it as:
Header From is the right thing to use.
In order to spoof header from, you'd either need your domain to have an SPF Policy that includes the Sender's IP, or a dkim record that includes the signature of the key that signed the message. Both things involve control over the domain.
Envelope from is more free, only SPF cares about it and you could fail it and still have the mail delivered as long as the dkim record matched some other domain. Envelope from is used for auth, not identity.
I see, thank you! Then if I'm understanding this correctly, my above method should actually be secure, right?
In theory of using the right field? Yea
In how you're checking? No clue lol. All the docs on the from header and parsing are here: https://datatracker.ietf.org/doc/html/rfc5322#section-3.6.3
It seems that maybe it would be fine, as long as Cloudflare is validating the header is right and rejecting anything wrong. <> are only allowed in quotes
perfect, thank you!
that's Good Enough™️ for me then, I'm considering this solved!