Colin Asquith
Colin Asquith
CDCloudflare Developers
Created by Colin Asquith on 1/29/2024 in #workers-help
HTMLRewriter Replace text issue with HTML entities and <script>
I have a Worker which replaces some text in a webpage. In general it's working fine, but I seem to be caught between a couple of Worker quirks: * I had an issue that the replace would sometimes cause problems with HTML entities and end up with weird ASCII characters - so I followed https://community.cloudflare.com/t/how-to-unescape-html-entities-in-strings/255667 to use html-entites and decode them, it solved that issue * I then started to have some problems replacing text in general in too many tags, and it was breaking scripts in the head - so I targetted the HTMLRewriter with theselector in the on() method to say what to process - this seemed to fix most issues for <head> scripts and scripts before the end of the </body> tag. * I now have a site where there is a <script> tag inside a multiple div tags - and I do notwant to replace the text in the script, but because it is inside multiple divs, it is processed as a child element of div - a less than sign is becoming &lt; which then cannot run as code... So far, I've been trying to work out the perfect selector query to help target what I am doing, and maybe there is a way to do that, but maybe the possible solutions are: * target the selector further (but I don't think I can handle the case above) * if I didn't have the issue with HTML entities I would not have the problem at all, is there another way to prevent this ASCII character issue creeping in when doing a replace? * as far as I understand it - I'm going through the text for an element and its children - is there some way to have more control of that?
4 replies