RegEx with match() and replace()

I have a regular expression that grabs the double quotes and value for an html tag attribute: str.match(RegEx). And then I use str.replace(RegEx, replacementStr) where the replacementStr is a span tag with a class for changing the color in a pre block. It works like a charm when I only have 1 attribute in my html tag. But when I have 2 attributes (e.g. id and class), I get both matches in each attribute. I can't figure out why it doubling since the id and class values are different. Any insight would be great:
convertedArr.forEach(line => {
// Quotes
if (line.match(dblQuote)) {
result = line.replace(dblQuote, `<span class="${classes[1]}">${line.match(dblQuote)}</span>`);
} else {
result = line;
}
});
convertedArr.forEach(line => {
// Quotes
if (line.match(dblQuote)) {
result = line.replace(dblQuote, `<span class="${classes[1]}">${line.match(dblQuote)}</span>`);
} else {
result = line;
}
});
Here is my RegEx and some console.log:
const dblQuote = /(&quot;[.\w\/:*?-]*\w)(&quot;[.\w\/:*?-]*)/g;
console.log(line.match(dblQuote));
// ['&quot;something&quot;', '&quot;test&quot;']
// ['&quot;test-img&quot;', '&quot;./img/file.jpeg&quot;']
console.log(result)
// &lt;p id=<span class="light-blue">&quot;something&quot;,&quot;test&quot;</span> class=<span class="light-blue">&quot;something&quot;,&quot;test&quot;</span>&gt;words here&lt;/p&gt;
const dblQuote = /(&quot;[.\w\/:*?-]*\w)(&quot;[.\w\/:*?-]*)/g;
console.log(line.match(dblQuote));
// ['&quot;something&quot;', '&quot;test&quot;']
// ['&quot;test-img&quot;', '&quot;./img/file.jpeg&quot;']
console.log(result)
// &lt;p id=<span class="light-blue">&quot;something&quot;,&quot;test&quot;</span> class=<span class="light-blue">&quot;something&quot;,&quot;test&quot;</span>&gt;words here&lt;/p&gt;
Mt regexr.com link: https://regexr.com/76539
RegExr
RegExr: Learn, Build, & Test RegEx
RegExr is an online tool to learn, build, & test Regular Expressions (RegEx / RegExp).
13 Replies
MarkBoots
MarkBoots2y ago
I'm no hero with regex. but wouldn't it be easier to parse the string to an actual dom element so everytihng is easier accessible. Then you have much more control and you can do want you want
const line = '<span class="white">&lt;p id=&quot;test-one&quot; class=&quot;test-two&quot;&gt;words here&lt;/p&gt;</span><span class="white">&lt;title&gt;Code Formatter&lt;/title&gt;</span><span class="white">&lt;img src =&quot;./img/file.jpeg&quot; /&gt;</span>';

const replacements = [["&quot;", '"'], ["&lt;", "<"], ["&gt;", ">"]];
const temp = document.createElement("div");
temp.innerHTML = replacements.reduce((p,r)=>p=p.replaceAll(r[0],r[1]),line);
const spans = temp.querySelectorAll("span");
const line = '<span class="white">&lt;p id=&quot;test-one&quot; class=&quot;test-two&quot;&gt;words here&lt;/p&gt;</span><span class="white">&lt;title&gt;Code Formatter&lt;/title&gt;</span><span class="white">&lt;img src =&quot;./img/file.jpeg&quot; /&gt;</span>';

const replacements = [["&quot;", '"'], ["&lt;", "<"], ["&gt;", ">"]];
const temp = document.createElement("div");
temp.innerHTML = replacements.reduce((p,r)=>p=p.replaceAll(r[0],r[1]),line);
const spans = temp.querySelectorAll("span");
MarkBoots
MarkBoots2y ago
here you have 3 span elements from that line string
Jochem
Jochem2y ago
I agree with Mark, HTML and Regex don't mix very well. You probably could get this to work with regular expressions, but that's not always guaranteed with HTML, and there's much better tools like Mark's code
Lofty!
Lofty!2y ago
or even better - AST's although it might be overkill here
Joao
Joao2y ago
You could also try prism.js for highlighting code (if it's code we're talking about): https://github.com/PrismJS/prism Or, as per this SO thread, you can create a dummy html element and parse it at will using DOM manipulation methods: https://stackoverflow.com/questions/10585029/parse-an-html-string-with-js
Kernix
KernixOP2y ago
I am outputting the result to the DOM and then using that as the code in my pre tag. My code actually grabs the line of code to first convert HTML entities. Then I use those line to find the matches with my regex and output the matches with a span with a class for the coloring I want. It works fine if I only have 1 match per line I assume there is a package I could use, but I'd rather do it myself.
Unknown User
Unknown User2y ago
Message Not Public
Sign In & Join Server To View
Kernix
KernixOP2y ago
The html strings are in a js array, not in the DOM
Unknown User
Unknown User2y ago
Message Not Public
Sign In & Join Server To View
Kernix
KernixOP2y ago
Yes, I have changed this a little but a paragraph tag with 2 attributes, title tag with no attributes (both with opening and closing tags), and an img tag with 1 attribute though I had a class on that one as well. I'm getting blown away by the multiple arrays in the forEach block. I'm absolutley freaking out. There is no way that this should be this difficult. I must be totally missing something simple.
Kernix
KernixOP2y ago
Here are the regexr expressions I am working with: 1. html attr: https://regexr.com/767h9 2. html tags: https://regexr.com/765tq 3. double quotes: https://regexr.com/76539
RegExr
RegExr: Learn, Build, & Test RegEx
RegExr is an online tool to learn, build, & test Regular Expressions (RegEx / RegExp).
RegExr
RegExr: Learn, Build, & Test RegEx
RegExr is an online tool to learn, build, & test Regular Expressions (RegEx / RegExp).
RegExr
RegExr: Learn, Build, & Test RegEx
RegExr is an online tool to learn, build, & test Regular Expressions (RegEx / RegExp).
Kernix
KernixOP2y ago
It's the lack of the correct capture groups in my regex, and using the $1 syntax for the output, at least for my double-quotes regex
Kernix
KernixOP2y ago
Yeah, that was it. I have the individual regex expressions working, now I have to combine them. Thanks for all the help. If you are interested, here is my codepen with my code up to this point: https://codepen.io/jim-kernicky/pen/xxJrwXz
Want results from more Discord servers?
Add your server