How to get regex matches between optional start and end tokens
I have a massive string with occurrences of the strings
do()
and don't()
. I have to find all text before a don't()
, ignore all text after it, until another do()
or end of string, then all text from that do()
until another don't()
or end of string. So far I have this (.*)don't\(\).*do\(\)(.*)
and I'm processing it like this:
I'm not getting the expected answer this way, and am deeply suspicious that my regex is missing something. The input string is 20k long so I'm battling a bit figuring this out by hand.6 Replies
I think writing a simple parser might be a better idea here, tbh
It's for 1 part of an AoC day challenge. Wrting a proper parser would be a nice exercise, but maybe for another time. What I can do here is just use find instances of the 2 enclosing tokens and loop substrings between them. I am a little too regex fixated.
It looks like it is supposed to. The problem statement says:
Only the most recent do() or don't() instruction applies. At the beginning of the program, mul instructions are enabled.It doesn't mention surplus
do
s like yours at sit do()amet
.
The function that will process your string with the do()
doesn't care about either of these two tokens.I just finished writing a parser btw :KEKW:
I've never written a real one in my 25 year career
I wrote a Logo parser 1st year uni. That's about it
"Parser" just sounds grandiose. It's just a loop, really
Loop over the string and keep saving the chars, if you detect
don't()
stop saving them, start saving them again when you detect do()
Yeah, thanks guys, I am doing it with a loop after all. Much simpler.