C
C#12mo ago
Gautzilla

❔ Match literal escape strings with a regex

Hi everyone, I'm trying to write a simple regex so that each literal escape hexadecimal character notation is matched. For example, in the string "hey there \x68" , i'm interested in matching "\x68", and not the unescaped "h" character. I wrote the following Regex: "\x[0-9a-f]{2}", but it seems that the Regex.Matches method still looks at it like I'm trying to match characters using an escaped notation, as I have the following error: " Invalid pattern '\x[0-9a-f]{2}' at offset 3. Insufficient hexadecimal digits." Does anyone know how I can specify that I want the literal strings of the form "\xnn" to be matched, rather than the corresponding unescaped characters? Thanks!
15 Replies
boiled goose
boiled goose12mo ago
can you paste the code like what string literal are you using as an argument to Regex "" or @"" because \ would be escaped
Gautzilla
Gautzilla12mo ago
I tried either: string charASCIICodeRegex = "\\x[0-9a-f]{2}"; var Matches = Regex.Matches(line, charASCIICodeRegex); and: string charASCIICodeRegex = @"\x[0-9a-f]{2}"; var Matches = Regex.Matches(line, charASCIICodeRegex); without success
artya
artya12mo ago
have you tried
@"\\x[0-9a-f]{2}"
@"\\x[0-9a-f]{2}"
you have two different regex there
ero
ero12mo ago
it's that one yeah
MODiX
MODiX12mo ago
ero
REPL Result: Success
Regex.IsMatch(@"\x68", @"\\x[0-9a-f]{2}")
Regex.IsMatch(@"\x68", @"\\x[0-9a-f]{2}")
Result: bool
True
True
Compile: 454.167ms | Execution: 30.453ms | React with ❌ to remove this embed.
cap5lut
cap5lut12mo ago
as \ has an escape meaning in regex itself u need to escape that as well. so to match a backslash u need to use @"\\" or "\\\\" but u already have an issue here: what if the content is @"\\x1", it would match on the @"\x1" while its actually being escaped
Gautzilla
Gautzilla12mo ago
OK Thanks everyone! I just kinda missed one layer of escape meaning. That makes sens! Whoops, I didn't think about that. It's for a simple AdventOfCode puzzle, so I don't know if I'll stump upon this issue, but that's a good point!
cap5lut
cap5lut12mo ago
regexes are good for simple patterns but u might need an actual parser here but it depends on what u want to achieve, u can also just @"(\\+)x([0-9a-fA-F]{2})" and afterwards check if the first group has an odd length (if yes the last \ isnt escaped) to go further
ero
ero12mo ago
i don't understand this at all lol how would that match when we have {2}
cap5lut
cap5lut12mo ago
aah, yeah, it doesnt, my focus was on the backslash, not on the hexadecimal number
ero
ero12mo ago
ah
cap5lut
cap5lut12mo ago
assume the content was \\x10 ;p, it would match on the \x10 but the \ is actually escaped
Gautzilla
Gautzilla12mo ago
Yup, I didn't even notice that you put only one hex digit, but I got the idea. Thanks!
cap5lut
cap5lut12mo ago
note, that depending on context \\x10 could still be resolved to backslash + the escape hexadecimal notation
Accord
Accord12mo ago
Was this issue resolved? If so, run /close - otherwise I will mark this as stale and this post will be archived until there is new activity.