C
C#17mo ago
Chris TCC

❔ trim string according to regex

I've got this regex string, ^(?:\b\w+\b[\s\r\n]*){1,25}$ which is supposed to verify the string doesn't have more than 25 words. How do I use this regex (or modifications of it) to trim the string to 25 words if it's longer than that? (when the regex check fails)
42 Replies
Anton
Anton17mo ago
well remove the dollar sign, and then ig slice the rest the string with the help of the capture groups (they contain the start index information and such I think) or do ^(your_inner_expr)(.+)$ and then modify your match logic to check if the last group is empty and if it's not, then that's the rest of the string come to think of it, you could just slice off the match off of the original string, after removing the dollar from the regex ofc either way, you have to remove the dollar for any of that to work
Chris TCC
Chris TCC17mo ago
I was thinking along these lines just chop off the end no need to split or anything the dollar sign is for prior logic I'll remove it for the string processing
Anton
Anton17mo ago
idk what you mean by splitting, but splitting by words might actually make your code way easier
Chris TCC
Chris TCC17mo ago
really? I thought there would be a simple way to just use regex and delete anything that isn't the regex
Anton
Anton17mo ago
well I mean assuming you probably will need the words at some point
Chris TCC
Chris TCC17mo ago
I just need the 25 anything except that doesn't matter to me
Anton
Anton17mo ago
if you split by words, you could do something along these lines:
var words = line.SplitterWords();
int i = 0;
var t = words.Skip(25);
using var e = t.GetEnumerator();
if (t.MoveNext())
// use the words from t
else
// no more words
var words = line.SplitterWords();
int i = 0;
var t = words.Skip(25);
using var e = t.GetEnumerator();
if (t.MoveNext())
// use the words from t
else
// no more words
Chris TCC
Chris TCC17mo ago
I don't understand this code it uses enums?
Anton
Anton17mo ago
enumerators & enumerables and linq
Chris TCC
Chris TCC17mo ago
uh I have no idea what those are
Anton
Anton17mo ago
are you a beginner?
Chris TCC
Chris TCC17mo ago
very much also self-taught
Anton
Anton17mo ago
well an enumerable is a collection that you can loop through an enumerator is an object that holds the state of enumeration example: an enumerable can be an array, and an enumerator would be a struct with a reference to the array and the current index since enumerables are designed for enumeration, they can be lazily computed there's too much details to explain, you better read some docs or articles
Chris TCC
Chris TCC17mo ago
I tried to learn enums on a few occasions I never understood them
Anton
Anton17mo ago
not enums enums have nothing to do with it enums is short for an enumeration
Chris TCC
Chris TCC17mo ago
oh I thought enums are enumerators
Anton
Anton17mo ago
while these are a able and an ator do you know what constants and static classes are?
Chris TCC
Chris TCC17mo ago
statics can't be changed no? read-only?
Anton
Anton17mo ago
not classes for classes it means they can't be instantiated and hence can only hold static things
Chris TCC
Chris TCC17mo ago
yeah so read only right?
Anton
Anton17mo ago
what
Chris TCC
Chris TCC17mo ago
as in, it's essentially like a book
Anton
Anton17mo ago
no
Chris TCC
Chris TCC17mo ago
you can read from it but can't change the values
Anton
Anton17mo ago
yeah you seem to not know classes either yet well I suggest you do some reading find some book maybe
Chris TCC
Chris TCC17mo ago
all my C# experience is programming some mobile app prototypes in unity and using it as a scripting language
Anton
Anton17mo ago
you can't make more or less complicated systems without this knowledge tho well if it's simple scripts I guess you'll manage but if you ever want to go beyond, take your time
Chris TCC
Chris TCC17mo ago
hm. as of now I've only been doing very simple things for the past 6 years So is it possible to truncate a string based on number of words? as of now, the only way I could think of would be to just split the string, then stitch them together again, then ditch the remaining array indices
Anton
Anton17mo ago
the pseudocode would be kinda like this
var regex = ...;
string? match = regex.GetMatch(input);
if (match is not null)
return input[match.Length ..];
var regex = ...;
string? match = regex.GetMatch(input);
if (match is not null)
return input[match.Length ..];
Chris TCC
Chris TCC17mo ago
but length is in characters not words would I have to use regex to split the string into words first?
Anton
Anton17mo ago
well yeah that's what you need there you wanted everything after the first 25 words
Chris TCC
Chris TCC17mo ago
but words can have different character amounts, no?
Anton
Anton17mo ago
explain your logic
Chris TCC
Chris TCC17mo ago
"purple" and "christmas" have different character amounts so you can't really use character amounts to determine where a word boundary is, right?
Anton
Anton17mo ago
I don't see why that would matter in your problem you just needed to slice off the first 25 words, you said
Chris TCC
Chris TCC17mo ago
I want to keep the first 25 words and slice off the rest sorry if I wasn't clear with my wording
Anton
Anton17mo ago
well then just take the match from the regex the first group is the whole match
Chris TCC
Chris TCC17mo ago
string match = filter.Match(string);
print match;
string match = filter.Match(string);
print match;
?
Anton
Anton17mo ago
I don't remember the API exactly, you have to look that up I almost never use regexes
Florian Voß
Florian Voß17mo ago
I think you shouldn't be solving this with regex in the first place just check if length or count is greater than 25? after splitting the string on what ever seperates the words from each other
Chris TCC
Chris TCC17mo ago
that was the other option that I saw yeah
Accord
Accord17mo ago
Was this issue resolved? If so, run /close - otherwise I will mark this as stale and this post will be archived until there is new activity.