✅ Remove tags from string
So if I have a string like this
This is a string with <color="red">different colors</color>. And some cool <shake>other effects</shake>
How can I remove the tags from the string and leave the rest as is?
I was thinking of looking into the string and removing everything between <
and >
but I can't seem to find a solution that would do that.
Any suggestions?10 Replies
if you can use regex it's pretty easy.
Regex.Replace(inputString, "<.*?>", "");
oooh I wasn't aware regex was in c#
that's nice I'll try that. Thanks
Pretty much every language has some sort of regex support
But... what if the string is
<foo>two < three</foo>
?
What if it's <foo>a -> b -> c</foo>
?True. Might be situations where it's desired to have those within. But I think for now I can live without. I guess the string would need to have some escape string type deal.
One of the reasons you never really sanitize for HTML nowadays, you just escape it
Instead of stripping all
<script>
s and the like, you just display it as <script>
I'm not sanitizing html 😄
Doing an dialog system that has text styling tags in it
You're looking at making some parser, then
Or using something existing like BBCode
Sorry, should've added that in the question.
a custom parser yeah
Ah, well, you should tokenize it then, into something like
If you do that, no need to strip anything
And with a parser like this you already keep track of
<
s and >
s, so it should be easy to tell which strings are not parts of the tags at that stageTrue, I might actually do that. Because the more I think of it the more confusing it gets.
Yeah, I'll have to take a closer look at that. To be honest this has now made me rethink my entire system so I might just re do it. Thanks for the advice everyone.