❔ Split on new line preserving empty line

So I'm trying to create a method that given a string with Environment.NewLine or \r\n or \r or \n converts it to an array while preserving the new lines in form of an empty line.
string[] newLineArray = { Environment.NewLine };
string[] textArray1 = text.Split(newLineArray, StringSplitOptions.None);
string[] textArray = text.Split(Environment.NewLine.ToArray(), StringSplitOptions.None);
string[] newLineArray = { Environment.NewLine };
string[] textArray1 = text.Split(newLineArray, StringSplitOptions.None);
string[] textArray = text.Split(Environment.NewLine.ToArray(), StringSplitOptions.None);
While testing things I am having hard time understanding why there is difference between first 2 lines vs what is in 3rd line. The first 2 lines when given a string such as "First line\r\nAnd more in new line" split into array of 2 strings, while the output of textArray splits into 3, with an empty line. In the end I want to add to my C# library that can create Word Documents ability for people to be able to provide string with new lines of different kind and that would be treated in proper manner. But for some reason Split on newLineArray delivers no empty lines, and the only time I can get it to deliver empty lines is when using Environmnet.NewLine.ToArray()
private WordParagraph ConvertToTextWithBreaks(string text) {
string[] newLineArray = { Environment.NewLine, "\n", "\r\n", "\n\r" };
//string[] newLineArray = { Environment.NewLine };
string[] textArray = text.Split(newLineArray, StringSplitOptions.None);
//string[] textArray = text.Split(Environment.NewLine.ToArray(), StringSplitOptions.None);

WordParagraph wordParagraph = null;
foreach (string line in textArray) {
if (line == "") {
wordParagraph = AddBreak();
} else {
wordParagraph = new WordParagraph(this._document, this._paragraph, new Run());
wordParagraph.Text = line;
this._paragraph.Append(wordParagraph._run);
}
}
return wordParagraph;
}
private WordParagraph ConvertToTextWithBreaks(string text) {
string[] newLineArray = { Environment.NewLine, "\n", "\r\n", "\n\r" };
//string[] newLineArray = { Environment.NewLine };
string[] textArray = text.Split(newLineArray, StringSplitOptions.None);
//string[] textArray = text.Split(Environment.NewLine.ToArray(), StringSplitOptions.None);

WordParagraph wordParagraph = null;
foreach (string line in textArray) {
if (line == "") {
wordParagraph = AddBreak();
} else {
wordParagraph = new WordParagraph(this._document, this._paragraph, new Run());
wordParagraph.Text = line;
this._paragraph.Append(wordParagraph._run);
}
}
return wordParagraph;
}
What I am missing?
43 Replies
phaseshift
phaseshift2y ago
Using Environmnet.NewLine.ToArray() means you're splitting on \r and then also on \n. So if the line is "blah\r\n" then you get "blah", "". Same as if you have "blah\ra\n" then you get "blah", "a"
ero
ero2y ago
you get "blah", "", and ""
phaseshift
phaseshift2y ago
Not sure how 'implementation defined' this is, but this might be what you want: .Split(new[]{"\r\n", "\r", "\n"}, StringSplitOptions.None); importantly "\r\n" is the first split string, so that's always checked first - at least in my local experiment.
ero
ero2y ago
splitting on just \r doesn't really make sense i think you can have a \r without a new line
przemyslawklys
That's now what I am seeing. It split just once and properly shows empty line in place of newline for both \r\n and explicit. My problem for the first question is - that even tho StringSplitOptions.None is supposed to act the same, I get 2 different results where while the split works on text.Split(newLineArray, StringSplitOptions.None); it's actually not preserving the empty entry
przemyslawklys
This doesn't seem to work for me. I mean the Split itself does work, the prbolem is the result is not "blah", "", "something else" but it's "blah", "something else". Only time "blah", "", "something else" is reteined if I use text.Split(Environment.NewLine.ToArray(), StringSplitOptions.None); I have no clue why
przemyslawklys
The breakpoint never gets hit if I define text.Split suggested way
przemyslawklys
it hits just fine when using single .ToArray() approach I guess the difference comes from var test = Environment.NewLine.ToArray(); because what it shows now is that test is actually char[], that would mean char[] splits differently then string[]
phaseshift
phaseshift2y ago
It seems like you're expecting split to keep new lines or convert them. It just doesn't
przemyslawklys
var paragraph1 = "First line\r\nAnd more in new line";

char[] testChars = Environment.NewLine.ToArray();
string[] testStrings = { Environment.NewLine };
string[] testStringsMultiple = { Environment.NewLine, "\n", "\r\n" };

string[] textArray1 = paragraph1.Split(testChars, StringSplitOptions.None);
string[] textArray2 = paragraph1.Split(testStrings, StringSplitOptions.None);
string[] textArray3 = paragraph1.Split(testStringsMultiple, StringSplitOptions.None);

Console.WriteLine(textArray1.Length);
Console.WriteLine(textArray2.Length);
Console.WriteLine(textArray3.Length);
var paragraph1 = "First line\r\nAnd more in new line";

char[] testChars = Environment.NewLine.ToArray();
string[] testStrings = { Environment.NewLine };
string[] testStringsMultiple = { Environment.NewLine, "\n", "\r\n" };

string[] textArray1 = paragraph1.Split(testChars, StringSplitOptions.None);
string[] textArray2 = paragraph1.Split(testStrings, StringSplitOptions.None);
string[] textArray3 = paragraph1.Split(testStringsMultiple, StringSplitOptions.None);

Console.WriteLine(textArray1.Length);
Console.WriteLine(textArray2.Length);
Console.WriteLine(textArray3.Length);
phaseshift
phaseshift2y ago
It doesn't 'show empty line in place of new line'
przemyslawklys
first 3 console write gives 3,2,2 which means char is treated differently and preserves new line on which we actually Split when a string[] is passed the empty line is not preserved
phaseshift
phaseshift2y ago
Your just doing different things They're not treated differently You give different input, you get different answer
przemyslawklys
i've edited the example above showing same string, with split using 3 different, but very similar ways first result gives 3, second gives 2, third gives 2 to my noob ass it looks the same 😉 when it comes to input,
Want results from more Discord servers?
Add your server