❔ List contains 600+ lines, the file that it writes to contains only 130, why?

I got a pdf with a few pages, alot of lines, anybody got and idea? When i stop on debug, wordsInPage contains 600+ lines. that's the entire code:
using (PdfDocument document = PdfDocument.Open(@"C:\Users\myuser\Downloads\test1.PDF"))
{
var wordsToIntercept = new List<string>();
var pages = document.GetPages().ToList();

foreach (var page in pages)
{
var wordsInPage = page.GetWords().Select(w => w.Text).ToList();
var contains = wordsInPage.Contains("TJMAX");
foreach (var wordd in wordsInPage)
{
Console.WriteLine(wordd);
}
string docPath =
Environment.GetFolderPath(Environment.SpecialFolder.MyDocuments);

// Write the string array to a new file named "WriteLines.txt".
using (StreamWriter outputFile = new StreamWriter(Path.Combine(docPath, "WriteLines.txt")))
{
foreach (string line in wordsInPage)
outputFile.WriteLine(line.ToUpper());
}
using (PdfDocument document = PdfDocument.Open(@"C:\Users\myuser\Downloads\test1.PDF"))
{
var wordsToIntercept = new List<string>();
var pages = document.GetPages().ToList();

foreach (var page in pages)
{
var wordsInPage = page.GetWords().Select(w => w.Text).ToList();
var contains = wordsInPage.Contains("TJMAX");
foreach (var wordd in wordsInPage)
{
Console.WriteLine(wordd);
}
string docPath =
Environment.GetFolderPath(Environment.SpecialFolder.MyDocuments);

// Write the string array to a new file named "WriteLines.txt".
using (StreamWriter outputFile = new StreamWriter(Path.Combine(docPath, "WriteLines.txt")))
{
foreach (string line in wordsInPage)
outputFile.WriteLine(line.ToUpper());
}
61 Replies
codesandplays
codesandplays2y ago
What is the type of wordsInPage
antimatter8189
antimatter8189OP2y ago
List<string>
codesandplays
codesandplays2y ago
cs
foreach (var wordd in wordsInPage)
{
Console.WriteLine(wordd);
}
cs
foreach (var wordd in wordsInPage)
{
Console.WriteLine(wordd);
}
How many lines are being print to the console?
antimatter8189
antimatter8189OP2y ago
130smth
codesandplays
codesandplays2y ago
If list contains 600 items there is no way a for loop would print just 130 items
antimatter8189
antimatter8189OP2y ago
but wordsInPage.count is 600+
codesandplays
codesandplays2y ago
can you share the pdf file?
antimatter8189
antimatter8189OP2y ago
yup thats what im having trouble undersanding https://www.analog.com/media/en/technical-documentation/data-sheets/ad9361.pdf big ass pdf, has way more then 130 lines
codesandplays
codesandplays2y ago
What nuget package are you using for PDFDocument
antimatter8189
antimatter8189OP2y ago
PigPdf
antimatter8189
antimatter8189OP2y ago
PdfPig 0.1.7
Reads text content from PDF documents and supports document creation. Apache 2.0 licensed.
codesandplays
codesandplays2y ago
alright let me run this. give me a sec
antimatter8189
antimatter8189OP2y ago
you should see the count on 693 I think
codesandplays
codesandplays2y ago
I'm getting count 693 yes
antimatter8189
antimatter8189OP2y ago
Thats odd aint it lol? never seen such behavior that doesn't have anything to do with the nuget, that's been extracted into a list of strings
Henkypenky
Henkypenky2y ago
seems like your buffer is having some problems try
using (StreamWriter outputFile = new StreamWriter(Path.Combine(docPath, "WriteLines.txt")))
{
outputFile.AutoFlush = true;
foreach (string line in wordsInPage)
outputFile.WriteLine(line.ToUpper());
}
using (StreamWriter outputFile = new StreamWriter(Path.Combine(docPath, "WriteLines.txt")))
{
outputFile.AutoFlush = true;
foreach (string line in wordsInPage)
outputFile.WriteLine(line.ToUpper());
}
antimatter8189
antimatter8189OP2y ago
on it @Henkypenky that gave me 20 more lines aka 150+ , not the 693 i should be getting
Henkypenky
Henkypenky2y ago
wait
codesandplays
codesandplays2y ago
im getting 152 lines weird
antimatter8189
antimatter8189OP2y ago
yup exactly smth is so odd lol
codesandplays
codesandplays2y ago
very weird
Henkypenky
Henkypenky2y ago
try
antimatter8189
antimatter8189OP2y ago
and im writing those lines just to test if i see a certain value it prob fucks up the loops after too im guessing unsure
Henkypenky
Henkypenky2y ago
using (StreamWriter outputFile = new StreamWriter(Path.Combine(docPath, "WriteLines.txt")))
{
foreach (string line in wordsInPage)
outputFile.WriteLine(line.ToUpper());
outputFile.Flush();
}
using (StreamWriter outputFile = new StreamWriter(Path.Combine(docPath, "WriteLines.txt")))
{
foreach (string line in wordsInPage)
outputFile.WriteLine(line.ToUpper());
outputFile.Flush();
}
antimatter8189
antimatter8189OP2y ago
back to 136 lol @Orannis aint u some c# god? save us
Henkypenky
Henkypenky2y ago
okay try this
try
{
StreamWriter outputFile = new StreamWriter(Path.Combine(docPath, "WriteLines.txt");
foreach (string line in wordsInPage)
outputFile.WriteLine(line.ToUpper());
}
finally
{
stream.Flush();
stream.Close();
stream.Dispose();
}
try
{
StreamWriter outputFile = new StreamWriter(Path.Combine(docPath, "WriteLines.txt");
foreach (string line in wordsInPage)
outputFile.WriteLine(line.ToUpper());
}
finally
{
stream.Flush();
stream.Close();
stream.Dispose();
}
antimatter8189
antimatter8189OP2y ago
waiit that code not correc tXD cant paste it in
codesandplays
codesandplays2y ago
we are using "using" instead of try/finally here yearh?
antimatter8189
antimatter8189OP2y ago
unsure what u mean?
Henkypenky
Henkypenky2y ago
StreamWriter outputFile = new StreamWriter(Path.Combine(docPath, "WriteLines.txt");

try
{
foreach (string line in wordsInPage)
{
outputFile.WriteLine(line.ToUpper());
}
}
catch (Exception e)
{
//
}
finally
{
outputFile.Flush();
outputFile.Close();
outputFile.Dispose();
}
StreamWriter outputFile = new StreamWriter(Path.Combine(docPath, "WriteLines.txt");

try
{
foreach (string line in wordsInPage)
{
outputFile.WriteLine(line.ToUpper());
}
}
catch (Exception e)
{
//
}
finally
{
outputFile.Flush();
outputFile.Close();
outputFile.Dispose();
}
antimatter8189
antimatter8189OP2y ago
136 lines no change
codesandplays
codesandplays2y ago
and the week has barely started fml
antimatter8189
antimatter8189OP2y ago
I've never seen this behavior didnt know it can even happen
Henkypenky
Henkypenky2y ago
i have final try
using (StreamWriter outputFile = new StreamWriter(Path.Combine(docPath, "WriteLines.txt")))
{
outputFile.AutoFlush = true;
foreach (string line in wordsInPage)
outputFile.WriteLine(line.ToUpper());
outputFile.Close(); //outside foreach
}
using (StreamWriter outputFile = new StreamWriter(Path.Combine(docPath, "WriteLines.txt")))
{
outputFile.AutoFlush = true;
foreach (string line in wordsInPage)
outputFile.WriteLine(line.ToUpper());
outputFile.Close(); //outside foreach
}
antimatter8189
antimatter8189OP2y ago
same result lol 136 lines only your first methode gave me +20 lines, but even that not close to the full number heh
Henkypenky
Henkypenky2y ago
in my behavious i saw last line was skipped flush solved it but this many lines is weird
antimatter8189
antimatter8189OP2y ago
skips about 550 lines insane
Henkypenky
Henkypenky2y ago
can you do
codesandplays
codesandplays2y ago
I think i've figured it out
Henkypenky
Henkypenky2y ago
Console.WriteLine(wordsInPage.Count
codesandplays
codesandplays2y ago
gimme a sec
Henkypenky
Henkypenky2y ago
before the foreach
antimatter8189
antimatter8189OP2y ago
LOL 693 220 302 324 253 313 400 286 285 169 264 163 162 75 288 567 756 731 631 395 377 516 418 336 392 375 484 203 387 368 503 213 884 859 773 152 i think thats page related now
Henkypenky
Henkypenky2y ago
did u add it inside the foreach
antimatter8189
antimatter8189OP2y ago
there are about 30 pages no outisde of it its in the foreach of the pages there are 30 pages in that pdf
Henkypenky
Henkypenky2y ago
so the 152 is working correctly
antimatter8189
antimatter8189OP2y ago
i dont think so it stops at first page on line 152
codesandplays
codesandplays2y ago
bro here is the problem
antimatter8189
antimatter8189OP2y ago
not on the page that contains 152 lines
codesandplays
codesandplays2y ago
your for loops are strutured weird let me explain
Henkypenky
Henkypenky2y ago
i think you are overwriting everything start by dismantling your problem do page 1 only
codesandplays
codesandplays2y ago
henky you're right that's exactly what's happening here
antimatter8189
antimatter8189OP2y ago
snappp makes sense! im recreating the file each time last iteration is the 136 one prob
Henkypenky
Henkypenky2y ago
yesowo
antimatter8189
antimatter8189OP2y ago
or 152 w/e lol i was sure it would be locked tho, or just add to it, not overwrite it didnt know it behaves like this good catch guys
Henkypenky
Henkypenky2y ago
you need to not "using" the streamwriter so: create the streamwriter do the page foreach do the lines foreach then flush close and dispose the writer
antimatter8189
antimatter8189OP2y ago
Will do boss thanks! helped alot wouldnt have caught it for hours by myself probably
Henkypenky
Henkypenky2y ago
dw, glad we could help
codesandplays
codesandplays2y ago
bro your wordsinpage count is off too
cs
using UglyToad.PdfPig;
List<string> totalWords = new List<string>();

using (PdfDocument document = PdfDocument.Open(@"/Users/raghu/samplepdf.pdf"))
{
var wordsToIntercept = new List<string>();
var pages = document.GetPages().ToList();
foreach (var page in pages)
{
var wordsInPage = page.GetWords().Select(w => w.Text).ToList();
foreach (var word in wordsInPage)
{
totalWords.Add(word);
}
}
}

using (StreamWriter outputFile = new StreamWriter("/Users/raghu/myfile.txt"))
{
int itemno = 1;
foreach (string line in totalWords)
{
Console.WriteLine(itemno + ":" + line);
itemno += 1;
}
}
cs
using UglyToad.PdfPig;
List<string> totalWords = new List<string>();

using (PdfDocument document = PdfDocument.Open(@"/Users/raghu/samplepdf.pdf"))
{
var wordsToIntercept = new List<string>();
var pages = document.GetPages().ToList();
foreach (var page in pages)
{
var wordsInPage = page.GetWords().Select(w => w.Text).ToList();
foreach (var word in wordsInPage)
{
totalWords.Add(word);
}
}
}

using (StreamWriter outputFile = new StreamWriter("/Users/raghu/myfile.txt"))
{
int itemno = 1;
foreach (string line in totalWords)
{
Console.WriteLine(itemno + ":" + line);
itemno += 1;
}
}
Henkypenky
Henkypenky2y ago
try this
using UglyToad.PdfPig;

using (PdfDocument document = PdfDocument.Open(@"C:\Users\Foo\Desktop\ad9361.pdf"))
{
var wordsToIntercept = new List<string>();
var pages = document.GetPages().ToList();

string docPath =
Environment.GetFolderPath(Environment.SpecialFolder.MyDocuments);

using (StreamWriter outputFile = new StreamWriter(Path.Combine(docPath, "WriteLines.txt")))
{
foreach (var page in pages)
{
var wordsInPage = page.GetWords().Select(w => w.Text).ToList();
var contains = wordsInPage.Contains("TJMAX");
foreach (var wordd in wordsInPage)
{
Console.WriteLine(wordd);
}

foreach (string line in wordsInPage)
{
outputFile.WriteLine(line.ToUpper());
}
}
}
}
using UglyToad.PdfPig;

using (PdfDocument document = PdfDocument.Open(@"C:\Users\Foo\Desktop\ad9361.pdf"))
{
var wordsToIntercept = new List<string>();
var pages = document.GetPages().ToList();

string docPath =
Environment.GetFolderPath(Environment.SpecialFolder.MyDocuments);

using (StreamWriter outputFile = new StreamWriter(Path.Combine(docPath, "WriteLines.txt")))
{
foreach (var page in pages)
{
var wordsInPage = page.GetWords().Select(w => w.Text).ToList();
var contains = wordsInPage.Contains("TJMAX");
foreach (var wordd in wordsInPage)
{
Console.WriteLine(wordd);
}

foreach (string line in wordsInPage)
{
outputFile.WriteLine(line.ToUpper());
}
}
}
}
Accord
Accord2y ago
Was this issue resolved? If so, run /close - otherwise I will mark this as stale and this post will be archived until there is new activity.

Did you find this page helpful?