C
C#3y ago
LukeZurg22

CSV Invalid Header Reading [Answered]

I am currently a CSV with a header of "province;red;green;blue;name;x", however my program expects to be able to read CSV's that are strictly int;int;int;int;string;x as opposed to (oops!) all strings as given. I cannot change the CSV; not allowed under any circumstance. I need to implement handling of invalid entries like this but I am uncertain how, especially in a way that won't bog down my program.
48 Replies
LukeZurg22
LukeZurg22OP3y ago
This is the given context in code:
public void PopulateAppendProvinceCSVData()
{
var cfg = new CsvConfiguration(CultureInfo.InvariantCulture)
{
Delimiter = ";",
HasHeaderRecord = false,
};
using var reader = new StreamReader(Directories.DefinitionCSVPath);
using var csv = new CsvReader(reader, cfg);
csv.Context.RegisterClassMap<MainCsvIndexSyntax>();
var records = csv.GetRecords<ProvinceCSVDefinition>();
{
Parallel.ForEach(records, record =>
{
if (!string.IsNullOrWhiteSpace(record.provinceID.ToString()))
{
if (Provinces.ContainsKey(Convert.ToUInt32(record.provinceID)))
Provinces[Convert.ToUInt32(record.provinceID)].AppendFromCSV(record);
}
});
}
csv.Dispose();
}
public void PopulateAppendProvinceCSVData()
{
var cfg = new CsvConfiguration(CultureInfo.InvariantCulture)
{
Delimiter = ";",
HasHeaderRecord = false,
};
using var reader = new StreamReader(Directories.DefinitionCSVPath);
using var csv = new CsvReader(reader, cfg);
csv.Context.RegisterClassMap<MainCsvIndexSyntax>();
var records = csv.GetRecords<ProvinceCSVDefinition>();
{
Parallel.ForEach(records, record =>
{
if (!string.IsNullOrWhiteSpace(record.provinceID.ToString()))
{
if (Provinces.ContainsKey(Convert.ToUInt32(record.provinceID)))
Provinces[Convert.ToUInt32(record.provinceID)].AppendFromCSV(record);
}
});
}
csv.Dispose();
}
I forgot to mention the error specifics,
System.AggregateException: 'One or more errors occurred. (The conversion cannot be performed.
Text: 'province'
MemberType: System.Int32
TypeConverter: 'CsvHelper.TypeConversion.Int32Converter'
IReader state:
ColumnCount: 0
CurrentIndex: 0
HeaderRecord:

IParser state:
ByteCount: 0
CharCount: 29
Row: 1
RawRow: 1
Count: 6
RawRecord:
province;red;green;blue;x;x
)'
System.AggregateException: 'One or more errors occurred. (The conversion cannot be performed.
Text: 'province'
MemberType: System.Int32
TypeConverter: 'CsvHelper.TypeConversion.Int32Converter'
IReader state:
ColumnCount: 0
CurrentIndex: 0
HeaderRecord:

IParser state:
ByteCount: 0
CharCount: 29
Row: 1
RawRow: 1
Count: 6
RawRecord:
province;red;green;blue;x;x
)'
province;red;green;blue;x;x as Row 0 (first row) is obviously the problem, but being that i am not allowing myself to physically change the file, I am at a loss.
mtreit
mtreit3y ago
Just skip the first line Advance your StreamReader by one line before passing it to the CsvReader code (probably, I don't know what the CsvReader does...presumably it doesn't do something dumb like seek back to the origin)
FroH.LVT
FroH.LVT3y ago
var cfg = new CsvConfiguration(CultureInfo.InvariantCulture)
{
Delimiter = ";",
HasHeaderRecord = false, // Remove this
};
var cfg = new CsvConfiguration(CultureInfo.InvariantCulture)
{
Delimiter = ";",
HasHeaderRecord = false, // Remove this
};
your csv has header, why set HasHeaderRecord = false
mtreit
mtreit3y ago
heh, I didn't even notice that
LukeZurg22
LukeZurg22OP3y ago
Most files don't have province;red;green;blue;x;x, this is more or less just handling the exception I figured I needed to skip the first line, but with this reader i'm not entirely sure how to detect when exactly I need to skip it, or when a CSV file's first line is perfectly readable as is.
FroH.LVT
FroH.LVT3y ago
try use BadDataFound if your CSV is good then you can just set it to null you might get a blank record for header line if i'm not wrong
LukeZurg22
LukeZurg22OP3y ago
Tried. Doesn't have any effect.
FroH.LVT
FroH.LVT3y ago
it still throws error?
LukeZurg22
LukeZurg22OP3y ago
Yeah.
FroH.LVT
FroH.LVT3y ago
at which line? is it CSVHelper exception or your code
LukeZurg22
LukeZurg22OP3y ago
Line? I'm not sure, 1 second.
LukeZurg22
LukeZurg22OP3y ago
FroH.LVT
FroH.LVT3y ago
can you show your CSVConfiguration?
LukeZurg22
LukeZurg22OP3y ago
Yepyep.
var cfg = new CsvConfiguration(CultureInfo.InvariantCulture)
{
Delimiter = ";",
HasHeaderRecord = false,
BadDataFound = x => Console.WriteLine($"Bad data: <{x.RawRecord}>")
};

using var reader = new StreamReader(Directories.DefinitionCSVPath);
using var csv = new CsvReader(reader, cfg);
csv.Context.RegisterClassMap<MainCsvIndexSyntax>();
var records = csv.GetRecords<ProvinceCSVDefinition>();
var cfg = new CsvConfiguration(CultureInfo.InvariantCulture)
{
Delimiter = ";",
HasHeaderRecord = false,
BadDataFound = x => Console.WriteLine($"Bad data: <{x.RawRecord}>")
};

using var reader = new StreamReader(Directories.DefinitionCSVPath);
using var csv = new CsvReader(reader, cfg);
csv.Context.RegisterClassMap<MainCsvIndexSyntax>();
var records = csv.GetRecords<ProvinceCSVDefinition>();
FroH.LVT
FroH.LVT3y ago
hmhm have you tried debug and seen records value before your Parallel is it error? I mean check the records value there Error might happen because you're trying convert those header to Int inside your Parallel foreach. You need to clean them first or put a check
LukeZurg22
LukeZurg22OP3y ago
I cannot get the record's value, but I'm certain I know what it is Because the first record is province;red;green;blue;x. Formatted, this equals to string;string;string;string;string;x which is very bad, because i need int;int;int;int;string;x I'm certain it's the first record that's the problem; given the error kindly notes it. records shows null, due to the aforementioned bad record entry.
FroH.LVT
FroH.LVT3y ago
I mean you can check if your record is valid first before process
Parallel.ForEach(records, record =>
{
if (record is not valid) return;

if (!string.IsNullOrWhiteSpace(record.provinceID.ToString()))
{
if (Provinces.ContainsKey(Convert.ToUInt32(record.provinceID)))
Provinces[Convert.ToUInt32(record.provinceID)].AppendFromCSV(record);
}
});
Parallel.ForEach(records, record =>
{
if (record is not valid) return;

if (!string.IsNullOrWhiteSpace(record.provinceID.ToString()))
{
if (Provinces.ContainsKey(Convert.ToUInt32(record.provinceID)))
Provinces[Convert.ToUInt32(record.provinceID)].AppendFromCSV(record);
}
});
LukeZurg22
LukeZurg22OP3y ago
is not valid?
FroH.LVT
FroH.LVT3y ago
put your own condition there
LukeZurg22
LukeZurg22OP3y ago
I'm not sure what condition to put there An invalid record only exists at the beginning aswell.
FroH.LVT
FroH.LVT3y ago
do you know int.TryParse?
LukeZurg22
LukeZurg22OP3y ago
Yes
FroH.LVT
FroH.LVT3y ago
lets check if it can parse provinceID as int successully then continue else bye bye
LukeZurg22
LukeZurg22OP3y ago
LukeZurg22
LukeZurg22OP3y ago
Regardless, this check would be running 3000+ times when the problematic record only occurs once. It just feels... weird to me.
FroH.LVT
FroH.LVT3y ago
er remove "record is not" does it still show error?
LukeZurg22
LukeZurg22OP3y ago
The error is within the tryparse. Removing record is not has made no affect.
FroH.LVT
FroH.LVT3y ago
zzz. You have to remove that, it makes syntax wrong take a look at it man calm down
LukeZurg22
LukeZurg22OP3y ago
? Alright, if (int.TryParse(Convert.ToString(record.provinceID), out int value)) return; Does not spit out an error. However during testing, the same error still appears.
LukeZurg22
LukeZurg22OP3y ago
FroH.LVT
FroH.LVT3y ago
BadDataFound = x => Console.WriteLine($"Bad data: <{x.RawRecord}>") set BadDataFound = null and try use normal for loop debug and see
LukeZurg22
LukeZurg22OP3y ago
A normal for-loop? Alrighty then.
FroH.LVT
FroH.LVT3y ago
I don't have good exp about parallel for each so i can't tell
LukeZurg22
LukeZurg22OP3y ago
From my understanding, this use of Parallel is just a for loop that is split between different threads. Meant to be faster, and boy, it sure is.
LukeZurg22
LukeZurg22OP3y ago
LukeZurg22
LukeZurg22OP3y ago
Same problem sadly still occurs. I deduce that its within the csv.GetRecords() that it actually begins to read the first line, and since it can't understand what that line is supposed to be since it's types are all out of wack, it throws a fit when records is actually used.
FroH.LVT
FroH.LVT3y ago
FroH.LVT
FroH.LVT3y ago
right click this, add watch
LukeZurg22
LukeZurg22OP3y ago
Ayeye No option given to add watch.
FroH.LVT
FroH.LVT3y ago
hmhm what is there?
LukeZurg22
LukeZurg22OP3y ago
Alright, just got it sorted. Had to do it the old fashioned way. Frankly i don't think cfg would be changing anytime soon, but here's what i got:
- cfg {CsvConfiguration { AllowComments = False, BadDataFound = CsvHelper.BadDataFound, BufferSize = 4096, CacheFields = False, Comment = #, CountBytes = False, CultureInfo = , Delimiter = ;, DetectDelimiter = False, DetectDelimiterValues = System.String[], DetectColumnCountChanges = False, DynamicPropertySort = , Encoding = System.Text.UTF8Encoding+UTF8EncodingSealed, Escape = ", ExceptionMessagesContainRawData = True, GetConstructor = CsvHelper.GetConstructor, GetDynamicPropertyName = CsvHelper.GetDynamicPropertyName, HasHeaderRecord = False, HeaderValidated = CsvHelper.HeaderValidated, IgnoreBlankLines = True, IgnoreReferences = False, IncludePrivateMembers = False, InjectionCharacters = System.Char[], InjectionEscapeCharacter = , IsNewLineSet = False, LeaveOpen = False, LineBreakInQuotedFieldIsBadData = False, MemberTypes = Properties, MissingFieldFound = CsvHelper.MissingFieldFound, Mode = RFC4180, NewLine =
, PrepareHeaderForMatch = CsvHelper.PrepareHeaderForMatch, ProcessFieldBufferSize = 1024, Quote = ", ReadingExceptionOccurred = CsvHelper.ReadingExceptionOccurred, ReferenceHeaderPrefix = , SanitizeForInjection = False, ShouldQuote = CsvHelper.ShouldQuote, ShouldSkipRecord = , ShouldUseConstructorParameters = CsvHelper.ShouldUseConstructorParameters, TrimOptions = None, UseNewObjectForNullReferenceMembers = True, WhiteSpaceChars = System.Char[] }} CsvHelper.Configuration.CsvConfiguration
- cfg {CsvConfiguration { AllowComments = False, BadDataFound = CsvHelper.BadDataFound, BufferSize = 4096, CacheFields = False, Comment = #, CountBytes = False, CultureInfo = , Delimiter = ;, DetectDelimiter = False, DetectDelimiterValues = System.String[], DetectColumnCountChanges = False, DynamicPropertySort = , Encoding = System.Text.UTF8Encoding+UTF8EncodingSealed, Escape = ", ExceptionMessagesContainRawData = True, GetConstructor = CsvHelper.GetConstructor, GetDynamicPropertyName = CsvHelper.GetDynamicPropertyName, HasHeaderRecord = False, HeaderValidated = CsvHelper.HeaderValidated, IgnoreBlankLines = True, IgnoreReferences = False, IncludePrivateMembers = False, InjectionCharacters = System.Char[], InjectionEscapeCharacter = , IsNewLineSet = False, LeaveOpen = False, LineBreakInQuotedFieldIsBadData = False, MemberTypes = Properties, MissingFieldFound = CsvHelper.MissingFieldFound, Mode = RFC4180, NewLine =
, PrepareHeaderForMatch = CsvHelper.PrepareHeaderForMatch, ProcessFieldBufferSize = 1024, Quote = ", ReadingExceptionOccurred = CsvHelper.ReadingExceptionOccurred, ReferenceHeaderPrefix = , SanitizeForInjection = False, ShouldQuote = CsvHelper.ShouldQuote, ShouldSkipRecord = , ShouldUseConstructorParameters = CsvHelper.ShouldUseConstructorParameters, TrimOptions = None, UseNewObjectForNullReferenceMembers = True, WhiteSpaceChars = System.Char[] }} CsvHelper.Configuration.CsvConfiguration
FroH.LVT
FroH.LVT3y ago
BadDataFound = CsvHelper.BadDataFound you didn't set it to null?
LukeZurg22
LukeZurg22OP3y ago
Ah, I had, but it made no difference.
FroH.LVT
FroH.LVT3y ago
that is weird because set BadDataFound = null will ignore error and give you a blank record if i'm not wrong It worked fine for me before
LukeZurg22
LukeZurg22OP3y ago
Strange. To me it seems like the crash stems from var records = csv.GetRecords<ProvinceCSVDefinition>(); As cfg does not change at any point, but records is certainly not meant to be null.
FroH.LVT
FroH.LVT3y ago
no, i didnt mean cfg would change but cfg will effect behavior of csvreader
LukeZurg22
LukeZurg22OP3y ago
I think I may be getting far over my head in trying to optimize something I shouldn't. I've sorted it out and fixed it. Turns out theres no real easy way to manage a csv file like that with the way I had it, and am going to have to contend with directly accessing the csv variables from within ProvinceFile.
Accord
Accord3y ago
✅ This post has been marked as answered!
Want results from more Discord servers?
Add your server