C
C#3w ago
Victor H

Parsing data from several sources into a common format.

I have several data sources that I get data from that I must parse into a common format. Say this is the common format (simplified example):
public record ProductDto(
string ProductId,
Price Price,
);

public abstract record Price(decimal Value);
public sealed record FixedPrice(decimal Value) : Price(Value);
public sealed record PricePerUnit(decimal Value, string Unit) : Price(Value);
public record ProductDto(
string ProductId,
Price Price,
);

public abstract record Price(decimal Value);
public sealed record FixedPrice(decimal Value) : Price(Value);
public sealed record PricePerUnit(decimal Value, string Unit) : Price(Value);
One approach would be to set up something like a public record Source1ProductDto and use annotations like JsonPropertyName for each data source and let System.Text.JsonSerializer handle it. However, for more complex types like Price I might need to combine data from several fields of the original data to parse it into my custom format. My current approach is to use some type of strategy pattern for each field by using this interface:
public interface IFieldParser<out T>
{
T Parse(JsonElement json);
}
public interface IFieldParser<out T>
{
T Parse(JsonElement json);
}
then I do this (simplified parsing for illustration purpose):
public class Source1IdParser : IFieldParser<string>
{
public string Parse(JsonElement)
{
// Here it might be called "id"
return json.GetProperty("id").GetString() ?? string.Empty;
}
}

public class Source2IdParser : IFieldParser<string>
{
public string Parse(JsonElement json)
{
// Here it might be called "productCode"
return json.GetProperty("productCode").GetString() ?? string.Empty;
}
}

public class Source1PriceParser : IFieldParser<Price>
{
public Price Parse(JsonElement json)
{
var unit = json.GetProperty("unit").GetString() ?? "";
var amount = json.GetProperty("price").GetString() ?? "";
return unit == ""
? new FixedPrice(Convert.ToDecimal(amount))
: new PricePerUnit(Convert.ToDecimal(amount), unit);
}
}
public class Source1IdParser : IFieldParser<string>
{
public string Parse(JsonElement)
{
// Here it might be called "id"
return json.GetProperty("id").GetString() ?? string.Empty;
}
}

public class Source2IdParser : IFieldParser<string>
{
public string Parse(JsonElement json)
{
// Here it might be called "productCode"
return json.GetProperty("productCode").GetString() ?? string.Empty;
}
}

public class Source1PriceParser : IFieldParser<Price>
{
public Price Parse(JsonElement json)
{
var unit = json.GetProperty("unit").GetString() ?? "";
var amount = json.GetProperty("price").GetString() ?? "";
return unit == ""
? new FixedPrice(Convert.ToDecimal(amount))
: new PricePerUnit(Convert.ToDecimal(amount), unit);
}
}
You get the point for PriceParser2. Can you give me advice on how to improve? Possibly how to leverage .NET strengths better.
17 Replies
Victor H
Victor HOP3w ago
Forgot to mention, from these IFieldParsers I can create parsers for each different source easily, and even test them quite easily which I guess is a benefit.
Mąż Zuzanny Harmider Szczęście
what would be the sources in this case?
Victor H
Victor HOP3w ago
JSON sources, but I could parametrize that too I suppose - but currently not necessary.
Pobiega
Pobiega3w ago
Assuming all your sources provide a string, I'd probably just make an interface along the lines of
public interface IProductProvider
{
public ProductDto FromString(string input);
}
public interface IProductProvider
{
public ProductDto FromString(string input);
}
and let each source determine if its json, or yaml or a file or whatever. It feels a bit overkill making a parser for each source and field - you only really care about it giving you a full valid ProductDto, no?
Victor H
Victor HOP3w ago
Thanks for your response! Yeah, I also thought about one for each one rather than one for each field! And I do believe that might be the better approach too.
Pobiega
Pobiega3w ago
"Better" is too subjective, but "simpler" is often the way to go 🙂 per field makes sense if you need to be able to mix and match, like with a parser combinator
Victor H
Victor HOP3w ago
Yeah that is what I thought I might need at first But it is probably overkill It's one of those YAGNI moments hehe
Pobiega
Pobiega3w ago
Thats my current opinion too, based on what I've seen yep
Victor H
Victor HOP3w ago
Do you have any suggestions on how I can make the System.Text.Json do as much lifting for me as possible? I'm not well versed in .NET to be honest but I did really enjoy the fact that I can extract field very easily by adding [JsonPropertyName(string)]
Pobiega
Pobiega3w ago
hm, depends a bit on how complicated you want to get
Victor H
Victor HOP3w ago
Maybe I could turn this:
public class Source1PriceParser : IFieldParser<Price>
{
public Price Parse(JsonElement json)
{
var unit = json.GetProperty("unit").GetString() ?? "";
var amount = json.GetProperty("price").GetString() ?? "";
return unit == ""
? new FixedPrice(Convert.ToDecimal(amount))
: new PricePerUnit(Convert.ToDecimal(amount), unit);
}
}
public class Source1PriceParser : IFieldParser<Price>
{
public Price Parse(JsonElement json)
{
var unit = json.GetProperty("unit").GetString() ?? "";
var amount = json.GetProperty("price").GetString() ?? "";
return unit == ""
? new FixedPrice(Convert.ToDecimal(amount))
: new PricePerUnit(Convert.ToDecimal(amount), unit);
}
}
into this:
public class Source1PriceParser : IFieldParser<Price>
{
private record PriceDto(
[property: JsonPropertyName("unit")] string Unit,
[property: JsonPropertyName("price")] string Price,
);
public Price Parse(JsonElement json)
{
var price = System.Text.Json.JsonSerializer<PriceDto>(json);
if (price.Unit != null) etc
}
}
public class Source1PriceParser : IFieldParser<Price>
{
private record PriceDto(
[property: JsonPropertyName("unit")] string Unit,
[property: JsonPropertyName("price")] string Price,
);
public Price Parse(JsonElement json)
{
var price = System.Text.Json.JsonSerializer<PriceDto>(json);
if (price.Unit != null) etc
}
}
Pobiega
Pobiega3w ago
since you have 2 "types" of Price, and seemingly no discriminator, you'll need a custom converter, or do it via JsonElement/JsonObject
Victor H
Victor HOP3w ago
(I used the IFieldParser here still because it was just easier to copy the code)
Pobiega
Pobiega3w ago
so yeah that second snippet there could work, but that involves doing quite a bit of "manual" json work you could make it work without Parse, like simply JsonSerializer.Deserialize<ProductDto>(jsonString);, assuming you feed the serializer your custom converter that does the checking for the unit prop just look up "system text json custom converter" if your goal is to learn more about STJ, thats the way to go if you just want this to work, JsonElement is okay
Victor H
Victor HOP3w ago
So the custom json converter from STJ (system.text.json I assume) is the preferred way? Oh yeah I've seen this one with the Utf8JsonReader
Pobiega
Pobiega3w ago
preferred? I'd say it depends on if you want to integrate this with STJ in general, like if ProductDto suddenly could be part of a larger json structure then that would make more sense but it also makes it a bit messier with multiple providers actually no since you'd have a customer converter per provider
Victor H
Victor HOP3w ago
Thanks I will try!

Did you find this page helpful?