C
C#14mo ago
joren

Data seeding database

So currently in my program.cs I have the following code:
// Seed the database, disable once done.
using (var scope = app.Services.CreateScope())
{
var services = scope.ServiceProvider;
try
{
DataContext context = services.GetRequiredService<DataContext>();

if(context.Database.EnsureCreated())
{
var dataSeeder = new DataSeeder(context);
dataSeeder.SeedData();
}
}
catch (Exception ex)
{
Console.WriteLine($"Error occurred while seeding the database: {ex.Message}");
}
}
// Seed the database, disable once done.
using (var scope = app.Services.CreateScope())
{
var services = scope.ServiceProvider;
try
{
DataContext context = services.GetRequiredService<DataContext>();

if(context.Database.EnsureCreated())
{
var dataSeeder = new DataSeeder(context);
dataSeeder.SeedData();
}
}
catch (Exception ex)
{
Console.WriteLine($"Error occurred while seeding the database: {ex.Message}");
}
}
Now, seeding only makes sense in a development environment and only has to be executed once. So would this be a proper way to go about it? My DataSeeder class basically creates objects of all my models, gives them mock values, and that I save to the database.
40 Replies
joren
jorenOP14mo ago
To my understanding, once I call SaveChanges() on my DataContext object, that's when they will staged for the next migration, please correct me if im wrong. For instance in my
public void SeedData()
{
SeedPokemonOwners();
_context.SaveChanges();
}
public void SeedData()
{
SeedPokemonOwners();
_context.SaveChanges();
}
The moment I do _context.SaveChanges();, all the changes I pushed to the context with: _context.TableName.AddRange( /* data here */ ) gets staged for the next migrations.
Baturhan
Baturhan14mo ago
Hello @joren, you can check your enviorenment in program.cs using app.Environment.IsDevelopment() if this returns true you are in development, if the case is only debuggin #if DEBUG #end if you can use this. So when you release the application your enviorenment is going to be production(or the setting you set before) and the db is not going to be seed anymore Savechanges method just redirect your query to the db, it's not related with migration at all
joren
jorenOP14mo ago
Migrations are strictly about model changes and its relations and nothing to do with the values inside those tables then I assume would make sense actually, as seeding wouldnt be migrating anything what dictates the value it returns, the profile we use to run it ?
joren
jorenOP14mo ago
so this one, in release it would be False then I assume
No description
joren
jorenOP14mo ago
Eh, I guess its the launchSettings.json that dictates it
{
"profiles": {
"http": {
"commandName": "Project",
"dotnetRunMessages": true,
"launchBrowser": true,
"launchUrl": "swagger",
"applicationUrl": "http://localhost:5116",
"environmentVariables": {
"ASPNETCORE_ENVIRONMENT": "Development"
}
},
"https": {
"commandName": "Project",
"dotnetRunMessages": true,
"launchBrowser": true,
"launchUrl": "swagger",
"applicationUrl": "https://localhost:7048;http://localhost:5116",
"environmentVariables": {
"ASPNETCORE_ENVIRONMENT": "Development"
}
},
}
{
"profiles": {
"http": {
"commandName": "Project",
"dotnetRunMessages": true,
"launchBrowser": true,
"launchUrl": "swagger",
"applicationUrl": "http://localhost:5116",
"environmentVariables": {
"ASPNETCORE_ENVIRONMENT": "Development"
}
},
"https": {
"commandName": "Project",
"dotnetRunMessages": true,
"launchBrowser": true,
"launchUrl": "swagger",
"applicationUrl": "https://localhost:7048;http://localhost:5116",
"environmentVariables": {
"ASPNETCORE_ENVIRONMENT": "Development"
}
},
}
Baturhan
Baturhan14mo ago
Yes, there is a environment variables you need to use
No description
joren
jorenOP14mo ago
"ASPNETCORE_ENVIRONMENT": "Development" Makes sense, so lets say its on development as it is now, would it try to seed it every time I run it ? I assume so, which would make the startup time pretty inefficient I might just add some compile time constant, that I can flip from True to False to disable seeding after the initial seeding locally...
Baturhan
Baturhan14mo ago
if that variable is set to development yes, If you don't want ta seed again and again, you could check the data with "any" method. If you try to execute your program in release mode it should be changed I guess, let me check it
joren
jorenOP14mo ago
Any() would be on a table, what if seeding was done manually and therefore is not 100% complete, then doing it on just one table would be wrong you'd have to do a .Any() check on each table, to see if it contains any data, if one of them does not, just seed it, else skip it Would be cool if I could somehow iterate over the types in the Models folder, call .Any() on each, not sure if thats achievable in C#. It has reflection, so it is doable I just checked
Baturhan
Baturhan14mo ago
are you even trying to execute the db commands parallel? Any() just returns, if there is any data in the table. IF you added it it will true, thus you dont need to add any data to seed and you will return/continue
joren
jorenOP14mo ago
I mean prior to seeding, when the program starts, lets assume the db is seeded already
Baturhan
Baturhan14mo ago
Even if you add your data manually when thread comes to Any() method it's going to return true or false according to it
joren
jorenOP14mo ago
I want to check if it's seeding yes, but that'd mean I'd have to call .Any on each table individually imagine you have 100 tables, no one will call .Any on each table to check if it has data. prior to seeding I could I guess, but then it always goes into the proces of seeding. Then I assume it should be seeded, and actually seed the table if its not populated
Baturhan
Baturhan14mo ago
Yes unfortunetly. Well in that case If you are adding any data to the 100 tables manually you already have long code, an any method wouldn't hurt ya? you can register your code to the applicationstart event, thus it's absolutely going to work once. it'd be useless if you define a variable to the program cs because it only works once already
joren
jorenOP14mo ago
yeah true, one more thing thats bugging me:
joren
jorenOP14mo ago
No description
joren
jorenOP14mo ago
got these models right, pretty standard stuff my current seeder looks like this:
public class DataSeeder
{
private readonly DataContext _context;

public DataSeeder(DataContext context)
{
_context = context;
}

public void SeedData()
{
SeedPokemonOwners();
_context.SaveChanges();
}

private void SeedPokemonOwners()
{
if (_context.PokemonOwners.Any())
{
Console.WriteLine("PokemonOwners data already exists. Skipping seeding.");
return;
}

var pokemonOwners = GetSamplePokemonOwners();
_context.PokemonOwners.AddRange(pokemonOwners);
}
public class DataSeeder
{
private readonly DataContext _context;

public DataSeeder(DataContext context)
{
_context = context;
}

public void SeedData()
{
SeedPokemonOwners();
_context.SaveChanges();
}

private void SeedPokemonOwners()
{
if (_context.PokemonOwners.Any())
{
Console.WriteLine("PokemonOwners data already exists. Skipping seeding.");
return;
}

var pokemonOwners = GetSamplePokemonOwners();
_context.PokemonOwners.AddRange(pokemonOwners);
}
private List<PokemonOwner> GetSamplePokemonOwners()
{
return new List<PokemonOwner>()
{
CreatePokemonOwner("Pikachu", "Jack London", "Kanto", "Electric", "Teddy", "Smith"),
CreatePokemonOwner("Squirtle", "Harry Potter", "Saffron City", "Water", "Teddy", "Smith"),
CreatePokemonOwner("Venasaur", "Ash Ketchum", "Millet Town", "Leaf", "Teddy", "Smith"),
};
}

private PokemonOwner CreatePokemonOwner(string pokemonName, string ownerName, string countryName,
string categoryName, string reviewerFirstName, string reviewerLastName)
{
var country = new Country() { Name = countryName };
var owner = new Owner() { FirstName = ownerName.Split()[0], LastName = ownerName.Split()[1], Gym = $"{ownerName}'s Gym", Country = country };
var category = new Category() { Name = categoryName };
var reviewer = new Reviewer() { FirstName = reviewerFirstName, LastName = reviewerLastName };
var pokemon = new Pokemon()
{
Name = pokemonName,
BirthDate = new DateTime(1903, 1, 1),
PokemonCategories = new List<PokemonCategory>() { new PokemonCategory { Category = category } },
Reviews = new List<Review>()
{
new Review { Title = pokemonName, Text = $"{pokemonName} is the best pokemon, because it is {categoryName}", Reviewer = reviewer }
}
};

return new PokemonOwner() { Pokemon = pokemon, Owner = owner };
}
private List<PokemonOwner> GetSamplePokemonOwners()
{
return new List<PokemonOwner>()
{
CreatePokemonOwner("Pikachu", "Jack London", "Kanto", "Electric", "Teddy", "Smith"),
CreatePokemonOwner("Squirtle", "Harry Potter", "Saffron City", "Water", "Teddy", "Smith"),
CreatePokemonOwner("Venasaur", "Ash Ketchum", "Millet Town", "Leaf", "Teddy", "Smith"),
};
}

private PokemonOwner CreatePokemonOwner(string pokemonName, string ownerName, string countryName,
string categoryName, string reviewerFirstName, string reviewerLastName)
{
var country = new Country() { Name = countryName };
var owner = new Owner() { FirstName = ownerName.Split()[0], LastName = ownerName.Split()[1], Gym = $"{ownerName}'s Gym", Country = country };
var category = new Category() { Name = categoryName };
var reviewer = new Reviewer() { FirstName = reviewerFirstName, LastName = reviewerLastName };
var pokemon = new Pokemon()
{
Name = pokemonName,
BirthDate = new DateTime(1903, 1, 1),
PokemonCategories = new List<PokemonCategory>() { new PokemonCategory { Category = category } },
Reviews = new List<Review>()
{
new Review { Title = pokemonName, Text = $"{pokemonName} is the best pokemon, because it is {categoryName}", Reviewer = reviewer }
}
};

return new PokemonOwner() { Pokemon = pokemon, Owner = owner };
}
As you can see, im only really creating objects of PokemonOwner, which contains basically all the other models. And somehow, for some reason, I suppose based on my relations with this model, it generates the other dummy data automatically for me like, the Country tables, Reviewer table, etc. Seems to me like EF Core is doing some magic. It makes sense, as PokemonOwner has an Owner, a Country, etc. So it not seeding the tables referenced wouldnt make much sense, though it looks a bit tricky and wonky.
Jimmacle
Jimmacle14mo ago
right, EF is an ORM which means it tries to sync your objects up with the database
Baturhan
Baturhan14mo ago
It needs to create, ef core acts right, otherwise ef core couldn't create the entity.
Jimmacle
Jimmacle14mo ago
if your object graph spans multiple entity types, they will all be tracked and updated
joren
jorenOP14mo ago
And assuming what I said is the case, a .Any call on PokemonOwner only would be sufficient then no? as PokemonOwner in my case, requires all those other tables, therefore its granted they are seeded?
Jimmacle
Jimmacle14mo ago
if data existing in that table is enough of an indicator for you that it's seeded, yes
Baturhan
Baturhan14mo ago
yes it' seems enough in this case
joren
jorenOP14mo ago
I mean, lets say the database is seeded except Owner, then PokemonOwner's table wouldnt be able to be filled properly it would have NULL I suppose, its possible in the database, would .Any() give True?
Jimmacle
Jimmacle14mo ago
why would it be seeded except for that table if your seeding code is guaranteed to add data to it?
joren
jorenOP14mo ago
well, lets assume they do not use my seeder, manually seed it locally, wouldnt know why but lets assume they did
"asdf", 12, "fg", NULL
"jh", 32, "fg", NULL
"sfd", 34, "fg", NULL
"asdf", 12, "fg", NULL
"jh", 32, "fg", NULL
"sfd", 34, "fg", NULL
given this is the PokemonOwner table, would .Any() return true?
Jimmacle
Jimmacle14mo ago
Any tells you if any rows exist, so yes
Baturhan
Baturhan14mo ago
catshy
joren
jorenOP14mo ago
ah I see, so to be waterproof some reflective way to iterate over the models, and to call .Any on each model would be better, in theory
Jimmacle
Jimmacle14mo ago
yes, if you want to avoid seeding if there is anything in the DB at all you'll have to check each set
joren
jorenOP14mo ago
any libs that implemented this that you know of
Jimmacle
Jimmacle14mo ago
it's basically just a loop
joren
jorenOP14mo ago
iterate over the Models folder, take the class type, call a method on it
Baturhan
Baturhan14mo ago
you can create an empty migration and can do whatever in there you want, there is no need to use library for only checking .Any()
Jimmacle
Jimmacle14mo ago
you'd be better off using the model built by the dbcontext than doing reflection
joren
jorenOP14mo ago
you mean in the OnModelCreation and the HasData()?
Jimmacle
Jimmacle14mo ago
i mean a dbcontext has a .Model property that gives you access to your db model where you can get the entity types and do something with them, etc
joren
jorenOP14mo ago
oh really that'd make it easy af lol let me check this one second
Jimmacle
Jimmacle14mo ago
the only problem is no non-generic way to get a dbset
joren
jorenOP14mo ago
// Get all DbSet properties of the DbContext
var dbSetProperties = context.GetType().GetProperties()
.Where(p => p.PropertyType.IsGenericType && p.PropertyType.GetGenericTypeDefinition() == typeof(DbSet<>));

// Iterate over DbSet properties
foreach (var property in dbSetProperties)
{
// Get the entity type from DbSet
var entityType = property.PropertyType.GetGenericArguments().FirstOrDefault();

// Perform check
bool hasAnyRows = (bool)typeof(QueryableExtensions)
.GetMethod("Any", BindingFlags.Public | BindingFlags.Static)
.MakeGenericMethod(entityType)
.Invoke(null, new object[] { context.Set(entityType) });

Console.WriteLine($"Table: {entityType.Name}, Has Rows: {hasAnyRows}");
}
// Get all DbSet properties of the DbContext
var dbSetProperties = context.GetType().GetProperties()
.Where(p => p.PropertyType.IsGenericType && p.PropertyType.GetGenericTypeDefinition() == typeof(DbSet<>));

// Iterate over DbSet properties
foreach (var property in dbSetProperties)
{
// Get the entity type from DbSet
var entityType = property.PropertyType.GetGenericArguments().FirstOrDefault();

// Perform check
bool hasAnyRows = (bool)typeof(QueryableExtensions)
.GetMethod("Any", BindingFlags.Public | BindingFlags.Static)
.MakeGenericMethod(entityType)
.Invoke(null, new object[] { context.Set(entityType) });

Console.WriteLine($"Table: {entityType.Name}, Has Rows: {hasAnyRows}");
}
wait, moment I think I messed it up Well I got it to work, here is the code that works. It correctly checks if any table is seeded, if not it re-seeds it fully according to my seeding input.
public bool IsFullySeeded()
{
// Get all DbSet properties of the DbContext
var dbSetProperties = _context.GetType().GetProperties()
.Where(p => p.PropertyType.IsGenericType && p.PropertyType.GetGenericTypeDefinition() == typeof(DbSet<>));

foreach (var property in dbSetProperties)
{
// Get the entity type from DbSet
var entityType = property.PropertyType.GetGenericArguments().FirstOrDefault();

// Get the DbSet instance
object? dbSet = property.GetValue(_context);

// Use LINQ Queryable.Any directly
MethodInfo anyMethod = typeof(Queryable)
.GetMethods()
.FirstOrDefault(m => m.Name == "Any" && m.GetParameters().Length == 1)
.MakeGenericMethod(entityType);

// Invoke Any method
bool hasAnyRows = (bool)anyMethod.Invoke(null, new object[] { dbSet });

Console.WriteLine($"Table: {entityType.Name}, Has Rows: {hasAnyRows}");

// If any table is not seeded, return false
if (!hasAnyRows)
{
Console.WriteLine("DB not fully seeded");
return false;
}
}

// If all tables are seeded, return true
return true;
}
public bool IsFullySeeded()
{
// Get all DbSet properties of the DbContext
var dbSetProperties = _context.GetType().GetProperties()
.Where(p => p.PropertyType.IsGenericType && p.PropertyType.GetGenericTypeDefinition() == typeof(DbSet<>));

foreach (var property in dbSetProperties)
{
// Get the entity type from DbSet
var entityType = property.PropertyType.GetGenericArguments().FirstOrDefault();

// Get the DbSet instance
object? dbSet = property.GetValue(_context);

// Use LINQ Queryable.Any directly
MethodInfo anyMethod = typeof(Queryable)
.GetMethods()
.FirstOrDefault(m => m.Name == "Any" && m.GetParameters().Length == 1)
.MakeGenericMethod(entityType);

// Invoke Any method
bool hasAnyRows = (bool)anyMethod.Invoke(null, new object[] { dbSet });

Console.WriteLine($"Table: {entityType.Name}, Has Rows: {hasAnyRows}");

// If any table is not seeded, return false
if (!hasAnyRows)
{
Console.WriteLine("DB not fully seeded");
return false;
}
}

// If all tables are seeded, return true
return true;
}
that is if anyone is interested in this, or would like to use it.

Did you find this page helpful?