C#•2y ago

Data seeding database

So currently in my program.cs I have the following code:

// Seed the database, disable once done.
using (var scope = app.Services.CreateScope())
{
    var services = scope.ServiceProvider;
    try
    {
        DataContext context = services.GetRequiredService<DataContext>();
        
        if(context.Database.EnsureCreated())
        {
            var dataSeeder = new DataSeeder(context);
            dataSeeder.SeedData();
        }
    }
    catch (Exception ex)
    {
        Console.WriteLine($"Error occurred while seeding the database: {ex.Message}");
    }
}

// Seed the database, disable once done.
using (var scope = app.Services.CreateScope())
{
    var services = scope.ServiceProvider;
    try
    {
        DataContext context = services.GetRequiredService<DataContext>();
        
        if(context.Database.EnsureCreated())
        {
            var dataSeeder = new DataSeeder(context);
            dataSeeder.SeedData();
        }
    }
    catch (Exception ex)
    {
        Console.WriteLine($"Error occurred while seeding the database: {ex.Message}");
    }
}

Now, seeding only makes sense in a development environment and only has to be executed once. So would this be a proper way to go about it? My DataSeeder class basically creates objects of all my models, gives them mock values, and that I save to the database.

40 Replies

jorenOP•2y ago

To my understanding, once I call SaveChanges() on my DataContext object, that's when they will staged for the next migration, please correct me if im wrong. For instance in my

public void SeedData()
{
    SeedPokemonOwners();
    _context.SaveChanges();
}

public void SeedData()
{
    SeedPokemonOwners();
    _context.SaveChanges();
}

The moment I do _context.SaveChanges();, all the changes I pushed to the context with: _context.TableName.AddRange( /* data here */ ) gets staged for the next migrations.

Baturhan•2y ago

Hello @joren, you can check your enviorenment in program.cs using app.Environment.IsDevelopment() if this returns true you are in development, if the case is only debuggin #if DEBUG #end if you can use this. So when you release the application your enviorenment is going to be production(or the setting you set before) and the db is not going to be seed anymore Savechanges method just redirect your query to the db, it's not related with migration at all

jorenOP•2y ago

Migrations are strictly about model changes and its relations and nothing to do with the values inside those tables then I assume would make sense actually, as seeding wouldnt be migrating anything what dictates the value it returns, the profile we use to run it ?

jorenOP•2y ago

so this one, in release it would be False then I assume

jorenOP•2y ago

Eh, I guess its the launchSettings.json that dictates it

{
  "profiles": {
    "http": {
      "commandName": "Project",
      "dotnetRunMessages": true,
      "launchBrowser": true,
      "launchUrl": "swagger",
      "applicationUrl": "http://localhost:5116",
      "environmentVariables": {
        "ASPNETCORE_ENVIRONMENT": "Development"
      }
    },
    "https": {
      "commandName": "Project",
      "dotnetRunMessages": true,
      "launchBrowser": true,
      "launchUrl": "swagger",
      "applicationUrl": "https://localhost:7048;http://localhost:5116",
      "environmentVariables": {
        "ASPNETCORE_ENVIRONMENT": "Development"
      }
    },
}

{
  "profiles": {
    "http": {
      "commandName": "Project",
      "dotnetRunMessages": true,
      "launchBrowser": true,
      "launchUrl": "swagger",
      "applicationUrl": "http://localhost:5116",
      "environmentVariables": {
        "ASPNETCORE_ENVIRONMENT": "Development"
      }
    },
    "https": {
      "commandName": "Project",
      "dotnetRunMessages": true,
      "launchBrowser": true,
      "launchUrl": "swagger",
      "applicationUrl": "https://localhost:7048;http://localhost:5116",
      "environmentVariables": {
        "ASPNETCORE_ENVIRONMENT": "Development"
      }
    },
}

Baturhan•2y ago

Yes, there is a environment variables you need to use

jorenOP•2y ago

"ASPNETCORE_ENVIRONMENT": "Development" Makes sense, so lets say its on development as it is now, would it try to seed it every time I run it ? I assume so, which would make the startup time pretty inefficient I might just add some compile time constant, that I can flip from True to False to disable seeding after the initial seeding locally...

Baturhan•2y ago

if that variable is set to development yes, If you don't want ta seed again and again, you could check the data with "any" method. If you try to execute your program in release mode it should be changed I guess, let me check it

jorenOP•2y ago

Any() would be on a table, what if seeding was done manually and therefore is not 100% complete, then doing it on just one table would be wrong you'd have to do a .Any() check on each table, to see if it contains any data, if one of them does not, just seed it, else skip it Would be cool if I could somehow iterate over the types in the Models folder, call .Any() on each, not sure if thats achievable in C#. It has reflection, so it is doable I just checked

Baturhan•2y ago

are you even trying to execute the db commands parallel? Any() just returns, if there is any data in the table. IF you added it it will true, thus you dont need to add any data to seed and you will return/continue

jorenOP•2y ago

I mean prior to seeding, when the program starts, lets assume the db is seeded already

Baturhan•2y ago

Even if you add your data manually when thread comes to Any() method it's going to return true or false according to it

jorenOP•2y ago

I want to check if it's seeding yes, but that'd mean I'd have to call .Any on each table individually imagine you have 100 tables, no one will call .Any on each table to check if it has data. prior to seeding I could I guess, but then it always goes into the proces of seeding. Then I assume it should be seeded, and actually seed the table if its not populated

Baturhan•2y ago

Yes unfortunetly. Well in that case If you are adding any data to the 100 tables manually you already have long code, an any method wouldn't hurt ya? you can register your code to the applicationstart event, thus it's absolutely going to work once. it'd be useless if you define a variable to the program cs because it only works once already

jorenOP•2y ago

yeah true, one more thing thats bugging me:

jorenOP•2y ago

got these models right, pretty standard stuff my current seeder looks like this:

public class DataSeeder
{
    private readonly DataContext _context;

    public DataSeeder(DataContext context)
    {
        _context = context;
    }

    public void SeedData()
    {
        SeedPokemonOwners();
        _context.SaveChanges();
    }

    private void SeedPokemonOwners()
    {
        if (_context.PokemonOwners.Any())
        {
            Console.WriteLine("PokemonOwners data already exists. Skipping seeding.");
            return;
        }

        var pokemonOwners = GetSamplePokemonOwners();
        _context.PokemonOwners.AddRange(pokemonOwners);
    }

public class DataSeeder
{
    private readonly DataContext _context;

    public DataSeeder(DataContext context)
    {
        _context = context;
    }

    public void SeedData()
    {
        SeedPokemonOwners();
        _context.SaveChanges();
    }

    private void SeedPokemonOwners()
    {
        if (_context.PokemonOwners.Any())
        {
            Console.WriteLine("PokemonOwners data already exists. Skipping seeding.");
            return;
        }

        var pokemonOwners = GetSamplePokemonOwners();
        _context.PokemonOwners.AddRange(pokemonOwners);
    }

    private List<PokemonOwner> GetSamplePokemonOwners()
    {
        return new List<PokemonOwner>()
        {
            CreatePokemonOwner("Pikachu", "Jack London", "Kanto", "Electric", "Teddy", "Smith"),
            CreatePokemonOwner("Squirtle", "Harry Potter", "Saffron City", "Water", "Teddy", "Smith"),
            CreatePokemonOwner("Venasaur", "Ash Ketchum", "Millet Town", "Leaf", "Teddy", "Smith"),
        };
    }

    private PokemonOwner CreatePokemonOwner(string pokemonName, string ownerName, string countryName,
                                            string categoryName, string reviewerFirstName, string reviewerLastName)
    {
        var country = new Country() { Name = countryName };
        var owner = new Owner() { FirstName = ownerName.Split()[0], LastName = ownerName.Split()[1], Gym = $"{ownerName}'s Gym", Country = country };
        var category = new Category() { Name = categoryName };
        var reviewer = new Reviewer() { FirstName = reviewerFirstName, LastName = reviewerLastName };
        var pokemon = new Pokemon()
        {
            Name = pokemonName,
            BirthDate = new DateTime(1903, 1, 1),
            PokemonCategories = new List<PokemonCategory>() { new PokemonCategory { Category = category } },
            Reviews = new List<Review>()
            {
                new Review { Title = pokemonName, Text = $"{pokemonName} is the best pokemon, because it is {categoryName}", Reviewer = reviewer }
            }
        };

        return new PokemonOwner() { Pokemon = pokemon, Owner = owner };
    }

    private List<PokemonOwner> GetSamplePokemonOwners()
    {
        return new List<PokemonOwner>()
        {
            CreatePokemonOwner("Pikachu", "Jack London", "Kanto", "Electric", "Teddy", "Smith"),
            CreatePokemonOwner("Squirtle", "Harry Potter", "Saffron City", "Water", "Teddy", "Smith"),
            CreatePokemonOwner("Venasaur", "Ash Ketchum", "Millet Town", "Leaf", "Teddy", "Smith"),
        };
    }

    private PokemonOwner CreatePokemonOwner(string pokemonName, string ownerName, string countryName,
                                            string categoryName, string reviewerFirstName, string reviewerLastName)
    {
        var country = new Country() { Name = countryName };
        var owner = new Owner() { FirstName = ownerName.Split()[0], LastName = ownerName.Split()[1], Gym = $"{ownerName}'s Gym", Country = country };
        var category = new Category() { Name = categoryName };
        var reviewer = new Reviewer() { FirstName = reviewerFirstName, LastName = reviewerLastName };
        var pokemon = new Pokemon()
        {
            Name = pokemonName,
            BirthDate = new DateTime(1903, 1, 1),
            PokemonCategories = new List<PokemonCategory>() { new PokemonCategory { Category = category } },
            Reviews = new List<Review>()
            {
                new Review { Title = pokemonName, Text = $"{pokemonName} is the best pokemon, because it is {categoryName}", Reviewer = reviewer }
            }
        };

        return new PokemonOwner() { Pokemon = pokemon, Owner = owner };
    }

As you can see, im only really creating objects of PokemonOwner, which contains basically all the other models. And somehow, for some reason, I suppose based on my relations with this model, it generates the other dummy data automatically for me like, the Country tables, Reviewer table, etc. Seems to me like EF Core is doing some magic. It makes sense, as PokemonOwner has an Owner, a Country, etc. So it not seeding the tables referenced wouldnt make much sense, though it looks a bit tricky and wonky.

Jimmacle•2y ago

right, EF is an ORM which means it tries to sync your objects up with the database

Baturhan•2y ago

It needs to create, ef core acts right, otherwise ef core couldn't create the entity.

Jimmacle•2y ago

if your object graph spans multiple entity types, they will all be tracked and updated

jorenOP•2y ago

And assuming what I said is the case, a .Any call on PokemonOwner only would be sufficient then no? as PokemonOwner in my case, requires all those other tables, therefore its granted they are seeded?

Jimmacle•2y ago

if data existing in that table is enough of an indicator for you that it's seeded, yes

Baturhan•2y ago

yes it' seems enough in this case

jorenOP•2y ago

I mean, lets say the database is seeded except Owner, then PokemonOwner's table wouldnt be able to be filled properly it would have NULL I suppose, its possible in the database, would .Any() give True?

Jimmacle•2y ago

why would it be seeded except for that table if your seeding code is guaranteed to add data to it?

jorenOP•2y ago

well, lets assume they do not use my seeder, manually seed it locally, wouldnt know why but lets assume they did

"asdf", 12, "fg", NULL
"jh", 32, "fg", NULL
"sfd", 34, "fg", NULL

"asdf", 12, "fg", NULL
"jh", 32, "fg", NULL
"sfd", 34, "fg", NULL

given this is the PokemonOwner table, would .Any() return true?

Jimmacle•2y ago

Any tells you if any rows exist, so yes

Baturhan•2y ago

jorenOP•2y ago

ah I see, so to be waterproof some reflective way to iterate over the models, and to call .Any on each model would be better, in theory

Jimmacle•2y ago

yes, if you want to avoid seeding if there is anything in the DB at all you'll have to check each set

jorenOP•2y ago

any libs that implemented this that you know of

Jimmacle•2y ago

it's basically just a loop

jorenOP•2y ago

iterate over the Models folder, take the class type, call a method on it

Baturhan•2y ago

you can create an empty migration and can do whatever in there you want, there is no need to use library for only checking .Any()

Jimmacle•2y ago

you'd be better off using the model built by the dbcontext than doing reflection

jorenOP•2y ago

you mean in the OnModelCreation and the HasData()?

Jimmacle•2y ago

i mean a dbcontext has a .Model property that gives you access to your db model where you can get the entity types and do something with them, etc

jorenOP•2y ago

oh really that'd make it easy af lol let me check this one second

Jimmacle•2y ago

the only problem is no non-generic way to get a dbset

jorenOP•2y ago

// Get all DbSet properties of the DbContext
var dbSetProperties = context.GetType().GetProperties()
    .Where(p => p.PropertyType.IsGenericType && p.PropertyType.GetGenericTypeDefinition() == typeof(DbSet<>));

// Iterate over DbSet properties
foreach (var property in dbSetProperties)
{
    // Get the entity type from DbSet
    var entityType = property.PropertyType.GetGenericArguments().FirstOrDefault();

    // Perform check
    bool hasAnyRows = (bool)typeof(QueryableExtensions)
        .GetMethod("Any", BindingFlags.Public | BindingFlags.Static)
        .MakeGenericMethod(entityType)
        .Invoke(null, new object[] { context.Set(entityType) });

    Console.WriteLine($"Table: {entityType.Name}, Has Rows: {hasAnyRows}");
}

// Get all DbSet properties of the DbContext
var dbSetProperties = context.GetType().GetProperties()
    .Where(p => p.PropertyType.IsGenericType && p.PropertyType.GetGenericTypeDefinition() == typeof(DbSet<>));

// Iterate over DbSet properties
foreach (var property in dbSetProperties)
{
    // Get the entity type from DbSet
    var entityType = property.PropertyType.GetGenericArguments().FirstOrDefault();

    // Perform check
    bool hasAnyRows = (bool)typeof(QueryableExtensions)
        .GetMethod("Any", BindingFlags.Public | BindingFlags.Static)
        .MakeGenericMethod(entityType)
        .Invoke(null, new object[] { context.Set(entityType) });

    Console.WriteLine($"Table: {entityType.Name}, Has Rows: {hasAnyRows}");
}

wait, moment I think I messed it up Well I got it to work, here is the code that works. It correctly checks if any table is seeded, if not it re-seeds it fully according to my seeding input.

public bool IsFullySeeded()
{
    // Get all DbSet properties of the DbContext
    var dbSetProperties = _context.GetType().GetProperties()
        .Where(p => p.PropertyType.IsGenericType && p.PropertyType.GetGenericTypeDefinition() == typeof(DbSet<>));

    foreach (var property in dbSetProperties)
    {
        // Get the entity type from DbSet
        var entityType = property.PropertyType.GetGenericArguments().FirstOrDefault();

        // Get the DbSet instance
        object? dbSet = property.GetValue(_context);

        // Use LINQ Queryable.Any directly
        MethodInfo anyMethod = typeof(Queryable)
            .GetMethods()
            .FirstOrDefault(m => m.Name == "Any" && m.GetParameters().Length == 1)
            .MakeGenericMethod(entityType);

        // Invoke Any method
        bool hasAnyRows = (bool)anyMethod.Invoke(null, new object[] { dbSet });

        Console.WriteLine($"Table: {entityType.Name}, Has Rows: {hasAnyRows}");

        // If any table is not seeded, return false
        if (!hasAnyRows)
        {
            Console.WriteLine("DB not fully seeded");
            return false;
        }
    }

    // If all tables are seeded, return true
    return true;
}

public bool IsFullySeeded()
{
    // Get all DbSet properties of the DbContext
    var dbSetProperties = _context.GetType().GetProperties()
        .Where(p => p.PropertyType.IsGenericType && p.PropertyType.GetGenericTypeDefinition() == typeof(DbSet<>));

    foreach (var property in dbSetProperties)
    {
        // Get the entity type from DbSet
        var entityType = property.PropertyType.GetGenericArguments().FirstOrDefault();

        // Get the DbSet instance
        object? dbSet = property.GetValue(_context);

        // Use LINQ Queryable.Any directly
        MethodInfo anyMethod = typeof(Queryable)
            .GetMethods()
            .FirstOrDefault(m => m.Name == "Any" && m.GetParameters().Length == 1)
            .MakeGenericMethod(entityType);

        // Invoke Any method
        bool hasAnyRows = (bool)anyMethod.Invoke(null, new object[] { dbSet });

        Console.WriteLine($"Table: {entityType.Name}, Has Rows: {hasAnyRows}");

        // If any table is not seeded, return false
        if (!hasAnyRows)
        {
            Console.WriteLine("DB not fully seeded");
            return false;
        }
    }

    // If all tables are seeded, return true
    return true;
}

that is if anyone is interested in this, or would like to use it.

Gaming

Programming

Data seeding database

Did you find this page helpful?