Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TPC does not work "out of the box" with Sqlite #31752

Closed
sjb-sjb opened this issue Sep 15, 2023 · 18 comments
Closed

TPC does not work "out of the box" with Sqlite #31752

sjb-sjb opened this issue Sep 15, 2023 · 18 comments
Labels
closed-no-further-action The issue is closed and no further action is planned. customer-reported

Comments

@sjb-sjb
Copy link

sjb-sjb commented Sep 15, 2023

As discussed on the inheritance page, store-generated integer primary keys cannot be used with Table Per Concrete Type in Sqlite due to the fact that Sqlite lacks named sequences.

On the other hand, if one uses Guids then the problem described in #13575 occurs. Briefly, EF will sometimes track an entity in the Added state even if the primary key was already set. This happens, for example, when setting a navigation to a stored principle entity from a tracked dependent entity; even though the principle entity already has a primary key defined, it is unconditionally tracked in the Added state. This results in errors during SaveChanges. Unfortunately the workaround described in #13575 does not work for Guids; there does not seem to be a direct workaround.

While each of these two points is understandable on its own, the net effect is that TPC is very difficult to use with Sqlite. I would respectfully submit that something be done to make TPC more useable with Sqlite.

@ajcvickers
Copy link
Contributor

@sjb-sjb What, specifically, are you suggesting?

@sjb-sjb
Copy link
Author

sjb-sjb commented Sep 15, 2023

Well... I think there are several possible approaches.

  1. One approach outlined in #13575 would be to change the way guid keys are assigned in Attach, so that it would either be in a shadow property or otherwise not assigned to the CLR property at the time Tracked is fired. This would re-enable the workaround described in #13575. On the other hand it would also be a breaking change and conceptually it is a bit weird perhaps to shadow something whose value is not going to change when it comes out of shadow and into the CLR key field.

  2. Another approach also referenced in #13575 would be to use Sqlite-generated (store generated) guid keys. While this may be possible to do today, it does not seem to be something that is ready "out of the box". So this approach would mean some code provided in EF to enable it more easily. The workaround in #13575 would then work with these server-generated keys.

  3. A third approach would be to solve the original problem of EF sometimes putting entities into the Added state even though they already have primary keys attached. I have not thought through the details but presumably this is what the AddOrAttach method proposed in #13575 is meant to accomplish. Hopefully this would mean the workaround is no longer needed.

  4. Another approach would be creating code in EF to address the problem of Sqlite not providing named autoincrement sequences that can be used to set the primary key. One could have a table containing the latest sequence number and increment this during ADD operations. Efficiency would have to be looked at. Given that sqlite is a single-user system (i.e. the file is locked) it might be possible for EF to keep the latest sequence number in memory, increment it client-side and add it to the entities just before sending them to the database. The updated sequence number would also have to be written but at least this way one would avoid alternating reads and writes on the sequence table. Instead there would just be a lot of writes to the table. Technically then these would be client-generated keys, but they would be integers that look to the EF user exactly as if they were store-generated / autoincremented keys. This workaround could be applied by EF whenever the model configuration calls for store-generated keys with integral type and the database provider is Sqlite.

On balance, and taking it with a grain of salt given that I am not familiar with the internals of either EF or Sqlite, my suggestion would be to investigate (4) and if that doesn't work then look at (2). If (4) works, it would address the core underlying problem and would be the most natural from a user perspective -- i.e. users would effectively have server-generated keys on the Sqlite platform as they do on other platforms.

Approach (1) would require a conceptual change that would also be breaking, although perhaps not a big/bad breaking change. I think (2) is reasonable although the counterargument would be that the point of guids is that the client can assign them, so it feels a bit funny to generate them on the server. I'm all for (3) but it would require users to understand the nuances of #13575 and adopt the new AddOrAttach user interface. Personally I'm a bit fuzzy on how it would address the Added state when setting a principle entity relationship -- since the user is not invoking Attach in that example they do not have a way to switch their call to AddOrAttach.

Whatever approach is used, the end goal would be for people to be able to switch their sqlite project from TPH to TPC with only minimal effort.

@roji
Copy link
Member

roji commented Sep 15, 2023

@sjb-sjb just noting that a lot of the above really is orthogonal to TPC and/or Sqlite. That is, the main issue you're describing seems to be about how to deal with change tracking and non-temporary keys being generated client-side. Client-generated keys don't seem to be a problem for most users - you don't have to depend on the key to know the state of things - but in any case this has nothing really to do with TPC on Sqlite. For example, many people use client-side GUID generation on SQL Server (without TPC), things work in the same way in that scenario and that doesn't seem to be blocking for the vast majority of them.

In other words, I'd separate between the general question of "how do I work with change tracking and client-generated keys" (something that most users don't seem to have a problem with), and other questions specifically about managing keys in SQLite (e.g. making server-generated GUIDs easier to configure, option (2) above).

Option (4) seems to amount to basically EF creating a custom named sequence mechanism ("One could have a table containing the latest sequence number and increment this during ADD operations"); we generally don't build this kind of workaround where a database lacks support for a feature.

@sjb-sjb
Copy link
Author

sjb-sjb commented Sep 16, 2023

Hi @roji, well, I guess you're saying that on the one hand it's not EF's mandate to fix or workaround database deficiencies, and on the other hand #13575 did not gain traction with a lot of people, relative to other issues. Both understandable points. All I can say is that as a user I was pretty surprised that something like primary keys and TPC did not work out of the box, it was a frustrating experience. I ended up resolving the absence of this feature by working to defeat other features -- doing my own TrackGraph instead of using Attach, and also throwing an exception if there is an attempt to assign a reference to an entity that is not already attached to the same context. So it felt like I was having to defeat the original design intentions of EF.

@ajcvickers
Copy link
Contributor

@sjb-sjb Would you have been prevented from going down this path if you knew that SQLite doesn't support the features you need? How did you choose to target SQLite?

@roji
Copy link
Member

roji commented Sep 16, 2023

@sjb-sjb don't get me wrong, I understand the pain. Our general recommendation for SQLite+TPC is "use GUIDs", which most people seem to be really OK with - it's very easy to set up and just works. This is a reason why we haven't even considered doing something like a "named sequence workaround" for SQLite - people seem to be happy enough with client-generated GUIDs, just like they're happy doing that on SQL Server for various other reasons.

It may make sense for SQLitePCLRaw (which Microsoft.Data.Sqlite uses) to bundle sqlite3-uuid (or similar) - in fact maybe it even already does so (@bricelam?). That would probably make all this easier.

@sjb-sjb
Copy link
Author

sjb-sjb commented Sep 16, 2023

@roji, @ajcvickers I guess I missed the memo :-). No, when I read the inheritance page it was easy to conclude that I needed to use guids, but I didn’t understand the client side vs server side implications. I didn’t realize that Attach would assign the guid’s for me and that this has different implications for the Tracked event compared to server-side key generation.

Look at the switch from autogenerated integers to guids: you switch the type of key in your entity, in both cases you pass the entity into Attach without a key, you set your entity properties and call SaveChanges… the key takes care of itself. Naively it seems to be pretty much the same. When the workaround for 13575 broke, though, that’s when the hours of digging and figuring-out came in.

The underlying piece that seems to be missing is a systematic way to determine whether or not an entity has been saved in the database. For store-generated keys obviously we can just look in the clr entity to see if the key is set. For client-side keys, if the user is managing the keys then I don’t think there’s much EF can do. But there is a programming philosophy (which I happen to subscribe to) that says users should work with navigation properties only and let EF and the database manage the keys.

If EF is managing the client-side keys, I think it would be fairly easy to do it in a way that supports a determination of whether or not the entity has been stored in the database. One could maintain the invariant that (a) the CLR key is set iff the entity is either Tracked or has been saved to the database and (b) for entities that are tracked, EF stores a flag saying whether or not it has been saved to the db (this could be accomplished by a shadow property). An implication would be that when entities stop being tracked, the key would be unset again if the entity was not saved. This would also solve # 13575.

In this approach, EF would offer a deal to users: if you let EF manage the keys then, regardless of the key type: (1) EF will correctly decide between Add and Unchanged at the time of tracking, and (2) DbContext will tell you whether or not a given tracked entity has been stored in the database, and (3) for an untracked entity the key will be set if and only if the entity has been stored to the database.

@roji
Copy link
Member

roji commented Sep 16, 2023

I understand the above, and I think there's merit there. At the same time, it could be argued that it isn't EF's job to tell you whether an entity being tracked - after all you can easily do this yourself, e.g. by checking the Guid before calling Attach or via your custom application logic. This maybe does require a bit of adaptation from you (since you have to check before Attach rather than after), but it seems reasonable enough that I'm not sure a big effort is warranted on the EF side. I suspect this is also why not many other users are asking for this.

What do you think @ajcvickers?

@sjb-sjb
Copy link
Author

sjb-sjb commented Sep 16, 2023

@ajcvickers to answer your question, the reason for using sqlite is it produces a local file and requires no separate setup/configuration. As far as I know it is the only choice for this kind of desktop use, so, I just have to live with the lumps. I’m using the db as a local file produced by the app. I do want to keep an eye in the possibility of using a central db server such as SQL server either for this app in the future or for a different app. From that point of view I’m trying to make my code base as db-agnostic as possible.

@sjb-sjb
Copy link
Author

sjb-sjb commented Sep 16, 2023

@roji I think you meant, whether or not an entity has been saved to the database — not whether or not it is being tracked.

Sure the user can do this themselves, as they could anything. Given that it simplifies the EF user’s life and solves related problems such as 13575, I would say it can reasonably be considered to be in EF scope. Whether to actually do it or not I suppose is a question of priority relative to other issues.

@roji
Copy link
Member

roji commented Sep 17, 2023

whether or not an entity has been saved to the database — not whether or not it is being tracked.

Yeah, sorry.

@ajcvickers
Copy link
Contributor

@sjb-sjb If you don't want keys to show up in the actual instances until SaveChanges, then use a temporary generator, but mark the values as not-temporary just before saving. For example:

using (var context = new SomeDbContext())
{
    await context.Database.EnsureDeletedAsync();
    await context.Database.EnsureCreatedAsync();
    
    context.AddRange(new Blog { Posts = { new(), new(), new() }},new Blog { Posts = { new(), new(), new() }});

    await context.SaveChangesAsync();
}

using (var context = new SomeDbContext())
{
    var blogs = context.Blogs.Include(e => e.Posts).ToList();
    Console.WriteLine(context.ChangeTracker.DebugView.LongView);
}

public class SomeDbContext : DbContext
{
    public SomeDbContext()
    {
        SavingChanges += (c, _) =>
        {
            foreach (var entry in ((DbContext)c!).ChangeTracker.Entries())
            {
                entry.Property("Id").IsTemporary = false;
            }
        };
    }

    protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
        => optionsBuilder
            .UseSqlite("Data Source=test.db")
            .LogTo(Console.WriteLine, LogLevel.Information)
            .EnableSensitiveDataLogging();

    public DbSet<Blog> Blogs => Set<Blog>();

    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        modelBuilder.Entity<Blog>().Property(e => e.Id).HasValueGenerator<TemporaryGuidValueGenerator>();
        modelBuilder.Entity<Post>().Property(e => e.Id).HasValueGenerator<TemporaryGuidValueGenerator>();
    }
}

public class Blog
{
    public Guid Id { get; set; }
    public string? Name { get; set; }
    public List<Post> Posts { get; } = new();
}

public class Post
{
    public Guid Id { get; set; }
    public string? Title { get; set; }
    public Guid? BlogId { get; set; }
    public Blog Blog { get; set; } = null!;
}

@sjb-sjb
Copy link
Author

sjb-sjb commented Oct 1, 2023

@acjvickers that sounds like a really cool idea. You haven’t said what TemporaryGuidValueGenerator looks like but I’m guessing it somehow sets the guid into the ChangeTracker’s copy of the key, but leaves the entity’s copy unset (default value) and sets IsTemporary=true ?? Would be interested to see that.

I am not clear on why you would have to set IsTemporary back to false explicitly before SaveChanges. Won’t EF see that the value in CurrentValue is Modified relative to OriginalValue, send it to the db and then set IsTemporary to false after the SaveChanges? Or is that logic somehow defeated by IsTemporary?

@ajcvickers
Copy link
Contributor

You haven’t said what TemporaryGuidValueGenerator looks like

You could just look at the code...

Won’t EF see that the value in CurrentValue is Modified relative to OriginalValue, send it to the db and then set IsTemporary to false after the SaveChanges?

Nope. Have a read up on how temporary value work sometime.

@sjb-sjb
Copy link
Author

sjb-sjb commented Oct 1, 2023

@ajcvickers thanks. I did read that page and the linked page on temporary values and still had the question. And I guess the reason I missed the source code was that I looked for it using Google instead of GitHub. So… all in good faith.

@ajcvickers
Copy link
Contributor

We discussed this in triage and concluded that we are not going to implement features in EF Core to make up for limitations in the SQLite database. TPC with non-GUID generated keys is not something that SQLite is capable of out-of-the-box.

@ajcvickers ajcvickers closed this as not planned Won't fix, can't repro, duplicate, stale Oct 9, 2023
@ajcvickers ajcvickers added the closed-no-further-action The issue is closed and no further action is planned. label Oct 9, 2023
@sjb-sjb
Copy link
Author

sjb-sjb commented Dec 25, 2024

An update on this issue. The problem disappeared for me, at least for a while, when I applied the approach described by @roji above. That is, I had a user-defined Attach method that would check the client-side Guid key before attaching to the context. I did it by using TrackGraph and setting state to either Added if the key was null or Unchanged if the key was not null.

That approach proved brittle, however. It required that one never set a navigation between an untracked but already-saved entity and an already-tracked entity, because then EF would bypass the user-defined means of attaching entities and would track the already-saved entity in the Added state even though it already existed in the database.

I switched to the shadow client-generated GUID approach described by @ajcvickers. Overall it worked great!

Here is a summary for the benefit of others who may want to use TPC with Sqlite or who otherwise want to use client-generated keys in Sqlite with detached entities.

  • Use Guid's for primary keys and configure them like this in OnModelCreating:
    entityTypeBuilder
        .Property<Guid>("PrimaryKey")
        .IsRequired()
        .ValueGeneratedOnAdd()
        .HasValueGenerator<TemporaryGuidValueGenerator>()
    
    where "PrimaryKey" is the name of your primary key property. With this, adding the entity to the DbContext will result in a value being generated and stored in the ChangeTracker but not in the entity.
  • If you want to wrap the Guid in a struct, say ClientKey, then make a class that is a modification of TemporaryGuidValueGenerator to return the wrapped key, and use it in HasValueGenerator instead (and, obviously, change Guid to ClientKey in the Property call). The generator would look like this:
    public class TemporaryClientKeyValueGenerator: ValueGenerator<ClientKey>
    {
        public override ClientKey Next(EntityEntry entry) => new ClientKey(Guid.NewGuid());
        public override bool GeneratesTemporaryValues => true;
    }
    
  • In your DbContext, subscribe to the Tracked event and check whether the newly-tracked entity has the primary key already set (i.e. set in the entity object, not just in the context entry / change tracker). If the primary key is set and the entry state is Added, then set the entry state to Unchanged. This will address the question discussed in Consider stronger semantics around generated keys #13575 that when a navigation is assigned, the entity is tracked as Added even if the primary key was already set. And it will mean that Added entries are exactly those whose primary key is a temporary value i.e. only set in the ChangeTracker, not set in the entity object. Note that if you also subscribe to StateChanged then you will see an additional state transition from Added to Unchanged for some entities.
  • Just before SaveChanges, iterate over context.ChangeTracker.Entries and if entry.State == Added, set entry.Property("PrimaryKey").IsTemporary = false. This will copy the temporary key into the entity's primary key property for the purposes of saving the entity. It may also be possible to do this in an interceptor -- I haven't looked into that.
  • Catch any exceptions from SaveChanges and, since the context was not saved, iterate again over context.ChangeTracker.Entries and if entry.State == Added, reset the entry.Entity.PrimaryKey = default. This will help to maintain the invariant that the primary key is only set in the entity if the entity has been saved. You should also either set the entity state to Detached, clear the change tracker, or dispose the context. This way, the temporary value becomes moot since the ChangeTracker is no longer tracking the entity. If you don't do this, you likely will have wiped out the temporary value by setting the entity property to default.

@sjb-sjb
Copy link
Author

sjb-sjb commented Dec 26, 2024

... and, when you want to add an entity explicitly to the context then use either Attach if the primary key is set, or Add if the primary key is not set. Other entities may also be attached by including them in the navigations of entities that are already attached.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
closed-no-further-action The issue is closed and no further action is planned. customer-reported
Projects
None yet
Development

No branches or pull requests

3 participants