-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider stronger semantics around generated keys #13575
Comments
@sjb-sjb Please file a separate issue for that--it's not the same thing that this issue is tracking. |
“However, setting key values like this is typically used in special situations--most notably test data and seeding.“ Not in our case - it’s production code! We’ve worked around this by forcing the state to Unchanged through the ChangeTracker.Tracked event. |
@optiks If you're forcing the state to Unchanged, then I think your workaround is doing what I am suggesting we should do by default. That is, if we make this change, then if you set the key value, then we will always assume that this means the state of the entity should be Modified/Unchanged. Conversely, if the key value is not set, then we will always assume that the entity should be Added. The behavior today follows this except that:
|
@ajcvickers I have restated the issue in #13564, and added a question and a suggestion. |
I re-opened #13564 and we will discuss in triage. |
@sjb-sjb Our workaround may help you (you may need to tweak/make it generic if you don't use int ids):
|
@optiks Will that cause re-entry into the Tracked (or StateChanged) event handler? Or perhaps it will trigger a StateChanged event for an Added --> Unchanged transition? |
@ajcvickers I agree with your proposal, which I read after suggesting something similar in one of the comments in #13564. The only people that could find it problematic would be people who set store-generated keys themselves and, well, they just shouldn't be doing that in the first place! |
The main problem with making this change now is that it's going to be a break for a lot of seeding/tests, so while I am proposing this, I'm not convinced we will be able to do it. |
@sjb-sjb Unchanged is still tracked, so it shouldn't be re-entrant. If you're subscribing to StateChanged as well, then yes, I'd expect that to fire. |
@ajcvickers Good point about seeding. I am currently not seeding in the recommended way, instead I am making internal API calls and then doing a SaveChanges. It would make sense to me to go ahead with the changes you propose and then alter HasData so that seeding still works. Within HasData you can move the entities from Unchanged to Added. Outside of seeding, it seems like broken behavior to me that EF would Add an entity that already has a valid store-generated key. @optiks Thanks for the clarification. |
@ajcvickers FWIW, I agree that you should be limiting the number of breaking changes. In general, opt-in for behavioural changes is reasonable. |
It's hard to disagree with @optiks there. But, @ajcvickers, if an option is created to have the entity land in the Unchanged state instead of the Added state, then I would vote for an additional possibility which is that it stays Detached. In fact this is what I want it to do in my actual use case. This also is consistent with, as I commented in #13564, the generally desirable goal of giving the user fine-grained control over when / whether entities are implicitly attached to a context or not. My software design is using State assignments to attach entities and consistent with that I generally don't want other (reachable) entities to be attached implicitly. |
@sjb-sjb, I get what you're saying. Two other workarounds I can think of are:
Another thing to consider is whether disabling AutoDetectChanges might help in your case? |
Triage: we agreed on the general idea, but the breaking potential is great. Therefore, we will instead introduce two new methods:
These methods would work in the way described above when generated keys are being used. That is, if the instance has no key value set, then it will always be Added. If it has a key value set then it will be Unchanged for AddOrAttach or Modified for AddOrUpdate. For non-store generated keys, both will Add everything. We will make the breaking change in the DetectChanges behavior, but also add a hook to allow the behavior to be defined--see #13564 |
@optiks Thanks for your comments. I'm glad that you "got it" because in retrospect I sure didn't (and probably still don't). I tried to workaround by setting state back to Detached if it I didn't want the Detached --> Added transition. However for some reason EF then re-attached it again, resulting in an infinite cycle Detached --> Added --> Detached --> Added etc. This problem went away when I set the state to Unchanged instead of Detached. I do end up with an Unchanged entity attached to my context that I would rather not have attached, but this is not a big efficiency hit in my situation. @ajcvickers I can imagine this cropping up when you implement the Add-suppressing modification to the event, as suggested in your triage comment for #13564. I would guess it is solvable though. |
See also #14031 |
Just to provide a comment on something that I have found out through a bit more research. In the workaround suggested by @optiks (and I believe this is relevant to #14031 by @Rubbiroid as well), setting the state to Unchanged will result in a sequence Detached --> Added --> Unchanged. Some extra work is required to distinguish this from the Detached --> Added --> Unchanged transitions that occur as a result of SaveChanges on an Added entity. The only difference is that in the 'workaround' version we have Id > 0 during Detached --> Added, while in the normal case we have Id < 0. What we want to see, in effect, is Detached --> Unchanged in the 'workaround' case but we want to keep Detached --> Added --> Unchanged in the 'normal' case. To do this we need to carry information from the Tracked event forward to the StateChanged event. Some locking is needed since the state transition could be triggered from any thread. Putting it together we get something like this (replace ".Id" with your primary key):
|
DetectChanges bit for 3.0; Backlog for other parts. |
Split DetectChanges part into #14616 |
@ajcvickers , @AndriySvyryd : this issue has become more pressing for Sqlite users following the implementation of TPC. Since Sqlite does not have named id sequences, the use of TPC with Sqlite typically means using Guid's for the identity field. But for Guid key fields with Sqlite, the key value appears to be generated by EF Core during an Attach operation, and before the Tracked event is called. This differs from autogenerated ints or longs used as identity fields where the value is set into the entity following the save changes. Net/net, the workaround above no longer works. Testing the identity field during the Tracked event (arising due to an Attach call) no longer tells us whether or not this is an entity that is already in the database, because the Guid value will have been filled in by EF Core if it was not already set before Attach was called. Consequently we can no longer use this to determine whether the Added state genuinely reflects an entity that needs to be added to the database, vs an entity that already had an Id and was only put into the Added state due to EF's internal logic. Indeed as far as I can tell there is no value associated with the Tracked event args that can be used to test this condition. Edit: not sure how to work around this currently. Possibilities to explore are (a) server-side generation as suggested by @roji below or (b) perhaps stashing a copy of the guid at the time of materialization (assuming there is a hook for materialization). @optiks take note. |
@sjb-sjb you should be able to switch to server-side generation of UUIDs, which would make things very similar to how they work with sequence-drive ints. It seems that there's no nice native way to generate a random UUID in SQLite, but there's at least this extension which look pretty good. Could be related to #30753, which would in general change the default value generation strategy for GUIDs to be server-side. |
One way to resolve this would be to store EF-generated guid keys internally, the same way temporary integer ids are, and not store them in the entity field until after the entity has been saved. Treating EF-generated guids like temporary keys would make sense because… they ARE temporary keys, until they are stored in the database whereupon they become permanent keys. The fact that the temporary value is equal to the permanent value is beside the point. |
This is not the meaning of temporary keys in the EF context: temporary keys are about the key being a temporary placeholder until a different key gets fetched back from the database. Since client-generated keys really are the final key (they'll be sent to the database and persisted), they're not temporary (all other properties are "temporary" in the sense that you're using, simply since they haven't yet been saved). |
Sure, that’s one way to define it. Edit: sorry, my wording was a bit too short here. I should have said, "That's a fine definition for 'temporary' in the EF context. The similarity I was trying to bring forward was that in both cases a key is assigned in memory and does not become permanent until it is saved to the database. In that sense both are 'temporary' rather than permanently saved in the database." |
Importantly, this is the way it is defined for the EF Code. This isn't going to change. |
I'm fine with that. I'm just saying that having the Guid's not in the entity field due to the Attach would be one way to make a workaround possible here. The breaking impact would be very limited because the users who need to access the Guid could get it from the shadow property instead. The type of breaking is similar to that experienced in the transition to keeping temporary sequential id's in shadow instead of in the entity. A bit less of an impact because people who do want to see the Guid's in the entity could simply assign them themselves in the entity ctor. Otherwise I don't really know what the workaround could be. I didn't really see how to switch to server-side GUID generation without a large disruption, also if we did that then it would probably have a negative impact on the ability to migrate to other databases in the future. |
What kind of disruption are you thinking of, and why do you think this would impact your ability to migrate to other databases? I'm not aware of other database which don't support server-side generation of GUIDs (SQLite is actually the tricky one). |
It's just not an area that I have knowledge in and there seem to be a number of hurdles. The library you referenced says it only works with Linux and OS X, I'm not familiar with installing user functions in Sqlite, and I don't know what I would have to do to enable those functions to generate Id's in EF Core. I also don't know how to configure server-side generation of Guid's in SQL Server or in EF Core with SqlServer. In addition I have also read that there is a possibility of performance issues due to index fragmentation when using Guid's. Bear in mind, the problem that I'm really trying to solve is how to use TPC with Sqlite. There is a note on the TPC page saying that autoincremented integer identities don't work with TPC under Sqlite because there is an absence of named sequences in Sqlite. That's why I started investigating using Guid identifiers. But given the challenges with Guid's, I'm wondering whether to look again at the autoincrement question. I understand there is a table in the Sqlite db containing the current autoincrement values, and it seems to me that if I seeded the high-order bits with a different number for each concrete type then the autoincrement approach would work fine. EDIT: seeding the sqlite_sequence table did not work for me in this context. It seems that the sqlite_sequence table is either not generated or not used. EDIT: one possible workaround is to wrap the Guids in a struct. Then EF will not generate the Guid when the entity is passed to Attach, and correspondingly the test for Id == default in the Tracked event will correctly reflect whether the entity has been stored in the database. However in this case you must remember to set the Guid yourself in the Tracked event. To maintain consistently that the Guid is unset if the entity has not been saved, you also need to unset the Guid if the SaveChanges fails. I do sort of wish that TPC would work 'out of the box' with Sqlite. Maybe it does and I'm just missing something. |
Currently we avoid reasoning about store-generated keys in some places because it is possible to add a new entity with an explicit key value set even when the entity type is configured to use generated keys. The two most prominent examples of this are:
Both of these cause confusion.
However, setting key values like this is typically used in special situations--most notably test data and seeding. So adding to the complexity of the mental model and as a result generating confusion in the cases above may not be a good trade-off to make. Instead, if we assume a set store-generated key means that the entity already exists in the database, I believe this would make common things easier to both do and understand.
We could then allow a pattern something like this to temporarily enable setting explicit key values:
The text was updated successfully, but these errors were encountered: