-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bug: Two different keys that decode into the same field are not considered duplicate keys when decoding into struct #488
Comments
@benluddy Thanks for reporting this bug and sharing your insights!
I agree.
Yeah, this is about key equivalency for specific data models in section 2.2 of RFC 8949. User can:
So destination type is specific data model, and currently duplicate map key detection evaluates key based on key's destination type. For struct destination, if different map keys can be decoded to the same field, those keys should be considered duplicate. Also, potential json/v2 handles duplicate names depending on destination object.
I agree! Please feel free to open PR to modify
Do you also reject unknown struct fields Thoughts? |
Great! I'll get a PR open soon.
We sometimes reject and sometimes allow with a warning, but either way unknown fields must be detected. I think the optimization would be safe in general if mapKeys is used only for keys that don't match a struct field. You'd get a duplicate key error either when decoding into a field for the second time, or when two unknown fields are equal. If most or all keys map to fields in common inputs, most or all of the allocations related to mapKeys can be avoided. The existing limitation on which CBOR key types may be mapped to a struct field helps to rule out surprising cases, like having two identical strings with one enclosed in a tag. |
Closed by #492 |
What version of fxamacker/cbor are you using?
Current HEAD of master, 2da2d67.
Does this issue reproduce with the latest release?
Yes.
What OS and CPU architecture are you using (
go env
)?go env
OutputWhat did you do?
https://go.dev/play/p/jHzyzyfWCVD
What did you expect to see?
The caller of Unmarshal should have an indication that a map key in the input was effectively ignored. More specifically, if two map keys match the same struct field, they should be treated as equal for the purposes of detecting duplicate keys. This is not compatible with the current behavior of DupMapKeyEnforcedAPF, but since there is no way to tell that this has happened (other than doing a parallel decode into map), it seems reasonable to me to modify DupMapKeyEnforcedAPF rather than adding a new DupMapKeyMode.
I mentioned this in #432 in the context of case-insensitivity, but there may be other ways to reproduce it (possibly a separate issue, but I included in the playground link an example that relies on a keyasint field tag matching either an integer key or a string key containing the int encoded to base-ten).
What did you see instead?
No DupMapKeyError is returned from Unmarshal.
I understand that this raises interesting questions about consistency on the meaning of DupMapKeyError depending on whether the destination object has a struct type or a map type. I think some behavior differences are expected there. Today, for example, decoding into struct already forbids CBOR keys that are neither strings nor integers, even though the same keys may be permitted when decoding a map.
I'm especially interested in this because my use case is always configured to reject inputs with duplicate map keys. Populating the map used to detect duplicates (
cbor/decode.go
Line 1902 in 2da2d67
foundFldIdx []bool
(or similar) to recognize duplicates. I've explored that optimization, and my (application-specific) benchmarks suggest that allocations during decode can be reduced by more than 50% along with a more than 1.6x speedup.The text was updated successfully, but these errors were encountered: