Can create too many distinct numeric values #153

yoid2000 · 2025-01-29T16:16:54Z

In cases of continuous numeric data where nevertheless there are a "medium" number of distinct values, SynDiffix can end up creating substantially more distinct values than the original data.

What happens is that there are enough original distinct values that many of them have relatively low counts (10-20 say), and when combined with another column some of the values get suppressed. Then during microdata assignment, random values are assigned which are not original values, and more distinct values end up being created.

We need to do something whereby when the original data values are not suppressed, then we assign microdata only from the original values.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can create too many distinct numeric values #153

Can create too many distinct numeric values #153

yoid2000 commented Jan 29, 2025

Can create too many distinct numeric values #153

Can create too many distinct numeric values #153

Comments

yoid2000 commented Jan 29, 2025