Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decimal hashing very slow, could be cached #58683

Closed
Jimbofbx mannequin opened this issue Apr 2, 2012 · 27 comments
Closed

Decimal hashing very slow, could be cached #58683

Jimbofbx mannequin opened this issue Apr 2, 2012 · 27 comments
Assignees
Labels
performance Performance or resource usage stdlib Python modules in the Lib dir

Comments

@Jimbofbx
Copy link
Mannequin

Jimbofbx mannequin commented Apr 2, 2012

BPO 14478
Nosy @rhettinger, @jcea, @amauryfa, @mdickinson, @pitrou, @skrah, @JimJJewett, @serhiy-storchaka
Files
  • CachingDecimal_example.py
  • cachedDecimal.py
  • decimal_hash.diff: Cache the decimal hash using a slot
  • decimal_hash_2.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/rhettinger'
    closed_at = <Date 2012-11-25.14:02:33.586>
    created_at = <Date 2012-04-02.17:16:26.216>
    labels = ['library', 'performance']
    title = 'Decimal hashing very slow, could be cached'
    updated_at = <Date 2012-11-25.14:02:33.586>
    user = 'https://bugs.python.org/Jimbofbx'

    bugs.python.org fields:

    activity = <Date 2012-11-25.14:02:33.586>
    actor = 'mark.dickinson'
    assignee = 'rhettinger'
    closed = True
    closed_date = <Date 2012-11-25.14:02:33.586>
    closer = 'mark.dickinson'
    components = ['Library (Lib)']
    creation = <Date 2012-04-02.17:16:26.216>
    creator = 'Jimbofbx'
    dependencies = []
    files = ['25099', '25100', '25152', '25169']
    hgrepos = []
    issue_num = 14478
    keywords = ['patch']
    message_count = 27.0
    messages = ['157369', '157375', '157376', '157377', '157446', '157447', '157553', '157603', '157684', '157720', '157731', '157732', '157741', '157943', '157955', '157956', '157957', '157959', '157962', '157965', '157966', '157967', '157969', '157991', '163602', '163610', '176355']
    nosy_count = 11.0
    nosy_names = ['rhettinger', 'jcea', 'amaury.forgeotdarc', 'mark.dickinson', 'pitrou', 'skrah', 'Jimbofbx', 'python-dev', 'Ramchandra Apte', 'Jim.Jewett', 'serhiy.storchaka']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'needs patch'
    status = 'closed'
    superseder = None
    type = 'performance'
    url = 'https://bugs.python.org/issue14478'
    versions = ['Python 3.4']

    @Jimbofbx
    Copy link
    Mannequin Author

    Jimbofbx mannequin commented Apr 2, 2012

    Tested on 3.2

    Note that I noticed that Decimal is supposed to be faster in 3.3 but I thought I would bring this to light just in case its still relevant

    Decimal hashing is very slow, even for simple numbers. I found by caching the hash result, there is significant speed up whenever a Decimal value is reused.

    I created a class that inherits from Decimal and changed the __hash__ function to try/except a stored hash value as such:

        def __hash__(self):
            try: return self.myHash;
            except:
                self.myHash = super().__hash__();
                return self.myHash;

    Example code:

    d = dict();
    start = time.time();
    i = int(1);
    j = int(2);
    k = int(3);
    for i in range(100000):
        d[i,j,k] = 5;
    print(time.time() - start);
    
    d = dict();
    start = time.time();
    i = Decimal(1);
    j = Decimal(2);
    k = Decimal(3);
    for i in range(100000):
        d[i,j,k] = 5;
    print(time.time() - start);
    
    d = dict();
    start = time.time();
    i = CachingDecimal(1);
    j = CachingDecimal(2);
    k = CachingDecimal(3);
    for i in range(100000):
        d[i,j,k] = 5;
    print(time.time() - start);

    Output
    int: 0.04699993133544922
    Decimal: 2.312999963760376
    CachingDecimal: 0.1100001335144043

    In a real life example, I changed some of the values in my code from int to Decimal because non-whole numbers needed to be supported (and this seemed like the easiest way to do so without having to worry about my == comparisons breaking) and it slowed my code down massively. Changing to a CachingDecimal type sped up one code block from 92 seconds with Decimal to 7 seconds, which was much closer to the original int speed.

    Note that all of the relevant operations have to be overloaded to return the CachingDecimal type, which is a pain, so this makes a lot of sense to implement into the Decimal module. My understanding is that altering a Decimal value will always create a new Decimal object, so there shouldn't be any issues with the cached hash desyncing with the correct hash. Someone may want to check that though.

    Thanks,

    James

    @Jimbofbx Jimbofbx mannequin added stdlib Python modules in the Lib dir performance Performance or resource usage labels Apr 2, 2012
    @serhiy-storchaka
    Copy link
    Member

    It would be better if you provide a working script that demonstrates the issue. I have completed your example (attached).

    I ran the example on Python 3.1 and received:

    0.249999046326
    9.8737308979
    0.587980985641

    Then I ran on Python 3.3:

    0.2100839614868164
    0.8649246692657471
    0.6062228679656982

    As you can see, the new implementation is much faster. Benefit from caching decreased. I suppose, if we implement caching in C the difference will be more.

    Then I increased the size of the cycles in 10 times, and the time is almost equal (on Python 3):

    1.8386573791503906
    8.418540477752686
    8.355770826339722

    That I can't to explain. The time of cached version increased disproportionately, more than in 10 times.

    @Jimbofbx
    Copy link
    Mannequin Author

    Jimbofbx mannequin commented Apr 2, 2012

    If I increase the cycles increased 10x with 3.2 I get:

    int: 0.4219999313354492
    Decimal: 24.20299983024597
    CachingDecimal: 1.7809998989105225

    The sample you have provided is basically what I'm using. See attached

    What about worst case hash calculation time for Decimal? How does the C code handle that? This is if I increase the values themselves by 100x, same number of loops as above

    int: 0.5309998989105225
    CachingDecimal: 2.078000068664551
    Decimal: 41.2979998588562

    @Jimbofbx
    Copy link
    Mannequin Author

    Jimbofbx mannequin commented Apr 2, 2012

    100x should be e100

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Apr 3, 2012

    I agree that caching the hash would be useful for 3.2, but the
    request comes at a unfortunate time: 3.2.3 is about to be released,
    and there's no way that the change would go into it.

    So let's focus on the C version in 3.3. These are the timings on a
    64-bit machine with the C version in 3.3:

    int: 0.537806510925293
    CachingDecimal: 2.2549374103546143
    Decimal: 1.8158345222473145

    These are the timings with a hacked C version that caches the hash:

    int: 0.5755119323730469
    CachingDecimal: 2.3034861087799072
    Decimal: 0.4364290237426758

    The hash calculation time depends on the size of the coefficient
    of the Decimal and the exponent. Note that the context is not
    applied when using the Decimal constructor:

    >>> Decimal(1e100)
    Decimal('10000000000000000159028911097599180468360808563945281389781327557747838772170381060813469985856815104')

    So the numbers you are using have an unusually high precision for
    regular decimal floating point arithmetic.

    If you want well defined limits, I suggest using either:

    >>> Decimal('1e100')
    Decimal('1E+100')

    Or, if the input really must be a float:

    >>> c = getcontext()
    >>> c.create_decimal(1e100)
    Decimal('1.000000000000000015902891110E+100')

    In that latter case, of course the conversion is inexact and
    rounded (but hashing will be faster).

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Apr 3, 2012

    but hashing will be faster.

    I retract that. The exponent actually makes things worse.

    @RamchandraApte
    Copy link
    Mannequin

    RamchandraApte mannequin commented Apr 5, 2012

    I recommend that __hash__ should use functools.lru_cache for caching.

    @Jimbofbx
    Copy link
    Mannequin Author

    Jimbofbx mannequin commented Apr 5, 2012

    I presume you mean in 3.2? Have you looked at the source code for that decorator? It's fundamentally a try/except but with a lot more unnecessary bloat than is needed for caching a single int result from a function with no arguments. Its actually a lot slower.

    If this is likely going to see use in 3.3 then it would probably just be a long int since 3.3 uses C. 0 would indicate uncalculated.

    Hash function would have to be set up to never return 0.

    Also every function would need to be tested to make sure there isn't any "in-place" modification of the Decimal object that could alter the hash value.

    I like how the cached hash in 3.3 is faster than int for hashing. Shouldn't an int just return itself? Why would it be slower?

    @jimjjewett
    Copy link
    Mannequin

    jimjjewett mannequin commented Apr 6, 2012

    @jimbofbx: You are correct that a value has to be reserved to indicate that the hash hasn't been (or can't be) computed, but python uses -1 instead of zero.

    An integer can't return itself because a hash is an unboxed integer; that said, there is a fast path for small enough integers. You can see the actual code at
    http://hg.python.org/cpython/file/388016438a50/Objects/longobject.c#l2581

    Since it is too late for 3.2, I suggest closing this and opening a separate 3.3 issue if the existing improvements are not sufficient.

    @rhettinger rhettinger self-assigned this Apr 7, 2012
    @pitrou
    Copy link
    Member

    pitrou commented Apr 7, 2012

    I recommend that __hash__ should use functools.lru_cache for caching.

    Why would you do such a thing? A hash value is a single 64-bit slot, no need to add the memory consumption of a whole dictionary and the runtime cost of a LRU eviction policy when you can simply cache the hash in the object itself (like we already do for strings)...

    @serhiy-storchaka
    Copy link
    Member

    > I recommend that __hash__ should use functools.lru_cache for caching.
    Why would you do such a thing? A hash value is a single 64-bit slot, no need to add the memory consumption of a whole dictionary and the runtime cost of a LRU eviction policy when you can simply cache the hash in the object itself (like we already do for strings)...

    It was a joke (I think). Taking into account the fact that LRU cache
    uses a hashtable and need to calculate the hash of arguments (i.e., the
    Decimal self) to get the cached value of hash.

    @pitrou
    Copy link
    Member

    pitrou commented Apr 7, 2012

    > > I recommend that __hash__ should use functools.lru_cache for caching.
    > Why would you do such a thing? A hash value is a single 64-bit slot, no need to add the memory consumption of a whole dictionary and the runtime cost of a LRU eviction policy when you can simply cache the hash in the object itself (like we already do for strings)...

    It was a joke (I think). Taking into account the fact that LRU cache
    uses a hashtable and need to calculate the hash of arguments (i.e., the
    Decimal self) to get the cached value of hash.

    Damn. Shame on me for not understanding Raymond's humour :-)

    @pitrou
    Copy link
    Member

    pitrou commented Apr 7, 2012

    Le samedi 07 avril 2012 à 17:22 +0000, Raymond Hettinger a écrit :

    ----------
    keywords: +patch
    Added file: http://bugs.python.org/file25152/decimal_hash.diff

    I think patching the C version of Decimal would be more useful :)

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Apr 10, 2012

    I think it would be a good idea to fix this in the Python version
    for the other implementations.

    @jimjjewett
    Copy link
    Mannequin

    jimjjewett mannequin commented Apr 10, 2012

    Stefan Krah has a good point.

    Since the only cost is an extra slot, and this is for users who have already chosen to use Decimal instead of a more efficient (but possibly less accurate) representation, even without the native speedups to Decimal ... I guess I'm now +1.

    It is clearly an implementation detail, so I don't think the docs need to be changed. I'm not sure the tests do, nor am I sure *how* to test for it in a black-box fashion.

    I have reviewed the patch and have no objections.

    So unless a new objection comes up in the next few days, I would say this is ready to be checked in.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Apr 10, 2012

    New changeset a012d5df2c73 by Stefan Krah in branch 'default':
    Issue bpo-14478: Cache the hash of a Decimal in the C version.
    http://hg.python.org/cpython/rev/a012d5df2c73

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Apr 10, 2012

    I changed the C version to cache the hash as well: For the submitted
    test case the speedup is only 5x, but hashing times vary greatly
    depending of the size of the coefficient and the exponent.

    BTW, the tests already call both hash() and __hash__() on the same
    number, so retrieving the cached value is actually tested.

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Apr 10, 2012

    The patch for the Python version looks good to me.

    @serhiy-storchaka
    Copy link
    Member

    The patch for the Python version looks good to me

    Oh, but used by James Hutchison approach is faster. When I disable the
    _decimal:

    Unpatched:
    int: 2.24056077003479
    CachingDecimal: 8.49468207359314
    Decimal: 187.68132972717285

    With rhettinger's LBYL patch:
    int: 2.1670639514923096
    CachingDecimal: 8.790924310684204
    Decimal: 10.426796436309814

    With EAFP patch:
    int: 2.1392786502838135
    CachingDecimal: 8.431122303009033
    Decimal: 8.263015270233154

    @Jimbofbx
    Copy link
    Mannequin Author

    Jimbofbx mannequin commented Apr 10, 2012

    In the patch:

    This:
    + except AttributeError:
    + pass

    should be:
    + except:
    <everything inside except statement>

    Checking for the AttributeError is very slightly slower. Not by a lot, but I think if we're going for speed we might as well be as fast as possible. I can't imagine any other exception coming from that try statement.

    Using except: pass as opposed to sticking everything inside the except statement is also very slightly slower as well

    Simple test case, 10 million loops:
    except: 7.140999794006348
    except AttributeError: 7.8440001010894775

    Exception code
    in except: 7.483999967575073
    after except/pass: 7.75

    @serhiy-storchaka
    Copy link
    Member

    I can't imagine any other exception coming from that try statement.

    KeyboardInterrupt

    @amauryfa
    Copy link
    Contributor

    Checking for the AttributeError is very slightly slower
    Why are you trying to micro-optimize the Python version, when the C implementation does it better and faster?

    @serhiy-storchaka
    Copy link
    Member

    Using except: pass as opposed to sticking everything inside the except statement is also very slightly slower as well

    I ran the test several times and didn't see the difference between pass
    and sticking variants more than between different runs of the same
    variant. If it is, then this is a reason for the interpreter
    optimization.

    @rhettinger
    Copy link
    Contributor

    I don't see a need to cache the hash value in the pure Python version.

    @serhiy-storchaka
    Copy link
    Member

    The C version of decimal may not always be available. In particular, it is not compatible with C89. Therefore, efficiency of the pure Python version of decimal is important.

    Any chance to get it in Python 3.3?

    @mdickinson
    Copy link
    Member

    I agree with Raymond: I don't see a real need to patch the Python version here. If we do want to patch the Python version, I'd go with Raymond's simple patch.

    @mdickinson
    Copy link
    Member

    Closing as fixed (for the C version, at least).

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    performance Performance or resource usage stdlib Python modules in the Lib dir
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants