Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Saving a modules __dict__ attribute #41

Merged
merged 4 commits into from
May 28, 2014

Conversation

matsjoyce
Copy link
Contributor

Possibly a follow up for #35, but saving a session doesn't save the state of the modules:

Python 3.4.1 (default, May 19 2014, 17:23:49) 
[GCC 4.9.0 20140507 (prerelease)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import dill
>>> import test
>>> test.a = 2
>>> dill.dump_session()
>>> 
==========================Restart==========================
Python 3.4.1 (default, May 19 2014, 17:23:49) 
[GCC 4.9.0 20140507 (prerelease)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import dill
>>> dill.load_session()
>>> test
<module 'test' from 'test.py'>
>>> test.a
1
>>>

test.py:

def f():
    print(x)
a = 1

I think this code does the job, but it may need a few adjustments, depending which modules need to be pickled.

@mmckerns
Copy link
Member

Ok, I'm taking a look at your code. It would be nice if your test failed with an AssertionError instead of an AttributeError.

dude@hilbert>$ python test_module.py 
Traceback (most recent call last):
  File "test_module.py", line 23, in <module>
    assert module.a == 1234
AttributeError: 'module' object has no attribute 'a'

@matsjoyce
Copy link
Contributor Author

Oops...
I'll change it from

assert module.a == 1234

to

assert hasattr(module, "a") and module.a == 1234

I'll wait to see if there is anything else, though.

@mmckerns
Copy link
Member

should probably clean up the test_mixins.pyc file after the test is run.

I'm poking around on the different python versions and different tests now.

@matsjoyce
Copy link
Contributor Author

It would be nice if your test failed with an AssertionError instead of an AttributeError.
should probably clean up the test_mixins.pyc file after the test is run.

Done

@mmckerns
Copy link
Member

Python 3.3.5 (default, Mar 10 2014, 21:37:38) 
[GCC 4.2.1 Compatible Apple Clang 4.1 ((tags/Apple/clang-421.11.66))] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import dill
>>> import test_mixins
>>> test_mixins.a = 2
>>> dill.dump_session()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/mmckerns/lib/python3.3/site-packages/dill-0.2.1.dev-py3.3.egg/dill/dill.py", line 182, in dump_session
    pickler.dump(main_module)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/pickle.py", line 235, in dump
    self.save(obj)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/pickle.py", line 297, in save
    f(self, obj) # Call unbound method with explicit self
  File "/Users/mmckerns/lib/python3.3/site-packages/dill-0.2.1.dev-py3.3.egg/dill/dill.py", line 772, in save_module
    state=_main_dict)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/pickle.py", line 430, in save_reduce
    save(state)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/pickle.py", line 297, in save
    f(self, obj) # Call unbound method with explicit self
  File "/Users/mmckerns/lib/python3.3/site-packages/dill-0.2.1.dev-py3.3.egg/dill/dill.py", line 504, in save_module_dict
    StockPickler.save_dict(pickler, obj)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/pickle.py", line 633, in save_dict
    self._batch_setitems(obj.items())
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/pickle.py", line 666, in _batch_setitems
    save(v)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/pickle.py", line 297, in save
    f(self, obj) # Call unbound method with explicit self
  File "/Users/mmckerns/lib/python3.3/site-packages/dill-0.2.1.dev-py3.3.egg/dill/dill.py", line 772, in save_module
    state=_main_dict)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/pickle.py", line 430, in save_reduce
    save(state)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/pickle.py", line 297, in save
    f(self, obj) # Call unbound method with explicit self
  File "/Users/mmckerns/lib/python3.3/site-packages/dill-0.2.1.dev-py3.3.egg/dill/dill.py", line 504, in save_module_dict
    StockPickler.save_dict(pickler, obj)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/pickle.py", line 633, in save_dict
    self._batch_setitems(obj.items())
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/pickle.py", line 666, in _batch_setitems
    save(v)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/pickle.py", line 297, in save
    f(self, obj) # Call unbound method with explicit self
  File "/Users/mmckerns/lib/python3.3/site-packages/dill-0.2.1.dev-py3.3.egg/dill/dill.py", line 467, in save_function
    obj.__dict__), obj=obj)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/pickle.py", line 412, in save_reduce
    save(args)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/pickle.py", line 297, in save
    f(self, obj) # Call unbound method with explicit self
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/pickle.py", line 553, in save_tuple
    save(element)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/pickle.py", line 297, in save
    f(self, obj) # Call unbound method with explicit self
  File "/Users/mmckerns/lib/python3.3/site-packages/dill-0.2.1.dev-py3.3.egg/dill/dill.py", line 456, in save_code
    pickler.save_reduce(_unmarshal, (marshal.dumps(obj),), obj=obj)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/pickle.py", line 412, in save_reduce
    save(args)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/pickle.py", line 297, in save
    f(self, obj) # Call unbound method with explicit self
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/pickle.py", line 539, in save_tuple
    save(element)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/pickle.py", line 297, in save
    f(self, obj) # Call unbound method with explicit self
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/pickle.py", line 500, in save_bytes
    (str(obj, 'latin1'), 'latin1'), obj=obj)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/pickle.py", line 411, in save_reduce
    save(func)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/pickle.py", line 297, in save
    f(self, obj) # Call unbound method with explicit self
  File "/Users/mmckerns/lib/python3.3/site-packages/dill-0.2.1.dev-py3.3.egg/dill/dill.py", line 618, in save_builtin_method
    pickler.save_reduce(_get_attr, (obj.__self__, obj.__name__), obj=obj)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/pickle.py", line 416, in save_reduce
    self.memoize(obj)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/pickle.py", line 255, in memoize
    assert id(obj) not in self.memo
AssertionError

Possibly an issue of getting rid of if is_dill(pickler)?

@matsjoyce
Copy link
Contributor Author

😩 Sorry, mistake in condition.

@matsjoyce
Copy link
Contributor Author

Do you want me to squash the commits?

@mmckerns
Copy link
Member

It still doesn't seem to work for me with dump_session.

Python 3.3.5 (default, Mar 10 2014, 21:37:38) 
[GCC 4.2.1 Compatible Apple Clang 4.1 ((tags/Apple/clang-421.11.66))] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import dill
>>> import numpy as np
>>> np.sin = np.cos
>>> import test_mixins as module
>>> module.a = 2
>>> dill.dump_session('foo.pik')
>>> ^D
dude@hilbert>$ python
Python 3.3.5 (default, Mar 10 2014, 21:37:38) 
[GCC 4.2.1 Compatible Apple Clang 4.1 ((tags/Apple/clang-421.11.66))] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import dill
>>> dill.load_session('foo.pik')
>>> np
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'np' is not defined
>>> module
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'module' is not defined

@matsjoyce
Copy link
Contributor Author

😩 Still the condition. I've combined it with the old condition, so the __main__ module ought to pickle correctly.

@matsjoyce
Copy link
Contributor Author

should probably clean up the test_mixins.pyc file after the test is run.

tests/test_nested.py doesn't clean up when run by python 3. Shall I file a separate issue?

@mmckerns
Copy link
Member

Looks like matsjoyce@7bcf040 works both in a file and with dump_session. It looks to work nicely in most cases I can think of.

I could accept it, as is… because it adds a feature and doesn't remove any… however, I'd like to ask your opinion on proceeding. Should the following be the correct behavior? It currently is, b/c "standard" modules are stored by reference… but it is definitely possible to add something to a standard module's dict.

>>> import dill
>>> import numpy as np
>>> np.min = np.max
>>> dill.dump_session()
>>> ^D
############ restart ############
>>> import dill
>>> dill.load_session()
>>> np.min([1,2,3,4,5])
1

@mmckerns
Copy link
Member

Note that this is also the current behavior, so maybe it makes sense? I'm not sure.

Python 2.7.6 (default, Nov 12 2013, 13:26:39) 
[GCC 4.2.1 Compatible Apple Clang 4.1 ((tags/Apple/clang-421.11.66))] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import dill        
>>> import abc
>>> abc.ABCMeta.zzz = 1
>>> p = dill.copy(abc.ABCMeta)
>>> p
<class 'abc.ABCMeta'>
>>> p.zzz
1
>>> dill.dump_session('foo.pik')
>>> ^D
dude@hilbert>$ python
Python 2.7.6 (default, Nov 12 2013, 13:26:39) 
[GCC 4.2.1 Compatible Apple Clang 4.1 ((tags/Apple/clang-421.11.66))] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import dill
>>> dill.load_session('foo.pik')
>>> p
<class 'abc.ABCMeta'>
>>> p.zzz
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: type object 'ABCMeta' has no attribute 'zzz'

@matsjoyce
Copy link
Contributor Author

The reason for excluding standard modules is that most of them don't pickle (sys, re, etc). If a list of picklable or non-picklable modules could be created, that behaviour could be changed to be more logical. Could a list be made though dill._objects?

from dill import _objects
{getattr(i.__class__, '__module__', "")  for i in _objects.x.values()}

@matsjoyce
Copy link
Contributor Author

Using a function like:

def _can_pickle_module(mod, _cache={}, _recursion_protection=[None]):
    if _recursion_protection[0] is not None:
        return mod is _recursion_protection[0]
    if mod not in _cache:
        _recursion_protection[0] = mod
        try:
            dumps(mod)
        except:
            _cache[mod] = False
        else:
            _cache[mod] = True
        finally:
            _recursion_protection[0] = None
    return _cache[mod]

and the condition in save_module being:

if _can_pickle_module(obj) or is_dill(pickler) and obj is pickler._main_module:

it'll work. However, this approach is less efficient.

@matsjoyce
Copy link
Contributor Author

The issue with abc.ABCMeta is that is it dumped like:

>>> dill.dumps(p)
T4: <class 'abc.ABCMeta'>
b'\x80\x03cabc\nABCMeta\nq\x00.'

The type is handed to pickle, which does not preserve the __dict__ attribute. An edit to dill.dill.save_type would be needed to change that.

@mmckerns
Copy link
Member

Right. What I'm saying, is what do you think the expected behavior should be?

My solution to abc.ABCMeta, is exactly the same as yours… but do you think it's the proper thing to do? I'm tending to think it might be.

My solution to an edited __dict__ on a builtin or "standard module" would be to pickle the object by reference like it is now, but also pickle it's existing __dict__… then reconstitute in unpicking. Make a _save_module that serializes the module as is done now, and also serializes the __dict__… and in unpicking, deserializes the module, then deserializes the associated __dict__ and applies it to the module before returning it to the user. Then it doesn't matter which modules serialize and which don't. The issue is serializing the associated __dict__.

@matsjoyce
Copy link
Contributor Author

My solution to abc.ABCMeta, is exactly the same as yours… but do you think it's the proper thing to do? I'm tending to think it might be.

Yes, as if dill.dump_session is supposed to restore the session exactly, the more that is saved, the better.

My solution to an edited __dict__ on a builtin or "standard module" would be to pickle the object by reference like it is now, but also pickle it's existing __dict__… then reconstitute in unpicking. Make a _save_module that serializes the module as is done now, and also serializes the __dict__… and in unpicking, deserializes the module, then deserializes the associated __dict__ and applies it to the module before returning it to the user. Then it doesn't matter which modules serialize and which don't. The issue is serializing the associated __dict__.

All the builtin modules (why do I keep calling them standard...) serialize by reference, but the __dicts__ don't:

>>> import dill, sys
>>> dill.dumps(sys.__dict__)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/matthew/GitHub/dill/dill/dill.py", line 143, in dumps
    dump(obj, file, protocol, byref)
  File "/home/matthew/GitHub/dill/dill/dill.py", line 136, in dump
    pik.dump(obj)
  File "/usr/lib/python3.4/pickle.py", line 410, in dump
    self.save(obj)
  File "/usr/lib/python3.4/pickle.py", line 477, in save
    f(self, obj) # Call unbound method with explicit self
  File "/home/matthew/GitHub/dill/dill/dill.py", line 504, in save_module_dict
    StockPickler.save_dict(pickler, obj)
  File "/usr/lib/python3.4/pickle.py", line 812, in save_dict
    self._batch_setitems(obj.items())
  File "/usr/lib/python3.4/pickle.py", line 838, in _batch_setitems
    save(v)
  File "/usr/lib/python3.4/pickle.py", line 522, in save
    self.save_reduce(obj=obj, *rv)
  File "/usr/lib/python3.4/pickle.py", line 600, in save_reduce
    save(func)
  File "/usr/lib/python3.4/pickle.py", line 477, in save
    f(self, obj) # Call unbound method with explicit self
  File "/home/matthew/GitHub/dill/dill/dill.py", line 827, in save_type
    StockPickler.save_global(pickler, obj)
  File "/usr/lib/python3.4/pickle.py", line 918, in save_global
    (obj, module_name, name))
_pickle.PicklingError: Can't pickle <class 'sys.flags'>: it's not the same object as sys.flags

If we have a way to test whether a module's __dict__ can be serialised, then it'll all be fine. I'm not sure if there is a cleverer way of checking that than _can_pickle_module above.

mmckerns added a commit that referenced this pull request May 28, 2014
Saving a modules __dict__ attribute
@mmckerns mmckerns merged commit 86e0bb8 into uqfoundation:master May 28, 2014
@mmckerns
Copy link
Member

This issues should stay open, or be a new issue.

@matsjoyce matsjoyce deleted the module_dict branch May 28, 2014 11:04
@mmckerns
Copy link
Member

And, with regard to your test above… you have to be extra careful with a __dict__… in many cases, it might actually be a weakref… so it will break unless you grab it the right way. I haven't investigated what kind of beast it is, however.

@matsjoyce
Copy link
Contributor Author

Shall I file new issues for:

  • type's __dict__s not being pickled
  • tests/test_nested.py doesn't clean up when run by python 3
  • builtin modules __dict__s are not pickled

@mmckerns
Copy link
Member

I was just typing the same. Sure.

@mmckerns
Copy link
Member

ref this issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants