Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Test error on linux gpu with fastai #1047

Closed
miguelgfierro opened this issue Feb 10, 2020 · 7 comments
Closed

[BUG] Test error on linux gpu with fastai #1047

miguelgfierro opened this issue Feb 10, 2020 · 7 comments
Assignees
Labels
bug Something isn't working

Comments

@miguelgfierro
Copy link
Collaborator

miguelgfierro commented Feb 10, 2020

Description

    @pytest.mark.smoke
    @pytest.mark.gpu
    def test_fastai_smoke(notebooks):
        notebook_path = notebooks["fastai"]
        pm.execute_notebook(
            notebook_path,
            OUTPUT_NOTEBOOK,
            kernel_name=KERNEL_NAME,
>           parameters=dict(TOP_K=10, MOVIELENS_DATA_SIZE="100k", EPOCHS=1),
        )

tests/smoke/test_notebooks_gpu.py:72: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
/data/anaconda/envs/nightly_reco_gpu/lib/python3.6/site-packages/papermill/execute.py:100: in execute_notebook
    raise_for_execution_errors(nb, output_path)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

nb = {'cells': [{'cell_type': 'code', 'metadata': {'inputHidden': True, 'hide_input': True}, 'execution_count': None, 'sour...nd_time': '2020-02-10T16:14:36.593733', 'duration': 14.501099, 'exception': True}}, 'nbformat': 4, 'nbformat_minor': 2}
output_path = 'output.ipynb'

    def raise_for_execution_errors(nb, output_path):
        """Assigned parameters into the appropriate place in the input notebook
    
        Parameters
        ----------
        nb : NotebookNode
           Executable notebook object
        output_path : str





E           --> 150         with pd.option_context('display.max_colwidth', -1):
E               151             display(HTML(df.to_html(index=False)))
E               152 
E           
E           /data/anaconda/envs/nightly_reco_gpu/lib/python3.6/site-packages/pandas/_config/config.py in __enter__(self)
E               405 
E               406         for pat, val in self.ops:
E           --> 407             _set_option(pat, val, silent=True)
E               408 
E               409     def __exit__(self, *args):
E           
E           /data/anaconda/envs/nightly_reco_gpu/lib/python3.6/site-packages/pandas/_config/config.py in _set_option(*args, **kwargs)
E               125         o = _get_registered_option(key)
E               126         if o and o.validator:
E           --> 127             o.validator(v)
E               128 
E               129         # walk the nested dict
E           
E           /data/anaconda/envs/nightly_reco_gpu/lib/python3.6/site-packages/pandas/_config/config.py in is_nonnegative_int(value)
E               842 
E               843     msg = "Value must be a nonnegative integer or None"
E           --> 844     raise ValueError(msg)
E               845 
E               846 
E           
E           ValueError: Value must be a nonnegative integer or None




link to log: https://dev.azure.com/best-practices/recommenders/_build/results?buildId=22669&view=logs&j=513fa82f-6bf5-5d01-2457-cb22781e601b&t=9e260265-5f8c-5335-1126-375960d54d1f

In which platform does it happen?

How do we replicate the issue?

Expected behavior (i.e. solution)

Other Comments

@miguelgfierro miguelgfierro added the bug Something isn't working label Feb 10, 2020
@anargyri
Copy link
Collaborator

This link points to NNI (the same error as the other issue), not to FastAI.

@loomlike loomlike self-assigned this Feb 11, 2020
@loomlike
Copy link
Collaborator

@anargyri @miguelgfierro So is this nni-related issue or fastai with a wrong link?

@miguelgfierro
Copy link
Collaborator Author

it is fastai, I fixed the link

@anargyri
Copy link
Collaborator

I can't replicate either this or the NNI issue. Both tests pass on another VM after generating the conda files from scratch. Most likely there is some problem with the build VM, not with the code. I suggest you try again on the build VM with a completely new and fresh conda environment.

@loomlike
Copy link
Collaborator

Based on the error message, I believe the error is related to this: https://forums.fast.ai/t/fastai-v1-and-pandas-1-0-0-error-in-tabular-data-show-xys/63210

Seems like the test is now passing. Has either fastai package or pandas package version changed in reco_gpu.yaml?

@miguelgfierro
Copy link
Collaborator Author

it is weird, the last 5 or so tests failed and today it runs correctly. Not sure why, I also saw that the master tests are not running daily https://dev.azure.com/best-practices/recommenders/_build/results?buildId=21076&view=results (last time it run was on Jan 18), does anyone changed this?

@loomlike
Copy link
Collaborator

Confirmed (and replicated) the issue with changing pandas package versions.
pandas==1.0.0 causes the issue.
pandas==1.0.1 fixes.

So, how do we want to handle this at the env file now...?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants