Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Level 3 model fails due to not finding second levels results #101

Open
adswa opened this issue Jan 29, 2019 · 24 comments
Open

Level 3 model fails due to not finding second levels results #101

adswa opened this issue Jan 29, 2019 · 24 comments

Comments

@adswa
Copy link
Contributor

adswa commented Jan 29, 2019

(migrating this from pybids where I falsely posted it before) pinging @effigies (@yarikoptic promised me to give you a hydra account that I can finally share the data).

We've stripped our dataset to a bare minimum: 3 subjects, each with 2 runs, and a model file with only one event in X to reduce computation time. You can find the dataset on hydra in the directory
/data/movieloc/backup_store/publish-BIDSSacc on branch slim (please disregard the master branch for now - it is a couple of weeks behind and I don't want to push the currently rather messy state of it).
The model (found under models/movel_v3_smdl_autocon_splitconv_oneevent.json is evoked with

fitlins . Jan_28 'run' -m models/movel_v3_smdl_autocon_splitconv_oneevent.json --desc highpass --space 'MNI152NLin6Sym' -d $PWD -w 'Jan_28_wd' --n-cpus 3

Click here to expand our model
{
    "name": "FEF_localizer",
    "Input": {
        "session": "movie"
        },
    "Steps": [
        {
            "Level": "run",
            "Model": {
                "X": [
                    "amplitude_.RIGHT"
                    ]
            },
            "Contrasts": [],
            "AutoContrasts": true,
            "Transformations": [{
                "Name": "Split",
                "Input": ["amplitude_"],
                "By": ["trial_type"]
                },
                {
                "Name": "Convolve",
                "Input": ["amplitude_.RIGHT"],
                "Model": "spm"
                  }]
            },
        {
            "Level": "subject",
            "AutoContrasts": true
        },
        {
            "Level": "dataset",
            "AutoContrasts": true
        }]
}

The model fails on level 3 (dataset) with

ValueError: A second level model requires a list with atleast two first level models or niimgs

after producing seemingly sensible output on the first (run) and second (subject) level. As far as I can judge, the files that the level 3 model should ingest have been created.

Here is the traceback
Traceback (most recent call last):
  File "/home/adina/Repos/nipype/nipype/pipeline/plugins/multiproc.py", line 69, in run_node
    result['result'] = node.run(updatehash=updatehash)
  File "/home/adina/Repos/nipype/nipype/pipeline/engine/nodes.py", line 473, in run
    result = self._run_interface(execute=True)
  File "/home/adina/Repos/nipype/nipype/pipeline/engine/nodes.py", line 1254, in _run_interface
    self.config['execution']['stop_on_first_crash'])))
  File "/home/adina/Repos/nipype/nipype/pipeline/engine/nodes.py", line 1176, in _collate_results
    (self.name, '\n'.join(msg)))
Exception: Subnodes of node: l3_model failed:
Subnode 0 failed
Error: Traceback (most recent call last):

  File "/home/adina/Repos/nipype/nipype/pipeline/engine/utils.py", line 99, in nodelist_runner
    result = node.run(updatehash=updatehash)

  File "/home/adina/Repos/nipype/nipype/pipeline/engine/nodes.py", line 473, in run
    result = self._run_interface(execute=True)

  File "/home/adina/Repos/nipype/nipype/pipeline/engine/nodes.py", line 557, in _run_interface
    return self._run_command(execute)

  File "/home/adina/Repos/nipype/nipype/pipeline/engine/nodes.py", line 637, in _run_command
    result = self._interface.run(cwd=outdir)

  File "/home/adina/Repos/nipype/nipype/interfaces/base/core.py", line 369, in run
    runtime = self._run_interface(runtime)

  File "/home/adina/Repos/fitlins/fitlins/interfaces/nistats.py", line 174, in _run_interface
    model.fit(input, design_matrix=design_matrix)

  File "/home/adina/env/fitlins/local/lib/python3.5/site-packages/nistats/second_level_model.py", line 164, in fit
    raise ValueError('A second level model requires a list with at'

ValueError: A second level model requires a list with atleast two first level models or niimgs

Reproducibility sits in a bar somewhere and laughs its ass of, but I'm trying to also give an overview of custom changes @yarikoptic and I made to the fitlins sourcecode. I don't see any particular relevance of the changes we made to the issue at hand (mostly hardcoding quickfixes for issues that erose), but then again, I'm certainly not the one to judge what is of relevance and what not ;-) , and an attempt to rerun the model would fail without the additional space choice and the selection of only one of two identical bold files being returned.

pybids changes
diff --git a/fitlins/cli/run.py b/fitlins/cli/run.py
index 28dcdba..f64e14e 100755
--- a/fitlins/cli/run.py
+++ b/fitlins/cli/run.py
@@ -85,15 +85,15 @@ def get_parser():
     g_bids.add_argument('--derivative-label', action='store', type=str,
                         help='execution label to append to derivative directory name')
     g_bids.add_argument('--space', action='store',
-                        choices=['MNI152NLin2009cAsym', ''],
+                        choices=['MNI152NLin2009cAsym', '', 'MNI152NLin6Sym'],
                         default='MNI152NLin2009cAsym',
                         help='registered space of input datasets. Empty value for no explicit space.')

diff --git a/fitlins/interfaces/bids.py b/fitlins/interfaces/bids.py
index 919742f..a7b98a9 100644
--- a/fitlins/interfaces/bids.py
+++ b/fitlins/interfaces/bids.py
@@ -177,7 +177,7 @@ class LoadBIDSModel(SimpleInterface):
         selectors = self.inputs.selectors
 
         analysis = Analysis(model=self.inputs.model, layout=layout)
-        analysis.setup(drop_na=False, desc='preproc', **selectors)
+        analysis.setup(drop_na=False, desc='highpass', space='MNI152NLin6Sym', **selectors)
         self._load_level1(runtime, analysis)
         self._load_higher_level(runtime, analysis)
 
@@ -198,25 +198,37 @@ class LoadBIDSModel(SimpleInterface):
[...]
-            if len(preproc_files) != 1:
-                raise ValueError('Too many BOLD files found')
+            # ATM we could get multiple entries for the same file
+            # see https://github.com/bids-standard/pybids/issues/350
+            if len(set(f.path for f in preproc_files)) != 1:
+                raise ValueError(
+                    'Too many (%d) BOLD files found: %s'
+                    % (len(preproc_files), ', '.join(preproc_files))
+                )
 
             fname = preproc_files[0].path
 

Do you have any idea what I am missing here to figure out why the dataset level does not work?

@adswa
Copy link
Contributor Author

adswa commented Jan 30, 2019

I have a preliminary update on this:
The failure occurs in

for name, weights, type in prepare_contrasts(self.inputs.contrast_info, names):
# Need to add F-test support for intercept (more than one column)
# Currently only taking 0th column as intercept (t-test)
weights = weights[0]
input = (np.array(filtered_files)[weights != 0]).tolist()
design_matrix = pd.DataFrame({'intercept': weights[weights != 0]})
model.fit(input, design_matrix=design_matrix)
stat = model.compute_contrast(second_level_stat_type=type)
stat_fname = os.path.join(runtime.cwd, '{}.nii.gz').format(name)
stat.to_filename(stat_fname)
contrast_maps.append(stat_fname)
contrast_metadata.append({'contrast': name, **out_ents})

during the third level model. The problem is that self.input.contrast_info for a reason I do not yet understand ingests entries from the participants.tsv file and turns them into contrasts.

This is how `self.input.contrast_info` looks like for a 3rd level model
(Pdb) self.inputs.contrast_info
[{'entities': {'session': 'movie', 'task': 'avmovie'}, 'weights': [{'forrest_av_rating': 1}], 'type': 't', 'name': 'forrest_av_rating'}, {'entities': {'session': 'movie', 'task': 'avmovie'}, 'weights': [{'forrest_seen_languages': 1}], 'type': 't', 'name': 'forrest_seen_languages'}, {'entities': {'session': 'movie', 'task': 'avmovie'}, 'weights': [{'hearing_problems_current': 1}], 'type': 't', 'name': 'hearing_problems_current'}, {'entities': {'session': 'movie', 'task': 'avmovie'}, 'weights': [{'forrest_seen_dist': 1}], 'type': 't', 'name': 'forrest_seen_dist'}, {'entities': {'session': 'movie', 'task': 'avmovie'}, 'weights': [{'forrest_av_feeling': 1}], 'type': 't', 'name': 'forrest_av_feeling'}, {'entities': {'session': 'movie', 'task': 'avmovie'}, 'weights': [{'age': 1}], 'type': 't', 'name': 'age'}, {'entities': {'session': 'movie', 'task': 'avmovie'}, 'weights': [{'forrest_seen_count': 1}], 'type': 't', 'name': 'forrest_seen_count'}, {'entities': {'session': 'movie', 'task': 'avmovie'}, 'weights': [{'forrest_av_storydepth': 1}], 'type': 't', 'name': 'forrest_av_storydepth'}, {'entities': {'session': 'movie', 'task': 'avmovie'}, 'weights': [{'forrest_ad_known': 1}], 'type': 't', 'name': 'forrest_ad_known'}, {'entities': {'session': 'movie', 'task': 'avmovie'}, 'weights': [{'vision_problems_current': 1}], 'type': 't', 'name': 'vision_problems_current'}, {'entities': {'session': 'movie', 'task': 'avmovie'}, 'weights': [{'forrest_av_fatigue': 1}], 'type': 't', 'name': 'forrest_av_fatigue'}, {'entities': {'session': 'movie', 'task': 'avmovie'}, 'weights': [{'amplitude_.RIGHT': 1}], 'type': 't', 'name': 'amplitude_.RIGHT'}, {'entities': {'session': 'movie', 'task': 'avmovie'}, 'weights': [{'handedness': 1}], 'type': 't', 'name': 'handedness'}, {'entities': {'session': 'movie', 'task': 'avmovie'}, 'weights': [{'hearing_problems_past': 1}], 'type': 't', 'name': 'hearing_problems_past'}, {'entities': {'session': 'movie', 'task': 'avmovie'}, 'weights': [{'forrest_seen': 1}], 'type': 't', 'name': 'forrest_seen'}, {'entities': {'session': 'movie', 'task': 'avmovie'}, 'weights': [{'vision_problems_past': 1}], 'type': 't', 'name': 'vision_problems_past'}, {'entities': {'session': 'movie', 'task': 'avmovie'}, 'weights': [{'gender': 1}], 'type': 't', 'name': 'gender'}, {'entities': {'session': 'movie', 'task': 'avmovie'}, 'weights': [{'forrest_av_artist_count': 1}], 'type': 't', 'name': 'forrest_av_artist_count'}]

where everything apart from the entry {'entities': {'session': 'movie', 'task': 'avmovie'}, 'weights': [{'amplitude_.RIGHT': 1}], 'type': 't', 'name': 'amplitude_.RIGHT'} stems from the participants.tsv file.

The following

for name, weights, type in prepare_contrasts(self.inputs.contrast_info, names):
will then go and assign weights of zero to the first contrast_info entry (as it should), but then in the following
input = (np.array(filtered_files)[weights != 0]).tolist()
design_matrix = pd.DataFrame({'intercept': weights[weights != 0]})
model.fit(input, design_matrix=design_matrix)
attempts to give an empty list as input to model.fit(), which results in the failure I observed.

I'm not sure whether the problem lies in the fact that contrast_info is given contrast information that I did not consciously specify anywhere in my model.json (and it is weird that this only happens on third, not second level), or in the fact that it will attempt to give an empty list as input. As for a temporary fix, I modified it like this:

        for name, weights, type in prepare_contrasts(self.inputs.contrast_info, names):
            # Need to add F-test support for intercept (more than one column)
            # Currently only taking 0th column as intercept (t-test)
            weights = weights[0]
            if all(weights == np.zeros(len(names))):
                continue
            input = (np.array(filtered_files)[weights != 0]).tolist()
            design_matrix = pd.DataFrame({'intercept': weights[weights != 0]})

@adswa
Copy link
Contributor Author

adswa commented Jan 30, 2019

(I could also be entirely wrong with all of this, but it runs without failure with this added conditional statement)

@effigies
Copy link
Collaborator

Sorry, I'm having a lot of trouble getting this to work. I've forgotten... how did we resolve this:

Traceback (most recent call last):
  File "/opt/conda/envs/neuro/lib/python3.6/site-packages/nipype/pipeline/plugins/multiproc.py", line 69, in run_node
    result['result'] = node.run(updatehash=updatehash)
  File "/opt/conda/envs/neuro/lib/python3.6/site-packages/nipype/pipeline/engine/nodes.py", line 473, in run
    result = self._run_interface(execute=True)
  File "/opt/conda/envs/neuro/lib/python3.6/site-packages/nipype/pipeline/engine/nodes.py", line 557, in _run_interface
    return self._run_command(execute)
  File "/opt/conda/envs/neuro/lib/python3.6/site-packages/nipype/pipeline/engine/nodes.py", line 637, in _run_command
    result = self._interface.run(cwd=outdir)
  File "/opt/conda/envs/neuro/lib/python3.6/site-packages/nipype/interfaces/base/core.py", line 369, in run
    runtime = self._run_interface(runtime)
  File "/src/fitlins/fitlins/interfaces/bids.py", line 180, in _run_interface
    analysis.setup(drop_na=False, **selectors)
  File "/opt/conda/envs/neuro/lib/python3.6/site-packages/bids/analysis/analysis.py", line 89, in setup
    b.setup(input_nodes, drop_na=drop_na, **selectors)
  File "/opt/conda/envs/neuro/lib/python3.6/site-packages/bids/analysis/analysis.py", line 211, in setup
    coll = apply_transformations(coll, self.transformations)
  File "/opt/conda/envs/neuro/lib/python3.6/site-packages/bids/analysis/analysis.py", line 508, in apply_transformations
    func(collection, cols, **kwargs)
  File "/opt/conda/envs/neuro/lib/python3.6/site-packages/bids/analysis/transformations/base.py", line 87, in __new__
    return t.transform()
  File "/opt/conda/envs/neuro/lib/python3.6/site-packages/bids/analysis/transformations/base.py", line 261, in transform
    result = self._transform(data[i], **kwargs)
  File "/opt/conda/envs/neuro/lib/python3.6/site-packages/bids/analysis/transformations/munge.py", line 318, in _transform
    return var.to_dense(sampling_rate=sampling_rate)
  File "/opt/conda/envs/neuro/lib/python3.6/site-packages/bids/variables/variables.py", line 325, in to_dense
    duration = int(math.ceil(sampling_rate * self.get_duration()))
  File "/opt/conda/envs/neuro/lib/python3.6/site-packages/bids/variables/variables.py", line 311, in get_duration
    return sum([r.duration for r in self.run_info])
  File "/opt/conda/envs/neuro/lib/python3.6/site-packages/bids/variables/variables.py", line 311, in <listcomp>
    return sum([r.duration for r in self.run_info])
AttributeError: 'str' object has no attribute 'duration'

@adswa
Copy link
Contributor Author

adswa commented Jan 30, 2019

@yarikoptic, do you remember? I've seen this before many times but can't recollect exactly and my computer is a bit disabled due to the model running in the background. Might that have been related to #353 (but thats merged already...)?

@tyarkoni
Copy link
Collaborator

tyarkoni commented Jan 30, 2019

Have not read everything carefully, but regarding the automatic inclusion of columns available in participants.tsv, scans.tsv, etc., that's by design—otherwise we would need to add a separate set of instructions to BIDS-StatsModels that governs where/how to get variables, and that's out of its scope. The idea is that if you explicitly want to exclude variables, you can use the Select transformation, passing in the names of only the variables you want to keep. So typically you would add that as the first transformation in the list at each new level, and thereafter you can be certain that you don't have unexpected variables popping up. (I believe there may also be a Delete or Remove that does the inverse, if it's preferred.)

(FWIW, this shouldn't only be happening at the 3rd level; it happens at all levels. E.g., if you have confounds.tsv or physio.tsv.gz files, you should also be seeing extra variables show up in your run-level models, unless you're explicitly selecting what you want in a transformation. If you're not seeing available run-level variables show up, please open a separate issue, as that would be a bug.)

@adelavega
Copy link
Collaborator

Doesn't X in Model also operate as a Select transformation?

@effigies
Copy link
Collaborator

It looks like the issue is with variables created by the Split transform. Digging deeper now.

@yarikoptic
Copy link
Contributor

As Adina, isn't it @effigies this fix in pybids? bids-standard/pybids#353, if not then may be something similar, what is the string value there?

@tyarkoni
Copy link
Collaborator

@adelavega yes, but that would only make sense for the first level; past that, there's no new model being fit, just contrasts applied to estimates propagated forward. So in the subject-level model, transformations would be needed to explicitly limit what autocontrasts gets applied to.

@adswa
Copy link
Contributor Author

adswa commented Jan 30, 2019

Thanks for this information!
Yes, I noticed this automatic ingestions and iirc every column name of something ingestible (e.g. regressor.tsv files on the run level) was showing up at some point of the analysis. But only for the third level I ran into a problem. I've been wondering whether this 'only third-level' issue has something to do with the Step of the model? If I understand the contrast_info related code correctly, it is dependent on the analysis level, and the participant.tsv file is specified in the root of the dataset, and its column names are the only ones showing up (and not those from the run-specific regressors.tsv files anymore for example). So if I understand it correctly, this file would be only ingested during a dataset level analysis step (which would explain to me why it did not cause trouble on the second level)?

@tyarkoni
Copy link
Collaborator

tyarkoni commented Jan 30, 2019

It definitely shouldn't be happening only at the third level. As @adelavega pointed out, the X field in the first-level model will implicitly drop any unnamed variables, so that's why you may not see it happening there. But it should also be happening at subsequent levels, assuming you have a scans.tsv or sessions.tsv file containing extra columns. If you don't have anything in those files, then the behavior you see is exactly as intended.

The mapping, per the BIDS spec, is that session-level analysis automatically pulls in scans.tsv, subject-level analysis pulls in sessions.tsv, and dataset-level analysis pulls in participants.tsv.

@effigies
Copy link
Collaborator

As Adina, isn't it @effigies this fix in pybids? bids-standard/pybids#353, if not then may be something similar, what is the string value there?

'events'

@adswa
Copy link
Contributor Author

adswa commented Jan 30, 2019

@yarikoptic we ran into the string events.

@tyarkoni
Copy link
Collaborator

I'm guessing that this line may be passing variables in the wrong order, so that source (which can be 'events') is getting read instead of run_info.

@effigies
Copy link
Collaborator

Yeah. That's where I'm looking. It looks like #353 was a partial fix. Though I'm not sure why they can run these models and I can't...

@adswa
Copy link
Contributor Author

adswa commented Jan 30, 2019

It definitely shouldn't be happening only at the third level. As @adelavega pointed out, the X field in the first-level model will implicitly drop any unnamed variables, so that's why you may not see it happening there. But it should also be happening at subsequent levels, assuming you have a scans.tsv or sessions.tsv file containing extra columns. If you don't have anything in those files, then the behavior you see is exactly as intended.

The mapping, per the BIDS spec, is that session-level analysis automatically pulls in scans.tsv, subject-level analysis pulls in sessions.tsv, and dataset-level analysis pulls in participants.tsv.

huh. I actually don't have these files and had no clue I needed them (sorry, must have over-read this when studying the BIDS-spec, but I happily blame Michael for not having them in the first place in the sourcedata).

@adswa
Copy link
Contributor Author

adswa commented Jan 30, 2019

@effigies would it be helpful if I push my local branch where I committed, merged and cherry-picked all changes we found relevant to my fork here on Github?

@tyarkoni
Copy link
Collaborator

tyarkoni commented Jan 30, 2019 via email

@effigies
Copy link
Collaborator

@AdinaWagner Sure, you can go ahead and cherry-pick. I've fixed a couple things, so we're likely to have some clashes, but it would be good to see what you have.

@adswa
Copy link
Contributor Author

adswa commented Jan 30, 2019

okay, so my current software state is AdinaWagner/pybids branch runinfo-merge and AdinaWagner/fitlins branch BIDSSacc.

@mgxd
Copy link
Contributor

mgxd commented Mar 15, 2019

I'm also running into the same issue with an image built off current master (ca40343)

Crashfile
File: /om/project/voice/bids/scripts/fitlins/crash-20190315-170528-mathiasg-l3_model-2f07306c-9009-48f6-8009-8da1f86d937d.pklz Node: fitlins_wf.l3_model Working directory: /tmp/tmp4nwpaui8/fitlins_wf/l3_model

Node inputs:

contrast_info = [[{'entities': {'session': '1', 'subject': 'voice969', 'task': 'emosent'}, 'name': 'speech', 'type': 't', 'weights': [{'speech': 1}]}]]
stat_files =
stat_metadata =

Traceback:
Traceback (most recent call last):
File "/opt/miniconda-latest/envs/neuro/lib/python3.6/site-packages/nipype/pipeline/plugins/multiproc.py", line 69, in run_node
result['result'] = node.run(updatehash=updatehash)
File "/opt/miniconda-latest/envs/neuro/lib/python3.6/site-packages/nipype/pipeline/engine/nodes.py", line 473, in run
result = self._run_interface(execute=True)
File "/opt/miniconda-latest/envs/neuro/lib/python3.6/site-packages/nipype/pipeline/engine/nodes.py", line 1253, in _run_interface
self.config['execution']['stop_on_first_crash'])))
File "/opt/miniconda-latest/envs/neuro/lib/python3.6/site-packages/nipype/pipeline/engine/nodes.py", line 1175, in _collate_results
(self.name, '\n'.join(msg)))
Exception: Subnodes of node: l3_model failed:
Subnode 0 failed
Error: Traceback (most recent call last):

File "/opt/miniconda-latest/envs/neuro/lib/python3.6/site-packages/nipype/pipeline/engine/utils.py", line 99, in nodelist_runner
result = node.run(updatehash=updatehash)

File "/opt/miniconda-latest/envs/neuro/lib/python3.6/site-packages/nipype/pipeline/engine/nodes.py", line 473, in run
result = self._run_interface(execute=True)

File "/opt/miniconda-latest/envs/neuro/lib/python3.6/site-packages/nipype/pipeline/engine/nodes.py", line 557, in _run_interface
return self._run_command(execute)

File "/opt/miniconda-latest/envs/neuro/lib/python3.6/site-packages/nipype/pipeline/engine/nodes.py", line 637, in _run_command
result = self._interface.run(cwd=outdir)

File "/opt/miniconda-latest/envs/neuro/lib/python3.6/site-packages/nipype/interfaces/base/core.py", line 375, in run
runtime = self._run_interface(runtime)

File "/src/fitlins/fitlins/interfaces/nistats.py", line 178, in _run_interface
model.fit(input, design_matrix=design_matrix)

File "/opt/miniconda-latest/envs/neuro/lib/python3.6/site-packages/nistats/second_level_model.py", line 170, in fit
raise ValueError('A second level model requires a list with at'

ValueError: A second level model requires a list with atleast two first level models or niimgs

@effigies
Copy link
Collaborator

@mgxd Can you provide your model? If this is somewhere I can login, it might be easiest to dig through your working directory.

@effigies
Copy link
Collaborator

@mathias It looks like you have degenerate inputs. You have two files, that you collapse in layer 2 to one file. Then at layer 3, you only have one file.

@effigies
Copy link
Collaborator

fitlins_wf/l3_model/mapflow/_l3_model0/_report/report.rst:

Node: nistats
=============


 Hierarchy : _l3_model0
 Exec ID : _l3_model0


Original Inputs
---------------


* contrast_info : [{'entities': {'session': '1', 'subject': 'voice969', 'task': 'emosent'}, 'name': 'speech', 'type': 't', 'weights': [{'speech': 1}]}]
* stat_files : [['/working/fitlins_wf/l2_model/mapflow/_l2_model0/speech.nii.gz']]
* stat_metadata : [[{'contrast': 'speech', 'session': '1', 'subject': 'voice969', 'suffix': 'stat', 'task': 'emosent'}]]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants