Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clearer dataset requirements for mixscape #210

Closed
marcus-r-kelly opened this issue Feb 28, 2023 · 6 comments · Fixed by #222
Closed

clearer dataset requirements for mixscape #210

marcus-r-kelly opened this issue Feb 28, 2023 · 6 comments · Fixed by #222
Assignees
Labels
enhancement New feature or request

Comments

@marcus-r-kelly
Copy link

Description of feature

Hello! As a python/scanpy user, I am happy to see that you are working on a scverse impelementation of mixscape. I have attempted to follow along with this notebook : https://github.com/theislab/pertpy-tutorials/blob/main/mixscape.ipynb using a third-party dataset (this. I have been using pertpy 0.3

I have had difficulty, however, because the names of some columns/features are occasionally required to have a certain value. For example, pt.tl.Mixscape.pert_sign apparently requires that the supplied AnnData object has a column in its .obs table of the format <gene_target>g<#> that has the same name as the nontargeting control. As far as I can tell, this is not documented.

Similarly, in pt.tl.Mixscape.lda value of the labels keyword argument must be 'gene_target' for proper function, since that is hard-coded elsewhere in the function.

These are just the functions that I have been able to make work. pt.pl.ms.barplot fails to work, and I cannot decipher the traceback to understand how to reconfigure my AnnData.obs table properly. I also cannot find how to make pt.pl.ms.lda work properly, since some array is getting reshaped during function runtime in ways that I cannot understand.

If this pipeline is working as designed, I would greatly appreciate a tool or guide to make sure that my AnnData.obs table has all required columns formatted properly.

@marcus-r-kelly marcus-r-kelly added the enhancement New feature or request label Feb 28, 2023
@Zethson
Copy link
Member

Zethson commented Feb 28, 2023

Dear @marcus-r-kelly ,

thank you very much for the issue!
Short answer: We'll make these parameters with defaults and document them properly.

Could you please share the error for pt.pl.ms.lda?

Cheers

@Zethson
Copy link
Member

Zethson commented Mar 13, 2023

@marcus-r-kelly ?

@marcus-r-kelly
Copy link
Author

marcus-r-kelly commented Mar 14, 2023

Sorry for the delay, this is kind of a scouting-ahead project for me and I've been swamped the last two weeks.
Here is the traceback:

ValueError                                Traceback (most recent call last)
Cell In[111], line 1
----> 1 pt.pl.ms.lda(adata=mf_ad_good)

File ~/miniconda3/envs/pertpy/lib/python3.9/site-packages/pertpy/plot/_mixscape.py:540, in MixscapePlot.lda(adata, mixscape_class, mixscape_class_global, control, perturbation_type, lda_key, n_components, show, save, **kwds)
    535     raise ValueError(f'Did not find .uns["{lda_key!r}"]. Run `pt.tl.neighbors` first.')
    537 adata_subset = adata[
    538     (adata.obs[mixscape_class_global] == perturbation_type) | (adata.obs[mixscape_class_global] == control)
    539 ].copy()
--> 540 adata_subset.obsm[lda_key] = adata_subset.uns[lda_key]
    541 if n_components is None:
    542     n_components = adata_subset.uns[lda_key].shape[1]

File ~/miniconda3/envs/pertpy/lib/python3.9/site-packages/anndata/_core/aligned_mapping.py:151, in AlignedActualMixin.__setitem__(self, key, value)
    150 def __setitem__(self, key: str, value: V):
--> 151     value = self._validate_value(value, key)
    152     self._data[key] = value

File ~/miniconda3/envs/pertpy/lib/python3.9/site-packages/anndata/_core/aligned_mapping.py:215, in AxisArraysBase._validate_value(self, val, key)
    206 if (
    207     hasattr(val, "index")
    208     and isinstance(val.index, cabc.Collection)
    209     and not (val.index == self.dim_names).all()
    210 ):
    211     # Could probably also re-order index if it’s contained
    212     raise ValueError(
    213         f"value.index does not match parent’s axis {self.axes[0]} names"
    214     )
--> 215 return super()._validate_value(val, key)

File ~/miniconda3/envs/pertpy/lib/python3.9/site-packages/anndata/_core/aligned_mapping.py:52, in AlignedMapping._validate_value(self, val, key)
     50     if self.parent.shape[axis] != val.shape[i]:
     51         right_shape = tuple(self.parent.shape[a] for a in self.axes)
---> 52         raise ValueError(
     53             f"Value passed for key {key!r} is of incorrect shape. "
     54             f"Values of {self.attrname} must match dimensions "
     55             f"{self.axes} of parent. Value had shape {val.shape} while "
     56             f"it should have had {right_shape}."
     57         )
     58 if not self._allow_df and isinstance(val, pd.DataFrame):
     59     name = self.attrname.title().rstrip("s")

ValueError: Value passed for key 'mixscape_lda' is of incorrect shape. Values of obsm must match dimensions (0,) of parent. Value had shape (6380, 10) while it should have had (3523,).

This error is pretty puzzling to me. Here are shapes of relevant objects in mf_ad_good :

mf_ad_good.X
<7360x1535 sparse matrix of type '<class 'numpy.float32'>'
	with 2378731 stored elements in Compressed Sparse Column format>
mf_ad_good.layers['X_pert']
<7360x1535 sparse matrix of type '<class 'numpy.float32'>'
	with 7270151 stored elements in Compressed Sparse Column format>
mf_ad_good.obsm
AxisArrays with keys: X_pca, X_umap
mf_ad_good.obsm['X_pca'].shape
(7360, 50)
mf_ad_good.obsm['X_umap'].shape
(7360, 2)

What is supposed to have 6380 or 3523 rows? What is the scalar parent whose dimensions obsm is supposed to match?

@xinyuejohn
Copy link
Collaborator

@marcus-r-kelly Thank you so much for finding these issues! Could you please share your current notebook? It would be very helpful.

@xinyuejohn xinyuejohn linked a pull request Mar 24, 2023 that will close this issue
3 tasks
@marcus-r-kelly
Copy link
Author

Notebook attached. All the files referred to are the ones in the GEO dataset I linked in the OP.
scverse_reattempt.zip

@xinyuejohn
Copy link
Collaborator

@marcus-r-kelly Thanks for your reply! I just checked, it turns out one argument "control" is missing in your case. The error information is not from pertpy and it is indeed not directly related to the error itself. I will improve the documentation and also the function argument settings!
Screenshot 2023-03-26 at 23 40 39

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants