Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect atom typing? #170

Closed
mikemhenry opened this issue Jun 5, 2018 · 18 comments
Closed

Incorrect atom typing? #170

mikemhenry opened this issue Jun 5, 2018 · 18 comments

Comments

@mikemhenry
Copy link
Member

You will need to remove the extra .txt on these files.
pc71bm.hoomdxml.txt
opv_ff.xml.txt

pc71bm = mb.load("pc71bm.hoomdxml")
foyer_kwargs = {"assert_angle_params": False,
                "assert_dihedral_params": False}
pc71bm.save("pcbm_box_other.hoomdxml", forcefield_files="opv_ff.xml", overwrite=True, foyer_kwargs=foyer_kwargs)

The C atoms in the rings should be CA, and the other C atoms should be typed CT.

@mikemhenry
Copy link
Member Author

mikemhenry commented Jun 5, 2018

image

With def="C[r6,r5]" it looks like only two atoms are being miss typed

@summeraz
Copy link
Contributor

summeraz commented Jun 5, 2018

I think the issue is with the ring-finding, specifically with the use of NetworkX's cycle_basis. My guess is that for multi-cyclic systems like a fullerene, a valid cycle basis could be returned that does not include all of the rings in the system.

@summeraz
Copy link
Contributor

summeraz commented Jun 5, 2018

I think cycle_basis should be replaced with something like this, but would likely increase the time of atomtyping considerably...

However, if we set a break after >8 particles were added to a cycle, then perhaps this wouldn't be too terrible

@mikemhenry
Copy link
Member Author

mikemhenry commented Jun 5, 2018

I can take a look at implementing that, but in the mean time do you have any ideas for how I can type the system as a work around?

Regarding performance:
Maybe I don't understand how the residue map works, but it would be nice if I could type a pcbm and a p3ht, then pack them into a box, and then save them. Then foryer only needs to type 2 molecules which will really improve performance.

@mattwthompson
Copy link
Member

Then foryer only needs to type 2 molecules which will really improve performance.

That's the expected behavior if the residue map is used. The only caveat is that it looks are residue names only. So if your molecules are separate residues with identical names, it should be taking advantage of the residue map.

@mikemhenry
Copy link
Member Author

How do I set the residue name? I didn't see that as a compound attribute.

@mikemhenry
Copy link
Member Author

mikemhenry commented Jun 5, 2018

Residues are assigned by checking against Compound.name.

RTFM mike.

@mattwthompson
Copy link
Member

mattwthompson commented Jun 5, 2018

Residues aren't explicitly accounted for in mb.Compound. I would recommend converting it to a parmed.Structure first (mycompound.to_parmed(residues=['myresname_a', 'myresname_b'])). foyer does that conversion internally but it not guaranteed to interpret the hierarchy the way you would want it to.

@mikemhenry
Copy link
Member Author

mikemhenry commented Jun 5, 2018

Oh so can I not do a self.name = "pcbm" or a pcbm.name = "pcbm" to set the residue?

@mattwthompson
Copy link
Member

If your hierarchy is relatively simple, that should work.

@mikemhenry
Copy link
Member Author

"""forked from networkx dfs_edges function. Assumes nodes are integers, or at least
    types which work with min() and > ."""

"""cycle as a tuple in a deterministic order."""

These doc strings in that example code you linked to @summeraz I think hint at the problem, we need a deterministic way to walk through the cycle. I haven't really done any work with networkx, but could we use the atom index?

@summeraz
Copy link
Contributor

summeraz commented Jun 6, 2018

That could work. I don't actually think it should matter what node the cycle-finding starts with. If our goal is to find all of the cycles containing ≤ 8 particles that a given particle is within, then the atomtyping can be deterministic even if networkx isn't.

@mikemhenry
Copy link
Member Author

That PR fixes all the ring stuff, but I do have another question, how can I get the carbon that is connected to the tail of carbons, the top ring, and the fullerene cage, typed as CT? I'm not sure how it is getting typed as a CA since it isn't in a 5 or 6 ring.

@summeraz
Copy link
Contributor

summeraz commented Jun 13, 2018

Hmm, with the changes from #171, when I atom-type your system that bridging carbon is properly typed as CT:
viz.pdf

@summeraz
Copy link
Contributor

In your force field file you'll need to change the SMARTS for CA to [C;r5,r6]. Then the atom-typing should proceed as expected.

@mikemhenry
Copy link
Member Author

Sweet! Thanks so much, that fixed it. So does that string mean carbon in a ring of 5 elements or in a ring of 6 elements? Did C[r5,r6] mean something like carbon (bonded?) to any element in a ring of 5 elements or a ring of 6 elements?

@summeraz
Copy link
Contributor

Did C[r5,r6]mean something like carbon (bonded?) to any element in a ring of 5 elements or a ring of 6 elements?

Yes, I'm pretty sure that's what was happening. By including the C within the square brackets the SMARTS now reads as a carbon that is within a 5 or 6 membered ring.

@mattwthompson
Copy link
Member

Just to add in case it's useful,r indicates being in a ring of a given size but R can be used to indicate membership in a number of rings.

https://github.com/mosdef-hub/foyer/blob/master/foyer/smarts.py#L29

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants