-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes to annotation files submitted by GOA #185
Comments
+1 to the change, provided we have addressed issues here: geneontology/go-annotation#1113 -- specifically, do we lose any annotations that are assigned to a non-GCRP when they should be assigned to a GCRP? May 9 is only a two week lag. It may take some time for people to change their pipelines. One possibility is for the central GO publishing script (aka Mike's script) to produce both the old schema and the new scheme to ease the transition. However, I would advocate against this. |
Checklist:
|
+10 ! |
One thought. a) I'm assuming that the "isoform file" will be the contain only the automated annotations on Trembl entries? In which sase it isn't only isoforms, but also any truncated, alternative entry which isn't integrated into the UniProt record (UniProt may want to correct me here). b) Sometimes a GO annotator will make an experimental annotation to an isoform using the canonical entry using column 17 (UniProtKB:P12345-2)...I'm guessing/hoping these will still be in the main file (because they will have the canonical ID in column 2). So if the above is correct, the file naming/contents description might be confusing if people really are interested in isoform annotations....However, its probably only a file description issue.....the data spilt described is completely necessary |
point a, isoforms: just realized I am not clear on what this is. "isoforms from the UniProt GCRP" sounds like a bit of a contradictory term, since unless I'm not mistaken the GCRP by design avoids either true isoforms or alternate records. Perhaps this is just the current goa_$species, minus GCRP, RNA and complexes. In which case I would go with a name like "_altids" rather than "_isoforms" (sorry, can't think of a better suggestion now). We should also document a use case for this file. I agree "_isoforms" is potentially confusing if these are alternate records rather than true isoforms. b - my understanding is the same as yours. And I agree it's confusing to have the altid file called "isoforms" as it would lead people to believe they would not get isoforms from the default file. |
With regard to Tony's point 2:
I completely agree, and is the plan that this practice will be adopted consortium-wide? |
In the light of comments about timing, we've decided to postpone rolling out these changes until the GOA release scheduled for the week of 6th June; this gives people an extra four weeks to make any necessary preparations. |
I've created some sample files for human and mouse; you can find them at ftp://ftp.ebi.ac.uk/pub/contrib/goa/new-files/ |
Continued in #190 |
We (GOA) currently submit the following files to the GOC SVN repository:
GO-SVN/trunk/gene-associations/submission/
GO-SVN/trunk/gpad-gpi/submission/
where is one of: human, chicken, cow, dog, pig
At the recent GOC meeting in Geneva, we announced our intention to revise the set of files that we publish as part of our four-weekly release cycle in order to eliminate confusion that has arisen in the past about what the various files contain.
As of the GOA release that is scheduled for the week of 9th May, we intend to change the set of files that we publish and submit to the GOC repository to the following:
GO-SVN/trunk/gene-associations/submission/
GO-SVN/trunk/gpad-gpi/submission/
Again, is one of: human, chicken, cow, dog, pig
As you can see, as well as rationalising the contents of the files that we produce, we are also proposing changing their names; we're doing this for two reasons:
Comments / suggestions / objections?
The text was updated successfully, but these errors were encountered: