-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor document #93
Conversation
…ta.document.construct_document
…ayer but allow add without annotations to create a new layer
This should be ready now. @ChristophAlt could you have a look? |
src/pytorch_ie/data/document.py
Outdated
def has_layer(self, name: str) -> bool: | ||
return name in self | ||
|
||
def add(self, name: str, annotation: Optional[Annotation] = None, create_layer: bool = False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this to complicated and it will certainly confuse the user. Just automatically create the layer if it's not present and then there's also no need to make an annotation optional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I'm good with removing the create_layer
parameter. However, there is the use case where we want to add just an empty layer. Do you think it makes more sense to provide an add_layer
method instead of having annotation
optional?
This is very interesting! I like it. Two remarks:
|
Thanks for the feedback! I addressed your first point above above. Regarding the
vs.
If |
It's not about the usage of as_{X} -- it's certainly convenient given the necessity for the casts. The problem is that the casts are necessary in the first place, which might suggest our current approach of document annotations and predictions is flawed. The same goes for the necessity of adding empty layers to control certain taskmodule behavior -- this doesn't seem the proper way to do it. This is not a critique of your implementation but highlights some missing pieces in the current implementation. |
…of new AnnotationCollection.create_layer method
@ChristophAlt I just implemented another version of this which uses explicit types: #95. I like the other one more, what do you think? |
This is discarded in favor of #95. |
This PR cleans up and simplifies the Document.
It contains the following changes:
annotations
andpredictions
as attributes ofDocument
with typeAnnotationCollection
to reuse its functionalityAnnotationCollection
as mapping from layer names toAnnotationLayer
s with some helper methods.AnnotationLayer
that derives fromList
, but with some sanity checks and typed getters (optional)construct_document
that allows to easily create a Document from a text, spans and binary_relations (both as mapping from names to iterables of the respective annotations). This was previously sitting intests.helpers.document_utils
, but seems to be useful also in other places than tests.TransformerRETextClassificationTaskModule
: usepytest.approx
to compare floats