-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
new task: pointer network for joint ner and re #10
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #10 +/- ##
==========================================
- Coverage 95.64% 95.56% -0.09%
==========================================
Files 34 40 +6
Lines 2459 3315 +856
==========================================
+ Hits 2352 3168 +816
- Misses 107 147 +40 ☔ View full report in Codecov by Sentry. |
0e5c6f6
to
4efc78d
Compare
It looks like the training is now aligned. When we replace the weights of the original model with the ones from the simple model, we get the following: # check the numbers
assert {layer_name: len(anns) for layer_name, anns in fp.items()} == {
"labeled_spans": 101,
"binary_relations": 63,
}
assert {layer_name: len(anns) for layer_name, anns in fn.items()} == {
"labeled_spans": 93,
"binary_relations": 73,
}
assert {layer_name: len(anns) for layer_name, anns in tp.items()} == {
"labeled_spans": 41,
"binary_relations": 9,
} which is quite close to the result from the not-modified original model: # check the numbers
assert {layer_name: len(anns) for layer_name, anns in fp.items()} == {
"labeled_spans": 126,
"binary_relations": 72,
}
assert {layer_name: len(anns) for layer_name, anns in fn.items()} == {
"labeled_spans": 88,
"binary_relations": 73,
}
assert {layer_name: len(anns) for layer_name, anns in tp.items()} == {
"labeled_spans": 46,
"binary_relations": 9,
} The result for the simple model we took the weights from is much worse: # check the numbers
assert {layer_name: len(anns) for layer_name, anns in fp.items()} == {
"labeled_spans": 108,
"binary_relations": 65,
}
assert {layer_name: len(anns) for layer_name, anns in fn.items()} == {
"labeled_spans": 106,
"binary_relations": 78,
}
assert {layer_name: len(anns) for layer_name, anns in tp.items()} == {
"labeled_spans": 29,
"binary_relations": 4,
} |
a9b37db
to
2261afc
Compare
6b03d46
to
1d5dee1
Compare
…d with special ids (bos, eos, pad) and number of target_token_ids; fix PointerHead.forward() for non default eos_id / label_ids
…e_decoder_input_ids(), and prepare_decoder_position_ids(); add pad_input_id to prepare_decoder_input_ids() instead
…rrectly parametrize
code is mostly taken from https://github.com/ArneBinder/pie-document-level/pull/26
Note: This adds the requirement
transformers = "^4.32.0"
, so that we can importBartPreTrainedModel
(breaking).Requires:
TaskModule.configure_model_metric(stage)
pytorch-ie#392)BartModelWithDecoderPositionIds
base model #26 (fordecoder_position_id_pattern
, i.e. replaced positional encodings (RPE))TODO:
metric_intervals
parameter) (true: 2h)pytorch-ie = ">=0.29.5,<0.30.0"
when released (we require addTaskModule.configure_model_metric(stage)
pytorch-ie#392) - [x] constrained training (true: 4h)PointerNetworkSpanAndRelationEncoderDecoder
back again into taskmodule and improve structure (true: 10h)return_overflowing_tokens=True
)cmp_src_rel
sorts the relations in a useful manner (note: we encode relations astail-head-label
). Edit: we sort first by head, then by tail. We could try the other way, which should be more in sync with the encoding mode, but the current setup seems to work.sanitize_sequence
common
andpointer_network
modules (true: 2h)sanitize_sequence()
end2end_re
with the current taskmodule variant) . Edit: fixed in 3411b29test_pointer_head.py
(PointerHead
)test_bart_as_pointer_network
(BartAsPointerNetwork
)test_simple_generative_pointer.py
test_simple_generative_pointer_predict.py
use mock models?Edit:use sshleifer/tiny-mbartuse sshleifer/bart-tiny-randomlayernorm_decay
to set "encoder only layernorm parameters"). Edit: fixed in b8da7e3GenerativeModel
. Edit: see new task: text-2-text #24BartModelWithDecoderPositionIds
, see addBartModelWithDecoderPositionIds
base model #26moveEdit: no, this is fineprepare_decoder_position_ids
toBartAsPointerNetwork
useEdit: no, just mask eos, bos, and pad positions in offset scoresspecial_tokens_mask
inPointerHead
Follow-up:
transformers.AutoModelForSeq2SeqLM
for candidatesLongT5ForConditionalGeneration
(pszemraj/long-t5-tglobal-base-sci-simplify)BucketSampler
, see Allowed customBatchSampler
s when instantiated in*_dataloader
hook Lightning-AI/pytorch-lightning#13640 (comment)