Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Es/lpt/lpt to ngraph fixes2 with master #2671

Merged
Merged
Show file tree
Hide file tree
Changes from 250 commits
Commits
Show all changes
794 commits
Select commit Hold shift + click to select a range
75a6414
[LPT] Replace creation of dequantization with factory
eshoguli Aug 29, 2020
4cc1109
[ngraph][LPT] Add ScaleShift replace for dequantization operations
apertovs Aug 25, 2020
0ee5e9a
[LPT] SubtractMultiplyToMultiplyAdd refactoring
eshoguli Aug 29, 2020
6840fb8
[LPT] Code style fix
eshoguli Aug 29, 2020
e83e0d9
[LPT] Edit SubtractMultiplyToMultiplyAdd transformation for dequantiz…
eshoguli Aug 29, 2020
c25028b
[LPT] Linux compilation quick fix
eshoguli Aug 29, 2020
ff35f58
[LPT] [WIP] runtime info applying
eshoguli Aug 29, 2020
10d2a4a
[LPT] Concat transformation functional tests extending
eshoguli Aug 29, 2020
88ed077
[LPT] MultiplyToConvolution + Subtract to add fusing + improvements i…
vzinovie Aug 26, 2020
5e3b364
[LPT] linux compilation error fix
eshoguli Aug 29, 2020
09a8c7b
[LPT] compilation error
eshoguli Aug 30, 2020
178fbeb
Merge branch 'es/lpt/lpt_to_ngraph_integration' of https://github.com…
eshoguli Aug 30, 2020
db0c33c
[LPT] MultiplyToGroupConvolution fix: 5D support
eshoguli Aug 30, 2020
20bc449
[LPT] Multiply transformation extending: FQ weights support - wip
eshoguli Aug 30, 2020
0e9f6de
[LPT] FQ folding & precision selection
eshoguli Aug 30, 2020
c92a2aa
[LPT] code style fixes
eshoguli Aug 30, 2020
f4ffccf
[LPT] code style fixes
eshoguli Aug 30, 2020
d956008
[LPT] Linux compilation error fix
eshoguli Aug 30, 2020
7bfdf66
[LPT] SubtractMultiplyToMultiplyAdd: refactoring
eshoguli Aug 30, 2020
36119b5
[LPT] Tests fixes
eshoguli Aug 31, 2020
3155060
[LPT] MultiplyToGroupConvolution tests
vzinovie Aug 31, 2020
9a3b0e2
[LPT] Convert subtract with int inputs to Eltwise sub
vzinovie Aug 31, 2020
6d8b12b
[LPT] Constant folding fix for quant models
vzinovie Aug 31, 2020
715b8c4
[LPT] 1) Asymmetric quantization improvement 2) tests extending
eshoguli Aug 31, 2020
8092c93
[LPT] 2 fixes for se_resnext_50
vzinovie Sep 1, 2020
1e316cc
Merge remote-tracking branch 'upstream/master' into es/lpt/lpt_to_ngr…
eshoguli Sep 1, 2020
6bb3a8e
[LPT] Add transformation priority branch selection test
vzinovie Sep 1, 2020
0b6d600
[LPT] AddMultiplyFusion: legacy transformation quick fix
v-Golubev Sep 2, 2020
fd7d49b
Merge branch 'es/lpt/lpt_to_ngraph_integration' of https://github.com…
eshoguli Sep 2, 2020
7396092
[LPT] nGraph tests temporary disabling
eshoguli Sep 2, 2020
b561c9f
[LPT] Fix for eltwise inputs with multiple outputs
vzinovie Sep 3, 2020
fd4d3c1
[LPT] Fix for FQ fuse
vzinovie Sep 3, 2020
eebb795
[LPT] Reshape by channel, batch temporary disabled
vzinovie Sep 4, 2020
d192d11
[nGraph][LPT] MatMul fix for reading FP16 models
vzinovie Sep 4, 2020
7431fe5
[LPT] 1) Add (not after Convolution/GroupConvolution/MatMul with Cons…
eshoguli Sep 4, 2020
c31b307
Merge branch 'es/lpt/lpt_to_ngraph_integration' of https://github.com…
eshoguli Sep 4, 2020
1efc713
[LPT] DenseNet improvments: AddTransformation: Add to Subtract + tests
eshoguli Sep 5, 2020
59bcc86
[LPT] AddTransformarion refactoring
eshoguli Sep 5, 2020
1732260
[LPT] AddTransformation tests temporay disabled
eshoguli Sep 5, 2020
8c434cb
[LPT] ReshapeTransformation improvements: degradation fix
eshoguli Sep 6, 2020
550d293
[LPT] code style fix
eshoguli Sep 6, 2020
8b17bcf
[LPT] Concat tests temporary disabling
eshoguli Sep 6, 2020
ea1ebbd
[LPT] tests unification
v-Golubev Aug 28, 2020
0918751
[LPT] split & variadic split merge fix
v-Golubev Sep 4, 2020
4086f1c
[LPT] Clamp: added support for asymmetric quantization
v-Golubev Sep 3, 2020
f212ab8
[LPT] added DequantizationAttr run-time attribute
v-Golubev Sep 4, 2020
a627bcb
[LPT] debug info removal
eshoguli Sep 7, 2020
03ac1ed
[LPT] ConcatTransformation: zero point fix
eshoguli Sep 7, 2020
c07a393
Merge remote-tracking branch 'upstream/master' into es/lpt/lpt_to_ngr…
eshoguli Sep 7, 2020
95eec47
[LPT] CNNNetwork ReLU transformation quick fix
eshoguli Sep 7, 2020
7c25bef
[LPT]
v-Golubev Aug 10, 2020
2742592
[LPT]
v-Golubev Aug 13, 2020
a087194
[LPT] concat fix Ubuntu18
v-Golubev Sep 7, 2020
1607569
[LPT] Concat test fixes
eshoguli Sep 8, 2020
7b1f6b3
[LPT] Not fp32 FQ input support
vzinovie Sep 8, 2020
3b21edd
[LPT] MatMul Fix + separateInStandaloneBranch Fix
vzinovie Sep 8, 2020
b92cf5e
[LPT] Fix reference input types in mish fusion tests
vzinovie Sep 8, 2020
4836442
[LPT] Fix cpuFuncTests on CentOS building
vzinovie Sep 8, 2020
b866968
[nGraph][LPT] ScaleShift 2d, 3d nGraph conversion enabling
vzinovie Sep 9, 2020
4ee20d2
[LPT] 1) FullyConnected workaround removing 2) validate_nodes_and_inf…
eshoguli Sep 9, 2020
5037b51
[ngraph] Add check for childs for ConvertSubtract
apertovs Sep 1, 2020
0c50f11
[LPT] Squeeze/Unsqueeze tests unification
apertovs Sep 1, 2020
ee10cb9
[LPT] Squeeze/Unsqueeze change signature for getReference/getOriginal
apertovs Sep 9, 2020
3b3128c
[LPT] Mul & Add -> ScaleShift quick fix
eshoguli Sep 9, 2020
5d25c7e
[LPT] nGraph tests emporary disabling
eshoguli Sep 9, 2020
1852420
[LPT] code style fix
eshoguli Sep 9, 2020
76e6f2c
[LPT] code style fix #2
eshoguli Sep 9, 2020
07c206b
[LPT] nGraph tests temporary disabling
eshoguli Sep 9, 2020
e854a7c
[LPT] code styl fix #3
eshoguli Sep 9, 2020
e8b3207
[LPT] shared plugin tests temporary disabling
eshoguli Sep 9, 2020
a5ef448
[LPT] cleanup
eshoguli Sep 10, 2020
a46a7c8
[LPT] nGraph unit_tests tests temproary disabling
eshoguli Sep 10, 2020
a6c599a
[LPT] nGraph unit tests disabling #2
eshoguli Sep 10, 2020
165412d
[LPT] nGraph tests disabling
eshoguli Sep 10, 2020
3fbcac8
[LPT] nGraph tests temporary disabling
eshoguli Sep 10, 2020
d0e1068
[LPT] WA removing
eshoguli Sep 10, 2020
8a497a8
[LPT] CentOS compilation fix
eshoguli Sep 10, 2020
d730e4d
[LPT] KMB wa to avoid compilation error
eshoguli Sep 10, 2020
1994294
Merge remote-tracking branch 'upstream/master' into es/lpt/lpt_to_ngr…
eshoguli Sep 10, 2020
e0f635e
[LPT] functional test temporary disabling
eshoguli Sep 10, 2020
f879f31
[nGraph] code style fixes
eshoguli Sep 10, 2020
6b5208f
[LPT] ConcatTransformation: data movement operation as intermediate h…
eshoguli Sep 12, 2020
6398b8f
[LPT] FuseSubtractToFakeQuantize after VariadicSplit
eshoguli Sep 13, 2020
3958e55
[LPT] ConcatWithSplitTransformation functional test temporary disabling
eshoguli Sep 13, 2020
4d2cee9
[LPT] Clamp and ConcatWithDifferentPrecisionsOnChilds: tests fix
v-Golubev Sep 11, 2020
26f5b04
[LPT] MatMul: bert-nv-mlperf-quantized fix
v-Golubev Sep 12, 2020
8e2d7c8
[LPT] Add to convolution biases fuse fix
vzinovie Sep 11, 2020
1012b75
[LPT] GPU plugin tests fixes
v-Golubev Sep 14, 2020
9eec103
[LPT] Normalize GPU plugin tests fix
v-Golubev Sep 14, 2020
11a31f4
[LPT] test-commit
v-Golubev Sep 14, 2020
3f4c1fe
[LPT] CLDNN Plugin FP16 conversion
vzinovie Sep 14, 2020
5bc0afd
[LPT] AvgPool update precision if there is not FQ after + convolution
vzinovie Sep 9, 2020
b1102b6
[LPT] Convolution fixes
eshoguli Sep 12, 2020
07fe900
[LPT] FuseSubtractToFakequantize & FuseMultiplyToFakeQuantize improve…
eshoguli Sep 15, 2020
c4614cc
[LPT] FuseSubtractToFakeQuantize test fix
eshoguli Sep 15, 2020
ab5d479
[LPT] FuseSubtractToFakeQuantizeTransformation tests
eshoguli Sep 15, 2020
c9c4d35
[LPT] code style fix
eshoguli Sep 15, 2020
85b8779
[LPT] AvgPool child recursive extend
vzinovie Sep 14, 2020
be778b9
[LPT] AvgPool tests + fix
vzinovie Sep 15, 2020
be6a313
[LPT] compilation quick fix
eshoguli Sep 15, 2020
2982991
[LPT] Add to convolution biases fuse fix
vzinovie Sep 11, 2020
a40a011
Merge branch 'es/lpt/lpt_to_ngraph_integration' of https://github.com…
eshoguli Sep 15, 2020
9b1d968
Merge remote-tracking branch 'upstream/master' into es/lpt/lpt_to_ngr…
eshoguli Sep 15, 2020
cd1464f
[LPT] Linux issues: MatMulWithOptimizedConstantFakeQuantizeTransforma…
eshoguli Sep 16, 2020
c41b784
[LPT] Normalize GPU plugin tests fix
v-Golubev Sep 14, 2020
5e8eb39
[LPT] test-commit
v-Golubev Sep 14, 2020
5ebcbad
[LPT]
v-Golubev Sep 9, 2020
70c98df
[LPT] Tests Unification: GroupConvolution
v-Golubev Sep 11, 2020
4608510
[LPT] removed debug info
v-Golubev Sep 11, 2020
77dbbbc
[LPT] functional tests for Convolution & GroupConvolution extending
eshoguli Sep 16, 2020
f13e018
[LPT] [MatMul] Quick fix ubuntu error
apertovs Sep 16, 2020
4704a63
[LPT] MatMulTransformation quick test fix: one constant for both inte…
eshoguli Sep 16, 2020
49ba970
[nGraph] code style fix
eshoguli Sep 16, 2020
63b5607
[LPT] added output_precision to NormalizeIE
v-Golubev Sep 16, 2020
6c40457
[nGraph] NormalizeIE fix for LPT support
eshoguli Sep 17, 2020
e451cf2
[LPT] nGraph WA removal
eshoguli Sep 17, 2020
d7806ec
[LPT] fixed fillSubgraph for concat multi channels
v-Golubev Sep 16, 2020
f5b071f
[LPT] MatMul fix
eshoguli Sep 18, 2020
82be35b
Merge commit 'cb70ae064b02ec53456f1794cc24e38e3474dbd1' into move_tra…
eshoguli Sep 18, 2020
3df32a9
Merge branch 'move_transforms_to_common' into es/lpt/lpt_to_ngraph_wa…
eshoguli Sep 18, 2020
1e8618b
[nGraph] WA removal: 1) nGraph tests enabling 2) LPT extanding: not h…
eshoguli Sep 18, 2020
60d663a
[LPT] nGraph WA removal: function tests skip config rollback
eshoguli Sep 18, 2020
c3f5e92
[LPT] WA removal: precision propagation fix
eshoguli Sep 18, 2020
40713e1
[LPT] ConvertMulOrAddFinally transformation extending
eshoguli Sep 18, 2020
dbc0a4f
[nGraph] ConvolutionMultiplyFusion rollback (move from legacy to common)
eshoguli Sep 18, 2020
a5b4eaa
[nGraph] ConvertMulAddToScaleShiftOrPower: WA removal
eshoguli Sep 18, 2020
e206f1f
[nGraph] TypeRelaxed: WA removal
eshoguli Sep 18, 2020
7446e53
[nGraph] WA removal: TypeRelaxed
eshoguli Sep 19, 2020
ac97c83
[LPT] WA removal: ConcatTransformation
eshoguli Sep 19, 2020
a52f514
[nGraph] WA removal: Eltwise & ConvertMulOrAddFinally fixes to suppor…
eshoguli Sep 19, 2020
6f8d450
[nGraph] MulAddConversion fix: 2D & 3D ScaleShift are supproted
eshoguli Sep 19, 2020
c52fafb
[nGraph] VisualizeTree extending
eshoguli Sep 20, 2020
e29b505
[LPT] FakeQuantizeDequantization extending: check element wise dequan…
eshoguli Sep 20, 2020
4c600dc
[LPT] FakeQuantizeDequantization extending: SubtractMultiplyToMultipl…
eshoguli Sep 20, 2020
51ef2b2
[LPT] Convolution + test infrastructure update
eshoguli Sep 20, 2020
05a850b
Merge remote-tracking branch 'upstream/master' into es/lpt/lpt_to_ngr…
eshoguli Sep 20, 2020
0c0ed09
[LPT] GPU compilation error
eshoguli Sep 20, 2020
292b9ee
[nGraph] BatchNorm plugin tests: input tensor definition
eshoguli Sep 20, 2020
3a1e210
[LPT] LowPrecisionTransformer::isFunctionQuantized was added
eshoguli Sep 20, 2020
143673a
[nGraph] WA final cleanup
eshoguli Sep 20, 2020
70975d6
[nGraph] ScaleShiftIE quick fix
eshoguli Sep 20, 2020
9405359
[LPT] Functional tests: added test-cases "Concat with intermediate wi…
v-Golubev Sep 19, 2020
3e53922
Merge pull request #5 from eshoguli/es/lpt/lpt_to_ngraph_wa_removal_m…
eshoguli Sep 21, 2020
9e4e478
Merge remote-tracking branch 'upstream/master' into es/lpt/lpt_to_ngr…
eshoguli Sep 21, 2020
12f7338
[LPT] Transformer::isNetworkquantized fix
eshoguli Sep 21, 2020
a62255c
[LPT] SubtractMultiplyToMultiplyAdd zero Add remove: fix for ssd300 o…
vzinovie Sep 16, 2020
59922be
[LPT] MultiplyToGroupConvolution not transform on Const
vzinovie Sep 17, 2020
776ff3f
[LPT] workaround for negative scales
vzinovie Sep 17, 2020
dcdaa77
[LPT] Convert standalone dequantization Mul,Sub,Add to ScaleShift
vzinovie Sep 18, 2020
edd798d
[LPT] SubtractMultiplyToMultiplyAdd test fix
vzinovie Sep 21, 2020
4445ad2
[LPT] Clamp transformation: GPU tests fix
v-Golubev Sep 21, 2020
5595c4c
Merge remote-tracking branch 'upstream/master' into es/lpt/lpt_to_ngr…
eshoguli Sep 21, 2020
932ff7d
Merge pull request #4 from v-Golubev/vg/merge_to_integration
eshoguli Sep 22, 2020
637aeab
[LPT] Transformer tests
eshoguli Sep 22, 2020
a9d0019
Merge branch 'es/lpt/lpt_to_ngraph_integration' of https://github.com…
eshoguli Sep 22, 2020
36733ef
[LPT] FakeQuantizePrecisionSelectionTransformation was disabled for GPU
eshoguli Sep 22, 2020
271f169
Merge branch 'es/lpt/lpt_to_ngraph_integration' of https://github.com…
eshoguli Sep 22, 2020
5ea6624
[LPT] TransformerIsFunctionQuantized refactoring
eshoguli Sep 22, 2020
8d8bd5c
[nGraph] code style fix
eshoguli Sep 22, 2020
089e259
[LPT] mobilenet_v2_tf_depthwise test update
eshoguli Sep 22, 2020
4d0632c
[LPT] TMP: dequantization folding
eshoguli Sep 22, 2020
b03990b
[LPT] Elementwise transformation fix: dequantization operations const…
eshoguli Sep 22, 2020
82eaaec
[LPT] cleanup
eshoguli Sep 23, 2020
1c8c57c
[LPT] denormal values fix
eshoguli Sep 25, 2020
2463bc4
[LPT] FuseFakeQuantize test fixed + negative multiply case
vzinovie Sep 21, 2020
6e83091
[LPT] FP32 -> FP16 conversion info
vzinovie Sep 22, 2020
34bd7f3
[LPT] FQ dot interval support + swapMultiplyAdd safely division
vzinovie Sep 24, 2020
3278add
[LPT] test fix
vzinovie Sep 24, 2020
f86dc1d
[LPT] Tests for dot interval on FQ + tests for addTransformation enab…
vzinovie Sep 25, 2020
dbec4cf
[LPT] Clamp transformation fix
vzinovie Sep 25, 2020
0d93d6b
[LPT] FQ prec selection test fix
vzinovie Sep 25, 2020
1ec6102
Merge remote-tracking branch 'upstream/master' into es/lpt/lpt_to_ngr…
eshoguli Sep 26, 2020
6e2dacb
[LPT] Clamp test case
vzinovie Sep 26, 2020
5609275
[LPT] Concat division precision fix
vzinovie Sep 26, 2020
71f726f
[LPT] cleanup
eshoguli Sep 26, 2020
9823ded
[LPT] merge fix
eshoguli Sep 27, 2020
1194de1
[LPT] WIP: MatMul asymmetric quantization fix (BERT)
eshoguli Sep 27, 2020
53a672f
[LPT] MatMulWithOptimizedConstantFakeQuantizeTransformation disabled
eshoguli Sep 28, 2020
8e611e3
[LPT] GPU Plugin set config fix
vzinovie Sep 28, 2020
8a157c5
[LPT] Fix merge mistakes
vzinovie Sep 28, 2020
f4cf3a5
[LPT] Rollback device specific INT8
vzinovie Sep 28, 2020
07d1b56
Merge pull request #6 from vzinovie/es/lpt/lpt_to_ngraph_integration
dmitry-gorokhov Sep 28, 2020
f9486a2
[LPT] ReshapeFullyConnected fix: FullyConnected output fix
eshoguli Sep 28, 2020
a5b275a
[LPT] bert-base-chinese GPU fix
v-Golubev Sep 28, 2020
dc22ae5
[ngraph/LPT] Tests for fix convert_mul_or_add_finally with dequantiza…
apertovs Sep 27, 2020
3095459
[LPT] ScaleShift dim < 4 only dequantization conversion
vzinovie Sep 28, 2020
977ca2f
[LPT] MatMul transformation tests extensing
eshoguli Sep 30, 2020
7d0a070
[LPT] ReshapeFullyConnected legacy transformation: LPT test case addi…
eshoguli Sep 30, 2020
95579a3
[nGraph] VisualizeTree extending: property names displying to simplif…
eshoguli Sep 30, 2020
e0514b2
[LPT] getDequantization extending
eshoguli Sep 30, 2020
3f59f1c
[LPT] MulAddToScaleshiftOrPower: out precision fix & tests
v-Golubev Sep 29, 2020
1ae5655
[LPT] Multiply to ScaleShiftIE: Multiply transformation: remove DEQUA…
eshoguli Sep 30, 2020
00219df
[LPT] Concat test case
vzinovie Sep 29, 2020
28c87b9
[nGraph] try to fix opencv compatibility
vzinovie Sep 29, 2020
7c93a3a
[nGraph] nGraph code style fix
vzinovie Sep 30, 2020
69ec09f
[LPT] InPlace dequantization folding
vzinovie Sep 30, 2020
1c6a9c4
[LPT] Multiply constant folding test
vzinovie Sep 30, 2020
72c39ab
[LPT] Fix plugin test case for MatMulWithOptimizedConstantFakeQuantize
apertovs Sep 30, 2020
f93b5e4
[LPT] Convolution transformation: mulConst shape fix
v-Golubev Sep 30, 2020
939ca97
[LPT] INT8 Constant folding branch for elementwise ops optimization r…
eshoguli Oct 1, 2020
6ce619d
Merge remote-tracking branch 'upstream/master' into es/lpt/lpt_to_ngr…
eshoguli Oct 1, 2020
e0cc238
[LPT] eltwise for const branch fix
eshoguli Oct 1, 2020
c61e9ff
[LPT] linux fix
eshoguli Oct 1, 2020
10131b3
[LPT] Multiply test refactoring
eshoguli Oct 3, 2020
43b18f4
[LPT] Convert Fuse in Constant + tests
vzinovie Oct 2, 2020
58b276d
[LPT] function comparation: runtime info comparation rollback
eshoguli Oct 3, 2020
e0af769
[LPT] linux build fix
eshoguli Oct 4, 2020
fba49cd
[LPT] linux build fix2
eshoguli Oct 4, 2020
7ac56d0
[LPT] MatMul transformation limitation was added to be similar as CNN…
eshoguli Oct 4, 2020
d0efcc8
[LPT] Reshape transformation update: don't broadcast by batch
eshoguli Oct 4, 2020
b7af836
[LPT] MatMul transformation limitation was added to be similar as CNN…
eshoguli Oct 4, 2020
457639a
[LPT] MatMul transformation: transpose input tensors fix
eshoguli Oct 5, 2020
0c1b347
[LPT] checkElementwise for AddTransformation WA: should be moved to g…
eshoguli Oct 5, 2020
e56fcbc
Merge remote-tracking branch 'upstream/master' into es/lpt/lpt_to_ngr…
eshoguli Oct 5, 2020
b751053
[LPT] merge fix
eshoguli Oct 5, 2020
e1cdd4b
[LPT] MatMul fix & tests
eshoguli Oct 5, 2020
6435344
[LPT] AddTransformation tests
eshoguli Oct 5, 2020
c3056bb
[LPT] Interpolate transformation enabled
vzinovie Oct 5, 2020
9635131
[LPT] constant folding before LPT
eshoguli Oct 7, 2020
6e5c570
[LPT] WIP: not completed tests
eshoguli Oct 7, 2020
9debe58
Merge remote-tracking branch 'upstream/master' into es/lpt/lpt_to_ngr…
eshoguli Oct 7, 2020
3dba7bd
[LPT] GPU degradation fix
eshoguli Oct 7, 2020
a471c1f
[LPT] FuseConvert workaround
eshoguli Oct 7, 2020
540711b
[LPT] code cleanup
eshoguli Oct 7, 2020
a491844
[LPT] Interpolate GPU test quick fix
eshoguli Oct 7, 2020
66a0564
Merge remote-tracking branch 'upstream/master' into es/lpt/lpt_to_ngr…
eshoguli Oct 7, 2020
ac5e2c1
[LPT] GroupConvolution fix
eshoguli Oct 7, 2020
797f6f2
Merge remote-tracking branch 'upstream/master' into es/lpt/lpt_to_ngr…
eshoguli Oct 14, 2020
3cd084f
[LPT] Fix fusing multiply for non-dequantization layers
eshoguli Oct 14, 2020
1d93cd0
[LPT] GPU pipeline update: enableInt8 initialization place update
eshoguli Oct 13, 2020
5114f15
Merge remote-tracking branch 'upstream/master' into es/lpt/lpt_to_ngr…
eshoguli Oct 15, 2020
4636b97
[LPT] tests compilation fix
eshoguli Oct 15, 2020
985aed7
[LPT] merge fix
eshoguli Oct 15, 2020
87ddcce
[LPT] tests enabling
eshoguli Oct 15, 2020
bdf93e1
[LPT] merge issue resolving
eshoguli Oct 15, 2020
4021c8c
[LPT] LPT CNNNetwork usage macros: part #1: source code
eshoguli Oct 17, 2020
a0e4209
[LPT] LPT CNNNetwork usage macros: part #2: cmake files update and te…
eshoguli Oct 18, 2020
6eeaac2
[LPT] LPT workaround from nGraph core removing
eshoguli Oct 20, 2020
88242f1
[LPT] previous LPT version tests
eshoguli Oct 20, 2020
f524ef7
[LPT] inference_engine_lp_transformations was returned back
eshoguli Oct 20, 2020
2a3316e
[LPT] replace_node rollback
eshoguli Oct 20, 2020
e1a5cee
Merge remote-tracking branch 'upstream/master' into es/lpt/lpt_to_ngr…
eshoguli Oct 20, 2020
5126095
[LPT] ConvertSubtract fix
eshoguli Oct 20, 2020
6c52ee7
[LPT] GPU: baselineIsFP16 reuse fix
eshoguli Oct 21, 2020
3c1b711
[LPT] FakeQuantizeTransformation: GPU workaround: I32 -> FP32 Convert…
eshoguli Oct 13, 2020
840937e
[LPT] AvgPool output precision workaround
eshoguli Oct 13, 2020
00aef0d
[LPT] Group convolution precision + Subtract to ScaleShift const fix
vzinovie Oct 15, 2020
a29524d
[LPT] SubMulToMulAdd & Transpose: action-recognition-0001 fix
v-Golubev Oct 14, 2020
2a4ca0f
[LPT] Transpose: added test with per-tensor quantization
v-Golubev Oct 14, 2020
0a57b9d
Merge pull request #15 from eshoguli/es/lpt/lpt_to_ngraph_fixes2_with…
eshoguli Oct 21, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
7 changes: 7 additions & 0 deletions inference-engine/src/cldnn_engine/cldnn_config.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -181,6 +181,13 @@ void Config::UpdateFromMap(const std::map<std::string, std::string>& configMap)
} else {
THROW_IE_EXCEPTION << NOT_FOUND_str << "Unsupported property value by plugin: " << val;
}
} else if (key.compare(PluginConfigInternalParams::KEY_LP_TRANSFORMS_VERSION) == 0) {
if (val == PluginConfigInternalParams::LP_TRANSFORMS_CNNNETWORK)
lptVersion = LptVersion::cnnNetwork;
else if (val == PluginConfigInternalParams::LP_TRANSFORMS_NGRAPH)
lptVersion = LptVersion::nGraph;
else
THROW_IE_EXCEPTION << "Wrong value for property key " << PluginConfigInternalParams::KEY_LP_TRANSFORMS_MODE;
} else if (key.compare(CLDNNConfigParams::KEY_CLDNN_NV12_TWO_INPUTS) == 0) {
if (val.compare(PluginConfigParams::YES) == 0) {
nv12_two_inputs = true;
Expand Down
6 changes: 6 additions & 0 deletions inference-engine/src/cldnn_engine/cldnn_config.h
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,11 @@
namespace CLDNNPlugin {

struct Config {
enum LptVersion {
cnnNetwork,
nGraph
};

Config() : throughput_streams(1),
useProfiling(false),
dumpCustomKernels(false),
Expand Down Expand Up @@ -49,6 +54,7 @@ struct Config {
bool memory_pool_on;
bool enableDynamicBatch;
bool enableInt8;
LptVersion lptVersion = LptVersion::nGraph;
bool nv12_two_inputs;
bool enable_fp16_for_quantized_models;
cldnn::priority_mode_types queuePriority;
Expand Down
116 changes: 94 additions & 22 deletions inference-engine/src/cldnn_engine/cldnn_engine.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,9 @@
#include <transformations/opset_conversions/convert_opset2_to_opset1.hpp>
#include <transformations/opset_conversions/convert_opset3_to_opset2.hpp>
#include <transformations/init_node_info.hpp>
#include <transformations/convert_precision.hpp>
#include <transformations/rt_info/fused_names_attribute.hpp>

#include <legacy/convert_function_to_cnn_network.hpp>
#include <legacy/ie_util_internal.hpp>
#include <legacy/graph_transformer.h>
Expand All @@ -42,6 +44,9 @@
#include "cldnn_executable_network.h"
#include "cldnn_custom_layer.h"

#include <transformations/low_precision/transformer.hpp>
#include <transformations/low_precision/mat_mul.hpp>

#ifdef __linux__
#include <dlfcn.h>
#endif
Expand Down Expand Up @@ -72,8 +77,10 @@ cldnn::device_info clDNNEngine::GetDeviceInfo(const std::map<std::string, std::s
return device_info;
}

InferenceEngine::ICNNNetwork::Ptr clDNNEngine::CloneAndTransformNetwork(const InferenceEngine::ICNNNetwork& network) const {
InferenceEngine::ICNNNetwork::Ptr clDNNEngine::CloneAndTransformNetwork(const InferenceEngine::ICNNNetwork& network, CLDNNPlugin::Config config) const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use const ref for the config.

std::shared_ptr<ICNNNetwork> clonedNetwork = cloneNetwork(network);
bool baselineIsFP16 = false;

if (clonedNetwork->getFunction()) {
const auto transformations_callback = [](const std::shared_ptr<const ::ngraph::Node> &node) -> bool {
// Reshape->Permute->Reshape pattern in theory can change output rank, so this check is added to be sure
Expand Down Expand Up @@ -112,6 +119,12 @@ InferenceEngine::ICNNNetwork::Ptr clDNNEngine::CloneAndTransformNetwork(const In
return can_use_reduce;
}

if (auto add_op = std::dynamic_pointer_cast<const ngraph::opset1::Add>(node)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is purpose of this skip option? Should it prevent bias merge into conversion mode? Or general linear operation merge?
Looks like this skip option may have unexpected effect on other passes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And what if Conv -(in:1)-> Add

return ngraph::is_type<ngraph::opset1::Convolution>(add_op->get_input_node_shared_ptr(0)) ||
ngraph::is_type<ngraph::opset1::GroupConvolution>(add_op->get_input_node_shared_ptr(0)) ||
ngraph::is_type<ngraph::opset1::MatMul>(add_op->get_input_node_shared_ptr(0));
}

return std::dynamic_pointer_cast<const ::ngraph::opset2::Gelu>(node) ||
std::dynamic_pointer_cast<const ::ngraph::opset3::ShuffleChannels>(node) ||
std::dynamic_pointer_cast<const ::ngraph::opset2::BatchToSpace>(node) ||
Expand All @@ -126,24 +139,75 @@ InferenceEngine::ICNNNetwork::Ptr clDNNEngine::CloneAndTransformNetwork(const In
// Disable shape inference (WA for generic operations)
::ngraph::op::GenericIE::DisableReshape noReshape(nGraphFunc);

// Note: instead of running all Conversion Transformations you can make up your own transformation pipeline
ngraph::pass::Manager manager;
manager.register_pass<ngraph::pass::InitNodeInfo>();
// WA: ConvertPriorBox must be executed before the 1st ConstantFolding pass
manager.register_pass<ngraph::pass::ConvertPriorBox>();
manager.register_pass<ngraph::pass::CommonOptimizations>();
manager.register_pass<ngraph::pass::ConvertOpSet3ToOpSet2>();
manager.register_pass<ngraph::pass::ConvertOpSet2ToOpSet1>();
manager.register_pass<ngraph::pass::ConvertOpSet1ToLegacy>();

manager.set_callback(transformations_callback);
manager.run_passes(nGraphFunc);

ngraph::pass::Manager ti_manager;
// Unroll will be called after all conversions
// temporarily switch back to plugin unroller from NGraph unroller until TI output names are corrected
// ti_manager.register_pass<ngraph::pass::UnrollTensorIterator>();
ti_manager.run_passes(nGraphFunc);
bool enableInt8;

{
// Note: instead of running all Conversion Transformations you can make up your own transformation pipeline
ngraph::pass::Manager manager;
manager.register_pass<ngraph::pass::InitNodeInfo>();
// WA: ConvertPriorBox must be executed before the 1st ConstantFolding pass
manager.register_pass<ngraph::pass::ConvertPriorBox>();
manager.register_pass<ngraph::pass::CommonOptimizations>();
manager.register_pass<ngraph::pass::ConvertOpSet3ToOpSet2>();
manager.register_pass<ngraph::pass::ConvertOpSet2ToOpSet1>();


manager.set_callback(transformations_callback);
manager.run_passes(nGraphFunc);

const auto fp16_callback = [&baselineIsFP16](const std::shared_ptr<const ::ngraph::Node> &node) -> bool {
if (!baselineIsFP16 && node->get_output_element_type(0) == ngraph::element::f16) {
baselineIsFP16 = true;
}

return true;
};

ngraph::pass::Manager conversion_manager;

enableInt8 = config.enableInt8 &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

enableInt8 has the same name as conf field but different meaning. It slightly confuses readers.

(config.lptVersion == Config::LptVersion::nGraph) &&
ngraph::pass::low_precision::LowPrecisionTransformer::isFunctionQuantized(nGraphFunc);
if (enableInt8) {
// [WA part1] Convert quantized FP16 model to FP32 to avoid possible overflow and mixed precision errors
conversion_manager.register_pass<ngraph::pass::ConvertPrecision>(ngraph::element::f16, ngraph::element::f32);
}

conversion_manager.set_callback(fp16_callback);
conversion_manager.run_passes(nGraphFunc);

ngraph::pass::Manager ti_manager;
// Unroll will be called after all conversions
// temporarily switch back to plugin unroller from NGraph unroller until TI output names are corrected
// ti_manager.register_pass<ngraph::pass::UnrollTensorIterator>();
ti_manager.run_passes(nGraphFunc);
}

using namespace ngraph::pass::low_precision;
if (enableInt8) {
auto params = LayerTransformation::Params(
true, // updatePrecisions
LayerTransformation::QuantizedTensorAlignment::UpdateLevel, // quantizedTensorAlignmentOnActivations
LayerTransformation::QuantizedTensorAlignment::None, // quantizedTensorAlignmentOnWeights
true); // supportAsymmetricQuantization
LowPrecisionTransformer transformer(LowPrecisionTransformer::getAllTransformations(params)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eshoguli, @slyalin, @GlebKazantaev why the API for low precision transformation is completely differs from common transformations?
We use common transforms as follows:

manager.register_pass<ngraph::pass::ConvertOpSet2ToOpSet1>();
auto pass_config = manager.get_pass_config();
pass_config->disable<ngraph::pass::ConvertGELU>();
pass_config->set_callback<ngraph::pass::ConvertReduceMaxToPooling>(
                [](const_node_ptr &node) -> bool {
                    return disableReduceDecomposition<ngraph::opset1::ReduceMax>(node);
                });
manager.run_passes(nGraphFunc);
// etc

So I'd expect LPT to be used as follows:

manager.register_pass<ngraph::pass::LowPrecisionTransformations>();
pass_config.set_callback<ngraph::pass:MatMulTransformation>(/* callback that checks asymmetric quantization */);
or 
pass_config.disable<ngraph::pass:MatMulTransformationAsymmetric>();
manager.run_passes(nGraphFunc);

It it possible to align LPT with other transforms?

.add<MatMulTransformation, ngraph::opset1::MatMul>(LayerTransformation::Params(params).setSupportAsymmetricQuantization(false)));

transformer.transform(nGraphFunc);
}

{
ngraph::pass::Manager manager = ngraph::pass::Manager();
manager.register_pass<ngraph::pass::ConvertOpSet1ToLegacy>();
manager.set_callback(transformations_callback);
manager.run_passes(nGraphFunc);

ngraph::pass::Manager ti_manager;
// Unroll will be called after all conversions
// temporarily switch back to plugin unroller from NGraph unroller until TI output names are corrected
// ti_manager.register_pass<ngraph::pass::UnrollTensorIterator>();
ti_manager.run_passes(nGraphFunc);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a call for empty transformation list. Same as in line 183.

Can it be removed?

}

clonedNetwork = InferenceEngine::details::convertFunctionToICNNNetwork(nGraphFunc, *clonedNetwork);
}
Expand All @@ -155,6 +219,14 @@ InferenceEngine::ICNNNetwork::Ptr clDNNEngine::CloneAndTransformNetwork(const In
transformator.fullTrim();
}

if (baselineIsFP16) {
InputsDataMap inputsMap;
clonedNetwork->getInputsInfo(inputsMap);

auto input0 = getInputTo(inputsMap.begin()->second->getInputData());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be marked as WA with some description. The meaning of this parameter is not fully clear to me.

input0.begin()->second->params["FP16"];
}

return clonedNetwork;
}

Expand Down Expand Up @@ -257,7 +329,7 @@ ExecutableNetworkInternal::Ptr clDNNEngine::LoadExeNetworkImpl(const InferenceEn

context = m_defaultContext;

return std::make_shared<CLDNNExecNetwork>(*CloneAndTransformNetwork(network), context, conf);
return std::make_shared<CLDNNExecNetwork>(*CloneAndTransformNetwork(network, conf), context, conf);
}

ExecutableNetworkInternal::Ptr clDNNEngine::LoadExeNetworkImpl(const InferenceEngine::ICNNNetwork &network,
Expand All @@ -281,7 +353,7 @@ ExecutableNetworkInternal::Ptr clDNNEngine::LoadExeNetworkImpl(const InferenceEn
conf.max_dynamic_batch = static_cast<int>(network.getBatchSize());
}

return std::make_shared<CLDNNExecNetwork>(*CloneAndTransformNetwork(network), casted, conf);
return std::make_shared<CLDNNExecNetwork>(*CloneAndTransformNetwork(network, conf), casted, conf);
}

RemoteContext::Ptr clDNNEngine::CreateContext(const ParamMap& params) {
Expand Down Expand Up @@ -324,7 +396,7 @@ QueryNetworkResult clDNNEngine::QueryNetwork(const ICNNNetwork& network,
for (auto&& node : function->get_ops()) {
originalOps.emplace(node->get_friendly_name());
}
auto clonedNetwork = CloneAndTransformNetwork(network);
auto clonedNetwork = CloneAndTransformNetwork(network, _impl->m_config);
std::unordered_set<std::string> supported;
std::unordered_set<std::string> unsupported;

Expand Down
3 changes: 2 additions & 1 deletion inference-engine/src/cldnn_engine/cldnn_engine.h
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,8 @@ class clDNNEngine : public InferenceEngine::InferencePluginInternal,
CLDNNRemoteCLContext::Ptr m_defaultContext;

cldnn::device_info GetDeviceInfo(const std::map<std::string, std::string> &config) const;
InferenceEngine::ICNNNetwork::Ptr CloneAndTransformNetwork(const InferenceEngine::ICNNNetwork& network) const;
InferenceEngine::ICNNNetwork::Ptr CloneAndTransformNetwork(const InferenceEngine::ICNNNetwork& network,
CLDNNPlugin::Config config) const;
public:
clDNNEngine();

Expand Down
59 changes: 35 additions & 24 deletions inference-engine/src/cldnn_engine/cldnn_program.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -397,7 +397,39 @@ Program::Program(InferenceEngine::ICNNNetwork& network, std::shared_ptr<const cl
, p_currentOutputs({}) {
InitFormat(network);

InputsDataMap inputsMap;
network.getInputsInfo(inputsMap);

auto input0 = getInputTo(inputsMap.begin()->second->getInputData());

bool baselineIsFP16 = false;
if (input0.begin()->second->params.count("FP16") != 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the results of getInputTo is empty map, then this code will crash with segmentation fault.

baselineIsFP16 = true;
}

bool fqFound = false;
bool allFQareSupported = true;
if (config.enableInt8) {
auto it = details::CNNNetworkIterator(&network);
auto end = details::CNNNetworkIterator();
while (it != end) {
auto& layer = *it;
if (layer->precision == Precision::FP16) {
baselineIsFP16 = true;
}

if (CaselessEq<std::string>()(layer->type, "FakeQuantize")) {
fqFound = true;
auto levels = layer->GetParamAsUInt("levels");
if (levels != 255 && levels != 256) {
allFQareSupported = false;
}
}
it++;
}
}

if (config.enableInt8 && (config.lptVersion == Config::LptVersion::cnnNetwork)) {
auto params = LayerTransformation::Params(true, // updatePrecisions
true, // quantizeOutputs
true, // weightsToConst
Expand All @@ -413,38 +445,17 @@ Program::Program(InferenceEngine::ICNNNetwork& network, std::shared_ptr<const cl
.add<FullyConnectedTransformation>(LayerTransformation::Params(params).setSupportAsymmetricQuantization(false), "FullyConnected")
.add<GemmTransformation>(LayerTransformation::Params(params).setSupportAsymmetricQuantization(false), "GEMM");

bool fqFound = false;
bool allFQareSupported = true;
bool baselineIsFP16 = false;
{
auto it = details::CNNNetworkIterator(&network);
auto end = details::CNNNetworkIterator();
while (it != end) {
auto& layer = *it;
if (layer->precision == Precision::FP16) {
baselineIsFP16 = true;
}

if (CaselessEq<std::string>()(layer->type, "FakeQuantize")) {
fqFound = true;
auto levels = layer->GetParamAsUInt("levels");
if (levels != 255 && levels != 256) {
allFQareSupported = false;
}
}
it++;
}
}

// [WA part1] Convert quantized FP16 model to FP32 to avoid possible overflow and mixed precision errors
if (fqFound && allFQareSupported) {
NetPass::ConvertPrecision(network, Precision::FP16, Precision::FP32);
}

LowPrecisionTransformer transformer(transforms);
transformer.transform(network);
}

// [WA part2] Try to find non-quantized layers and convert them back to FP16
// [WA part2] Try to find non-quantized layers and convert them back to FP16
if (config.enableInt8) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, you run this fp16 fallback stuff only for USE_CNNNETWORK_LPT path which will lead to regressions on GPU with new LPT.

if (fqFound && baselineIsFP16 && config.enable_fp16_for_quantized_models) {
auto layersSorted = BFSSort(network);

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,13 +22,17 @@ class INFERENCE_ENGINE_API_CLASS(Eltwise) : public Op {

Eltwise(const Output<Node>& data1,
const Output<Node>& data2,
const ELTWISE_TYPE eltwise_type);
const ELTWISE_TYPE eltwise_type,
const element::Type output_type = element::undefined);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it valid to change signature of legacy operations? As I see this change makes data type cast semantic integrated into eltwise (and many others).


void validate_and_infer_types() override;

std::shared_ptr<Node> clone_with_new_inputs(const OutputVector& new_args) const override;

ELTWISE_TYPE eltwise_type;

private:
element::Type m_output_type;
};

} // namespace op
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -29,17 +29,21 @@ class INFERENCE_ENGINE_API_CLASS(FullyConnected) : public Op {
FullyConnected(const Output<Node> & A,
const Output<Node> & B,
const Output<Node> & C,
const Shape & output_shape);
const Shape & output_shape,
const element::Type output_type = element::undefined);

void validate_and_infer_types() override;

std::shared_ptr<Node> clone_with_new_inputs(const OutputVector& new_args) const override;

size_t get_out_size() { return m_output_size; }
size_t get_out_size() const { return m_output_size; }

element::Type get_output_type() const { return m_output_type; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need this method as you can get output type directly from node output

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Especially when all other operation has no such method.


private:
size_t m_output_size = 0;
Shape m_output_shape = {};
element::Type m_output_type;
};

} // namespace op
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,8 @@ class INFERENCE_ENGINE_API_CLASS(NormalizeIE) : public Op {
const Output<Node>& weights,
float eps,
bool across_spatial,
bool channel_shared);
bool channel_shared,
const ngraph::element::Type output_type);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we introduce output_type even to legacy ops we need to have the same default value and the same logic for setting output type.


float get_eps() const { return m_eps; }
bool get_channel_shared() const { return m_channel_shared;}
Expand All @@ -39,6 +40,7 @@ class INFERENCE_ENGINE_API_CLASS(NormalizeIE) : public Op {
float m_eps;
bool m_across_spatial;
bool m_channel_shared;
ngraph::element::Type m_output_type;
};

} // namespace op
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,13 +19,16 @@ class INFERENCE_ENGINE_API_CLASS(PowerIE) : public Op {
const NodeTypeInfo& get_type_info() const override { return type_info; }

PowerIE(const Output<Node>& data_batch,
const float power, const float scale, const float shift);
const float power, const float scale, const float shift, const element::Type output_type = element::undefined);

void validate_and_infer_types() override;

std::shared_ptr<Node> clone_with_new_inputs(const OutputVector& new_args) const override;

float scale, power, shift;

private:
element::Type m_output_type;
};

} // namespace op
Expand Down
Loading