forked from openvinotoolkit/openvino
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Low Latency transformation (openvinotoolkit#2869)
* initial draft of adding sinks to ngraph::Function * style fixes * code style fixes * code style fixes * code style fix * review fix+build fix * code style fix * fix build * API changed according to latest discussion * review fixes * review fixes + tests * initial draft of adding sinks to ngraph::Function * style fixes * code style fixes * code style fixes * code style fix * review fix+build fix * code style fix * fix build * API changed according to latest discussion * review fixes * review fixes + tests * added 1 more ctor * style fixes * used new api in ir parser * fixed build * update low latency transformation, fix unroll transformation, add unit tests, modify subgraph tests * fix low latency transformation * Update low latency transformation, unit and sub-graph tests * update LowLatency transformation and tests * ngraph codestyle * fix build, update description * resolve review remarks Co-authored-by: Svetlana Dolinina <[email protected]>
- Loading branch information
Showing
24 changed files
with
1,023 additions
and
93 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,56 @@ | ||
// Copyright (C) 2020 Intel Corporation | ||
// SPDX-License-Identifier: Apache-2.0 | ||
// | ||
|
||
/** | ||
* @brief This header file defines the list of public transformations. | ||
* | ||
* @file ie_transformations.hpp | ||
*/ | ||
|
||
#pragma once | ||
|
||
#include <ie_api.h> | ||
#include <cpp/ie_cnn_network.h> | ||
|
||
namespace InferenceEngine { | ||
|
||
/** | ||
* @brief The transformation finds all TensorIterator layers in the network, processes all back | ||
* edges that describe a connection between Result and Parameter of the TensorIterator body, | ||
* and inserts ReadValue layer between Parameter and the next layers after this Parameter, | ||
* and Assign layer after the layers before the Result layer. | ||
* Supported platforms: CPU, GNA. | ||
* | ||
* The example below describes the changes to the inner part (body, back edges) of the TensorIterator layer. | ||
* [] - TensorIterator body | ||
* () - new layer | ||
* | ||
* before applying the transformation: | ||
* back_edge_1 -> [Parameter -> some layers ... -> Result ] -> back_edge_1 | ||
* | ||
* after applying the transformation: | ||
* back_edge_1 -> [Parameter -> (ReadValue layer) -> some layers ... -> (Assign layer) ] | ||
* \ | ||
* -> Result ] -> back_edge_1 | ||
* | ||
* It is recommended to use this transformation in conjunction with the Reshape feature to set sequence | ||
* dimension to 1 and with the UnrollTensorIterator transformation. | ||
* For convenience, we have already enabled the unconditional execution of the UnrollTensorIterator | ||
* transformation when using the LowLatency transformation for CPU, GNA plugins, no action is required here. | ||
* After applying both of these transformations, the resulting network can be inferred step by | ||
* step, the states will store between inferences. | ||
* | ||
* An illustrative example, not real API: | ||
* | ||
* network->reshape(...) // Set sequence dimension to 1, recalculating shapes. Optional, depends on the network. | ||
* LowLatency(network) // Applying LowLatency and UnrollTensorIterator transformations. | ||
* network->infer (...) // Calculating new values for states. | ||
* // All states are stored between inferences via Assign, ReadValue layers. | ||
* network->infer (...) // Using stored states, calculating new values for states. | ||
* | ||
* @param network A network to apply LowLatency transformation | ||
* * | ||
*/ | ||
INFERENCE_ENGINE_API_CPP(void) LowLatency(InferenceEngine::CNNNetwork& network); | ||
} // namespace InferenceEngine |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
16 changes: 16 additions & 0 deletions
16
inference-engine/src/inference_engine/ie_transformations.cpp
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
// Copyright (C) 2020 Intel Corporation | ||
// SPDX-License-Identifier: Apache-2.0 | ||
// | ||
|
||
#include "ie_transformations.hpp" | ||
#include <ngraph/pass/low_latency.hpp> | ||
#include <ngraph/pass/manager.hpp> | ||
|
||
using namespace InferenceEngine; | ||
|
||
void InferenceEngine::LowLatency(InferenceEngine::CNNNetwork &network) { | ||
auto function = network.getFunction(); | ||
ngraph::pass::Manager manager; | ||
manager.register_pass<ngraph::pass::LowLatency>(); | ||
manager.run_passes(function); | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
5 changes: 5 additions & 0 deletions
5
...e-engine/src/transformations/include/transformations/common_optimizations/low_latency.hpp
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
// Copyright (C) 2020 Intel Corporation | ||
// SPDX-License-Identifier: Apache-2.0 | ||
// | ||
|
||
#include <ngraph/pass/low_latency.hpp> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.