Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[xpu] support fp16 data pricision #9080

Merged
merged 34 commits into from
Jun 28, 2022
Merged

[xpu] support fp16 data pricision #9080

merged 34 commits into from
Jun 28, 2022

Conversation

xiuxin121
Copy link
Contributor

No description provided.

@paddle-bot-old
Copy link

Thanks for your contribution!

}
}

void XPUStaticKernelPickPass::GetScore(PrecisionType precision,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

是否要单独考虑下数据类型为LOD_TENSOR_ARRAY时的场景?
当为LOD_TENSOR_ARRAY时,precision是为kUnk的,但注册的kernel是有具体的类型的。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

能否给个具体模型例子

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

考虑到当前修改的pick算法主要是针对FP16,而数据类型为LOD_TENSOR_ARRAY时的场景,也需要在FP32下考虑,此外,数据类型为LOD_TENSOR_ARRAY的OP reverse目前lite还没有合入,将在后续中加入,本次不做添加。

size_t score_tmp = 0;
if (kernel.GetInputDeclType(tmp)->precision() == PrecisionType::kAny) {
GetScore(PrecisionType::kAny, &score_tmp);
VLOG(6) << "match input data presion:kAny";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

precision

kernel.GetInputDeclType(tmp)->precision() ||
xpu_output_type_[in_names[i]] == PrecisionType::kAny) {
GetScore(xpu_output_type_[in_names[i]], &score_tmp);
VLOG(6) << "match input data presion";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同上,precision

@xiuxin121 xiuxin121 requested a review from shentanyue June 21, 2022 06:50
@@ -107,6 +107,9 @@ USE_MIR_PASS(__xpu__bigru_fuse_pass);
USE_MIR_PASS(__xpu__dynamic_lstm_fuse_pass);
USE_MIR_PASS(__xpu__multi_softmax_fuse_pass);
USE_MIR_PASS(__xpu__max_pooling_pad_zero_detect_fuse_pass);
#ifdef LITE_WITH_XPU
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

__xpu__static_kernel_pick_pass.cc本身已经用宏隔离了,这里应该不需要加宏隔离

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok,下次提交时修改。

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

建议去掉吧

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok,done

@@ -93,6 +93,11 @@ class SSAGraph : GraphBase {

std::string dump();

#ifdef LITE_WITH_XPU
void CopyScope(const Scope *scope) { scope_ = scope; }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

函数命名为 SetScope 会不会更好?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

只做scope拷贝,不涉及scope设置。

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

graph 可以通过类似方法获得scope ,不建议增加接口,原则上做最小改动

for (auto& any_op_node : graph->StmtTopologicalOrder()) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -93,6 +93,11 @@ class SSAGraph : GraphBase {

std::string dump();

#ifdef LITE_WITH_XPU
void CopyScope(const Scope *scope) { scope_ = scope; }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

graph 可以通过类似方法获得scope ,不建议增加接口,原则上做最小改动

for (auto& any_op_node : graph->StmtTopologicalOrder()) {

@@ -107,6 +107,9 @@ USE_MIR_PASS(__xpu__bigru_fuse_pass);
USE_MIR_PASS(__xpu__dynamic_lstm_fuse_pass);
USE_MIR_PASS(__xpu__multi_softmax_fuse_pass);
USE_MIR_PASS(__xpu__max_pooling_pad_zero_detect_fuse_pass);
#ifdef LITE_WITH_XPU
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

建议去掉吧

@@ -47,9 +50,13 @@ std::unique_ptr<RuntimeProgram> Optimizer::Run(Program&& program) {
graph.reset(new mir::SSAGraph);
graph->Build(program, valid_places_, block_idx);
graph->SetValidPlaces(valid_places_);

#ifdef LITE_WITH_XPU
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不建议修改

Copy link
Collaborator

@hong19860320 hong19860320 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zhupengyang zhupengyang merged commit 9f8f7a6 into PaddlePaddle:develop Jun 28, 2022
newway pushed a commit to newway/Paddle-Lite that referenced this pull request Aug 23, 2022
newway added a commit that referenced this pull request Aug 29, 2022
* [XPU] fixed the bug of tile op in large input and add XPU implementation. (#9102)

* [XPU] Fixed the bug of reuse of reshape2's output in xpu_memory_optimize_pass (#9178)

* [x86][XPU] Add the support tensorarray of slice on x86 and xpu (#9134)

* [XPU] Fixed the error on stack op binding on float. (#9204)

* [XPU] Stop supporting xpu conv autotune config with paddlelite c api. (#9316)

* [XPU] Support pre-LN encoder (#9159)

* [xpu] support fp16 data pricision (#9080)

* [xpu] delete kernel.precision()==float (#9189)

* [XPU] support fp16 data pression (#9228)

* [XPU] support fc per channel quant (#9323)

Co-authored-by: wbn <[email protected]>
Co-authored-by: Jinchen Han <[email protected]>
Co-authored-by: TingShenXD <[email protected]>
Co-authored-by: quwei03 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants