-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support CPU - WebAssembly scenario of the op level execution use case #156
Comments
(Cross-linking @wchao1115’s comment on WebAssembly.Memory object #149 (comment).) |
In order to better understand this use case, recently I happened to experimentally implement conv2d of TF.js Wasm backend by WebNN API. The implementation is in conv2d_impl.cc and the WebNN calls are guarded by According to the prototype, there are some findings:
|
@huningxin That is correct, TFJS wasm backend is synchronous, the computational heavy ops are executed with webworkers for multi-threading. And TFJS can be ran with a webworker to achieve asynchronous. |
@pyu10055 , thanks for the clarification. BTW, the webnn-native code to reproduce the conv2d perf of #156 (comment) is in webmachinelearning/webnn-native#10 for review. Feel free to check it out. |
While doing issue gardening, noticed this issue had been fixed by #174. |
Open this issue to follow up the operation-specific APIs discussion of 3/18 WebML CG call. @pyu10055 @wchao1115 @anssiko @jbingham, please take a look.
Use case
This is one scenario of the framework's op level execution use case (more details can be found in operation-specific API proposal). A JavaScript ML framework executes ops on the CPU device with WebAssembly. For the compute intensive ops, such as conv2d or matmul, the framework also wants to use WebNN API to execute the op (by a single-op MLGraph) with the ML-specific instructions, such as Vector Neural Network Instructions (VNNI), on the same CPU device.
Requirements
WebNN should allow frameworks create a
MLContex
for CPU device. This would avoid the unnecessary data copying cross devices when frameworks use WebAssembly - CPU to execute other ops.WebNN should allow frameworks control when the output data is available for access. This would avoid the unnecessary tensor layout conversions between native ML API and the WebNN. Some background:
For example, a user of TensorFlow.js may execute 3
conv2d
but only access the output of the last one:A potential WebNN implementation would only need to do the memory layout conversion and put the data into
ArrayBufferView
whenh.data()
is invoked.The text was updated successfully, but these errors were encountered: