-
Notifications
You must be signed in to change notification settings - Fork 150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Modularize Discovery Protocols as Pods #198
Comments
I'll try to add some thoughts I have on my mind. I personally don't know gRPC, I've used only old RPC in my past so I don't have anything against it. I'd like just to point some benefits of REST over gRPC:
Not strictly about REST:
|
I'm in the opposite boat, more familiar with gRPC than REST; however, I can see how REST may be better, especially since discovery handlers would only need to support two endpoints: one for For testing gRPC, there is a grpCURL tool that @DazWilkin blogged about using to inspect Akri's Device Plugin's with are gRPC over UDS. For your point about brokers not always being necessary, I agree. Brokers are currently optional, and Akri can just discover (which could mean reaching out to an external broker that exposes the REST/gRPC interface) and expose those devices to the cluster as Kubernetes resources, but not deploy a broker to the device. Is this what you were thinking? As for push vs pull, I was wondering the same. Pulling/polling makes sense with the current flow of applying a Configuration kicking off action in Akri. In a pull scenario, the Agent would see a Configuration and then reach out to some endpoint specified in the Configuration to discover devices. If it was a push scenario, what would the Agent do in response to seeing a Configuration? Would it be ready to receive information about devices from an endpoint specified in a Configuration? |
Yes exactly
Good question indeed. Does it need to do something though? It could maybe have only one endpoint/interface for receiving discovered devices, but a new configuration would mean that more device types are accepted. Otherwise it could reject unknown configurations. I don't know if this is the right issue but I have an issue on my mind with CoRE. CoRE via CoAP basically works this way:
I can register the device and it has a generic name like This flow is reasonable but I wonder if it could be improved or what use cases cannot be solved. What if I want a broker which can aggregate all temperature measurements from the devices to give a more accurate result? By allowing the discovery handler to push new devices instead of polling, maybe we can support a big number of devices types. I hope it makes sense and can add another perspective to this issue. |
Maybe you crossed out this line because you came to the same conclusion, but while a discovery protocol is specified in a Configuration, a Configuration is specific to a device type, which is why you also specify filters in a Configuration. For example, for a Configuration to discover local video devices, you specify the udev protocol and the filter (udev rule) That's a good point that, currently, we do discovery for each Configuration/device type, so the more device types/ Configurations, the more polling the Agent has to do.
Are you wondering about whether we could have one broker for multiple instances? Currently, we only have one "broker deployment strategy" of one broker per instance. This is a limiting use case, and we'd love to support others, such as the one you mention, which we describe as "Instance Pooling" in our broker deployment strategies proposal |
If we use the method of discovery handler pods pushing discovery results to the agent, how do we prevent a denial of service attack or the Agent being overloaded by too many results? |
I'm familiar with both gRPC and REST. REST is conventionally JSON and HTTP but need not be; Constrained RESTful Environments (CoRE) is REST/UDP . gRPC is conventionally protobuf messages over HTTP/2 but need not be. It is true that there are more developer tools available with REST but this is partly because REST is more common. Whether REST-like or RPC-like is chosen, Akri should consider providing SDKs for the popular languages (via OpenAPI or Protos) so that Akri discovery protocol developers can use a known-good SDK rather than be exposed to what should be an implementation detail. Request-Response, polling, streaming, bidirectional etc are implementation decisions and don't affect, for example, where handlers and agents may run (both permit running handlers remotely). gRPC is probably better if:
REST is probably better if:
Kubernetes' components frequently use gRPC service-service but the API Server (clients-service) is almost exclusively REST. |
I think k8s is a good example of gRPC vs REST indeed. For internal communication between parts gRPC is more immediate because it's closer to the SDK/language itself. For more higher-level communication between heterogeneous components REST has more advantages. So I agree gRPC or REST can be decided later after choosing the agent architecture.
I crossed because I couldn't think of anything better 😄 I still feel that a Configuration is a more a discovery protocol than device type and you can filter what is discovered. For instance, to support CoRE I need to add a new Configuration. But CoRE is not a device type, it's a discovery protocol indeed. Actually I don't even think there is a concept of device types in CoRE, both CoRE and CoAP talk about resource types because a device might provide different sensor data like both temperature and humidity. It doesn't matter what the device types is. By talking about resources types instead of device types, the latter don't matter. A camera could provide both video and audio resources. Or environment temperature could be provided by different device types under the same resource type.
If you mean malicious DOS, I believe it's out of the scope of akri. If you allow external services to communicate in any way to your cluster it's up to you to make it safe. Overloading could instead solved by horizontal scaling maybe? In the case of many results, even the pull approach would have severe issues. Maybe it would even be harder to scale horizontally because you should decide how to distribute the polling operations between the agents. |
I put together a spec on hackmd, pulling in a lot of this discussion into a design that I am working on implementing now. Would love to hear any comments people have on HackMD. |
Good job!
I've added some preliminary feedback. I need to better educate myself on the message flows between an Agent and remote |
Thanks for the feedback @DazWilkin and @jiayihu, it has led me to pivot a bit. It makes sense for the response to an Agent's call on a Specifically, in the proto file, what was:
Would have a streamed response:
|
Is your feature request related to a way you would like Akri extended? Please describe.
Currently, protocol discovery is all implemented in the Akri Agent, which as explained in the protocol extension proposal makes the Agent bigger than needed (if someone is not using all the protocols), creates a larger attack surface, and requires discovery protocols to be written in rust.
Akri should be able to be deployed with only the desired discovery protocols, and discovery protocols should be able to be added without changing Akri's core components (the Agent, Controller, and Configuration CRD). This would make extending Akri easier.
Describe the solution you'd like
This could be done by implementing idea 3 of the protocol extension proposal, wherein protocols are implemented as pods and exposed over a gRPC interface. The DiscoveryHandler's two methods
are_shared
anddiscover
could be exposed over this interface.Additionally, the Configuration CRD would need to be modified so protocols are generic rather than explicitly defined in the CRD. Potentially as a map, as mentioned in this thread.
Also, the Agent should still be able to be compiled with DiscoveryHandlers in case users want the smaller footprint of one Agent pod rather than Agent pod + 1 pod for each DiscoveryHandler.
Describe alternatives you've considered
Other ideas listed in the protocol extension proposal, such as exposing the Agent as a library.
Additional context
The Zeroconf protocol #163 increases the Agent's size considerably and would benefit from running in its own Pod.
The text was updated successfully, but these errors were encountered: