Does anyone seen a flow where user encrypt the prompt with FHE then the model process it in the encrypted form, and returns the output also in the encrypted form?
I think such flow would be the ideal for future models.
Does anyone seen a flow where user encrypt the prompt with FHE then the model process it in the encrypted form, and returns the output also in the encrypted form?
I think such flow would be the ideal for future models.
@hello1 you would have to fine tune the model with encrypted data to facilitate such things. I have seen in AWS sagemaker where folks have experimented with this but there does not appear to be wide adoption yet.
-Coby
Thanks Coby, I’m already started to write an example code. I will do it with a GPT-2 model as an example to see how it works.
Andrej Karpathy’s GPT2 c code is a good start for this.
I found this blog, and it shows I’m not the only one who is interested:
I will try to assemble on my research day, but my question is more on the SambaNova hardware infrastructure, how it capable to accelerate such workloads.
Maybe an engineer can explain what is possible, or an intern who has access to hardware can benchmark the code.
I think it would be interesting for inference providers, as technology provides the guarantee for the data privacy.
@hello1 I can submit the topic and interest to the product management teams but cannot guarantee you that soemthing like this will be on the near term roadmap.
-Coby
Thanks Coby, that’s perfect. I will also do my personal experiments.
@hello1 Thanks for sharing about this. I haven’t heard about this until now.
Just curious, what applications are you using this for?
In my WEB 4.0 wallet. I think if we want to build an infrastructure for privacy then AI workload is not an exception. The one will solve it will rule the market.
I think privacy is more important feature than speed itself, just based on the cybersecurity incident rising numbers.
It not only protects the users, but also the inference providers.