At this point ONNX is the most mature inference engine for mobile.
Had the joy of playing around with TFLite, Pytorch Mobile, GGML for work and nothing came close to ONNX in terms of stability across a wide array of devices.
Onnx is really helpful for shipping models from development to productionisation environments. It's standard is only designed to allow "safe operations", anything to do with text manipulation, for example, you'd have to write your own operator or glue logic for.
Unfortunately, at least as of last time I evaluated it, it still can't handle certain things that are currently only done with pickling in the Python ecosystem. Much of what's covered by scikit-learn's feature_extraction package, for example.
Not, I think, for any reason that's inherent to what those components are doing; a lot of it's just that much of the existing Python ML ecosystem was not engineered with robust productionization in mind. Possibly because the very existence of Pickle means everyone has an easy (if horrifying) way to get the job done for 0 effort. As the sklearn maintainers remind people every time they close an issue that asks for it, robust and secure model serialization is something that would have had to have been designed into the project from day 1, and doing it now would essentially require a rewrite.
Had the joy of playing around with TFLite, Pytorch Mobile, GGML for work and nothing came close to ONNX in terms of stability across a wide array of devices.
Also, model conversions are a breeze.
Shameless self promotion here but I wrote a little bit about calling Onnx in Scala here - https://tajd.co.uk/2023/10/15/onnx-interface-scala
Not, I think, for any reason that's inherent to what those components are doing; a lot of it's just that much of the existing Python ML ecosystem was not engineered with robust productionization in mind. Possibly because the very existence of Pickle means everyone has an easy (if horrifying) way to get the job done for 0 effort. As the sklearn maintainers remind people every time they close an issue that asks for it, robust and secure model serialization is something that would have had to have been designed into the project from day 1, and doing it now would essentially require a rewrite.
https://checkoway.net/musings/pickle/
If you want in on the fun.