-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
v4; motivation and initial thoughts #951
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…t on down-level fx
…T for larger values
# Conflicts: # protobuf-net.sln # src/Benchmark/Benchmark.csproj # src/BenchmarkBaseline/BenchmarkBaseline.csproj # src/BuildToolsUnitTests/BuildToolsUnitTests.csproj # src/Directory.Build.props # src/Examples/Examples.csproj # src/LongDataTests/LongDataTests.csproj # src/NativeGoogleTests/NativeGoogleTests.csproj # src/protobuf-net.AspNetCore/protobuf-net.AspNetCore.csproj # src/protobuf-net.BuildTools.Legacy/protobuf-net.BuildTools.Legacy.csproj # src/protobuf-net.BuildTools/protobuf-net.BuildTools.csproj # src/protobuf-net.Core/protobuf-net.Core.csproj # src/protobuf-net.FSharp.Test/protobuf-net.FSharp.Test.fsproj # src/protobuf-net.FSharp/protobuf-net.FSharp.csproj # src/protobuf-net.MSBuild.Test/protobuf-net.MSBuild.Test.csproj # src/protobuf-net.MSBuild/protobuf-net.MSBuild.csproj # src/protobuf-net.MessagePipes/protobuf-net.MessagePipes.csproj # src/protobuf-net.NodaTime/protobuf-net.NodaTime.csproj # src/protobuf-net.Protogen/protobuf-net.Protogen.csproj # src/protobuf-net.Reflection.Test/protobuf-net.Reflection.Test.csproj # src/protobuf-net.ServiceModel/protobuf-net.ServiceModel.csproj # src/protobuf-net.Test/protobuf-net.Test.csproj # src/protobuf-net/protobuf-net.csproj # src/protogen.site/protogen.site.csproj # src/protogen/protogen.csproj
Hey @mgravell thanks for your work, is there any news about it? |
Hi @mgravell , I know that you have a lot of work + family + combat criminals at night but I think this is the best protobuf library for dotnet, and Microsoft since net8 pushes a lot on performance + trimming + AOT + source generator, so I wanna ask:
Anyway thank you for your work! |
Hi; no hard ETA, but definitely still in progress; I'm very aware of the AOT work, and the hope is for the Dapper.AOT learnings to lead into the protobuf-net work; there exists an AOT branch for the analyzer pieces, but I think a lot of it will need some significant rework, but: I'm also a little distracted by Google's recent discussion of "edition 2024", and the "group" changes, which I also want to integrate (parser now works, so... yay!). This is relevant because the "editions" work and the "AOT" work need to interact, so understanding both pieces at the same time is essential. As for MSFT time: my MSFT time is focused on cache work at the moment, but: let's see how it goes a little later in the year, |
About AOT, it seems, that AssemblyBuilder.Save will work in .NET 9. I know generating c# code is better solution, but would this be supported? Generating serializer assemblies for AOT in some "model.csproj" after build step? |
@michaldobrodenka if AssemblyBuilder.Save starts working, I'll happily light up that API, and if that unblocks some scenarios: great! However, that will be unrelated to and tangential to the intended AOT route, which I hope to be codegen based |
Any news? |
First off, thank you for your work! It is great! I know this is not a rushed change (family, day job, etc.), but I was curious what could be done to help this PR along? Are there API improvements of code generators in .NET 9 that can be taken advantage of now? |
The APIs haven't changed hugely (I don't think interceptors give us much); but I do need to revisit this from the ground up, using our learnings here as a foundation - the object model needs a lot of rework based on my learnings from Roslyn incremental generators over the last few years; the approach here is naive. Doable: yes. But it needs dedicated time. |
Thanks for getting back so quickly, Marc! Ah, I see. So then would there be an issue / milestone with TODOs etc. to give a roadmap of what needs to be done so that we could help contribute where able? |
I started to play with generators and created a demo for protobuf generated serializers/deserializers from protobuf-net attributes. https://github.com/michaldobrodenka/GProtobuf It's far from usable, only deserialization is supported with only handful of types. Not tested/used. Just a proof of concept. Maybe will return to it sometimes. But when it's working, deserialization is crazy fast. |
That's neat, @michaldobrodenka! I am working on converting some code to be NativeAoT compliant and unfortunately haven't found a way to keep the NativeAoT runtime from trimming away But it did give me the idea (I haven't looked too deeply at this repo to see how feasible it is)--what if the I am sure there are reasons that this wouldn't work, but with .NET 9 giving full NativeAoT support for iOS, I am seeing a lot of movement towards NativeAoT to get off of MonoAoT. |
Eesh, I should just dust this off and ship something, even if it is incomplete. My plans are wider than my calendar, it seems. |
Is there any chance v4 could bring back support for I'm trying to find a good serializer for Unity and ProtoBuf v2 is the only one I've found which meets all my needs except that I can't seem to use it in Android builds due to IL2CPP requiring AOT compilation so it would be a huge shame to find a solution to that problem only to lose such a useful feature. |
@KybernetikGames did you looked at cysharp repos? They develop games with Unity and they are the creators of R3 (observables) and [Message/Memory]Pack (serializers) both developed in the way to be compatible with Unity. |
If you need solution now, you can check my protobuf-net 2 fork - with precompile you can prepare serializer in post build step as a dll. I'm using it in production. And you don't need old net framework. It works with net6+ https://github.com/michaldobrodenka/protobuf-net |
@Dona278 I briefly tried MessagePack and MemoryPack but ran into issues with each of them (here and here) which would have required me to refactor quite a bit of my code base. ProtoBuf v2 seemed like a silver bullet which handled everything I need to do with it right up until I tried to use it in a runtime build. But if I can't get it going then I'll definitely be revisiting the cysharp systems. @michaldobrodenka I found your repo earlier today and have been trying to get it to work in Unity with no success so far and there's no Issues page so I wasn't sure how to contact you. Do you have a preferred contact method? |
@KybernetikGames have you checked |
I genuinely do have plans to revisit the AOT work. I just need the world to switch to a 36 hour day so I have enough hours in each... |
Well, I just wanted to ask if you maybe had an outline of the work (that you know of so far) that needed to be done so that anyone who has the time and could contribute would (I have been looking into I know I am certainly interested in contributing. |
This PR covers some initial exploration into v4
Key Motivations
2 and 3 are most likely by way of a new reader/writer API with additional optimizations; 1 is most likely via new build tools which integrate with the outputs from 2 and 3
Improve AOT Support
Currently the core engine is focused on runtime reflection-based IL emit. The library conceptually supports AOT scenarios, including library separation of the core and reflection-based aspects, and attribute based annotation support for manually-written serializers, but none of the tools currently generate code-based serializers. We aim to provide both code-first and contract-first AOT scenarios, typically using Roslyn generators (either based on the discovered code model, or the .proto files parsed - the machinery ahead of these bits already exists).
Additionally:
Improve performance
Profiling has shown that the existing API is sub-optimal; discovery work has been done ahead of this PR to investigate a "from first principles" re-imagining of the core reader/writer API. It is fundamentally not possible to achieve all of the aims here without a new API, although it may be possible to reuse the new API from without the older API as a wrapper layer.
These changes include:
Support Additional Memory Usage Scenarios
Some models are inherently "allocatey"; consider, for example, a model with a
repeated
chunk of multiple sub-items, each of which has abytes
payload, resulting in large numbers of smallbyte[]
chunks. The idea here is to facilitate more efficient scenarios here; e.g. we could generateReadOnlyMemory<byte>
instead ofbyte[]
, and allow multiple leaf levels to be slices of the same underlying oversized buffer. The existing PR explores this scenario. Note, however, that profiling is mixed on the outcome of this. We want to enable this, but as an option, allowing us to play with multiple options with real data.Smaller Outputs
Right now the runtime library needs to contain chains for things it might need - niche random code paths for obscure and esoteric models. Because this discovery is done via reflection, these edge-cases are largely not trimmable (in the AOT sense), because discovering whether they are reached are not is basically impossible. By moving to an AOT path, without all the reflection gunk, it is very clear at build time what code is reached - there is no reflection gunk. This means we don't need all the reflection dependencies, and we don't need all the dependencies for all the stuff that isn't used by the model. This saving can be significant.
Likely implementation
We need to consider code-first and contract-first separately here. Let's consider a simple scenario:
Currently, this can be used to generate something akin to the same contract, as seen from a code-first perspective:
What we want to achieve is that whether starting code-first or contract-first, we generate code that includes the actual serialization code, either at the same time as generating the code (contract-first), or in an additional partial-class (code-first). Typical output code is shown in the exploration work in the PR.
The key point here, though, is that code-first and contract-first start from completely different code models - contract-first (and the existing code-gen) starts from the
FileDescriptorSet
view, where-as code-first starts from a Roslyn view. The actual code-gen should not have to content with this, and we do not intend duplication, so: the proposal instead is to create a new source-agnostic API that the new code-gen tools should use, and populate the source-agnostic API from the specific scenarios.For example, we could have:
So here, we would generate the equivalent of
So; the initial work items:
FileDescriptorSet
contract-first model to populate the new modelIt is not a goal of the current stage to emit code for the old serializer API from the new model; while that might be a nice feature in the future, it is not seen as solving an immediate need, and will only add support costs.
High level tasks
FileDescriptorSet
FileDescriptorSet
Test skeleton; somehow setup multi-input test (folder-based?) that takes a corpus of examples