I have project A and project B. They may be in different programming languages. Project A exposes an API using proto files, which project B will use to then generate the API in the programming language which project B uses.
But where are the proto files stored? What is the conventional way to do it with protobuf? Do you add the files generated from the proto files files to version control?
If you store a copy of the proto files in both project A and project B, then if project A changes its API, then project B will have to copy them over. This approach doesn't work great when there are many project using the API exposed by project A.
You can solve the above issue if you have a separate project, project C, containing the shared proto files. But then how do you generate the proto files from project A and project B?
I would suggest storing the
.proto files in a separate project. These are the contract between your two projects, and they are not necessarily "owned" by either one. Storing them in a separate project provides neutral ground for both project members to negotiate changes to the files - for example through a pull/merge request process where there may be members from both projects acting as reviewers.
As for generating code from the proto files, I would probably do this in the projects that need them. So in your case project C would only contain the
.proto files, and projects A and B would pull the
.proto files in and generate the code that they need. I feel like it has to be this way since it is projects A and B that are consuming the protobuf generated code. If the code was generated in project C then projects A and B would still have to pull the generated code to be able to use it, and since project C is technically decoupled from A and B it would not be obvious which languages would need to be generated - all of them? Just the 2 that are needed?
By creating project C you are creating a place that could potentially hold more
.proto files for other projects. Thinking to the future you may have many projects that share common base message types. To manage an architecture with many interconnected projects it makes a lot of sense to try and consolidate message definitions, and this would be difficult / impossible if each project maintained it's own definitions, even worse (as you say) if there are duplicate copies. Storing them in one location allows for new projects to pick up existing definitions and expand them (within evolutionary guidelines), and allows for more rigor over managing and maintaining the set of definitions, e.g. a set of experienced reviewers making sure that everything is being done consistently and sensibly - be it from a modeling, namespacing or versioning perspective.