rfc6749 says, "Before initiating the protocol, the client registers with the authorization server". It describes the client_id as REQUIRED. It gives some hints towards why this is the case, but leaves open questions (to me). It is also unclear to what extent they apply to PKCE because rfc7636 only mentions that the client_id should be considered public.
With the client_id being public, an attacking application that tried to gain authorization can easily send the client_id of the real application. That means, the authorization server cannot use the client_id alone to trust the client, nor can it use the client_id to show the user which client tries to gain authorization.
rfc6749 mentions that the client_id is used to select the redirection endpoint. On the other hand, the malicious application can override that endpoint during the authorization request -- so the authorization server cannot rely on that endpoint being correct and must ask the user if it is correct -- and also it seems redundant to select an endpoint using client_id if it can be specified in the authorization request. Anyway, selecting an endpoint using client_id is only possible if that endpoint depends only on the client and not on the user, but I'm interested in the case that each user runs a local-network installation of the client on a user-chosen address. Unless I force each user to register a separate client (which does not seem to be the intention of registration and client_id), this means that each user has a different redirection endpoint, so it has to be specified during the authorization request and cannot be selected by client_id alone.
rfc6749 also mentions the use of client_id during the token request: "In the "authorization_code" "grant_type" request to the token endpoint, an unauthenticated client MUST send its "client_id" to prevent itself from inadvertently accepting a code intended for a client with a different "client_id"." I honestly don't understand what this means. Assuming that the attacking client uses a different client ID to gain authorization (which in itself would be a serious problem, but let's ignore that for a moment) and then convinces the genuine application to use that code to get a token. Sending client_id would prevent that, but let's see what happens if we skip that check. Now the genuine application has a token which it did not ask for, giving it different permissions than it asked for. How is that a problem? The genuine application either has all the permissions it needed (plus some), working as intended, or it is lacking some permissions and will at some point fail with an error. To me this does not seem to do any more harm than to annoy the user.
Given all that, I do not understand why client registration and using a client_id is required. Suppose the client and the authorization server just skipped that and implemented the whole process without registration and client_id, what kind of attack would become possible?
I misunderstood two things about rfc6749. Clearing these two things up helped me understand why the client_id is specified to be required.
I misunderstood the redirect_uri parameter and thought that the endpoint configured during registration is just a default, but actually the redirect_uri must be one of the configured endpoints and only allows selecting one of multiple endpoints. An attacker cannot redirect to their own URL just by setting that parameter. An attacker could still set up their own client, with their own client_id, which looks similar to the genuine application, but doing so should be prevented in the first place and security considerations about the registration process are out-of-scope for rfc6749.
I mentioned that I have to deal with local-network installations using a-priori unknown addresses, and this would fall under section 2.4 (unregistered clients) and is therefore out-of-scope for rfc6749, too.
If the attacker is granted an access token and could cause the genuine application to use it, then I assumed that nothing bad could happen because at worst, the token does not imply sufficient permission. What actually happens in many applications is that the request which is authorized with the access token can omit one or more of its request parameters because they are implicitly determined from the access token. An obvious candidate for such a parameter would be the User ID. This can cause the genuine application, when using the wrong token, to store sensitive information at a place that was prepared by, and belongs to, the attacker.
WRT to the "how", consider that access tokens are often not just random strings but JWTs that contain multiple data fields, guarded by a digital signature. It becomes immediately obvious that an application could take these fields from the JWT instead of explicit parameters. If the JWT only served to grant permission, applications would have to expect all these fields again as parameters and check them for correctness (e.g. equality), and that would cause the whole OAuth2 specification to be hard and error-prone to implement, and thus less secure overall.