GCP PubSub - broadcast message - only relevant subscriber handles message

I'm using GCP PubSub as a backend for Web Sockets in a load-balanced environment. The current implementation has a topic for each server behind the load balancer and a mapping between end-users and servers. When I wish to send a message to a particular user, I use the mapping to determine which topic to publish it on.

This works, but it has a lot of moving parts, and requires cleanup of topics when servers are removed by downscaling or rolling out an updated version of the application.

I'm now exploring a more sophisticated implementation, whereby there is only one topic. Since each server knows its end users, in theory I could publish a message to this single topic and each server could inspect the message, comparing it to its list of users. Only the server which currently has a Web Socket connection for the end user specified in the message would handle it.

Which brings me to my question(s) - how do I achieve this with PubSub?

Currently I'm using Pull subscriptions, do I need to use a Push subscription? Perhaps that would allow simultaneous delivery to all subscribers?
In the Pull model, if a subscriber doesn't .ack() a message, presumably that allows the message to be redelivered, but it could take a long time for the message to eventually get sent to its appropriate subscriber (defeating the purpose of Web Sockets, which is "real time" updates) - is this a fair assessment of how it would work in a Pull subscription model?
Am I using the wrong tool for the job? It's possible, but I'm hoping I just need to make different use of the current tool

Solution

In this case I will use this design:

Create only one topic where all the messages are published
When a VM starts, the VM creates itself a pull subscription to the PubSub topic
When the VM shuts down, the VM deletes the subscription (in shutdown script for example)

Then, when a message arrives, it is posted in only one PubSub topic and fanned out to all the active subscription. The VM pull continuously the messages. When a message arrives in the pull queue:

The VM check if it is for it
- if no, ack the message (remove it from its subscription, not for the others)
- if yes, process, and ack the message

With this design, you minimize the latency, and you publish in only one topic. However, you duplicate a lot of message and you consume processing power to discard all the irrelevant messages.

EDIT 1

The principle is the following: you publish 1 message in the topic, then the message is duplicated in all subscription, and the subscribers (1 or many per subscription) receive a subset of message of 1 subscription (or all the messages if there is only 1 subscriber on the subscription)

That's why, in my proposition:

Each VM, create its own subscription and is the only one subscriber on it to receive a copy of all the messages published in the topic.
The irrelevant messages are acknowledged to remove them from the queue. They are only deleted from the current subscription, of the current VM.