In AWS a VPC subnet can only reside in one availability zone. I'm curious what's the reason behind this restriction.
I don't know how AWS implements VPC under the hood. But usually VPC implementation is based on overlay network, such as VXLAN. Take VXLAN as an example, technically two endpoints in a virtual subnet can communicate as long as the two physical hosts hosting the endpoints can communicate. It shouldn't matter whether these two hosts are in an availability zone or not.
So I'm wondering what's the reason for the limitation. Is it due to performance or some other network limitations?
This is by design, a subnet association is the indicator of which availability zone your resource is located in.
For planning to improve resilience and high availability, it is key for the user to guarantee there is isolation between their resources. If a subnet could span several availability zones then you could not guarantee that their infrastructure could all be brought down by a power cut for example.
On the other hand some people want to guarantee the minimum latency between resources within their VPC environment. By deploying them all to the same subnet they can guarantee that traffic will remain inside of the same logical data centre which will provide them the best latency between services.