I am under the impression that each micro-op is 8 bytes after looking at u-op caches for a while but my question is are all micro-ops the same size, even fused domain micro-ops?
This detail is not documented by x86 chip vendors. However, uops need to be simple enough so that they can be decoded within a fraction of a cycle. This is in contrast to x86 instructions where an instruction requires at least one cycle to be decoded (although multiple instructions can be decoded in the same cycle). So making uops of the same size with fairly uniform format greatly helps achieve this. I think most probably fused-domain and unfused-domain uops are all of the same size on most x86 processors. In Intel processors, uops in the uop cache can be of different sizes depending on whether a uop has an immediate and/or a displacement operand. On the other hand, the IDQ can accommodate a fixed number of uops without conditions on what the uops are, which suggests that each uop in the IDQ occupies the same amount of space. The size of a fused-domain uop might be different than that of an unfused-domain uop. But for micro-fusion to be of any use, the size of a fused-domain uop must be strictly smaller than twice the size of an unfused-domain uop. Also I think we can logically say that the size of a fused-domain uop is at least as large as the size of an unfused-domain uop.