How can I make sure that Cloud HSM service providers are really using physical Hardware Security Module not a simulation software?

This is exactly what I want to know. Cloud based HSM is expensive and I need to make sure of what to be paid for.

Solution

Primarily public-key cryptography. HSMs contain keypairs that chain back to trusted public roots, in the same way that SSL/TLS works. You can get a signed certificate from the device, and then verify that the certificate is signed by the manufacturer. (This is the same way you'd verify that an HSM in your physical possession is real and not a counterfeit.)

You're also trusting that the manufacturer and the certifying agencies with which they work are correct in their statements about the device's security, but that's no different than you have to do with an HSM in your possession.

The only difference in the threat models of on-prem vs. cloud HSMs is that in a cloud situation, my cloud provider can watch traffic and attempt to run commands on the HSM. However, the traffic is all encrypted and the device is at the very least password-protected (and will usually zeroize itself if too many failed login attempts occur), so my cloud provider can't actually access any cryptographic material or see what you're doing. The most they could see is how much traffic you're sending to the HSM, and if you're really paranoid you can have a system that adds random traffic to obfuscate usage patterns.

For more information, AWS has their answer to this question in their CloudHSM FAQs under "How do I know that I can trust CloudHSM appliances?" (you need to scroll down a bit, the docs don't support linking to a specific question, just a section)