I'm curious how it's able to get the ID's of UI elements in android activities and simulate different events? Is it the same story for iOS?
On iOS, an HTTP server is embedded in your app. That HTTP server hosts routes that can be used for querying the view hierarchy and the internal state of your app. The server also serves as a bridge between your app and the UIAutomation JavaScript API; this is how gestures are performed.