Search code examples
windowsarmgrpchololens

What could be causing grpc exec_ctx starting_cpu to be a garbage value?


I'm trying to debug an access violation bug within a library that uses grpc. This bug only seems to originate in arm32 builds on an arm64 Windows device (Hololens 2). Symptomatically, the starting_cpu_ member of exec_ctx is assigned some garbage value immediately after creation, which causes an access violation when operating through internal methods.

Specifically, e.g. in line 485 of core\lib\surface\channel.cc grpc_channel_create_registered_call(grpc_channel * channel, grpc_call * parent_call, unsigned int propagation_mask, grpc_completion_queue * completion_queue, void * registered_call_handle, gpr_timespec deadline, void * reserved) enter image description here

exec_ctx.starting_cpu_ is assigned a value of 1610612848, which is clearly incorrect. This only happens in a particular setting after we tear down a session and restart it (which involves shutting down the completion queue and creating a new one).

What could be modifying the starting_cpu_ value (or, as I suspect, the internal exec_ctx pointer thread local storage) right after creating the context?

I'm using grpc v1.29.1 on client and service.

Thanks!


Solution

  • This one was a bit of a doozy. Turns out that there was an OS/Silicon level bug that was causing calls to GetProcessorNumber() to get garbage values, only when running user mode apps on ARM32 on ARM64 Snapdragon 845/850 devices. The specific bug has been fixed upstream and will roll out soon in Windows builds.