Search code examples
c++javacardapduc++-chronowinscard

Best way to measure the time of an APDU command to a Java Card in c++


I'm trying to do some kind of timing attack to a Java Card.I need a way to measure the time elapsed between sending the command and getting the answer.I'm using the winscard.h interface and the language is c++. .I created a wrapper to winscard.h interface in order to make my work easier. For example for sending an APDU now i'm using this code which seems to work. Based on this answer I updated my code

 byte pbRecvBuffer[258];
long rv;
if (this->sessionHandle >= this->internal.vSessions.size())
    throw new SmartCardException("There is no card inserted");
SCARD_IO_REQUEST pioRecvPci;
pioRecvPci.dwProtocol = (this->internal.vSessions)[sessionHandle].dwActiveProtocol;
pioRecvPci.cbPciLength = sizeof(pioRecvPci);

LPSCARD_IO_REQUEST pioSendPci;
if ((this->internal.vSessions)[sessionHandle].dwActiveProtocol == SCARD_PROTOCOL_T1)
    pioSendPci = (LPSCARD_IO_REQUEST)SCARD_PCI_T1;
else
    pioSendPci = (LPSCARD_IO_REQUEST)SCARD_PCI_T0;
word expected_length = 258;//apdu.getExpectedLen();
word send_length = apdu.getApduLength();
CardSession session = (this->internal.vSessions).operator[](sessionHandle);
byte * data = const_cast<Apdu&>(apdu).getNonConstantData();
auto start = Timer::now();
rv = SCardTransmit(session.hCard, pioSendPci,data,
    send_length, &pioRecvPci, pbRecvBuffer,&expected_length);
auto end = Timer::now();
auto duration = (float)(end - start) / Timer::ticks();
return *new ApduResponse(pbRecvBuffer, expected_length,duration);

class Timer
{
public:
static inline int ticks()
{
    LARGE_INTEGER ticks;
    QueryPerformanceFrequency(&ticks);
    return ticks.LowPart;
}

static inline __int64 now()
{
    struct { __int32 low, high; } counter;

    __asm cpuid
    __asm push EDX
    __asm rdtsc
    __asm mov counter.low, EAX
    __asm mov counter.high, EDX
    __asm pop EDX
    __asm pop EAX

    return *(__int64 *)(&counter);
}

};

My code fails with error The value of ESP was not properly saved across a function call. This is usually a result of calling a function declared with one calling convention with a function pointer declared with a different calling convention.. My guessing is that instruction rdtsc is not supported by my Intel Processor.I have an Intel Broadwell 5500U. .I'm looking for a proper way to do this kind of measurement and get eventually responses with a more accuracy.


Solution

  • The error message that you provided

    The value of ESP was not properly saved across a function call. This is usually a result of calling a function declared with one calling convention with a function pointer declared with a different calling convention.

    indicates a mistake in the inline assembly function that you call. Assuming that the default calling convention is used when calling it, it's fundamentally flawed : cpuid destroys ebx, which is a callee-saved register. Furthermore, it only pushes one argument to the stack, and pops two : the second pop is effectively (most possibly) the return address of the function, or the base pointer saved as a part of the stack frame. As a result, the function fails when it calls ret, since it has no valid address to return to, or the runtime detects that the new value of esp (which is restored from the value at the beginning of the function) is simply invalid. This has nothing to do with the CPU that you're using, since all x86 CPUs support RDTSC - though the base clock that it uses may be different depending on the CPU's current speed state, which is why using the instruction directly is discouraged, and OS facilities should be favoured over it, as they offer compensation for different implementations of the instruction on various steppings.

    Seeing how you're using C++11 - judging by the use of auto - use std::chrono for measuring time intervals. If that doesn't work for some reason, use the facilities provided by your OS (this looks like Windows, so QueryPerformanceCounter is probably the one to use). If this still doesn't satisfy you, you can just generate the rdtsc by using the __rdtsc intrinsic function and not worry about inline assembly.