PHP 5.6 Laravel api application. Hosted in an Azure app service.
No extensions loaded, other than the defaults. So does use wincache and and the php sql server extension.
99% of requests are fine, but 1% (which is a lot) end in a 500 error. Now this is intermittent, there is nothing "obviously" wrong with the app, you cannot recreate this, as the same request will work when tried again, or runs fine the 1000 times it ran previously.
Now we have experienced poor stability with php in Azure before on other projects, and the fix to this was to disable the OP-Code cache in wincache, but on the current version it is disabled by default.
The actual error, is not logged by php, and none or very little code in the app executes as nothing makes it into the application log either, which is pretty much its first non framework activity.
Failed request tracing, looks like a lot of useless info, but perhaps I dont know how to interpret it, here are the highlights:
<failedRequest url="http://app:80/api/orders?with=orderItem&search=user_id:addf7e91a98a4eb3a623e65a38a2f646"
siteId="1758523661"
appPoolId="app-rest"
processId="4704"
verb="GET"
authenticationType="NOT_AVAILABLE" activityId="{00000000-0000-0000-6B06-0080000000F7}"
failureReason="STATUS_CODE"
statusCode="200"
triggerStatusCode="500"
timeTaken="250"
xmlns:freb="http://schemas.microsoft.com/win/2006/06/iis/freb"
>
...
<Event
xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
<System>
<Provider Name="WWW Server" Guid="{3A2A4E84-4C21-4981-AE10-3FDA0D9B0F83}"/>
<EventID>0</EventID>
<Version>1</Version>
<Level>3</Level>
<Opcode>18</Opcode>
<Keywords>0x100</Keywords>
<TimeCreated SystemTime="2017-08-05T08:52:22.727Z"/>
<Correlation ActivityID="{00000000-0000-0000-6B06-0080000000F7}"/>
<Execution ProcessID="4704" ThreadID="28092"/>
<Computer>RD0004FFD742D0</Computer>
</System>
<EventData>
<Data Name="ContextId">{00000000-0000-0000-6B06-0080000000F7}</Data>
<Data Name="ErrorDescription">D:\Program Files (x86)\PHP\v5.6\php-cgi.exe - The FastCGI process exited unexpectedly</Data>
</EventData>
<RenderingInfo Culture="en-US">
<Opcode>SET_RESPONSE_ERROR_DESCRIPTION</Opcode>
<Keywords>
<Keyword>RequestNotifications</Keyword>
</Keywords>
</RenderingInfo>
<ExtendedTracingInfo
xmlns="http://schemas.microsoft.com/win/2004/08/events/trace">
<EventGuid>{002E91E3-E7AE-44AB-8E07-99230FFA6ADE}</EventGuid>
</ExtendedTracingInfo>
</Event>
Grateful for any ideas or experiences that might explain or suggest ways to debug this before we move to AWS.
Thank you for anything you can offer.
The fix for me turned out to be rather simple and undocumented.
It was not as I originally suspected Wincache.
in the App settings, add WEBSITE_DYNAMIC_CACHE value 0
This has so far presented no negative effect to our setup. But please do bear in mind I am not totally clear what this setting is supposed to achieve - but was indicated to be related to file system caching.
https://github.com/projectkudu/kudu/wiki/Configurable-settings
Turning on the 'dynamic cache' feature.
Full content caching: caches both file content and directory/file metadata (timestamps, size, directory content):
WEBSITE_DYNAMIC_CACHE=1
Directory metadata caching: will not cache content of the files, only the directory/file metadata (timestamps, size, directory content). That results in much less local disk use:
WEBSITE_DYNAMIC_CACHE=2
Note that actually it is on by default and in our case we want to turn it off WEBSITE_DYNAMIC_CACHE=0