Windows Networking Troubleshooting 4: Slow Response of .NET Framework Applications
Like Java, .Net Framework on applications are executed and parsed on IL. With the !sos debug extension from Microsoft, we can basically have a clear and complete understanding of .Net Framework applications. This article describes how to solve a network problem with some users’ .Net Framework applications. Hopefully, it can inspire you to find many better solutions.
Problem
We have received feedback from a user indicating that the Windows applications seem to have slow access to MySQL and are behaving strangely. The architecture is like this:
Client — — — — Internet — -> Windows IIS server (.Net Framework application) — — — Internet — — -> MySQL server
User feedback:
- The client experienced slow access to the IIS server.
- The IIS server also had slow access to the same page by using its own Internet IP (VPC machines and EIP).
The user suspected that the reason was the slow access to the MySQL server on Windows and tried many IIS and MySQL optimizations. However, this problem never seemed to be resolved. In the meantime, the user tried to use another server to access MySQL, and ruled out MySQL itself as the cause of the performance issue.
Troubleshooting
At first glance, all these issues seemingly indicate that the Windows IIS server has slow access to the MySQL server when using an Internet IP. Therefore, we generate a waterfall chart of network requests to
Find out how slow the client access to the IIS server is.
This chart clearly shows the following information:
- The client indeed had slow access to the IIS server.
- The total latency is around six seconds .
- Each access to GetAllCountrys takes around 3.25s (TTFB) until the server returns the first response byte.
Is this caused by network problems between the client and the server? We can rule out this possibility by looking at the waterfall chart of network requests that we have obtained from the IIS server.
The information that we obtain through http://localhost
from the server also indicates that all components are running properly.
Another access test with the same content and a similar size, indicates that the response is as expected, almost with zero latency.
At this point, we can basically rule the product itself out as a cause of the slow access problem. To pin point the causes of this latency, we need to capture data packets for further analysis.
The following are some of our findings:
1. The server did receive the request 3.25s ago, but the response was delayed.
2. Judging from the captured data packets, after the server received the request, it immediately connected to the MySQL server and obtained the required content (that is, information about the country). We can easily find this by comparing the data packet content exported from Wireshark with the content received on the client.
3. After the server received the data, it did not immediately return the HTTP response. As shown in the chart, an ARP request occurred during this period. Logically, these additional operations have nothing to do with whether IIS returns a HTTP response or not. However, we can always find similar behaviors when we capture data packets several times. Intuitively, we think that this may be related to the 3.25s latency.
4. By taking a closer look, we can find that the ARP request on the server uses the Internet IP address of the client. However, this ARP request is definitely not initiated by the Windows operating system. If the server tried to communicate with the client at that time, it should request the Gateway address. Therefore, the ARP request is logically incorrect. This indirectly indicates that the ARP has something to do with the high latency.
5. Capturing a packet on Localhost can also be performed to determine whether or not an ARP request exists. However, it is also normal if nothing is captured since this is related to local traffic.
Unfortunately, Windows does not provide any logs related to ARP requests. To obtain definitive evidence, consider the following:
1. Use Process Monitor logs to collect information about the running processes of applications. Try to find some useful information by using Callstack in Process Monitor. Note that this kind of analysis takes a long time. We may resort to this when we don’t have any other clues. Currently, our analysis has been relatively clear and we specifically want to know what .Net applications are doing when the latency problem occurs .
2. Collect the Network Trace log in the .Net Framework. This log provides more information than a packet capture.
How to Configure Network Tracing:
http://msdn.microsoft.com/en-us/library/ty48b824(VS. 90).aspx.aspx)
3. Obtain the user dump for w3wp.exe on the IIS server during 3.25s of server unresponsiveness. Logically, IIS unresponsiveness is caused by an application waiting for a certain resource. However, it is not easy to capture this dump. Sysinternals provides a very powerful tool — Process Dump. We can use procdump.exe to monitor w3wp.exe and generate a dump file every one or two seconds to obtain the callstack when the problem occurs.
https://docs.microsoft.com/en-us/sysinternals/downloads/procdump
procdump -w -s 2 -n 10 w3wp.exe c:tempw3wp.dump
For detailed instructions, see the Microsoft document.
w3wp.exe User Dump Analysis
It is a relatively complex task to collect Network Trace in the .Net Framework. In addition, we need users’ cooperation to modify the profile. Therefore, we directly collect user dump file for further analysis. For a .Net program, we load the SOS debugging extension once we use Windbg to open the dump file.
For more information about the SOS debugging extension, visit https://docs.microsoft.com/en-us/dotnet/framework/tools/sos-dll-sos-debugging-extension
By analyzing w3wp.exe thread information, we locate the #27 thread.
27 Id: 1b4.654 Suspend: 0 Teb: 00000078`1eca8000 Unfrozen
Child-SP RetAddr : Args to Child : Call Site
00000078`1fabd668 00007ff8`45a8175e : 00000000`00000068 00000000`00000020 00000000`00000014 00000078`00000008 : ntdll! NtDeviceIoControlFile+0x14
00000078`1fabd670 00007ff8`433cf70c : 00000078`1fabd890 00000078`1fabd8ac 00000000`00000002 00000078`1fabd8ac : nsi! NsiGetAllParameters+0xfe
00000078`1fabd760 00007ff8`433def36 : 00007ff7`d82f1398 00000000`00000000 00000000`924b782a 00000000`00000000 : IPHLPAPI! ResolveIpNetEntry2+0xfc
00000078`1fabd850 00007ff7`d84ab0fb : 00000000`00000000 000001ab`927bcb48 00000078`1fabdb58 000001ab`927b4658 : IPHLPAPI! SendARP+0x86
00000078`1fabd930 00007ff7`d84a70b6 : 00000000`924b782a 00000000`00000000 00000078`1fabdb48 00000078`1fabdb40 : 0x00007ff7`d84ab0fb
00000078`1fabda30 00007ff7`d84a7d84 : 000001aa`92a02a20 00000078`1fabefe0 00000000`0000fdff 000001aa`917e9304 : 0x00007ff7`d84a70b6
00000078`1fabdb70 00007ff7`d80dffc7 : 000001ab`927bcb48 000001ab`9279da98 000001ab`927bcb90 000001aa`92541420 : 0x00007ff7`d84a7d84
00000078`1fabdcf0 00007ff7`d82e010b : 000001ab`92794d18 000001ab`9279da98 00007ff7`d8281638 00000000`00000001 : 0x00007ff7`d80dffc7
00000078`1fabddf0 00007ff7`d80df338 : 000001ab`927b4870 000001ab`92794d18 000001ab`927b4710 00000078`1fabd8a4 : 0x00007ff7`d82e010b
00000078`1fabde30 00007ff7`d80df1e3 : 000001ab`9279e1e8 000001ab`92795110 000001ab`9279f728 000001ab`9279f728 : 0x00007ff7`d80df338
00000078`1fabde80 00007ff7`d80df121 : 000001ab`92795890 000001ab`92795110 000001ab`9279e1e8 000001ab`9279f728 : 0x00007ff7`d80df1e3
00000078`1fabdec0 00007ff7`d80defda : 000001ab`927b42d8 000001ab`927b44c8 000001ab`927b443c 00000000`00000000 : 0x00007ff7`d80df121
00000078`1fabdef0 00007ff7`d80de9f0 : 000001ab`92795890 000001ab`927b43e0 000001ab`9279e1e8 00000000`00000000 : 0x00007ff7`d80defda
00000078`1fabdf20 00007ff7`d80de7ee : 000001ab`927b40c0 000001ab`927b43e0 000001ab`927b4034 00000078`1fabe2a8 : 0x00007ff7`d80de9f0
00000078`1fabdf60 00007ff7`d80de52a : 000001ab`927b42b8 000001ab`927b43e0 000001ab`927b4034 00000078`1fabe2a8 : 0x00007ff7`d80de7ee
00000078`1fabdfd0 00007ff7`d80de480 : 000001ab`92795890 000001ab`927b3fd8 000001ab`9279f5c4 00000078`1fabe2a8 : 0x00007ff7`d80de52a
00000078`1fabe000 00007ff7`d80d8757 : 000001ab`9279f6e0 000001ab`927b3fd8 000001ab`9279f5c4 00000078`1fabe2a8 : 0x00007ff7`d80de480
00000078`1fabe040 00007ff7`d80d8946 : 000001ab`9279e3e8 000001ab`927b3fd8 000001ab`9279f5c4 00000078`1fabe2a8 : 0x00007ff7`d80d8757
00000078`1fabe090 00007ff7`d80d898b : 000001ab`92795890 000001ab`9279f568 000001ab`92795c24 00000078`1fabe2a8 : 0x00007ff7`d80d8946
00000078`1fabe0c0 00007ff7`d80d8b79 : 000001ab`92795710 000001ab`9279f568 000001ab`92795c24 00000078`1fabe2a8 : 0x00007ff7`d80d898b
00000078`1fabe0f0 00007ff7`d80d89f4 : 000001ab`92739e08 000001ab`92795008 00000078`1fabe1f0 000001ac`b7af5d90 : 0x00007ff7`d80d8b79
00000078`1fabe120 00007ff7`d80d8b79 : 000001ab`92794d18 000001ab`92795bc8 000001ab`927952c4 00000078`1fabe2a8 : 0x00007ff7`d80d89f4
00000078`1fabe160 00007ff7`d80d8a76 : 000001ab`92795210 000001ab`92795bc8 000001ab`927952c4 00000078`1fabe2a8 : 0x00007ff7`d80d8b79
00000078`1fabe190 00007ff7`d80d8ae9 : 000001ab`92794d18 000001ab`92795268 000001ab`92795064 00000078`1fabe2a8 : 0x00007ff7`d80d8a76
00000078`1fabe1c0 00007ff7`d80d8b79 : 000001ab`927938c8 000001ab`92795268 000001ab`92795064 00000078`1fabe2a8 : 0x00007ff7`d80d8ae9
00000078`1fabe200 00007ff7`d80d8bc6 : 000001ab`92794fb0 000001ab`92795268 000001ab`92795064 00000078`1fabe2a8 : 0x00007ff7`d80d8b79
00000078`1fabe230 00007ff8`14ea57ba : 000001ab`92791c60 000001ab`92795008 00000078`1fabe2b0 00000078`1fabe2a8 : 0x00007ff7`d80d8bc6
00000078`1fabe260 00007ff8`15574bb3 : 000001ab`92739e08 000001ab`92739e08 00000078`1fabe470 00000078`1fabe2b0 : System_Web_ni+0x2f57ba
00000078`1fabe2f0 00007ff8`14e66590 : 000001ab`92739e08 000001ab`92739e08 00000078`1fabe470 00000078`1fabe2b0 : System_Web_ni+0x9c4bb3
00000078`1fabe330 00007ff8`14e7ac6c : 000001ab`92733810 000001ab`92739e08 00000078`1fabe470 00000000`00000000 : System_Web_ni+0x2b6590
00000078`1fabe380 00007ff8`14e67089 : 000001ab`92739a90 00007ff8`14e7a0a6 00000000`00000000 00000078`1fabe570 : System_Web_ni+0x2cac6c
00000078`1fabe4c0 00007ff8`14e7909b : 00000078`1fabe4c0 00007ff8`14ed40f5 00000078`1fabe558 00007ff8`14c0a6d0 : System_Web_ni+0x2b7089
00000078`1fabe510 00007ff8`14e681bd : 000001aa`92799ef8 000001ab`9272e020 000001ab`9272e378 00000000`00000000 : System_Web_ni+0x2c909b
00000078`1fabe590 00007ff8`14e67dc3 : 00000000`00000000 000001aa`91d2aa08 00000000`00000000 00000000`00000000 : System_Web_ni+0x2b81bd
00000078`1fabe730 00007ff8`15563592 : 000007f0`e084fce1 00000000`00000000 00007ff8`2f4c0df0 00007ff8`2f511d38 : System_Web_ni+0x2b7dc3
00000078`1fabe770 00007ff8`3758221e : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : System_Web_ni+0x9b3592
00000078`1fabe7e0 00007ff8`2f51183f : 000001aa`91d2aa08 00000078`00000000 000001aa`91d2ae38 000001aa`91d2aeb4 : clr! UMThunkStub+0x6e
00000078`1fabe870 00007ff8`2f5204ba : 00007ff8`00000000 00007ff8`2f4b3ed0 000001aa`91d2aa08 000001aa`91d2aa08 : webengine4! W3_MGD_HANDLER::ProcessNotification+0x8e
00000078`1fabe8a0 00007ff8`2f5115fb : 00007ff8`2f4b3ed0 00000000`00000080 000001aa`91d2aa08 00000000`00000000 : webengine4! W3_MGD_HANDLER::DoWork+0x3a5
00000078`1fabe910 00007ff8`2f573b21 : 00000000`00000000 00000000`00000000 0000d54a`f3ef32bb 000001aa`91d29180 : webengine4! RequestDoWork+0x3fc
00000078`1fabe9b0 00007ff8`2f4bd3d9 : 00000000`00000000 00000000`00000000 000001aa`91b1fd30 ffffffff`00000000 : webengine4! CMgdEngHttpModule::OnExecuteRequestHandler+0x21
00000078`1fabe9f0 00007ff8`2f4bcd7c : 000001aa`91b1fd30 00000078`1fabeb01 00000000`00000016 000001aa`91d29190 : iiscore! NOTIFICATION_CONTEXT::RequestDoWork+0xb9
00000078`1fabea40 00007ff8`2f4bcbc6 : 00000000`00000000 00000000`00000000 000001aa`91b1fd30 00000000`00000000 : iiscore! NOTIFICATION_CONTEXT::CallModulesInternal+0x19c
00000078`1fabeb50 00007ff8`2f4bde08 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : iiscore! NOTIFICATION_CONTEXT::CallModules+0x36
00000078`1fabebb0 00007ff8`2f4b7956 : 00007ff8`2f4b7880 00007ff8`14f87b88 000001ac`b7af5d90 000001aa`91b1fd30 : iiscore! NOTIFICATION_MAIN::DoWork+0x3d8
00000078`1fabeed0 00007ff8`2f56f481 : 000001aa`91d29180 00007ff8`2f4b7880 00000078`1fabf060 ffffffff`fffffffe : iiscore! W3_CONTEXT_BASE::IndicateCompletion+0xd6
00000078`1fabef50 00007ff8`2f5720c2 : 00000078`1fabf1b0 00000078`1fabf1b0 00000078`1fabf500 00000000`00000000 : webengine4! W3_MGD_HANDLER::IndicateCompletion+0x5d
00000078`1fabef80 00007ff8`14ed4f9e : 00000000`00000000 00007ff8`375846d5 00000078`1fabefa0 00006202`e851b47c : webengine4! MgdIndicateCompletion+0x22
00000078`1fabefb0 00007ff8`14e68385 : 000001aa`91d2aa08 00000078`1fabf1b0 00000078`1fabf018 00000078`1fabf178 : System_Web_ni+0x324f9e
00000078`1fabf070 00007ff8`14e67dc3 : 00000000`00000000 000001aa`91d2aa08 00000078`1fabf2b0 00000000`00000000 : System_Web_ni+0x2b8385
00000078`1fabf210 00007ff8`15563592 : 000001ac`b7a94820 000001aa`91d2aa08 00000000`00000003 00000000`00000004 : System_Web_ni+0x2b7dc3
00000078`1fabf250 00007ff8`37582473 : ffffffff`ffffffff 00007ff8`00000001 000001aa`91cbb501 00007ff8`3758a528 : System_Web_ni+0x9b3592
00000078`1fabf2c0 00007ff8`3764cfba : 00460054`0045004e 00007ff8`37585b35 ffffffff`fffffffe 00007ff8`3758a270 : clr! UM2MThunk_WrapperHelper+0x43
00000078`1fabf300 00007ff8`3758a9c8 : 00000078`1fabf428 00000000`00000000 000001ac`b7af5d90 00007ff8`375866af : clr! UM2MThunk_Wrapper+0x60
00000078`1fabf350 00007ff8`3764d065 : 00000078`1fabf501 00000000`00000002 000007f0`e084ee51 00000078`1fabf560 : clr! Thread::DoADCallBack+0x13d
00000078`1fabf510 00007ff8`3758241d : 000001aa`92293c30 ffffffff`ffffffff 00000000`00000003 00000000`00000000 : clr! UM2MDoADCallBack+0xb3
00000078`1fabf590 00007ff8`2f51183f : 00000000`00000000 000001aa`91d2aa08 00000000`00000003 00000000`00000004 : clr! UMThunkStub+0x26d
00000078`1fabf620 00007ff8`2f511792 : 000001aa`00000004 000001aa`91d2aa08 00007ff8`37ec4410 00000078`1fabf800 : webengine4! W3_MGD_HANDLER::ProcessNotification+0x8e
00000078`1fabf650 00007ff8`375899e7 : 000001aa`91d2aa08 00007ff8`2f511750 00007ff8`37ec4410 000001ac`b7af5d90 : webengine4! ProcessNotificationCallback+0x42
00000078`1fabf680 00007ff8`3758819c : 00007ff8`37589a70 00000078`1fabf700 00000078`1fabf801 00000000`00bbd38e : clr! UnManagedPerAppDomainTPCount::DispatchWorkItem+0x1bc
00000078`1fabf720 00007ff8`37587f45 : 00000001`00010002 00000000`00000001 000001ac`b7af5d90 00000000`00000001 : clr! ThreadpoolMgr::ExecuteWorkRequest+0x64
00000078`1fabf750 00007ff8`37632e8f : 00000000`00000000 00000000`00000001 00000000`00000001 00000000`00000804 : clr! ThreadpoolMgr::WorkerThreadStart+0xf5
00000078`1fabf7f0 00007ff8`45878364 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : clr! Thread::intermediateThreadProc+0x86
00000078`1fabfa30 00007ff8`47967091 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : kernel32! BaseThreadInitThunk+0x14
00000078`1fabfa60 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : ntdll! RtlUserThreadStart+0x21
We can easily locate IPHLPAPI! SendARP, which is literally the API that sends the ARP request. However, !sos is required to parse the long series of 0x00007ff7`d84ab0fb .
From the .Net String object on the stack, we can find the ARP request’s target address.
We can even print IL code for further analysis or save the DLL from the dump and use Reflector to decompile it.
Postscript
From the callstack in .Net, we can clearly understand why the latency problem occurs:
A user’s .Net application needs to write back access logs after it obtains information from the MySQL server. However, the application collects the MAC address information and therefore causes the system to continuously send out ARP requests, which finally leads to high latency. This code does not cause trouble in a local subnet. However, ARP requests can’t always receive responses across network segments, and subsequent response is blocked. Therefore, avoid this unnecessary operation whenever possible.
Later, when performing IL analysis of the .Net application, from the code we also see the call to GetWebClientHostname. In fact, this is the call to DNS to do reverse PTR query (We can also see the DNS PTR query in the previous packet capture. Unfortunately, I did not notice this before). Contrary to an ARP request, whether the query succeeds or fails, response will always be received if the DNS server is alive. Therefore, it does not have much impact on the latency. However, this operation is not very meaningful for client access from the Internet. So it is recommended to use Request.UserHostName instead to avoid unnecessary latency.