{"id":18,"date":"2010-05-04T13:03:56","date_gmt":"2010-05-04T18:03:56","guid":{"rendered":"http:\/\/capnbry.net\/blog\/?p=18"},"modified":"2010-06-04T12:02:35","modified_gmt":"2010-06-04T17:02:35","slug":"debugging-a-delphi-lockup-with-windbg","status":"publish","type":"post","link":"http:\/\/capnbry.net\/blog\/?p=18","title":{"rendered":"Debugging a Delphi Lockup with Windbg"},"content":{"rendered":"<p>While the Delphi debugger is quite good, there are some instances when it isn&#8217;t quite good enough.<\/p>\n<ul>\n<li><strong>Your application is crashing remotely<\/strong> The worst place for an application to fail is when there is no debugger available.  Usually these problems can be corrected via the &#8220;<em>Works Fine Here<\/em>&#8221; solution. Unfortunately most bug-tracking systems lack this option to close tickets.<\/li>\n<li><strong>Lack of symbol information<\/strong> The best trace you can get in the debugger shows a thread balls deep into Windows API calls for no apparent reason.  Some resolution can be gained from the DLL export addresses but without proper symbol information this isn&#8217;t enough detail.<\/li>\n<\/ul>\n<p>The solution to both these scenarios is to use Microsoft&#8217;s free debugger WinDbg, available as part of the <a href=\"http:\/\/www.microsoft.com\/whdc\/DevTools\/Debugging\/default.mspx\">Debugging Tools for Windows<\/a> package.  You&#8217;ll need the version for the host platform where the trace will occur.  This means get the 64-bit if you&#8217;re debugging on Win64 even if your app is 32-bit.  You&#8217;ll also need the <a href=\"http:\/\/code.google.com\/p\/map2dbg\/\">map2dbg<\/a> utility, originally by Lucian Wischik.<br \/>\n<!--more--><\/p>\n<h2>Prepare Your Project<\/h2>\n<p>First, you&#8217;ll need to prepare your project so that it exports its symbols to windbg, which is going to run externally from Delphi.  To do this you need to change the Linker options to generate a map file with a list of Publics.  Compile your app and run map2dbg against the exe to generate a .dbg file.  You can have Delphi automatically do this for you adding a Post-build step to the project options.  If you&#8217;re debugging a 32-bit application under a 64-bit environment however, windbg expects the dbg file to be named image00000000`00400000.dbg so that can be added as well.  DLL .dbg files can stay the same as the module name.<\/p>\n<pre class=\"brush:shell\">\r\nC:\\SysTools\\map2dbg.exe \"$(OUTPUTPATH)\"\r\nmove \"$(OUTPUTDIR)$(OUTPUTNAME).dbg\" \"image00000000`00400000.dbg\"\r\n<\/pre>\n<div id=\"attachment_25\" style=\"width: 607px\" class=\"wp-caption alignnone\"><a href=\"http:\/\/capnbry.net\/blog\/wp-content\/uploads\/2010\/05\/delphisetup.png\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-25\" src=\"http:\/\/capnbry.net\/blog\/wp-content\/uploads\/2010\/05\/delphisetup.png\" alt=\"Delphi WinDbg Project Option\" title=\"Delphi WinDbg Project Option\" width=\"597\" height=\"270\" class=\"size-full wp-image-25\" srcset=\"http:\/\/capnbry.net\/blog\/wp-content\/uploads\/2010\/05\/delphisetup.png 597w, http:\/\/capnbry.net\/blog\/wp-content\/uploads\/2010\/05\/delphisetup-300x135.png 300w, http:\/\/capnbry.net\/blog\/wp-content\/uploads\/2010\/05\/delphisetup-150x67.png 150w\" sizes=\"(max-width: 597px) 100vw, 597px\" \/><\/a><p id=\"caption-attachment-25\" class=\"wp-caption-text\">Delphi Project Options<\/p><\/div>\n<h2>Using WinDbg<\/h2>\n<p>Fire up windbg and launch your application using File | Open Executable&#8230;. Now set your Symbol Path from the File menu as well.  Start with the directory where your application&#8217;s .dbg file is located, then any DLL .dbg files, and finish with &#8220;srv*&#8221;.  This special token instructs WinDbg to use the Microsoft symbol server, which will automatically download the symbols you need and cache them in %ProgramData%\\sym\\dbg.<br \/>\n<div id=\"attachment_29\" style=\"width: 581px\" class=\"wp-caption alignnone\"><a href=\"http:\/\/capnbry.net\/blog\/wp-content\/uploads\/2010\/05\/symbolpath.png\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-29\" src=\"http:\/\/capnbry.net\/blog\/wp-content\/uploads\/2010\/05\/symbolpath.png\" alt=\"Symbol Search Path\" title=\"Symbol Search Path\" width=\"571\" height=\"209\" class=\"size-full wp-image-29\" srcset=\"http:\/\/capnbry.net\/blog\/wp-content\/uploads\/2010\/05\/symbolpath.png 571w, http:\/\/capnbry.net\/blog\/wp-content\/uploads\/2010\/05\/symbolpath-300x109.png 300w, http:\/\/capnbry.net\/blog\/wp-content\/uploads\/2010\/05\/symbolpath-150x54.png 150w\" sizes=\"(max-width: 571px) 100vw, 571px\" \/><\/a><p id=\"caption-attachment-29\" class=\"wp-caption-text\">Symbol Search Path<\/p><\/div><br \/>\nIf you&#8217;re having problems getting your application&#8217;s symbols located, turn on noisy symbol loading and windbg will give trace where it is looking for each .dbg file.<\/p>\n<pre class=\"brush:plain\">\r\n0:008:x86> !sym noisy\r\nnoisy mode - symbol prompts on\r\n0:008:x86> .reload\r\n<\/pre>\n<p>The hotkeys are standard Microsoft versions so that means F5 means run, or type &#8216;g&#8217; into the command window.  For this example I&#8217;ll be debugging a 32-bit lockup on my 64-bit development system. Once I&#8217;ve got my application wedged, hit the pause button.  If your development environment is the same as mine, the first thing you&#8217;ll notice is that all your threads are in the wow64cpu dll:<\/p>\n<pre class=\"brush:plain\">\r\n0:001> k\r\nChild-SP          RetAddr           Call Site\r\n00000000`0069f0f8 00000000`74db282c wow64cpu!CpupSyscallStub+0x9\r\n00000000`0069f100 00000000`74e2d07e wow64cpu!WaitForMultipleObjects32+0x32\r\n00000000`0069f1c0 00000000`74e2c549 wow64!RunCpuSimulation+0xa\r\n00000000`0069f210 00000000`773ed177 wow64!Wow64LdrpInitialize+0x429\r\n00000000`0069f760 00000000`773a308e ntdll! ?? ::FNODOBFM::`string'+0x2bfe4\r\n00000000`0069f7d0 00000000`00000000 ntdll!LdrInitializeThunk+0xe\r\n<\/pre>\n<p>This is because windbg always shows the architecture of the context where the break occurred.  If you hit Pause, you get 64-bit.  If a breakpoint in your code pops, you get 32-bit.  Luckily, the you can toggle architectures using the wow64exts extension by typing !wow64exts.sw, or simply !sw. <\/p>\n<pre class=\"brush:plain\">\r\n0:008:x86> k\r\nChildEBP RetAddr  \r\n04daf56c 770b0816 ntdll32!NtWaitForSingleObject+0x15\r\n04daf5d8 76211184 KERNELBASE!WaitForSingleObjectEx+0x98\r\n04daf5f0 76211138 kernel32!WaitForSingleObjectExImplementation+0x75\r\n04daf604 763188df kernel32!WaitForSingleObject+0x12\r\n04daf628 7643a819 ole32!GetToSTA+0xad\r\n04daf658 7643c05f ole32!CRpcChannelBuffer::SwitchAptAndDispatchCall+0x140\r\n04daf738 7632d0e5 ole32!CRpcChannelBuffer::SendReceive2+0xef\r\n...\r\n<\/pre>\n<p>Ah that&#8217;s better.  Now we see a common problem, a COM object is trying to call into an STA and when something hangs like this, it is generally because the STA has stopped pumping messages (calls to STAs are serialized using windows messages).  Let&#8217;s do a quick check to see what the threading model of this blocked thread is, which requires a little inside knowledge of how the windows keeps track of OLE threads.  When you call CoInitialize()\/CoInitializeEx() for the first time on a thread, the threading model you request is stored in the thread&#8217;s <a href=\"http:\/\/en.wikipedia.org\/wiki\/Win32_Thread_Information_Block\">Thread Environment Block<\/a> or TEB.  <em>Sidenote: Windows95 called this the Thread Information Block (TIB), and you&#8217;ll sometimes see the terms used interchangably although some of the data contained in the block differed between the 95 and NT platforms.<\/em>  This structure is allocated by COleTls::TLSAllocData and is stored at offset 0xf80 in the TEB, and the offset of the threading flags is at 0x0c in the structure.  From this knowledge, I can use !wow64exts.info to get the TEB address, add 0xf80, dereference that pointer then inspect the flags to determine the threading model.  Note that I use 0x notation for hexadecimal values for clarity, but windbg always takes numbers to be hex anyway so the 0x is optional.<\/p>\n<pre class=\"brush:plain\">\r\n0:008:x86> !wow64exts.info\r\n...\r\nTEB32: 0x7efa0000\r\n0:008:x86> dd 0x7efa0000+0xf80 L1\r\n7efa0f80  006cf638\r\n0:008:x86> dd 0x6cf638+0x0c L1\r\n006cf644  00000143\r\n<\/pre>\n<p>Using some more Win32 knowledge, I happen to know that the TEB is also pointed to by the Frame Segment pointer.  I can use this information, the POI() debugger macro, and my ability to count to 0x0c to reduce this operation to a single quick command.<\/p>\n<pre class=\"brush:plain\">\r\n0:008:x86> dd poi(fs:f80)\r\n006cf638  00000000 00000000 00000000 00000143\r\n<\/pre>\n<p>In these flags, if the bits 0x140 are present, the thread is in the MTA.  If the bits 0x080 are present, the thread is an STA.  If you get ????????, this means that CoInitialize() has not been called.  The flags here are 0x143, indicating MTA.  So our thread here is MTA, trying to call an STA. Lets look further into the backtrace to see which thread this is and where my code ends and OLE32 begins.<\/p>\n<pre class=\"brush:plain\">\r\n0:008:x86> k\r\nChildEBP RetAddr\r\n...\r\n04daff08 00510119 ScriptDPC!ScriptHost.TScriptHostImpl.Unlock+0x6\r\n04daff30 0054a986 image00000000_00400000!DPCHandler.TDPCHandler.DoShutdown+0x35\r\n04daff74 004055e6 image00000000_00400000!MsgConBase.TMsgConBase.ActionShutdown+0x36\r\n04daff88 76213677 image00000000_00400000!System.ThreadWrapper+0x2a\r\n<\/pre>\n<p>The last bits of my code are in the ScriptDPC dll, so I&#8217;ll set a breakpoint in the Unlock method to step through and see what&#8217;s going on.<\/p>\n<pre class=\"brush:plain\">\r\n0:008:x86> bp ScriptDPC!ScriptHost.TScriptHostImpl.Unlock\r\nSyntax:         Module     Unit         Class      Method\r\n<\/pre>\n<p>Use bp to create a breakpoint, bl to list them, and bc to remove (clear) one.  Single stepping through the function shows that while clearing up the script context, global COM objects referenced by the script are getting released.  Hrm, so it seems one of our COM objects is being Release()d by this thread but it was created by an STA so OLE is being nice and trying to RPC to that thread to release the interface.  All inter-apartment calls are considered RPC regardless of their activation context, be they in-process, out-of-process, or on another machine activated by DCOM\/COM+.  Looking through all of my threads using the method above, I see only one is an STA, the main thread.  So what I am probably looking for is the main thread creating a COM object in my script and not releasing it, which gets deferred until the script context is deleted.<\/p>\n<p>To find where this occurs is a bit more tricky, so I examine the points where the UI might interface to my scripting component.  While poking around examining the interactions, I notice this DLL load trace coming up in the debugger when I execute a context menu a certain component:<\/p>\n<pre class=\"brush:plain\">\r\nC:\\Windows\\SysWOW64\\RpcRtRemote.dll\r\nc:\\IE\\DPCs\\Test\\DBServices.dll\r\n<\/pre>\n<p>The first line there indicates that the RPC engine is being loaded into my process.  This creates an MTA thread and a pool of worker threads to service method invokations for objects and is a dead giveaway that the UI&#8217;s STA thread has just created a free-threaded object, the calls to which must be marshalled cross-apartment.  Some quick tracing through the source code turned up this bit of code being called from the main thread:<\/p>\n<pre class=\"brush:delphi\">\r\nprocedure TScriptDPCBase.ScriptScriptObjectsNeeded(Sender: TObject);\r\nbegin\r\n  RegisterScriptObject('IEProxy', IEProxy);\r\n\r\n  if not Assigned(FDBServices) then\r\n    FDBServices := CoCDataBaseServices.Create;\r\n  RegisterScriptObject('DBServices', FDBServices);\r\nend;\r\n<\/pre>\n<p>CDataBaseServices is flagged as tmFree.  There&#8217;s our MTA object.  A little rearranging moved the CDataBaseServices creation to the initialization of the script thread, make sure it isn&#8217;t being accessed by the UI, and lockup solved.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>While the Delphi debugger is quite good, there are some instances when it isn&#8217;t quite good enough. Your application is crashing remotely The worst place for an application to fail is when there is no debugger available. Usually these problems can be corrected via the &#8220;Works Fine Here&#8221; solution. Unfortunately most bug-tracking systems lack this [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[5],"tags":[],"_links":{"self":[{"href":"http:\/\/capnbry.net\/blog\/index.php?rest_route=\/wp\/v2\/posts\/18"}],"collection":[{"href":"http:\/\/capnbry.net\/blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/capnbry.net\/blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/capnbry.net\/blog\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"http:\/\/capnbry.net\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=18"}],"version-history":[{"count":18,"href":"http:\/\/capnbry.net\/blog\/index.php?rest_route=\/wp\/v2\/posts\/18\/revisions"}],"predecessor-version":[{"id":57,"href":"http:\/\/capnbry.net\/blog\/index.php?rest_route=\/wp\/v2\/posts\/18\/revisions\/57"}],"wp:attachment":[{"href":"http:\/\/capnbry.net\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=18"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/capnbry.net\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=18"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/capnbry.net\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=18"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}