Restoring OneNote data from the local cache (08 Oct 2017)

Phew, that was close. I just almost lost two months of notes in OneNote, but was able to recover them from local cache files.

This is how it all started: After withstanding the constant nagging for a while, I finally gave in to those prompts to upgrade to Office 2016 on my developer laptop. During the upgrade, OneNote 2016 warned me that it may lose those few notes which had not been synced to the cloud yet. This is because OneNote 2016 will not use an older installation's local cache, but instead create a new empty local cache during installation, which will then fill incrementally by downloading notes from the cloud.

I shrugged off the warning because I was quite certain I had not made any significant changes recently, so everything should have been synced already. Of course I was totally wrong.

But after installation, OneNote 2016 would not display about two months worth of notes even though I had taken them (in OneNote 2013) on the very system on which I performed the upgrade. It turned out that OneNote 2013 had indeed not synced any of those notes to the cloud - for a period of almost two months.

The bad news is that I will probably never find out why and how synchronisation failed for such a long time, and why I never noticed any warnings about it. I guess I will be quite paranoid about synchronisation for a while now.

But the good news is that I managed to restore my notes. OneNote 2016 had created a new and empty local cache below %LOCALAPPDATA%\Microsoft\OneNote\16.0, but it had not deleted the old OneNote 2013 cache (below %LOCALAPPDATA%\Microsoft\OneNote\15.0), and this saved my bacon.

Others have fallen into the same or similar traps before, of course, and so there are a number of related discussions out there on the topic, for example at https://answers.microsoft.com/en-us/msoffice/forum/msoffice_onenote-mso_other-mso_2010/recover-information-from/8bf30713-316b-49cb-abc3-a8ce8e4b310d. A number of approaches are mentioned, such as:

  • Restore OneNote sections from the C:\Users\<name>\AppData\Local\Microsoft\OneNote\15.0\Backup directory
    • This worked only partially because the latest backup was already several days old.
  • Extract notes from the old OneNoteOfflineCache.onecache file in C:\Users\<name>\AppData\Local\Microsoft\OneNote\15.0\ by running onenote.exe /forcerepair on it.

The approach which did work in the end was as follows:

  • I installed OneNote 2013 on a separate Windows VM.
  • Then I copied over the cache files from my developer laptop to the Windows VM, i.e. both the OneNoteOfflineCache.onecache file and the OneNoteOfflineCache_Files directory (which holds all the attachments), overwriting the default local cache files of the OneNote 2013 installation on the VM.
  • After starting OneNote 2013 on the VM, it displayed all notes just fine. Big sigh of relief.
  • Syncing those notes from the VM to the cloud would not work, though. I first had to move all the notes to a new section in the affected notebook, and then wait until all notes had been synced.
  • And now, finally, the notes reappeared in my OneNote 2016 installation on my developer laptop as well.

I also could have uninstalled OneNote 2016 on my developer laptop and replaced it with the older OneNote 2013, and in fact I tried, but the OneNote 2013 installer told me to uninstall all of Office 2016 first, from which I shied away.


CoCreate Modeling: Wie startet man ein interaktives cmd.exe? (05 Jul 2016)

Ganz frisch aus dem Kundenforum für CoCreate Modeling (aka PTC Creo Elements/Direct Modeling): Wie startet man aus CoCreate Modeling heraus programmatisch eine interaktive Instanz von cmd.exe, also ein cmd.exe mitsamt DOS-Prompt-Fenster?

Es liegt nahe, das zunächst mit

  (oli:sd-sys-exec "cmd.exe")

zu versuchen. Das führt dann aber dazu, dass CoCreate Modeling scheinbar hängt und nichts Erkennbares passiert.

cmd.exe ist ein Kommandozeilenprogramm. Deswegen ist es völlig normal, dass ("grafisch") nichts passiert, wenn man cmd.exe als externes Programm ohne Parameter startet, zum Beispiel per sd-sys-exec. Dann wartet cmd.exe nämlich einfach im Hintergrund auf weitere Eingaben und tut sonst nichts.

ScreenShot2016-07-05at20.43.17.png

Will man cmd.exe in einem eigenen Terminalfenster (landläufig "DOS-Fenster" oder "command shell" oder "command prompt") starten und interaktiv laufen lassen, kann man das so erreichen:

  (oli:sd-sys-exec "start cmd.exe")

(Zu den Kommandozeilenparametern und Besonderheiten des Helferleins start siehe http://ss64.com/nt/start.html.)

Bonusfrage: Wenn cmd.exe ein Kommandozeilenprogramm ohne grafische Oberfläche ist, wieso öffnet sich denn ein Terminalfenster, wenn man cmd.exe aus Windows Explorer heraus startet?

Antwort: Weil Explorer entsprechend vorkonfiguriert ist - intern wird in so einem Fall nicht einfach nur cmd.exe ausgeführt, sondern das moralische Äquivalent zu start cmd.exe.

Bonusfrage 2: Woher weiss Windows eigentlich, wo cmd.exe liegt? Muss man da nicht einen Pfad wie C:\Windows\System32\cmd.exe angeben?

Hintergrund: In der Forumsfrage wurde ein solcher hartkodierter Pfad verwendet.

Antwort: Das Verzeichnis, in dem cmd.exe liegt, taucht im Inhalt der Umgebungsvariablen PATH auf, die Windows beim Starten von Programmen konsultiert. Damit ist die explizite Angabe eines Pfades unnötig. Mehr noch, sie ist sogar kontraproduktiv und fehlerträchtig - denn nicht auf jedem Rechner liegt das Windows-Verzeichnis unter C:\Windows.

Bonusfrage 3: Wozu ist das eigentlich gut, so eine interaktive Instanz von cmd.exe aus einer CAD-Applikation heraus zu starten?

Kopfkratzende erste Antwort: Für sachdienliche Hinweise dankbar big grin

Zweite Antwort nach Eintreffen der angeforderten sachdienlichen Hinweise: Ziel war es offenbar letztlich, ein kleines interaktives Kommandozeilenprogramm zu starten - der Start von cmd.exe war nur erster Test dafür.


Scripting VPN connections (20 Aug 2011)

Like many other companies, my company provides VPN access to its employees so that we can stay connected from our home offices or on the road. Most of the time, I connect to the company network through a web portal which downloads, installs and runs Juniper's "Network Connect" software on the Windows client system. That's all fine and dandy, except that I am a command-line guy and find it way too clumsy to fire up a web browser just in order to "dial in".

Fortunately, Juniper's Network Connect client has a command-line interface, and so here is a trivial DOS batch script which can be used to establish a connection in "I-don't-need-no-stinkin'-buttons" mode.

The script assumes that the Network Connect client has been installed and run in the usual manner (i.e. from the web portal) at least once. It will attempt to auto-detect the VPN host and user name, so in most cases all you have to specify is password information. Oh, and the script assumes you want to connect to the "SecurID(Network Connect)" realm by default, which requires entering a PIN and a number displayed on your RSA SecurID token.

@echo off
REM Launch Juniper Network Connect client from the command line
REM Written by Claus Brod in 2011, see
REM http://www.clausbrod.de/Blog/DefinePrivatePublic20110820JuniperNetworkConnect

REM --------------------------------------------------------
setlocal enableextensions

call :find_juniper_client NCCLIENTDIR
if "x%NCCLIENTDIR%"=="x" (
  echo ERROR: Cannot find Network Connect client.
  goto :end
)

rem CONFIGURE: Set your preferred VPN host here.
set url=define-your-vpn-host-here
ping -n 1 %url% >nul
if not errorlevel 1 goto :validhost

rem Try to auto-detect the VPN host from the config file
set NCCLIENTCONFIG="%NCCLIENTDIR%\..\Common Files\config.ini"
if exist %NCCLIENTCONFIG% for /f "delims=[]" %%A in ('findstr [[a-z0-9]\. %NCCLIENTCONFIG% ^| findstr /V "Network Connect"') do set url=%%A
ping -n 1 %url% >nul
if errorlevel 1 (
  echo ERROR: Host %url% does not ping. Please check your configuration.
  goto :end
)

:validhost
call :read_no_history url %url% "VPN host"

set user=guest
call :read_no_history user %user% "Username"

rem CONFIGURE: Set your preferred realm here. By default, the script
rem assumes two-stage authentication using a PIN and RSA SecurID.

set realm="SecurID(Network Connect)"
call :read_no_history realm %realm% "Realm"

REM TODO: Hide password input
set password=""
call :read_no_history password %password% "Enter PIN + token value for user %user%:"
if x%password%==x (
  echo ERROR: No password specified
  goto :end
)

cls

echo Launching Juniper Network Connect client in
echo   %NCCLIENTDIR%...
"%NCCLIENTDIR%\nclauncher.exe" -url %url% -u %user% -p %password% -r %realm%
goto :end

REM --------------------------------------------------------
:find_juniper_client
setlocal
set CLIENT=

rem search registry first
for /f "tokens=1* delims=       " %%A in ('reg query "HKLM\SOFTWARE\Juniper Networks" 2^>nul') do set LATESTVERSION="%%A"
if x%LATESTVERSION%==x"" goto :eof
for /f "tokens=2* delims=        " %%A in ('reg query %LATESTVERSION% /v InstallPath 2^>nul ^| findstr InstallPath') do set CLIENT=%%B

rem if nothing found, check filesystem
if "x%CLIENT%"=="x" for /d %%A in ("%ProgramFiles(x86)%\Juniper Networks\Network Connect*") do set CLIENT=%%A
if "x%CLIENT%"=="x" for /d %%A in ("%ProgramFiles%\Juniper Networks\Network Connect*") do set CLIENT=%%A

endlocal & set "%~1=%CLIENT%"
goto :eof


REM --------------------------------------------------------
REM read_no_history promptvar default promptmessage
:read_no_history
setlocal
set msg=%~3
if not "x%~2"=="x" (
  set msg="%~3 (default: %~2): "
)
set /P RNH_TEMP=%msg% <nul
set RNH_TEMP=

REM call external script to avoid adding to our own command history
set RNH_CMDFILE=%TEMP%\temp$$$.cmd
  (
    echo @echo off
    echo set var_=%2
    echo set /p var_=
    echo echo %%var_%%
  )> "%RNH_CMDFILE%"

for /f "delims=," %%A in ('%RNH_CMDFILE%') do set RNH_TEMP=%%A
del %RNH_CMDFILE%
endlocal & if not x%RNH_TEMP%==x set "%~1=%RNH_TEMP%"
goto :eof


REM --------------------------------------------------------
:end
endlocal

The above script is meant to be used along with the Windows version of the Network Connect client. For the Linux client, Paul D. Smith provides an excellent script and great instructions at http://mad-scientist.us/juniper.html.

See below for the direct download link for the script.

PS: The code is now available from github as well, see https://github.com/clausb/nclauncher.

PS/2: Paul D. Smith's instructions are unavailable as of November 2015; the Wayback archive still has a copy at http://web.archive.org/web/20150908095435/http://mad-scientist.us/juniper.html.


How to Detect Mergers & Acquisitions in Code (01 Sep 2009)

ptccocreate.png

Let’s suppose you had written this test case for low-level DDE communication in your product, and that this test talks to Internet Explorer via DDE.

Let’s assume you’d do this by sending a URL to IE via DDE, and that you’d then verify the result by asking IE which page it actually loaded.

Let’s say that you’d use the URL of your company’s website, http://www.cocreate.com.

The day your QA people start yelling at you because the test fails miserably, you know that your company has been acquired, and that all accesses to http://www.cocreate.com have been automatically redirected to http://www.ptc.com wink


head replacement on Windows (16 Jun 2009)

A co-worker needed to convert a Cygwin-dependent script to something that runs on a bare-bones Windows system. The interesting part of the task was finding a replacement for the good ol' head command-line utility.

Fortunately, this is fairly simple using a few lines of VBScript and the Windows Scripting Host. First, here's the VBScript code:

lines = WScript.Arguments(0)
Do Until WScript.stdin.AtEndOfStream Or lines=0
  WScript.Echo WScript.stdin.ReadLine
  lines = lines-1
Loop

This is an extremely stripped-down version of head's original functionality, of course. For example, the code above can only read from standard input, and things like command-line argument validation and error handling are left as an exercise for the reader big grin

Assuming you'd save the above into a file called head.vbs, this is how you can display the first three lines of a text file called someinputfile.txt:

   type someinputfile.txt | cscript /nologo head.vbs 3

Enjoy!


Sniffing for crashes (09 Jan 2008)

A few months ago, I blogged about crash reporting under Windows XP and Vista. The protocol between Windows and Microsoft's Watson servers is undocumented, but contains useful hints on how the mechanism works. Back then, I hinted that I had intercepted the crash reporting traffic to learn more about it, but I didn't fully describe how I did that.

If only I could remember, that is big grin

Well, I remember the basic approach, and I even took some notes back then which I will regurgitate here and now. However, don't expect step-by-step instructions.

My goal was to take control of the crash reporting process in my application. When an application crashes, Microsoft's official recommendation is that it should not try to catch the fatal exception, but instead simply bail out, let Windows perform its crash reporting rites, and then terminate. For various reasons, we needed more control over the process, and so I set out on a discovery tour through the Windows Error Reporting APIs which Microsoft introduced in Windows Vista. The details of this epic saga can be found at:

For a long time, I simply couldn't get crash reporting to work as I expected it to. I had pretty much exhausted most of the official documentation, so I needed to dig deeper.

When an application crashes and the user agrees to send the crash data to Microsoft, Windows contacts Microsoft's Watson servers. For some reason, my crash reports weren't accepted there, while the usual plain vanilla crashme.exe could successfully dump its debris to Microsoft's servers. Hence, the idea was to look at the network traffic and find the differences between my crash reports and those produced by the proverbial crashme.exe.

I could have run the usual network sniffing suspects to decode LAN traffic including all the email exchanged between my boss and his boss, of course. But there is an even easier and less controversial approach: For corporate environments, where admins often need more control over the crash reporting process, Microsoft introduced corporate error reporting (CER), where crashing systems can contact a local server rather than sending all those confidential access violations to Microsoft.

There are registry entries to set the server name and port for corporate error reporting:

  • HKEY_LOCAL_MACHINE\Software\Microsoft\Windows\Windows Error Reporting\CorporateWERServer: Name of the local crash reporting server
  • HKEY_LOCAL_MACHINE\Software\Microsoft\Windows\Windows Error Reporting\CorporateWERPortNumber: Port number to be used for communication

On my Vista system, I modified CorporateWERServer so that it referred to my laptop. I did not set the port number explicitly; using procmon, I found that the default port is 1273.

On my laptop, I installed netcat and had it listen to input from port 1273 (nc -l273 or something like that). Once the port was open, Vista started to send HTTP POST requests to it - so the CER server really is a specialized HTTP server listening to port 1273! Here's a typical request (slightly polished and anonymized) following a crash in a sample app I was writing back then:

POST /stage2.htm HTTP/1.1
User-Agent: MSDW
Host: mylaptop:1273
Content-Length: 1110
Connection: Keep-Alive

<?xml version="1.0" encoding="UTF-16"?>
<WERREPORT xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
   <MACHINEINFO machinename="myvistabox" os="6.0.6000.2.0.0.256.16" lcid="1033"/>
   <USERINFO username="myvistabox\clausb"/>
   <APPLICATIONINFO appname="werapitest.exe" apppath="C:\tmp\werapitest.exe"/>
   <EVENTINFO reporttype="1" eventtime="128267449141252896" eventtype="werapitest (eventType)" 
      friendlyeventname="werapitest (friendly event name)" eventdescription="Critical runtime problem"/>
   <SIGNATURE>
   </SIGNATURE>
</WERREPORT>

By comparing this kind of payload with the traffic generated by a plain vanilla crashme.exe program, I could experiment with the various WER APIs and settings until I had finally figured out how to use them. Without crash report sniffing, I'd probably still be experimenting...


Elementary, my dear Watson! (03 Jul 2007)

When an application crashes under Windows Vista, it will contact Microsoft's Windows Error Reporting servers; only if those servers request more data about the problem, crashdump files will be generated. Bad luck if you're a developer talking to an enraged customer who just lost his data, and you first have to go through a lengthy process of registering your application on Winqual, mapping the right application versions, and waiting until the first customer crashdumps show up on the server. Chances are that your customer may have taken his money elsewhere in the meantime while waiting for you to fix his problem.

In Think globally, dump locally, I listed the following workarounds:

  • Using the ForceQueue registry entry to force WER to always produce a crashdump (at the price of losing UI interaction)
  • Disable the Internet connection before the crash occurs
  • (Ab-)Using the CorporateWERServer registry entry
  • Deploy Microsoft Operations Manager 2007, including Agentless Exception Monitoring (which replaces the older "Corporate Error Reporting" servers)
  • Copy Dr. Watson from XP to the Vista system and install it as system JIT debugger
  • Install a top-level crash filter in your application using SetUnhandledExceptionFilter

In Don't dump Vista just yet..., I presented a little-known Task Manager option which creates a user dump file from any running process.

And in this installment of the seemingly never-ending series on Vista crashdumps, we'll explore yet another option. What if we had a system JIT debugger which can be easily installed on a customer system and automatically produces minidump files for any crashing application? Basically a homebrew version of good ol' Dr. Watson, stripped down to bare essentials?

mydearwatson_install.png

The following code illustrates this approach. This skeleton application is called mydearwatson and installs as a JIT debugger. When an app crashes, it attaches to it and asks the user what kind of minidump (normal or full dump) should be generated. The resulting crashdump file goes to the current user's desktop folder, readily available for sending it off to the developer of the application.

// mydearwatson
//
// Minimal JIT debugger; attaches to a crashing process 
// and generates minidump information. Intended to be used
// as the rough skeleton for a poor man's Dr. Watson 
// replacement on Vista.
//
// Written by Claus Brod, http://www.clausbrod.de/Blog

#include <windows.h>
#include <DbgHelp.h>
#pragma comment(lib, "DbgHelp.lib")
#include <Psapi.h>
#include <shlobj.h>
#include <atlbase.h>

#include <stdio.h>
#include <string.h>

#define MSGHDR "mydearwatson\n" \
               "(C) 2007 Claus Brod, http://www.clausbrod.de/Blog\n\n"

bool uninstall(void)
{
  CRegKey key;
  if (ERROR_SUCCESS != key.Open(HKEY_LOCAL_MACHINE,
    "Software\\Microsoft\\Windows NT\\CurrentVersion\\AeDebug\\",
    KEY_READ | KEY_WRITE))
    return false;

  // check for old debugger registration
  char debuggerCommandLine[MAX_PATH+256];
  ULONG nChars = _countof(debuggerCommandLine);
  LONG ret = key.QueryStringValue("PreMydearwatsonDebugger",
    debuggerCommandLine, &nChars);
  if (ret == ERROR_SUCCESS) {
    ret = key.SetStringValue("Debugger", debuggerCommandLine);
    if (ret == ERROR_SUCCESS) {
      ret = key.DeleteValue("PreMydearwatsonDebugger");
    }
  }

  return ret == ERROR_SUCCESS;
}

bool install(char *programName)
{
  CRegKey key;
  if (ERROR_SUCCESS != key.Open(HKEY_LOCAL_MACHINE,
    "Software\\Microsoft\\Windows NT\\CurrentVersion\\AeDebug\\",
    KEY_READ | KEY_WRITE))
    return false;

  char debuggerCommandLine[MAX_PATH+256];
  ULONG nChars = _countof(debuggerCommandLine);
  if (ERROR_SUCCESS == key.QueryStringValue("Debugger", debuggerCommandLine, &nChars)) {
    _strlwr_s(debuggerCommandLine, _countof(debuggerCommandLine));
    if (!strstr(debuggerCommandLine, "mydearwatson")) {
      // save command line for previously installed debugger
      key.SetStringValue("PreMydearwatsonDebugger", debuggerCommandLine);
    }
  }

  char debuggerPath[MAX_PATH];
  strcpy_s(debuggerPath, programName);  // preset with default
  ::GetModuleFileName(GetModuleHandle(0), debuggerPath, _countof(debuggerPath));

  _snprintf_s(debuggerCommandLine, _countof(debuggerCommandLine), _TRUNCATE,
    "\"%s\" -p %%ld -e %%ld", debuggerPath);
  return ERROR_SUCCESS == key.SetStringValue("Debugger", debuggerCommandLine);
}

char *getMinidumpPath(void)
{
  static char dumpPath[MAX_PATH];
  if (dumpPath[0] == 0) {
    SHGetSpecialFolderPath(NULL, dumpPath, CSIDL_DESKTOPDIRECTORY, FALSE);
  }
  return dumpPath;
}

char *getMinidumpFilename(DWORD pid)
{
  static char minidumpFilename[MAX_PATH];
  if (!minidumpFilename[0]) {
    _snprintf_s(minidumpFilename, MAX_PATH, _TRUNCATE,
      "%s\\mydearwatson_pid%d.mdmp", getMinidumpPath(), pid);
  }
  return minidumpFilename;
}

bool dumpHelper(HANDLE hDumpFile,
                HANDLE processHandle, DWORD pid,
                HANDLE threadHandle, DWORD tid,
                EXCEPTION_RECORD *exc_record, MINIDUMP_TYPE miniDumpType)
{
  bool ret = false;

  CONTEXT threadContext;
  threadContext.ContextFlags = CONTEXT_ALL;
  if (::GetThreadContext(threadHandle, &threadContext)) {
    __try {
      MINIDUMP_EXCEPTION_INFORMATION exceptionInfo;
      exceptionInfo.ThreadId = tid;
      EXCEPTION_POINTERS exc_ptr;
      exc_ptr.ExceptionRecord = exc_record;
      exc_ptr.ContextRecord = &threadContext;
      exceptionInfo.ExceptionPointers = &exc_ptr;
      exceptionInfo.ClientPointers = FALSE;

      if (MiniDumpWriteDump(processHandle,
        pid, hDumpFile, miniDumpType, &exceptionInfo, NULL, NULL))
        ret = true;
    } __except(EXCEPTION_EXECUTE_HANDLER) { }
  }

  return ret;
}

struct HandleOnStack
{
  HANDLE m_h;
  HandleOnStack(HANDLE h) : m_h(h) { }
  ~HandleOnStack() { if (m_h && m_h != INVALID_HANDLE_VALUE) CloseHandle(m_h); }
  operator HANDLE() { return m_h; }
};

bool createMinidump(HANDLE processHandle, DWORD pid, DWORD tid,
                    EXCEPTION_RECORD *exc_record)
{
  HandleOnStack hDumpFile(CreateFile(getMinidumpFilename(pid), GENERIC_WRITE, 0, NULL,
    CREATE_ALWAYS, FILE_ATTRIBUTE_NORMAL, NULL));
  if (hDumpFile == INVALID_HANDLE_VALUE)
    return false;

  HandleOnStack
    threadHandle(OpenThread(THREAD_GET_CONTEXT | THREAD_SUSPEND_RESUME, 0, tid));
  if (!threadHandle)
    return false;

  if (-1 == SuspendThread(threadHandle))
    return false;

  MINIDUMP_TYPE miniDumpType = MiniDumpNormal;
  if (IDYES == MessageBox(NULL, MSGHDR
    "By default, minimal crashdump information is generated.\n"
    "Do you want full crashdump information instead?",
    "mydearwatson", MB_YESNO|MB_ICONQUESTION|MB_DEFBUTTON2)) {
      miniDumpType = MiniDumpWithFullMemory;
  }

  bool ret = dumpHelper(hDumpFile, processHandle, pid,
    threadHandle, tid, exc_record, miniDumpType);
  if (ret) {
    char buf[1024];
    _snprintf_s(buf, _countof(buf), _TRUNCATE, MSGHDR
      "Minidump information has been written to\n%s.\n",
      getMinidumpFilename(pid));
    MessageBox(NULL, buf, "mydearwatson", MB_OK|MB_ICONINFORMATION);
  }

  ResumeThread(threadHandle);
  return ret;
}

int debuggerLoop(DWORD pid, HANDLE eventHandle)
{
  // attach to debuggee
  if (!DebugActiveProcess(pid)) {
    fprintf(stderr, "Could not attach to process %d\n", pid);
    return 1;
  }

  HANDLE processHandle = 0;
  while (1) {
    DEBUG_EVENT de;
    if (WaitForDebugEvent(&de, INFINITE)) {
      switch(de.dwDebugEventCode)
      {
      case CREATE_PROCESS_DEBUG_EVENT:
        processHandle = de.u.CreateProcessInfo.hProcess;
        printf("Attaching to process %x...\n", processHandle);
        break;

      case EXCEPTION_DEBUG_EVENT:
        printf("Exception reported: code=%x, dwFirstChance=%d\n",
          de.u.Exception.ExceptionRecord.ExceptionCode, de.u.Exception.dwFirstChance);

        if (de.u.Exception.ExceptionRecord.ExceptionCode == EXCEPTION_BREAKPOINT) {
          SetEvent(eventHandle);
        } else {
          createMinidump(processHandle, de.dwProcessId, de.dwThreadId,
            &de.u.Exception.ExceptionRecord);
          ContinueDebugEvent(de.dwProcessId, de.dwThreadId,
            DBG_EXCEPTION_NOT_HANDLED); // required?
          DebugActiveProcessStop(pid);
          printf("Detached from process, terminating debugger...\n");
          return 0;
        }
        break;

      default:
        // printf("debug event code = %d\n", de.dwDebugEventCode);
        break;
      }

      ContinueDebugEvent(de.dwProcessId, de.dwThreadId, DBG_CONTINUE);
    }
  } // while (1)

  return 1;
}


void usage(char *programName)
{
  fprintf(stderr, "To install as JIT debugger:\n");
  fprintf(stderr, "  %s -i\n", programName);
  fprintf(stderr, "  %s\n", programName);
  fprintf(stderr, "To uninstall:\n");
  fprintf(stderr, "  %s -u\n", programName);
  fprintf(stderr, "Call as JIT debugger:\n");
  fprintf(stderr, "  %s -p pid -e eventhandle\n", programName);
}

int main(int argc, char *argv[])
{
  DWORD pid = 0;
  HANDLE eventHandle = (HANDLE)0;
  bool uninstallationMode = false;
  bool installationMode = false;

  if (argc == 1) {
    if (IDYES == MessageBox(NULL, MSGHDR
      "Do you want to install mydearwatson as the system JIT debugger?",
      "mydearwatson", MB_YESNO|MB_ICONQUESTION)) {
        installationMode = true;
    }
  }

  for (int i=1; i<argc; i++) {
    if (!_stricmp(argv[i], "-p")) {
      pid = atol(argv[i+1]);
    }
    if (!_stricmp(argv[i], "-e")) {
      eventHandle = (HANDLE)atol(argv[i+1]);
    }
    if (!_stricmp(argv[i], "-i")) {
      installationMode = true;
      uninstallationMode = false;
    }
    if (!_stricmp(argv[i], "-u")) {
      uninstallationMode = true;
      installationMode = false;
    }
  }

  if (installationMode) {
    if (!install(argv[0])) {
      fprintf(stderr, "Could not register as a JIT debugger.\n");
      return 1;
    }
    return 0;
  }

  if (uninstallationMode) {
    if (!uninstall()) {
      fprintf(stderr, "Could not uninstall.\n");
      return 1;
    }
    return 0;
  }

  if (!pid || !eventHandle) {
    usage(argv[0]);
    return 2;
  }

  return debuggerLoop(pid, eventHandle);
}

To compile and build this code, open a Visual Studio command prompt and enter

  cl mydearwatson.cpp

To install mydearwatson, run the executable in elevated mode and confirm the installation message box. Now configure Windows Error Reporting to always ask the user before submitting crash data: "Problem Reports and Solutions/Change Settings/Ask me to check if a problem occurs". Alternatively, create a registry value called Auto (REG_SZ) in HKEY_LOCAL_MACHINE\Software\Microsoft\Windows NT\CurrentVersion\AeDebug and set its value to "1".

mydearwatson_minidump.jpg The next time an application crashes, the usual WER dialog will appear; click the "Debug" option in that dialog. Another message box will be displayed asking what kind of crashdump information should be written. Make your choice, and the crashdump file will magically appear on your desktop.

To uninstall, run mydearwatson with the -u option. mydearwatson tries to remember which JIT debugger was installed before, and will reinstall that JIT debugger. The mechanism for doing this is far from perfect, though.

If you look at the code, you'll notice that it basically implements a minimal debugger, using Win32 debugging APIs such as DebugActiveProcess or WaitForDebugEvent. I've never written a debugger before, so I'd assume there are a few subtleties and bugs hidden in this code, but it did work for me on both XP and Vista systems. Test results most welcome.



Don't dump Vista just yet; instead, let Vista do the dumping! (30 Jun 2007)

The other day, I was contemplating various approaches to retrofit a feature into Vista which its new implementation of Windows Error Reporting apparently took away from us. What I and a couple of folks in the Windows Error Reporting discussion forum were desperately missing were crashdumps. On Vista, WER only produces minidumps if Microsoft's Winqual servers ask it to. So if a customer reports a crash, and your application isn't registered with Winqual, it becomes a lot more difficult than on XP systems to get a crash dump file to analyse the problem.

I already suggested a couple of workarounds, but they call came with their own quirks or inconveniences. None of them was 100% satisfying.

Which actually was good news to me, as this inspired me to develop a ruthless, but lucrative plot: I would sue the living daylights out of the poor folks in Redmond, based on a charge of malicious removal of vital features from their operating system, win the trial with fanfare, and then retreat to my newly-acquired beach villa on a remote island, my only connection to my former life being the ultra-hyper-gargantomanically-fast direct Internet connection via my own private satellite parked in stationary orbit 36000 km above Brod mansion.

It wasn't meant to be.

The bubble burst after a mouse click at the right time in the wrong place. For all eternity, cursed shall be the day I was taught to use the right mouse button.

Once more, I had run my crashme.exe test application which causes an access violation on purpose. I had the Windows Task Manager running. The WER dialog popped up, but for some reason, instead of using that dialog, I right-clicked the process entry in the Task Manager window.

And there it was: The "Create Dump File" option, which I had not noticed anytime before.


And indeed, this option does what it promises: It produces a dump file which can be loaded into the debugger to inspect the cause of the crash.

So if an application crashes on Vista, and you want to create crashdump information and send it to the developer of the application right away, here's how:

  • Configure Windows Error Reporting so that it asks you explicitly what to do when a crash occurs ("Problem Reports and Solutions/Change Settings/Ask me to check if a problem occurs")
  • Run and test the app until it crashes
  • Do not click any of the options in the WER dialog just yet. Instead, open the Task Manager and go to the "Processes" tab. Right-click the entry for the crashing app, and select "Create Dump File".

There's even a Knowledge Base article on this feature.

Sigh. There goes the beach villa.


Think globally, dump locally (27 Jun 2007)

localdump.png

These days, I spend quite some time in Microsoft's Windows Error Reporting forum, which is where David Ching, who is a Microsoft MVP, posed an interesting problem this week.

On Vista, Windows Error Reporting will create and transmit minidump files only if the WER servers request them. At least this seems to be the default behavior which both David and I have observed on Vista systems. David, however, wanted to make sure that whenever an application crashes, a minidump file is generated which the user or tester can then send directly to the developers of the application for analysis - even if Microsoft's WER servers never actually request the minidumps, which, as far as I can tell, is the default for applications which have not been explicitly registered with and mapped at Winqual.

My first idea was to force the system into queuing mode. When crash reports are queued, minidumps are always generated and stored locally, so that they can be transmitted to the error reporting server later on. Queuing is enabled by setting HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\Windows Error Reporting\ForceQueue (DWORD) to 1. (See WER Settings for documentation on this and other WER-related registry keys.) Crash report data will be stored in directories such as c:\Users\someusername\AppData\Local\temp and C:\ProgramData\Microsoft\Windows\WER\ReportQueue.

That works, but it also suppresses the WER UI, which isn't ideal either. Isn't there some way to have the cake and eat it, too?

Let's see: A variation of the above approach is to disable the Internet connection before the crash occurs. You'll get the dialogs, but WER won't be able to connect to the Microsoft servers, and so it should then also queue the crash information. Alternatively, and this is something that I have tried myself a few times, you could set HKEY_LOCAL_MACHINE\Software\Microsoft\Windows\Windows Error Reporting\CorporateWERServer (string) to the name of some non-existing system. When a crash occurs, WER will try to contact that server, find that it's not responding, and then store all crash data locally so that it can be re-sent when the connection is later established.

Or you could go all the way and actually install such a Corporate Error Reporting server on one of your systems. Probably one of the best solutions, since this gives you direct access to minidump files within your organization.

But this blog isn't about IT, it's about hacking and coding wink Here's an idea how David's goals could be accomplished without implementing a full-blown crash handler:

And here's the demo code which demonstrate this technique:

// Demo program using SetUnhandledExceptionFilter() and
// MiniDumpWriteDump().
//
// Claus Brod, http://www.clausbrod.de/Blog

#include <windows.h>
#include <DbgHelp.h>
#pragma comment(lib, "DbgHelp.lib")
#include <stdio.h>

static LONG WINAPI myfilter(_EXCEPTION_POINTERS *exc_ptr)
{
  static const char *minidumpFilename = "myminidump.mdmp";
  HANDLE hDumpFile = CreateFile(minidumpFilename, GENERIC_WRITE, 0, NULL,
    CREATE_ALWAYS, FILE_ATTRIBUTE_NORMAL, NULL);

  if (hDumpFile != INVALID_HANDLE_VALUE) {
    __try {
      MINIDUMP_EXCEPTION_INFORMATION exceptionInfo;
      exceptionInfo.ThreadId = GetCurrentThreadId();
      exceptionInfo.ExceptionPointers = exc_ptr;
      exceptionInfo.ClientPointers = false;

      BOOL ret = MiniDumpWriteDump(GetCurrentProcess(),
        GetCurrentProcessId(), hDumpFile, MiniDumpNormal, &exceptionInfo, NULL, NULL);
      if (ret) {
        printf("Minidump information has been written to %s.\n", minidumpFilename);
      }
    } __except(EXCEPTION_EXECUTE_HANDLER) { }
    CloseHandle(hDumpFile);
  }

  return EXCEPTION_CONTINUE_SEARCH;
}

static int wedding_crasher(int *pp)
{
  *pp = 42;
  return 42;
}

int main(void)
{
  SetUnhandledExceptionFilter(myfilter);

  wedding_crasher(0);
  return 0;
}

And finally, here's a really weird idea from Dmitry Vostokov: Resurrecting Dr. Watson on Vista wink If you're into exception handling and crash analysis, Dmitry's http://www.dumpanalysis.org/ web site is a fantastic resource. This guy lives in an exception filter big grin


Crashing with style on Vista, part II (25 Jun 2007)

In the first part of this mini-series, I demonstrated the ReportFault API and why it didn't fit my needs on Vista. Last time around, I discussed my first attempt to use the new Windows Error Reporting (WER) APIs instead, which failed to produce any crash reports on Microsoft's Winqual site.

When the curtain fell last time, I had a WER test application which, on the surface, appeared to work, but didn't manage to get any crash reports through to Winqual. Also, entries for crash reports produced by this application looked a little funny in Vista's Problem History window:


In particular, the Bucket ID value stands out. What are bucket IDs? Essentially, the Winqual site combines various attributes of the crash report (application, signatures, crash address etc.) and creates a unique integer value from them, which then becomes an identifier for this particular type of crash.

All my WER-induced crash reports submitted from Vista clients always had a bucket ID of 8, regardless of which test application I used and how exactly I provoked the crash. Also, I knew from earlier, successful attempts to talk to the Winqual servers how real bucket IDs usually look like (much larger integers). Something fishy was going on here.

The application I tested was properly registered, signed and mapped at the Winqual site, and crash reports submitted from XP systems made it to the Winqual servers just fine. Hence, registration issues could be ruled out. I posted to the Windows Error Reporting forum and asked for help and clarification. Saar Picker responded: "We filter out unknown event types. Since your report is not of a recognized event type, it is being rejected. The Bucket ID 8 event is reporting the rejection to us."

So my crash reports were not of a recognized event type. What's a poor crash report supposed to do to be recognized?

The first parameter for WerReportCreate is an event type. The documentations says: "wzEventType - A pointer to a Unicode string that specifies the name of the event." Hmmm, so maybe this is the event type that Saar mentioned. If so, what kind of event are we talking about? Win32 events? Events like the ones captured in the Windows event log?

None of those, as it turns out. Instead, error reporting servers can define types of error events that they want to capture. Microsoft's Winqual servers, for example, are configured to accept event types which represent application or operating system crashes.

So what is the magic event type which represents an application crash?

Hint 1: The werapi.h header file defines an undocumented macro constant called APPCRASH_EVENT.

  #define APPCRASH_EVENT L"APPCRASH"

Hint 2: When a crash report is submitted using WerReportSubmit, this API tries to contact the error reporting server. In Vista, the protocol is based on XML snippets which the client sends to the server via HTTP. One of the attributes in the initial XML that is transmitted is called eventtype, and for applications which do not try to handle fatal crashes themselves, the value of that attribute is indeed "APPCRASH".

So I modified my WER code to use "APPCRASH" instead of some arbitrary string. And indeed, this made a difference, although not the one I had hoped for: With the new event type, WerReportSubmit() now returned an error (E_FAIL), where it previously succeeded...

To debug the problem, I intercepted the XML exchange between the client and the server, and looked at the differences between a non-WER client and my own test code. (If you're interested in the interception details, drop me a line.) The non-WER client transmitted additional data (so-called "signature parameters"), and it also specified a "report type" of 2 instead of 1. So my strategy was to eliminate the differences one by one by working the WER APIs.

The extra parameters sent by the non-WER client were things like the application's name, version and timestamp; the faulting module's name, version and typestamp; and the exception code and address offset. And now, finally, I understood the purpose of the underdocumented WerReportSetParameter API - depending on the server's setup, it expects certain extra parameters to safely identify an event, and those can be set using WerReportSetParameter:

static void wer_report_set_parameters(HREPORT hReportHandle,
                                      EXCEPTION_POINTERS *exc_ptr)
{
  TCHAR moduleName[1024];
  get_module_name(NULL, moduleName, _countof(moduleName));
  pWerReportSetParameter(hReportHandle, 0, L"Application Name", moduleName);

  TCHAR buffer[1024];
  get_module_file_version(moduleName, buffer, _countof(buffer));
  pWerReportSetParameter(hReportHandle, 1, L"Application Version", buffer);

  HMODULE hModule = GetModuleHandle(0);
  DWORD timeStamp = GetTimestampForLoadedLibrary(hModule);
  _sntprintf_s(buffer, _countof(buffer), _TRUNCATE,
    __T("%x"), timeStamp);
  pWerReportSetParameter(hReportHandle, 2, L"Application Timestamp", buffer);

  // determine module name from crash address
  moduleName[0] = 0;
  void *exceptionAddress = exc_ptr->ExceptionRecord->ExceptionAddress;
  if (GetModuleHandleEx(GET_MODULE_HANDLE_EX_FLAG_FROM_ADDRESS |
    GET_MODULE_HANDLE_EX_FLAG_UNCHANGED_REFCOUNT,
    (LPCTSTR)exceptionAddress, &hModule)) {
    get_module_name(hModule, moduleName, _countof(moduleName));
  }
  pWerReportSetParameter(hReportHandle, 3, L"Fault Module Name", moduleName);

  get_module_file_version(moduleName, buffer, _countof(buffer));
  pWerReportSetParameter(hReportHandle, 4, L"Fault Module Version", buffer);

  timeStamp = GetTimestampForLoadedLibrary(hModule);
  _sntprintf_s(buffer, _countof(buffer), _TRUNCATE,__T("%x"), timeStamp);
  pWerReportSetParameter(hReportHandle, 5, L"Fault Module Timestamp", buffer);

  _sntprintf_s(buffer, _countof(buffer), _TRUNCATE,
    __T("%08x"), exc_ptr->ExceptionRecord->ExceptionCode);
  pWerReportSetParameter(hReportHandle, 6, L"Exception Code", buffer);

  INT_PTR offset = (char *)exceptionAddress - (char *)hModule;
  _sntprintf_s(buffer, _countof(buffer), _TRUNCATE, __T("%p"), offset);
  pWerReportSetParameter(hReportHandle, 7, L"Exception Offset", buffer);
}

The other significant change was to use the undocumented WerReportApplicationCrash constant as the "report type" parameter for WerReportCreate. After these changes, the Winqual servers finally started talking to me: I received bucket IDs, sometimes also requests to transmit minidump data - and after a few days, the crash reports appeared on the Winqual site! Whoopee!

The full demo code is attached. To build, open a Visual Studio command prompt and run the compiler:

  cl werapitest.cpp

My special thanks to Saar Picker and Jason Hardester at Microsoft for their help!

Now that I've achieved my original goal (reporting crashes using the WER APIs under Vista), let me spoil the fun by warning you to ever use this approach. Why? Because this is clearly not the way Microsoft recommends to handle application crashes. Now, while I'm not sure whether Microsoft as a whole has an official recommendation, the documentation or the postings in newsgroups in blogs clearly suggest that an application shouldn't actually even try to handle a crash explicitly - instead, it should just crash and let the OS do the reporting. The basic rationale behind this is that an application is probably already deeply confused when a crash occurs, and some of its data may already have been damaged. This makes crash recovery a difficult and unreliable endeavor.

There are circumstances where an application needs to keep control of the reporting process, but Microsoft expects such cases to be very rare. Which explains a lot of the initial communication disconnects that I experienced while discussing my case with Saar and Jason.

There's a reason why it's called "WER" (Windows Error Reporting) and not "WCR" (Windows Crash Reporting). Apparently, Microsoft doesn't expect us to use those APIs for crash reporting, but rather for more generic "error" or "event" reporting. For example, this U.S. patent claim discusses how the WER APIs can be used to report failures in handwriting recognition. (By the way, there's also a patent for WER itself, see http://www.freepatentsonline.com/20060271591.html.)


Crashing with style on Vista (18 Jun 2007)

A few days ago, I reported about the peculiarities of the ReportFault API, particularly on Windows Vista, and how those peculiarities drove me to give in to Microsoft's sound advice and use the new and shiny Windows Error Reporting (WER) APIs on Vista.

ReportFault() is a great one-stop shopping API: A one-liner will display all required dialogs, ask the user if he wants to contact Microsoft, create report data (including minidumps) if required, and send the whole report off to Microsoft.

The new WER APIs in Vista are slightly more complex, but also provide more control for the details of error reporting. Well, if you know how to handle the APIs, that is. Apparently, I do not know how to handle them since I still haven't solved all the problems around them.

More on this in a moment. Let's first take a look at the core of a test application I wrote:

static bool report_crash(_EXCEPTION_POINTERS *inExceptionPointer)
{
  // Set up parameters for WerReportCreate()
  WER_REPORT_INFORMATION werReportInfo;
  memset(&werReportInfo, 0, sizeof(werReportInfo));
  werReportInfo.dwSize = sizeof(werReportInfo);
  wcscpy_s(werReportInfo.wzFriendlyEventName,
    _countof(werReportInfo.wzFriendlyEventName),
        L"werapitest (friendly event name)");
  wcscpy_s(werReportInfo.wzApplicationName,
    _countof(werReportInfo.wzApplicationName), L"");
  wcscpy_s(werReportInfo.wzDescription,
    _countof(werReportInfo.wzDescription), L"Critical runtime problem");

  PCWSTR eventType = L"werapitest (eventType)"; // APPCRASH
  HREPORT hReportHandle;
  if (FAILED(pWerReportCreate(eventType, WerReportCritical,
    &werReportInfo, &hReportHandle)) || !hReportHandle) {
      return false;
  }

  bool ret = false;

  WER_EXCEPTION_INFORMATION werExceptionInformation;
  werExceptionInformation.bClientPointers = FALSE;
  werExceptionInformation.pExceptionPointers = inExceptionPointer;
  bool dumpAdded = SUCCEEDED(pWerReportAddDump(hReportHandle, ::GetCurrentProcess(),
    ::GetCurrentThread(), WerDumpTypeMiniDump, &werExceptionInformation, NULL, 0));
  if (!dumpAdded) {
    FATAL_ERROR("Minidump generation failed.\n");
  }

  DWORD submitOptions = WER_SUBMIT_OUTOFPROCESS | WER_SUBMIT_NO_CLOSE_UI;
  WER_SUBMIT_RESULT submitResult;
  if (SUCCEEDED(pWerReportSubmit(hReportHandle, WerConsentNotAsked,
    submitOptions, &submitResult))) {
      switch(submitResult)
      {
        // ... decode result ...

      }
  }
  pWerReportCloseHandle(hReportHandle);

  return ret;
}

static int filter_exception(EXCEPTION_POINTERS *exc_ptr)
{
  report_crash(exc_ptr);
  return EXCEPTION_EXECUTE_HANDLER;
}

static void wedding_crasher(void)
{
  __try {
    int *foo = (int *)0;
    *foo = 42;
  } __except(filter_exception(GetExceptionInformation())) {
    printf("Now in exception handler, process is still alive!\n");
  }
  Sleep(5000);
}

int main()
{
  HMODULE hWer = LoadLibrary("Wer.dll");
  if (hWer) {
    pWerReportCreate =
      (pfn_WERREPORTCREATE)GetProcAddress(hWer, "WerReportCreate");
    pWerReportSubmit =
      (pfn_WERREPORTSUBMIT)GetProcAddress(hWer, "WerReportSubmit");
    pWerReportCloseHandle =
      (pfn_WERREPORTCLOSEHANDLE)GetProcAddress(hWer, "WerReportCloseHandle");
    pWerReportAddDump =
      (pfn_WERREPORTADDDUMP)GetProcAddress(hWer, "WerReportAddDump");
  }

  if (!pWerReportCreate || !pWerReportSubmit ||
    !pWerReportCloseHandle || !pWerReportAddDump) {
      printf("Cannot initialize WER API.\n");
      return 1;
  }

  wedding_crasher();
  return 0;
}

The fundamental approach is still the same as for the ReportFault test program presented recently:

  • A structured exception block is established using __try and __except.
  • Code provokes an access violation.
  • The exception filter filter_exception is consulted by the exception handling infrastructure to find out how to proceed with the exception.
  • The filter calls the WER APIs to display the crash dialog(s), and to give the user options to debug the problem, ignore it, or report it to Microsoft.
  • The exception filter returns EXCEPTION_EXECUTE_HANDLER to indicate that its associated exception handler should be called.

The following WER APIs are used to create and send a crash report:

The WER APIs do indeed solve a problem that I found with ReportFault on Vista: They don't force the calling process to be terminated, and allow me to proceed as I see fit. That's really good news.

The problem I haven't resolved yet is this: Even though I call WerReportAddDump, I have no idea whether minidump data are actually generated and sent. In fact, from the feedback provided by the system, it seems likely that those data are not generated.

To illustrate my uncertainties, I wrote a test program called werapitest. The code is attached as a ZIP file; unpack it into a directory, open a Visual Studio command prompt window, and build the code as follows:

  cl werapitest.cpp

Run the resulting executable, then open up the "Problem Reports and Solutions" control panel and click on "View problem history". On my system, I get something like this:

werapitest_history.jpg

Double-clicking on the report entry leads to this:

werapitest_entry.png

The problem history entry does not mention any attached files, such as minidump data!

When a crash occurs, the system also writes entries into the event log; those log entries claim there are additional data in paths such as C:\Users\clausb\AppData\Local\Microsoft\Windows\WER\ReportArchive\Report0f8918ad, and indeed, such directories exist and each contain a file called Report.wer, which holds data such as:

Version=1
EventType=werapitest (eventType)
EventTime=128266502225896608
ReportType=1
Consent=1
UploadTime=128266502257542112
Response.BucketId=8
Response.BucketTable=5
Response.type=4
DynamicSig[1].Name=OS Version
DynamicSig[1].Value=6.0.6000.2.0.0.256.16
DynamicSig[2].Name=Locale ID
DynamicSig[2].Value=1033
UI[3]=werapitest.exe has stopped working
UI[4]=Windows can check online for a solution to the problem.
UI[5]=Check online for a solution and close the program
UI[6]=Check online for a solution later and close the program
UI[7]=Close the program
State[0].Key=Transport.DoneStage1
State[0].Value=1
State[1].Key=DataRequest
State[1].Value=Bucket=8/nBucketTable=5/nResponse=1/n
FriendlyEventName=werapitest (friendly event name)
ConsentKey=werapitest (eventType)
AppName=werapitest.exe
AppPath=C:\tmp\werapitest.exe
ReportDescription=Critical runtime problem

So again, the minidump is not mentioned anywhere.

Now let's try some minimal code which uses neither ReportFault nor the new WER API:

  int main(void)
  {
    int *p = (int *)0;
    *p = 42;
    return 0;
  }

After running this code and letting it crash and report to Microsoft, I get the following problem history entry:

crashme.jpg

This problem report contains a lot more data than the one for werapitest, and it even refers to a minidump file which was apparently generated by the system and probably also sent to Microsoft.

So the lazy code which doesn't do anything about crashes gets full and proper service from the OS, while the application which tries to deal with a crash in an orderly manner and elaborately goes through all the trouble of using the proper APIs doesn't get its message across to Microsoft. I call this unfair wink

Oh, and in case you're wondering: Yes, we've registered with Microsoft's Winqual site where the crash reports are supposed to be sent to, and we established "product mappings" there, and the whole process seems to work for XP clients just fine.

I'm pretty sure that I'm just missing a couple of details with the new APIs, or maybe I'm misinterpreting the feedback from the system. I ran numerous experiments and umpteen variations, I've searched the web high and low, read the docs, consulted newsgroups here and there - and now I'm running out of ideas. Any hints most welcome...

PS: I did indeed receive some hints. For updated WER code, along with an explanation on why the above failed, see Crashing with style on Vista, part II.


The end is nigh (for my process) (16 Jun 2007)

How can you tell that you're the control freak type of Windows programmer? Easy: You feel that irresistible urge to install top-level exception handlers which report application crashes to the end user and provide useful options on how to proceed, such as to report the issue to the software vendor, save the currently loaded data, inspect the issue in more detail, or call the police.

reportfault_xp.jpg In fact, this is pretty much what Windows Error Reporting is all about, only that the crash reports are sent to Microsoft first (to their Winqual site, that is), from where ISVs can then download them for further analysis. Oh, and the other difference is that Microsoft dropped the "call the police" feature in order to get Vista done in time.

One of the applications that I'm working on already had its own top-level crash handler which performed some of the services also provided by Windows Error Reporting. It was about time to investigate Microsoft's offerings in this area and see how they can replace or augment the existing crash handler code.

The first option I looked at was the ReportFault API. Microsoft's documentation says that the function is obsolete, and we should rather use a different set of APIs collectively called the "WER functions". However, understanding them requires a lot more brain calories than the trivial ReportFault call which you can simply drop into an exception filter, and you're done.

The required code is pretty trivial and looks roughly like this:

int filter_exception(EXCEPTION_POINTERS *exc_ptr)
{
  EFaultRepRetVal repret = ReportFault(exc_ptr, 0);
  switch (repret)
  {
         // decode return value...
         //
  }
  return EXCEPTION_EXECUTE_HANDLER;
}

void main(void)
{
  __try {
    int *foo = (int *)0;
    *foo = 42;
  } __except(filter_exception(GetExceptionInformation())) {
    _tprintf(__T("Nothing to see here, move on, process is still alive!\n"));
  }
  Sleep(5000);
}

Sequence of events:

  • A structured exception block is established using __try and __except.
  • Code provokes an access violation.
  • The exception filter filter_exception is consulted by the exception handling infrastructure to find out how to proceed with the exception.
  • The filter calls ReportFault to display the crash dialog as shown above, and to give the user options to debug the problem, ignore it, or report it to Microsoft.
  • After performing its menial reporting duties, the exception filter returns EXCEPTION_EXECUTE_HANDLER to indicate that its associated exception handler should be called.

That exception handler is, in fact, essentially the _tprintf statement which spreads the good news about the process still being alive.

reportfault_vista.jpg On XP, that is. On Vista, the _tprintf statement may actually never execute. You'll still get a nice reporting dialog, such as the one in the screenshot to the right, but when you click the "Close program" button, the calling process will be terminated immediately, i.e. ReportFault never really returns to the caller!

I debugged into ReportFault on my Vista machine and found that ReportFault spawns off a process called wermgr.exe which performs the actual work. My current hypothesis is that it is wermgr.exe which terminates the calling process if the user chooses "Close program".

If you want to try it yourself, click here to download the demo code. To compile, simply run it through cl.exe:

  cl.exe reportfault.cpp

Now, can we complain about this, really? After all, you can't call it surprising if a program closes after hitting the "Close program" button. Still, the behavior differs from the old XP dialog - and it is inconsistent even on Vista. What I just described is the behavior that I found with the default error reporting settings in Vista. By default, Vista "checks for solutions automatically" and doesn't ask the user what to do when a crash occurs. This can be configured in the "Problem Reports and Solutions" control panel:

vista_settings.jpg

After changing the report settings as shown above ("Ask me") and then running the test application again, the error reporting dialog looks like this:

reportfault_vista_ask.jpg

When I click on "Close program" now, guess what happens - the process does not terminate, and the _tprintf statement in my exception handler is executed, just like on XP! So that "Close program" button can mean two different things on Vista...

It's not just this inconsistency which bugged me. I also don't like the idea of letting the error reporting dialog pull the rug from under my feet. Sure, I'd like to use the dialog's services, but when it returns, I want to make my own decisions about how to proceed. For example, I could try and save the currently loaded data in my application, or I could add my own special reporting. Or call the cops.

ReportFault won't let me do that on Vista. And so I set out to burn those extra brain calories anyway and learn about the new WER APIs which were introduced with Windows Vista.

And burn calories I did, oh yes. More on this hopefully soon.


Getting rid of nul, or: How I learnt to love UNC (29 Apr 2006)

Every now and then, some tool on my system runs berserk and starts to generate files called nul. This is a clear indication that there's something going wrong with output redirection in a script, but I still have to figure out exactly what's going on. Until then, I need at least a way to get rid of those files.

Yes, that's right, you cannot delete a file called nul that easily - neither using Windows Explorer nor via the DOS prompt. nul is a very special filename for Windows - it is an alias for the null device, i.e. the bit bucket where all the redirected output goes, all those cries for help from software which we are guilty of ignoring all the time.

UNC path notation to the rescue: To remove a file called nul in, say c:\temp, you can use the DOS del command as follows:

  del \\.\c:\temp\nul

Works great for me. But since I rarely use UNC syntax, I sometimes forget how it looks like. Worse, the syntax requires to specify the full path of the nul file, and I hate typing those long paths. So I came up with the following naïve batch file which does the job for me. It takes one argument which specifies the relative or absolute path of the nul file. Examples:

   rem remove nul file in current dir
   delnul.bat nul
   rem remove nul file in subdir
   delnul.bat foo\nul
   rem remove nul file in tempdir
   delnul.bat c:\temp\nul

For the path completion magic, I'm using the for command which has so many options that my brain hurts whenever I read its documentation. I'm pretty sure one could build a Turing-complete language using just for...

  @echo off
  set fullpath=
  for %%i IN (%1x) DO set fullpath=%%~di%%~pi

  set filename=
  for %%i IN (%1x) DO set filename=%%~ni
  if not "x%filename%" == "xnulx" (echo Usage: %0 [somepath\]nul && goto :eof)
 
  echo Deleting %fullpath%nul...
  del \\.\%fullpath%nul

DelinvFile takes this a lot further; it has a Windows UI and can delete many other otherwise pretty sticky files - nul is not the only dangerous file name; there's con, aux, prn and probably a couple of other magic names which had a special meaning for DOS, and hence also for Windows.


Pasting my own dogfood, part 4 (15 Apr 2006)

In the epic "dogfood" series, the hero comes to realize he needs systematic tests for clipboard code. On his quest, he briefly gives in to the sweet song of the Scripting Sirens, but escapes and makes it to the safer shores of the Win32 API - but only to realize that fate has even more trials in place for him.

When we left our hero the last time, he had just figured out that not all Win32 handles are made alike. Data for most clipboard formats is held in a global memory buffer (allocated via GlobalAlloc); GetClipboardData returns a handle to the memory block, and all you need to do in order to decode the data is to interpret the handle as a memory handle and then read from that block of memory.

However, there are some formats which won't reveal their inner selfs that easily, such as bitmaps. Hence, it's about time we form a circle, take each other by the hands, and meditate over clipboard formats.

Microsoft lists clipboard formats here and distinguishes the following main classes of clipboard formats:

  • Standard clipboard formats, i.e. predefined formats such as CF_BITMAP, CF_TEXT, CF_ENHMETAFILE, CF_HDROP, CF_PALETTE, CF_WAVE etc.
  • Registered clipboard formats, i.e. application-defined formats which are registered at runtime. RTF is a prominent example for such a format.
  • Private clipboard formats: Another special kind of application-defined formats.

For the purpose of code which tries to retrieve arbitrary data from the clipboard, however, it is more useful to use a different classification.

Memory-based clipboard formats

These are all the formats for which the handle returned from GetClipboardData can be interpreted as a memory handle.

The text formats (CF_TEXT and cousins) are in this class, and probably most application-defined formats ("registered formats"), although it is entirely up to the application to decide on how the data in the clipboard needs to be decoded. Other examples: CF_LOCALE, CF_WAVE, CF_TIFF.

These formats can be posted to the clipboard and read from there using code similar to what I posted last time.

Handle-based clipboard formats

Examples for such formats:

  • CF_ENHMETAFILE, CF_DSPENHMETAFILE
  • CF_METAFILEPICT, CF_DSPMETAFILEPICT
  • CF_BITMAP, CF_DSPBITMAP
  • CF_PALETTE

I found that I had to provide format-specific code to interpret these formats.

Metafiles

CF_ENHMETAFILE and CF_DSPENHMETAFILE data can be copied by interpreting the clipboard handle as a metafile handle (HENHMETAFILE) and using the CopyEnhMetaFile API in Win32 to directly create a file on the disk.

CF_METAFILEPICT and CF_DSPMETAFILEPICT differ slightly from this. The clipboard handle is a memory handle to a METAFILEPICT data structure. That structure has a member called hMF which is the actual metafile handle; pass this handle to CopyMetaFile, and you'll get a metafile on the disk.

Bitmaps

The clipboard handle really is a bitmap handle of type HBITMAP. HBITMAPs refer to device-dependent bitmap data (DDB), which first need to be converted into device-independent format (DIB); then you can add a bitmap header and write the whole shebang to the disk in a format which can be read as *.BMP by image viewers.

Palettes

The clipboard handle must be interpreted as a HPALETTE handle. The GetPaletteEntries API can be used to retrieve the data behind such a handle; then we can dump the palette entries in the data structure to a file in any format we choose; for example, a simple integer specifying the number of entries, followed by PALETTEENTRY structures.

Other formats

When dragging and dropping files (or copying them in Explorer), information about these files is transmitted in CF_HDROP format. The handle returned by GetClipboardData can be interpreted as an HDROP, which can be passed to DragQueryFile to learn more about the files in the clipboard.

(I tried to come up with testcases for formats such as CF_PENDATA and CF_DSPTEXT, but could not find any. If anyone comes across these formats, please let me know.)

Finally: New toys!

With the above findings, I was ready to extend my very simplistic original test code. The result: Two useful tools which can be used to copy data from the clipboard into files (ClipboardToFile.exe), and to copy data from files into the clipboard (FileToClipboard.exe). And I'm even sharing this code .-)

Source code and executables can be downloaded here - and here are some hints on how to use the toolset.

ClipboardToFile

ClipboardToFile does exactly what the name hints at: It enumerates the formats which are currently in the clipboard, and writes files containing the clipboard data in that format.

So to create a set of test files, simply run your favorite apps on any system and use them to copy data in various formats to the clipboard. For example:

  • Run Paint, load a file and copy it to the clipboard
  • Run ClipboardToFile to save whatever Paint added to the clipboard
  • Run Word, type some text, and copy it to the clipboard
  • Run ClipboardToFile again to get clipboard extracts in various text formats.

An example of how to run ClipboardToFile:

 ClipboardToFile c:\temp\clipbboarddata

With the above command line, ClipboardToFile will produce files in the directory c:\temp\clipboarddata. Those files are named after the clipboard format from which they were produced. Typical names are "CF_TEXT", "CF_BITMAP", "CF_DIB" and so on. Repeat the process with other apps on your system until you have a library of clipboard data files which you can use for unit tests!

FileToClipboard

FileToClipboard is a command-line tool which takes any file you throw at it, reads the file's contents and copies them to the clipboard in (almost) any format you specify:

   FileToClipboard foo.wmf CF_ENHMETAFILE
   FileToClipboard foo.bmp CF_BITMAP
   FileToClipboard foo.txt CF_UNICODETEXT

So the basic idea is to prepare a couple of test files (here: foo.bmp, foo.wmf and foo.txt), dump them into your unit test directory, and use them to prepare the clipboard. Then you run your application's "Edit/Paste" functionality and verify that it works as expected. Since FileToClipboard is a command-line utility, you can automate such tests easily; also, the executable is very small and can be installed everywhere simply by copying the exe file.

In the case of text and bitmap files, it is easy to see where you can get sample test data. However, some formats are only used for clipboard transfer and are never persisted to files. As an example, the CF_LOCALE format indicates that locale data is in the clipboard. In the case of CF_LOCALE, it would be easy to fudge a binary file: A single integer is used to encode a locale ID. So you could create such a file with a hex editor or by writing a one-liner C program or whatever, and then feed it into FileToClipboard in CF_LOCALE mode.

However, there are many other formats which are not quite that simple. Worse, any application out there can define its own undocumented clipboard formats at any time. Fortunately, we already have a tool which fills this gap: ClipboardToFile.

The end of the saga? Not quite!

These tools cover a wide variety of clipboard formats, including many registered formats - most of them seem to be memory-based. For the most part, I'm a happy camper now. I can automate all my tests, and move on to greener pastures.

Well, I could, except that the whole Win32 approach which I took is still fundamentally flawed, really.

When I fire up Paint, then select an image area and copy it to the clipboard, ClipboardToFile will report the following:

ClipboardToFile - (C) 2006 Claus Brod, http://www.clausbrod.de

Clipboard format 49161 successfully written to DataObject.
Clipboard format 49163 successfully written to Embed Source.
Clipboard format 49156 successfully written to Native.
ERROR: Cannot write data to OwnerLink
Clipboard format 49166 successfully written to Object Descriptor.
Clipboard format 3 successfully written to CF_METAFILEPICT.
Clipboard format 8 successfully written to CF_DIB.
Clipboard format 49171 successfully written to Ole Private Data.
Clipboard format 14 successfully written to CF_ENHMETAFILE.
Clipboard format 2 successfully written to CF_BITMAP.
Clipboard format 17 successfully written to CF_DIBV5.
ERROR: Failure to enumerate clipboard and store all formats.

Apparently, there's a funny format on the clipboard called OwnerLink which, for some reason, ClipboardToFile cannot read properly. When debugging into this case, it turns out that GetClipboardData returns a null handle for this format. Hmmm... what is this format used for? Why doesn't it contain any data? Or are there ways to retrieve the data than GetClipboardData?

And what are these other formats such as DataObject, Embed Source, Native, Object Descriptor and Ole Private Data for?

Indeed, there is much more to the clipboard than was dreamt of in my philosophy. More on this (hopefully) soon.


Pasting my own dogfood, part 3 (14 Apr 2006)

Last time around, I discussed a slightly kludgy approach for automatic testing of clipboard code, which was based on clipbrd.exe and some VBscript code. That wasn't bad, but I wasn't really satisfied. After all, the original goal was to write rock-solid unit tests for clipboard code. I wanted a more reliable tool to copy data from and to the clipboard in arbitrary formats. I needed more control. And, most important of all, I was in the mood for reinventing wheels (really fancy ones, of course).

The key Win32 APIs for clipboard handling are SetClipboardData and GetClipboardData. Their signatures are as follows:

  HANDLE SetClipboardData(UINT uFormat, HANDLE hMem);

  HANDLE GetClipboardData(UINT uFormat);

So when you post to the clipboard, all you need to specify is a format and a memory handle, as it seems! This looks so trivial that I had my strategy laid out almost immediately: I would allocate a global memory block using GlobalAlloc. Then I would read some clipboard data from a file into that block, and finally call SetClipboardData - like in the following code:

HANDLE ReadFileToMemory(const TCHAR *filename)
{
  FILE *f;
  errno_t error = _tfopen_s(&f, filename, _T("rb"));
  if (error) {
    _ftprintf(stderr, _T("ERROR: Cannot open %s\n"), filename);
    return 0;
  }

  // get size of file
  fseek(f, 0, SEEK_END);
  long size=ftell(f);
  fseek(f, 0, SEEK_SET);

  // allocate memory
  HANDLE hMem = ::GlobalAlloc(GMEM_MOVEABLE, size);
  if (!hMem) {
    fclose(f);
    return 0;
  }

  LPVOID mem = ::GlobalLock(hMem);
  if (!mem) {
    fclose(f);
    ::GlobalFree(hMem);
    return 0;
  }

  // read the file into memory
  size_t bytes_read = fread(mem, 1, size, f);
  fclose(f);
  ::GlobalUnlock(mem);
  if (bytes_read != size) {
    ::GlobalFree(hMem);
    return 0;
  }

  return hMem;
}

bool FileToClipboard(TCHAR *filename, UINT clipid, HWND ownerWindow)
{
  if (::OpenClipboard(ownerWindow)) {
    HANDLE hClip = ReadFileToMemory(filename);
    ::EmptyClipboard();
    HANDLE h = ::SetClipboardData(clipid, hClip);
    ::CloseClipboard();
    ::DestroyWindow(owner);
    return true;
  }
  return false;
}

And the reverse code to get data from the clipboard and save it to a file would be just as simple:

  FILE *f = fopen(filename, "wb");
  if (f) {
    HANDLE hClip = GetClipboardData(clipID);  
                   // clipID culled from looping over
                   // available formats using EnumClipboardFormats
    void *pData = (void*)GlobalLock(hClip);
    if (pData) {
      SIZE_T sz = ::GlobalSize(pData);
      if (sz) {
        size_t written = fwrite(pData, 1, sz, f);
        ret = (written == sz);
      }
    }
    ::GlobalUnlock(hClip);
    fclose(f);
  }

Piece of cake! Mission accomplished! I slapped the usual boilerplate code for a console app onto the above, and ran my first successful tests: I could read text from the clipboard, and post text to it just fine.

However, several formats stubbornly refused to cooperate. In particular, the really useful stuff, like metafiles. Or bitmaps. What was going on?

When calling GetClipboardData for these formats, the code that interprets the returned handle as a global memory handle flatly falls on its face. It turns out that I had jumped to conclusions way too early when I read the first few lines of the SetClipboardData documentation - some of those clipboard handles are actually anything but memory handles! Examples for such formats:

  • CF_ENHMETAFILE, CF_DSPENHMETAFILE
  • CF_METAFILEPICT, CF_DSPMETAFILEPICT
  • CF_BITMAP, CF_DSPBITMAP
  • CF_PALETTE

And then, of course, there are whole classes of clipboard formats which I had not even explored yet, such as application-defined formats.

Next time: How I learnt to peacefully coexist with all the various classes of clipboard formats.


Pasting my own dogfood, part 2 (09 Apr 2006)

Yesterday, I introduced the problem of how to automatically test Windows clipboard code in applications. The idea is to move from manual and error-prone clickety-click style testing to an automatic process which produces reliable results.

Clipboard viewer Unbeknownst to many, Windows ships with a fairly interesting tool called the ClipBook Viewer (clipbrd.exe), which monitors what the clipboard contains, and will even display the formats it knows about.

This is quite helpful while developing and debugging clipboard code. However, ClipBook Viewer can even help with test automation since it can save the current clipboard contents to *.CLP files and load them back into the clipboard later.

Which, in fact, is almost all we need to thoroughly and reliably test clipboard code: We run some apps which produce a good variety of clipboard formats which our own application needs to deal with. We select some data, copy them to the clipboard, then save the clipboard contents as a *.CLP file from ClipBook Viewer.

Once we have created a reasonably-sized clipboard file library, we run ClipBook viewer and load each one of those clipboard files in turn. After loading, we switch to our own app, paste the data and check whether the incoming data makes sense to us. Not bad at all!

But alas, I could not find a way to automate the ClipBook Viewer via the command-line or COM interfaces. If someone knows about such interfaces, I'm certainly most interested to hear about them.

Luck would have it that only recently, I blogged about poor man's automation via SendKeys. The idea is to write a small shell script which runs the target application (here: clipbrd.exe), and then simulate how a user presses keys to use the application.

clipbrd.exe can be started with the name of a *.CLP file in its command line, and will then automatically load this file. However, before it pushes the contents of the file to clipboard, it will ask the user for confirmation in a message box. Well, in fact, first it will try to establish NetDDE connections, and will usually waste quite a bit of time for this. The following script tries to take this into account:

Set WshShell = WScript.CreateObject("WScript.Shell")
WshShell.Run("clipbrd.exe c:\temp\clip.CLP")
WScript.Sleep 5000 ' Wait for "Clear clipboard (yes/no)?" message box
WshShell.SendKeys "{ENTER}"

Now we could add some more scripting glue code to control our own application, have it execute its "Paste" functionality and verify that the data arrives in all its expected glory.

The above code is not quite that useful if we need to run a set of tests in a loop; the following modified version is better suited for that purpose. It assumes that all *.CLP files are stored in c:\temp\clipfiles.

Set WshShell = WScript.CreateObject("WScript.Shell")
WshShell.Run("clipbrd.exe")
WScript.Sleep 20000

startFolder="c:\temp\clipfiles"
set folder=CreateObject("Scripting.FileSystemObject").GetFolder(startFolder)
for each file in folder.files
  WScript.Echo "Now testing " & file.Path
  OpenClipFile(file.Path)
  ' Add here:
  ' - Activate application under test
  ' - Have it paste data from the clipboard
  ' - Verify that the data comes in as expected
next

' Close clipbrd.exe
WshShell.AppActivate("ClipBook Viewer")
WshShell.SendKeys "%F"
WScript.Sleep 1000
WshShell.SendKeys "x"

Sub OpenClipFile(filename)
  WshShell.AppActivate("ClipBook Viewer")
  WshShell.SendKeys "%W"  ' ALT-W for Windows menu
  WScript.Sleep 500
  WshShell.SendKeys "1"   ' Activate Clipboard window
  WScript.Sleep 500
  WshShell.SendKeys "%F"  ' ALT-F for File menu
  WScript.Sleep 1000
  WshShell.SendKeys "O"
  WScript.Sleep 1000
  WshShell.SendKeys filenam9e
  WScript.Sleep 1000
  WshShell.SendKeys "{ENTER}"
  WScript.Sleep 1000  ' Wait for "Clear clipboard (yes/no)?"
  WshShell.SendKeys "{ENTER}"
End Sub

I'm sure a VBscript hacker could tidy this up considerably and use it to form a complete test suite. However, while this approach finally gives us some degree of automation, it is still lacking in several ways:

  • The format for the *.CLP file is undocumented, so we cannot add clipboard data of our own, unless we first copy it to the clipboard, then save it from there using ClipBook Viewer.
  • Automation via sending keys is a very brittle approach. For instance, the above code was written for the English version of clipbrd.exe. The German or French or Lithuanian versions of clipbrd.exe might have completely different keyboard shortcuts.
  • I shudder when looking at those magic delay time values which the code is ridden with - what if we run on a really slow system? On a system which has even more networking problems than the one which I tested the code with?
  • Any process (or user) stealing the window focus while the test is running will break the test.

Hence, next time: Look Ma, no SendKeys!


Pasting my own dogfood (08 Apr 2006)

Simplicity often breeds success. The Windows clipboard is undoubtedly an example for this. Even though fairly limited and brittle, it is probably the most popular mechanism of data exchange in a computer user's daily life - every time I copy and paste some text from here to there, I'm using the clipboard.

(This is a tempting opportunity to gripe about clipboard inheritance. In my time as a programmer, I have certainly found way more jaw-dropping instances of boneheaded copy-paste programming than I'd ever wish for. But then, considering all the stuff I've written and forgotten about, who knows if I'm really in the right position to cast the first stone! And today doesn't feel like soapbox day, anyway. No, today I'll try to be constructive, just for a change!)

Pretty much every application under the sun supports the clipboard, and if you want to write a new application, you'll also want to be able to export application-specific data to the clipboard in some popular formats, or to import data from Office, your favorite audio ripper software, or simply from Notepad.

Clipboard code isn't too tricky to write. However, it isn't all that obvious how you can test it effectively.

You can of course test clipboard functionality in your application by mousing around: Run your app, select some data, hit CTRL-C to paste data to the clipboard, hit CTRL-V to paste it back into your own application again, select some other data, rinse and repeat. While this roundtrip test certainly covers a lot of ground, it has two fundamental weaknesses: First, it assumes that your application understands the same clipboard input formats which it produces for output, which is often not the case. Second, this approach only verifies that your "copy" code is just as broken as your "paste" code, i.e. that they make the same assumptions about the clipboard and the format of the data stored there. If the transfer works as expected, it either means that both copy and paste code are correct, or it means that both code areas have symmetric bugs!

So to fully test clipboard functionality in an application, you better try to interact with other applications. After all, this is the whole original purpose of the clipboard. If exchanging data with other apps works, then you know that you interpret certain clipboard formats the same way other applications do, and can claim with confidence that your application is interoperable.

However, running other applications as part of clipboard unit tests poses its own challenges: For example, the remote application might be difficult to automate because it does not have an automation API. Also, to run the unit tests on any given test system, you'd have to install the remote application on that test system first. Not exactly a tempting thought if the Windows Installer file for that application fills 100 MB, or if the installation process requires you to enter license codes.

This is the kind of situation I found myself in recently. In the next few blog entries, I'll discuss a few ideas on how to tackle this problem.


Don't quote me on this (18 Mar 2006)

Let us assume that I'm a little backward and have a peculiar fondness for the DOS command shell. Let us further assume that I also like blank characters in pathnames. Let us conclude that therefore I'm hosed.

But maybe others out there are hosed, too. Blank characters in pathnames are not exactly my exclusive fetish; others have joined in as well (C:\Program Files, C:\Documents and Settings). And when using software, you might be running cmd.exe without even knowing it. Many applications can run external helper programs upon user request, be it through the UI or through the application's macro language.

The test environment is a directory c:\temp\foo bar which contains write.exe (copied from the Windows system directory) and two text files, one of them with a blank in its filename.

Now we open a DOS shell:

C:\>dir c:\temp\foo bar
 Volume in drive C is IBM_PRELOAD
 Volume Serial Number is C081-0CE2

 Directory of c:\temp

File Not Found

 Directory of C:\

File Not Found

C:\>dir "c:\temp\foo bar"
 Volume in drive C is IBM_PRELOAD
 Volume Serial Number is C081-0CE2

 Directory of c:\temp\foo bar

03/18/2006  03:08 PM    <DIR>          .
03/18/2006  03:08 PM    <DIR>          ..
01/24/2006  11:19 PM             1,516 foo bar.txt
01/24/2006  11:19 PM             1,516 foo.txt
03/17/2006  09:44 AM             5,632 write.exe
               3 File(s)          8,664 bytes
               2 Dir(s)  17,448,394,752 bytes free

Note that we had to quote the pathname to make the DIR command work. Nothing unusual here; quoting is a fact of life for anyone out there who ever used a DOS or UNIX shell.

Trying to start write.exe by entering c:\temp\foo bar\write.exe in the DOS shell fails; again, we need to quote:

C:\>"c:\temp\foo bar\write.exe"

And if we want to load foo bar.txt into the editor, we need to quote the filename as well:

C:\>"c:\temp\foo bar\write.exe" "c:\temp\foo bar\foo bar.txt"

Still no surprises here.

But let's suppose we want to run an arbitrary command from our application rather than from the command prompt. The C runtime library provides the system() function for this purpose. It is well-known that under the hood system actually runs cmd.exe to do its job.

#include <stdio.h>
#include <process.h>

int main(void)
{
  char *exe = "c:\\temp\\foo bar\\write.exe";
  char *path = "c:\\temp\\foo bar\\foo bar.txt";

  char cmdbuf[1024];
  _snprintf(cmdbuf, sizeof(cmdbuf), "\"%s\" \"%s\"", exe, path);

  int ret = system(cmdbuf);
  printf("system(\"%s\") returns %d\n", cmdbuf, ret);
  return 0;
}

When running this code, it reports that system() returned 0, and write.exe never starts, even though we quoted both the name of the executable and the text file name.

What's going on here? system() internally runs cmd.exe like this:

  cmd.exe /c "c:\temp\foo bar\write.exe" "c:\temp\foo bar\foo bar.txt"

Try entering the above in the command prompt: No editor to be seen anywhere! So when we run cmd.exe programmatically, apparently it parses its input differently than when we use it in an interactive fashion.

I remember this problem drove me the up the freakin' wall when I first encountered it roughly two years ago. With a lot of experimentation, I found the right magic incantation:

  _snprintf(cmdbuf, sizeof(cmdbuf), "\"\"%s\" \"%s\"\"", exe, path);
  // originally: _snprintf(cmdbuf, sizeof(cmdbuf), "\"%s\" \"%s\"", exe, path);

Note that I quoted the whole command string another time! Now the executable actually starts. Let's verify this in the command prompt window: Yes, something like cmd.exe /c ""c:\temp\foo bar\write.exe" "c:\temp\foo bar\foo bar.txt"" does what we want.

I was reminded of this weird behavior when John Scheffel, long-time user of our flagship product OneSpace Designer Modeling and maintainer of the international CoCreate user forum, reported funny quoting problems when trying to run executables from our app's built-in Lisp interpreter. John also found the solution and documented it in a Lisp version.

Our Lisp implementation provides a function called sd-sys-exec, and you need to invoke it thusly:

(setf exe "c:/temp/foo bar/write.exe")
(setf path "c:/temp/foo bar/foo bar.txt")
(oli:sd-sys-exec (format nil "\"\"~A\" \"~A\"\"" exe path))

Kudos to John for figuring out the Lisp solution. Let's try to decipher all those quotes and backslashes in the format statement.

Originally, I modified his solution slightly by using ~S instead of ~A in the format call and thereby saving one level of explicit quoting in the code:

  (format nil "\"~S ~S\"" exe path))

This is much easier on the eyes, yet I overlooked that the ~S format specifier not only produces enclosing quotes, but also escapes any backslash characters in the argument that it processes. So if path contains a backslash (not quite unlikely on a Windows machine), the backslash will be doubled. This works surprisingly well for some time, until you hit a UNC path which already starts with two backslashes. As an example, \\backslash\lashes\back turns into \\\\backslash\\lashes\\back, which no DOS shell will be able to grok anymore.

John spotted this issue as well. Maybe he should be writing these blog entries, don't you think? smile

From those Lisp subtleties back to the original problem: I never quite understood why the extra level of quoting is necessary for cmd.exe, but apparently, others have been in the same mess before. For example, check out this XEmacs code to see how complex correct quoting can be. See also an online version of the help pages for CMD.EXE for more information on the involved quoting heuristics applied by the shell.

PS: A very similar situation occurs in OneSpace Designer Drafting as well (which is our 2D CAD application). To start an executable write.exe in a directory c:\temp\foo bar and have it open the text file c:\temp\foo bar\foo bar.txt, you'll need macro code like this:

LET Cmd '"C:\temp\foo bar\write.exe"'
LET File '"C:\temp\foo bar\foo bar.txt"'
LET Fullcmd (Cmd + " " + File)
LET Fullcmd ('"' + Fullcmd + '"')  { This is the important line }
RUN Fullcmd

Same procedure as above: If both the executable's path and the path of the data file contain blank characters, the whole command string which is passed down to cmd.exe needs to be enclosed in an additional pair of quotes...

PS: See also http://blogs.msdn.com/b/twistylittlepassagesallalike/archive/2011/04/23/everyone-quotes-arguments-the-wrong-way.aspx and http://daviddeley.com/autohotkey/parameters/parameters.htm


When asked for a TWiki account, use your own or the default TWikiGuest account.
http://xkcd.com/1638/

-- ClausBrod - 27 Mar 2016


Blame CoCreate, for instance (14 Feb 2006)

The company I work for is called CoCreate. The name was chosen because the company's mission is all about collaboratively creating things. That's all nice and dandy, but I guess the team who picked the name didn't include a programmer, and so they overlooked something pretty obvious which causes mild confusion every now and then.

Most programmers, when confronted with our company name, think of COM. After all, one of the most important functions in all of the COM libraries prominently displays our company name: CoCreateInstance. Now, if a programmer thinks about COM (and hence software) when she hears about us, that's probably fine, because, after all, we're in the business to make and sell software.

However, customers are not necessarily that technology-savvy, nor should they have to be.

A while ago, a customer complained that our software was sloppy because it wouldn't uninstall itself properly and leave behind traces in the system. Our installer/uninstaller tests didn't seem to confirm that. So we asked the customer why he thought we were messing with his system. "Well", he said, "even after I uninstall your stuff, I still get those CoCreate error messages."

The customer sent a screenshot - it showed a message box, displayed by an application which shall remain unnamed, saying that "CoCreateInstance failed" and mumbling some COM error codes!

It took us a while to explain to the customer that no, we did not install this CoCreateInstance thing on the system, and that it is a system function, and if we actually tried to uninstall it along with our application as he requested (kind of), he wouldn't be terribly happy with his system any longer, and that the other app was actually trying to report to the customer that it had found a problem with its COM registration, and that this should be looked after, not our uninstaller. Phew.

Now if only we had the time-warping powers of the publishers of "The Hitchhiker's Guide To The Galaxy", we'd send our company marketing materials back into time before Microsoft invented COM, and then sue the living daylights out of them. Well, if we were evil, that is big grin

My memory took a little longer to swap back in, but while writing the above, it dawned on me that this incident wasn't the only one of its kind: Somebody had upgraded to a new PC and installed all applications except CoCreate's. Then, while syncing to his Palm Pilot, he got an "OLE CoCreateInstance Failed" error message, and started to search high and low on his shiny new PC for traces of CoCreate applications or components.

Puzzled, he posted to a newsgroup, and I replied with tongue-in-cheek:

Let me explain: When we kicked off CoCreate as a company, we sat together and thought about awareness strategies for the new company. So we called our buddies from Microsoft and asked them to name some API functions after us, and in exchange we would port our software to Windows NT. Neat scheme, and as you discovered on your system, the cooperation between the two companies worked just fine.

[... skipping explanation of the technical issue and hints on how to fix registry issue on the system ...]

The next step for CoCreate towards world domination will be to talk to some of our buddies in, say, Portugal, and offer them to develop a Portugese version of our application if they name their country after us.

Would I get away with a response like this if I was a support engineer? Maybe not. One more thing to like about being a software developer wink (Everybody in the newsgroup had a good chuckle back then.)


When asked for a TWiki account, use your own or the default TWikiGuest account.


Environmental unconsciousness (4.2.2006)

Chances are that - by looking at my earlier blog entry on batch files - you think I'm a DOS lamer. Nothing could be further from the truth, because I'm really a UNIX lamer. (OK, so what really shaped my thinking even before that was the phrase "38911 bytes free". But I digress.)

So I still write little one-off scripts using bash, typically in a Cygwin environment. One of these scripts recently ran berserk, reporting lots of errors like this one:

  ./foo.sh: line 42: /usr/bin/find: Resource temporarily unavailable

I couldn't really figure out what resources the shell was talking about. Memory? Certainly not - the test system had ample memory, and was hardly using any. Files or disk space? Nope, lots of free disk space everywhere, and noone was fighting over access to shared files or so. Too many processes? Process Explorer wouldn't think so. Hmmm...

This test script then revealed the truth:

typeset -i limit=2200

# Create a file with 2200 environment variable definitions
rm -f exportlist
typeset -i i=0
while [ $i -lt $limit ]
do
  echo "export FOO$i=$i" >>exportlist
  let i=i+1
done

# Import the environment definitions
source ./exportlist 

# Are we still alive?
env | wc
find . -name exportlist

Run this script and watch it balk miserably about unavailable resources. So it's the environment which filled up and caused my scripts to fail! And indeed, the system for which the problem was originally reported uses a lot of environment variables for some reason, and this broke my script.

Once I had found out that much, it was easy to google for the right search terms and learn more: In this Cygwin mailing list discussion, Mike Sieweke explains that we are actually suffering from a Windows limitation here - apparently, the environment cannot grow larger than 32K. Christopher Faylor, chief maintainer of Cygwin, even recommends a workaround, but I haven't tested that one yet; instead, I helped to clean up the polluted environment on the affected PC, and henceforth, no waldsterben anymore on that system.

32K - this would have filled almost all of those 38911 memory bytes assigned for BASIC programs on my good ol' 64...


When asked for a TWiki account, use your own or the default TWikiGuest account.


Audio kills the video star (Jan 29, 2006)

Movie podcasts are the next big thing after blogging and podcasting. For the time being, I'll stick to my blog, thanks very much for asking, but I do listen to a lot of podcasts while commuting or exercising. Occasionally, I also watch some of the Channel 9 videos where Microsoft engineers and employees talk about their work. No matter what you think about the company in general, everybody knows that Microsoft hires smart people, so there is a lot to learn from them.

Many of those videos contain demos or at least feature casually-dressed geeks scribbling frantically on whiteboards, which, of course, is a must-see (ahem). But quite a few videos could be enjoyed almost just as well in pure audio format. Unfortunately, most of the Channel 9 content is in video format (*.wmv) only, which will neither fit nor play on my 512 MB MP3 player.

I'm pretty much a newbie in all things video, and so I was glad that Minh Truong suggested a way to convert WMV to WMA using Windows Media Encoder.

This actually works fine, but it's a lot of settings to remember (see the screenshots below), and it produces WMA instead of MP3 or OGG format which I'd prefer.

Fortunately, I found that Windows Media Encoder actually ships with a script called WMCmd.vbs which takes a gazillion parameters and automates the conversion process! And indeed, the following trivial command line produces a WMA audio file from a WMV video:

cd c:\Program Files\Windows Media Components\Encoder
cscript.exe WMCmd.vbs -input c:\temp\foo.wmv -output c:\temp\foo.wma -audioonly

There are a number of options to control the quality and encoding of the output which I haven't explored at all.

So now I only need to find a reasonable WMA-to-MP3 converter which can be used from the command line. batchenc and dBpowerAMP Music Converter look like they could help with that part of the job, but I'm not sure. Sounds like I have a plan for next weekend big grin


When asked for a TWiki account, use your own or the default TWikiGuest account.


Batch-as-batch-can! (27.1.2006)

So here I confess, not without a certain sense of pride: Sometimes I boldly go where few programmers like to go - and then I write a few lines in DOS batch language.

Most of the time, it's not as bad as many people think. Its bad reputation mostly stems from the days of DOS and Windows 95, but since the advent of Windows NT, the command processor has learnt quite a few new tricks. While its syntax remains absurd enough to drive some programmers out of their profession, you can now actually accomplish most typical scripting tasks with it. In particular, the for statement is quite powerful.

Anyway - a while ago, one of my batchfiles started to act up. The error message was "The system cannot find the batch label specified - copyfile". The batch file in question had a structure like this:

@echo off
rem copy all pdb files in the current directory into a backup directory
set pdbdir=c:\temp\pdbfiles

for /r %%c in (*.pdb) do call :copyfile "%%c" %pdbdir%
if errorlevel 1 echo Error occurred while copying pdb files

echo All pdb files copied.
goto :eof

rem copyfile subroutine
:copyfile
  echo Copying %1 to %2...
  copy /Y %1 %2 >nul
  goto :eof

I know what you're thinking - no, this is not the problem. This is how you write subroutines in DOS batch files. Seriously. And yes, the above script can of course be replaced by a single copy command. The original script couldn't; it performed a few extra checks for each and every file to be copied in the :copyfile subroutine, but it also contained a lot of extra fluff which distracts from the actual problem, so what you're seeing here is a stripped-down version.

The error message complained that the label copyfile could not be found. Funny, because the label is of course there. (The leading colon identifies it as a label.) And in fact, the very same subroutine could be called just fine from elsewhere in the same batch file!

For debugging, I removed the @echo off statement so that the command processor would log all commands it executes; this usually helps to find most batch file problems. But not this one - removing the echo "fixed" the bug. I added the statement again - now I got the error again. Removed the echo statement - all is fine.

Oh great. It's a Heisenbug. So I added the echo statement back in again and stared at the script hoping to find the problem by the old-fashioned method of "flash of inspiration".

No inspiration in sight, though. Not knowing what to do, I added a few empty lines between the for and the if errorlevel statement and ran the script again - no error message! Many attempts later, I concluded that it's the sheer length of the script file which made the difference between smooth sailing and desperation. By the way, the above demo script works just fine, of course, because I stripped it down for publication.

Google confirmed my suspicion: Apparently, there are cases where labels cannot be found even though they are most certainly in the batch file. Maybe the length of the label names matters - Microsoft Knowledge Base Article 63071 suggests that only the first eight characters of the label are significant. However, copyfile has exactly eight characters!

I still haven't solved this puzzle. If you're a seasoned batch file programmer sent to this place by Google and can shed some light on this, I could finally trust that script again...

-- ClausBrod - 27 Jan 2006


When asked for a TWiki account, use your own or the default TWikiGuest account.

"How bad is the Windows command line really?"

-- ClausBrod - 01 Apr 2016

Thanks a lot, Reinder!

-- ClausBrod - 05 Apr 2015

From http://help.wugnet.com/windows/system-find-batch-label-ftopict615555.html, I tentatively conclude that you need two preconditions for this to hit you:

  • the batch file must not use CRLF line endings
  • the label you jump to must span a block boundary

As to your remark "And in fact, the very same subroutine could be called just fine from elsewhere in the same batch file": in my experience, the subroutine gets called just fine when you get this error.

Regards,

Reinder

-- Reinder - 20 Jun 2008


It's official: Microsoft's compiler is twice as good as the HP-UX compiler! (24.1.2006)

A few days ago, I dissed good ol' aCC on the HP-UX platform, but for political correctness, here's an amusing quirk in Microsoft's compiler as well. Consider the following code:

typedef struct foobar gazonk;
struct gazonk;

The C++ compiler which ships with VS.NET 2003 is quite impressed with those two lines:

  fatal error C1001: INTERNAL COMPILER ERROR
  (compiler file 'msc1.cpp', line 2701)
  Please choose the Technical Support command on the Visual C++
  Help menu, or open the Technical Support help file for more information

What the compiler really wants to tell me is that it does not want me to redefine gazonk. The C++ compiler in VS 2005 gets this right.

If you refer back to the previous blog entry, you'll find that it took me only one line to crash aCC on HP-UX. It took two lines in the above example to crash Microsoft's compiler. Hence, I conclude that their compiler is only half as bad as the HP-UX compiler.

If you want to argue with my reasoning, let me tell you that out there in the wild, I have rarely seen a platform-vs-platform discussion based on facts which were any better than that. Ahem... big grin


When asked for a TWiki account, use your own or the default TWikiGuest account.


Getting organized (21.1.2006)

I've been meaning to install and tame Minimo on my PDA for some time. Finally, I can tap my way through my first blog posting from this cute little browser. I had to switch off the SSR (Small Screen Rendering) feature, but now, at last, my PocketPC displays web pages in a manner that is suitable for human consumption. I never understood why IE, after quite a number of releases of the Windows Mobile/CE platform, still sucks that badly as a browser.

Typing with the stylus is a pain in the youknowwhere, so I probably won't be blogging from my PDA that often big grin But still, I love the Minimo browser, even though it is in its early infancy. It seems to do a much better job at displaying most web sites than IE; it groks the CSS-based layout of my own web site; it makes use of the VGA screen on my PDA; and I can now even use TWiki's direct editing facilities from my organizer. This project really makes me wonder what it would take to start developing for the PocketPC platform...

And then there is Rory Blyth's Tiny Things podcast which is also wetting my appetite for those little gadgets. But even if you couldn't care less about mobile platforms, I hereby guarantee that listening to the intro section of each of the episodes will give you a good chuckle. Well, let's say that I'll guarantee a chuckle only if you promise that you won't sue me for making such potentially groundless claims. You'll know what I mean once you've listened in to Rory's show.

Man, I long for a real keyboard now.

-- ClausBrod - 21 Jan 2006


When asked for a TWiki account, use your own or the default TWikiGuest account.


A syntax error in a comment, or: The case of the vanishing parenthesis (10 Jan 2006)

The other day, Visual C++ would not compile this code, reporting lots and lots of errors:

  if (!strncmp(text, "FOO", 3)) 
  {
    foobar(text); //üüü 
  }
  else 
  {
    gazonk();
  }

Now this was funny because that code had not changed in ages, and so far had compiled just fine. At first, I couldn't explain what was going on. Hmmmm... note the funny u-umlauts in the comment. Why would someone use a comment like that? Well, the above code was inherited from source code originally written on an HP-UX system. For long years, the default character encoding on HP-UX systems has been Roman8. In that encoding, the above comment looked like this:

    foobar(text); //■■■

(If your browser cannot interpret the Unicode codepoint U+25A0, it represents a filled box.)

So the original programmer used this special character for graphically highlighting the line. In Roman8, the filled box has a character code of 0xFC. On a Windows system in the US or Europe, which defaults to displaying characters according to ISO8859-1 (aka Latin1), 0xFC will be interpreted as the German u-umlaut ü.

So far, so good, but why the compilation errors?

On the affected system, I ran the code through the C preprocessor (cpp), and ended up with this preprocessed version:

  if (!strncmp(text, "FOO", 3)) 
  {
    foobar(text);
  else 
  {
    gazonk();
  }

Wow - the preprocessor threw away the comment, as expected, but also the closing parenthesis } on the next line! Hence, the parentheses in the code are now unbalanced, which the compiler complains bitterly about.

But why would the preprocessor misbehave so badly on this system? Shortly before, I had installed the Windows multi-language UI pack (MUI) to run tests in Japanese; because of that, the system defaulted to a Japanese locale. In the default Japanese locale, Windows assumes that all strings are encoding according to the Shift-JIS standard, which is a multi-byte character set (MBCS).

Shift-JIS tries to tackle the problem of representing the several thousands of Japanese characters. The code positions 0-127 are identical with US ASCII. In the range from 128-255, some byte values indicate "first byte of a two-byte sequence" - and 0xFC is indeed one of those indicator bytes.

So the preprocessor reads the line until it finds the // comment indicators. The preprocessor changes into "comment mode" and reads all characters until the end of the line, only to discard them. (The compiler doesn't care about the comments, so why bother it with them?)

Now the preprocessor finds the first 0xFC character, and - according to the active Japanese locale - assumes that it is the first byte of a two-byte character. Hence, it reads the next byte (also 0xFC, the second "box"), converts the sequence 0xFC 0xFC into a Japanese Kanji character, and throws that character away. Then the next byte is read, which again is 0xFC (the third "box" in the comment), and so the preprocessor will slurp another byte, interpreting it as the second byte of a two-byte character.

But the next byte in the file after the third "box" is a 0x0A, i.e. the line-feed character which indicates the end of the line. The preprocessor reads that byte, forms a two-byte character from it and its predecessor (0xFC), discards the character - and misses the end of the line.

The preprocessor doesn't have a choice now but to continue searching for the next LF, which it finds in the next line, but only after the closing parenthesis. Which is why that closing parenthesis never makes it to the compiler. Hocus, pocus, leavenotracus.

So special characters in comments are not a particularly brilliant idea; not just because they might be misinterpreted (in our case, displayed as ü instead of the originally intended box), but because they can actually cause the compiler to fail.

If you think this could only happen in a Roman8 context, consider this variation of the original code:

  if (!strncmp(text, "MENU", 4)) 
  {
    display_ui(text); //Menü
  } 
  else 
  {
    gazonk();
  }

Here, we're simply using the German translation for menu in the comment; we're not even trying to be "graphical" and draw boxes in our comments. But even this is enough to cause the same compilation issue as with my original example.

Now, in my particular case, the affected code isn't likely to be compiled in Japan or China anytime soon, except in that non-standard situation when I performed my experiments with the MUI pack and a Japanese UI. But what if your next open-source project attracts hundreds of volunteers around the world who want to refine the code, and some of those volunteers happen to be from Japan? If you're trying to be too clever (or too patriotic) in your comments, they might have to spend more time on finding out why the code won't compile than on adding new features to your code.


When asked for a TWiki account, use your own or the default TWikiGuest account.


To cut a long filename short, I lost my mind (8.1.2006)

Yesterday, I explained how easy it is to inadvertedly load the same executable twice into the same process address space - you simply run it using its short DOS-ish filename (like Sample~1.exe) instead of its original long filename (such as SampleApplication.exe). For details, please consult the original blog entry.

I mentioned that one fine day I might report how exactly this happened to us, i.e. why in the world our app was started using its short filename. Seems like today is such a fine day smile

Said application registered itself as a COM server, and it does so using the services of the ATL Registrar. Upon calling RegisterServer, the registrar will kindly create all the required registry entries for a COM server, including the LocalServer entry which contains the path and filename of the server. Internally, this will call the following code in atlbase.h:

inline HRESULT WINAPI CComModule::UpdateRegistryFromResourceS(UINT nResID, 
  BOOL bRegister, struct _ATL_REGMAP_ENTRY* pMapEntries)
{
   USES_CONVERSION;
   ATL::CRegObject ro;
   TCHAR szModule[_MAX_PATH];
   GetModuleFileName(_pModule->GetModuleInstance(), szModule, _MAX_PATH);

   // Convert to short path to work around bug in NT4's CreateProcess
   TCHAR szModuleShort[_MAX_PATH];
   GetShortPathName(szModule, szModuleShort, _MAX_PATH);
   LPOLESTR pszModule = T2OLE(szModuleShort);
   ...

Aha! So ATL deliberately converts the module name (something like SampleApplication.exe) into its short-name equivalent (Sample~1.exe) to work around an issue in the CreateProcess implementation of Windows NT. MSKB:179690 describes this problem: CreateProcess could not always handle blanks in pathnames correctly, and so the ATL designers had to convert the path into its short-path version which converts everything into an 8+3 filename and hence guarantees that the filename contains no blanks.

Adding insult to injury, MSKB:201318 shows that this NT-specific bug fix in ATL has a bug itself... and, of course, our problem is, in fact, caused by yet another bug in the bug fix (see earlier blog entry).

For my application, the first workaround was to use a modified version of atlbase.h which checks the OS version; if it is Windows 2000 or later, no short-path conversion takes place. Under Windows NT, however, we're caught in a pickle: Either we use the original ATL version of the registration code and thus map the executable twice into the address space, or we apply the same fix as for Windows 2000, and will suffer from the bug in CreateProcess if the application is installed in a path which has blanks in the pathname.

In my case, this was not a showstopper issue because the application is targeting Windows 2000 and XP only, so I simply left it at that.

Another approach is to use the AddReplacement and ClearReplacements APIs of the ATL registrar to set our own conversion rules for the module name and thereby override ATL's own rules for the module name:

#include <atlbase.h>
#include <statreg.h>

void RegisterServer(wchar_t *widePath, bool reg)
{
  ATL::CRegObject ro;
  ro.AddReplacement(L"Module", widePath);
  reg ? ro.ResourceRegister(widePath, IDR_REGISTRY, L"REGISTRY") :
        ro.ResourceUnregister(widePath, IDR_REGISTRY, L"REGISTRY");
}


When asked for a TWiki account, use your own or the default TWikiGuest account.


Doppelgänger modules, or The Curse of EightPlusThree (7.1.2006)

Windows, even in its latest incarnations, still exhibits quite a bit of quirky behavior which is due to its DOS roots, or at least due to the attempt to remain compatible with code which was created for DOS. Most of the time, I am not even surprised anymore when I come across 16-bit limitations or similar reminiscences of the past. But sometimes, I only become aware of them when my code crashes.

This happened some time ago with an application I am working on. When I started the app in a certain way, it would simply crash very early during startup. It took a while to break this down into the following trivial code example which consists of a main executable and a DLL which is loaded into the executable via LoadLibrary, i.e. dynamically. Here is the code for the main executable, SampleApp.cpp:

#include <stdio.h>
#include <conio.h>
#include <windows.h>
#include <psapi.h>

static void EnumModules(const char *msg)
{
  printf("\n==========================================================\n");
  printf("List of modules in the current process %s:\n", msg);

  HMODULE hMods[1024];
  DWORD cbNeeded;
  HANDLE hProcess = GetCurrentProcess();

  // inquire modules loaded into process
  if( EnumProcessModules(hProcess, hMods, sizeof(hMods), &cbNeeded)) {
    // print name and handle for each module
    for ( unsigned int i = 0; i < (cbNeeded/sizeof(HMODULE)); i++ ) {
      char szModName[MAX_PATH];
      if ( GetModuleFileNameEx( hProcess, hMods[i], szModName, sizeof(szModName))) {
        printf("  %s (0x%08X)\n", szModName, hMods[i] );
      }
    }
  }

  CloseHandle( hProcess );
}

extern "C" __declspec(dllexport) int functionInExe(void)
{
  printf("Now in functionInExe()\n");
  return 42;
}

int main(void)
{
  EnumModules("before loading DLL");
  HMODULE hmod = LoadLibrary("SampleDLL.dll");
  EnumModules("after loading DLL");

  printf("\nPress key to exit.\n");
  _getch();
  return 0;
}

This code loads a DLL called SampleDLL.dll. Before and after loading the DLL, it enumerates the modules which are currently loaded into the process; this is only to demonstrate the effect which led to the crash in the other app I was working on.

SampleDLL.dll is built from this code (SampleDLL.cpp):

extern "C" __declspec(dllimport) int functionInExe(void);

extern "C" __declspec(dllexport) void gazonk(void)
{
  int i = functionInExe();
}

The main executable exports a function called functionInExe, and the DLL calls this function, and so it has an explicit reference to the main executable which the linker needs to resolve. This is an important piece of the puzzle.

And here is a simple makefile which shows how to build the two modules:

all: SampleApplication.exe SampleDLL.dll

clean:
   del *.obj *.exe *.dll

SampleApplication.exe: SampleApp.obj
   link /debug /out:SampleApplication.exe SampleApp.obj psapi.lib

SampleApp.obj: SampleApp.cpp
   cl /Zi /c SampleApp.cpp

SampleDLL.dll: SampleDLL.obj
   link /debug /dll /out:SampleDLL.dll SampleDLL.obj SampleApplication.lib

SampleDLL.obj: SampleDLL.cpp
   cl /Zi /c SampleDLL.cpp

Let's assume that the above files (SampleApp.cpp, SampleDLL.cpp and makefile) are all in a directory c:\temp\dupemod, and that we built the code by running nmake in that directory. Now let's run the code as shown in the screenshots below.

Long filename Short filename


Take a close look at the command shell window on the right: After loading the DLL, the process maps both c:\temp\dupemod\Sample~1.exe and c:\temp\dupemod\SampleApplication.exe into its address space. Both refer, of course, to the same file, which means that we have loaded the executable twice!

This happens only if we run the executable using its 8+3 DOS name, i.e. Sample~1.exe. When run with a long filename, everything works as expected. So when we load SampleDLL.dll, the OS loader tries to resolve the references which this DLL makes to other modules. One of those modules is SampleApplication.exe. The OS loader should be able to map this reference to the instance of the executable which is already mapped into the address space. However, it seems that the OS loader cannot figure out that Sample~1.exe and SampleApplication.exe are actually the same file, and therefore loads another instance of the executable!

BTW, this happens both on Windows 2000 and Windows XP systems. In this trivial example, the only damage done is probably just that the main executable consumes twice the virtual address space. In a large application, the consequences can be more severe, and in our case they were. Microsoft also documents some effects of this issue in Knowledge Base articles, for example KB218475 and KB193513.

The only workarounds I see are:

  • Rename the executable to that it uses a name which fits into the 8+3 format.
  • Make sure that nobody ever runs the executable using its short name.

We basically went with the latter approach - by making sure that end users always run the application by double-clicking shortcuts which contain the full executable name, and by fixing an interesting related bug in ATL which, hopefully, I may have the time and the nerves to describe in more detail one fine day...


When asked for a TWiki account, use your own or the default TWikiGuest account.



to top

You are here: Blog > WebLeftBar > BlogOnSoftwareWindows

r1.20 - 08 Oct 2017 - 21:18 - ClausBrod to top

Blog
This site
RSS

  2017: 12 - 11 - 10
  2016: 10 - 7 - 3
  2015: 11 - 10 - 9 - 4 - 1
  2014: 5
  2013: 9 - 8 - 7 - 6 - 5
  2012: 2 - 10
  2011: 1 - 8 - 9 - 10 - 12
  2010: 11 - 10 - 9 - 4
  2009: 11 - 9 - 8 - 7 -
     6 - 5 - 4 - 3
  2008: 5 - 4 - 3 - 1
  2007: 12 - 8 - 7 - 6 -
     5 - 4 - 3 - 1
  2006: 4 - 3 - 2 - 1
  2005: 12 - 6 - 5 - 4
  2004: 12 - 11 - 10
  C++
  CoCreate Modeling
  COM & .NET
  Java
  Mac
  Lisp
  OpenSource
  Scripting
  Windows
  Stuff
Changes
Index
Search
Maintenance
Impressum
Datenschutzerklärung
Home



Jump:

Copyright © 1999-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback