Getting rid of nul, or: How I learnt to love UNC (29 Apr 2006)

Every now and then, some tool on my system runs berserk and starts to generate files called nul. This is a clear indication that there's something going wrong with output redirection in a script, but I still have to figure out exactly what's going on. Until then, I need at least a way to get rid of those files.

Yes, that's right, you cannot delete a file called nul that easily - neither using Windows Explorer nor via the DOS prompt. nul is a very special filename for Windows - it is an alias for the null device, i.e. the bit bucket where all the redirected output goes, all those cries for help from software which we are guilty of ignoring all the time.

UNC path notation to the rescue: To remove a file called nul in, say c:\temp, you can use the DOS del command as follows:

  del \\.\c:\temp\nul

Works great for me. But since I rarely use UNC syntax, I sometimes forget how it looks like. Worse, the syntax requires to specify the full path of the nul file, and I hate typing those long paths. So I came up with the following naïve batch file which does the job for me. It takes one argument which specifies the relative or absolute path of the nul file. Examples:

   rem remove nul file in current dir
   delnul.bat nul
   rem remove nul file in subdir
   delnul.bat foo\nul
   rem remove nul file in tempdir
   delnul.bat c:\temp\nul

For the path completion magic, I'm using the for command which has so many options that my brain hurts whenever I read its documentation. I'm pretty sure one could build a Turing-complete language using just for...

  @echo off
  set fullpath=
  for %%i IN (%1x) DO set fullpath=%%~di%%~pi

  set filename=
  for %%i IN (%1x) DO set filename=%%~ni
  if not "x%filename%" == "xnulx" (echo Usage: %0 [somepath\]nul && goto :eof)
 
  echo Deleting %fullpath%nul...
  del \\.\%fullpath%nul

DelinvFile takes this a lot further; it has a Windows UI and can delete many other otherwise pretty sticky files - nul is not the only dangerous file name; there's con, aux, prn and probably a couple of other magic names which had a special meaning for DOS, and hence also for Windows.


A subset of Ruby (16 Apr 2006)

A programming language which inspires an author to write something like why's (poignant) guide to Ruby must be truly special. So I gave in readily to the temptation to try the language, especially since Ruby's dynamic type system and lambda-like expressions appeal to the Lisp programmer in me.

A while ago, I blogged about the subset sum problem, and so writing a version of that algorithm in Ruby was an obvious choice.

$solutions = 0
$numbers = []
$flags = []

def find_solutions(k, target_sum)
  if target_sum == 0
    # found a solution!
    (0..$numbers.length).each { |i| if ($flags[i]) then print $numbers[i], " "; end }
    print "\n"
    $solutions = $solutions + 1
  else
    if k < $numbers.length
      if target_sum >= $numbers[k]
        $flags[k] = true
        find_solutions k+1, target_sum-$numbers[k]
        $flags[k] = false
      end
      find_solutions k+1, target_sum
    end
  end
end

def find_subset_sum(target_sum)
  print "\nNow listing all subsets which sum up to ", target_sum, ":\n"
  $solutions = 0
  (0..$numbers.length()).each { |i| $flags[i] = false }
  find_solutions 0, target_sum
  print "Found ", $solutions, " different solutions.\n"
end

def subset_sum_test(size)
  total = 0
  target_sum = 0
  (0..size).each { |i| $numbers[i] = rand(1000); total += $numbers[i]; print $numbers[i], " " }
  target_sum = total/2
  find_subset_sum target_sum
end

subset_sum_test 25

Comparing this to my other implementations in various languages, this solution is shorter than the Lisp version, and roughly the same length as the VB code I wrote. I'm pretty sure that as I learn more about Ruby I will be able to improve quite a bit on the above naïve code, so it will probably become even shorter.

But even after the first few minutes of coding in this language, I'm taking away the impression that I can produce code which looks a lot cleaner than, say, Perl code. Well, at least cleaner than the Perl code which I usually write ... big grin


Pasting my own dogfood, part 4 (15 Apr 2006)

In the epic "dogfood" series, the hero comes to realize he needs systematic tests for clipboard code. On his quest, he briefly gives in to the sweet song of the Scripting Sirens, but escapes and makes it to the safer shores of the Win32 API - but only to realize that fate has even more trials in place for him.

When we left our hero the last time, he had just figured out that not all Win32 handles are made alike. Data for most clipboard formats is held in a global memory buffer (allocated via GlobalAlloc); GetClipboardData returns a handle to the memory block, and all you need to do in order to decode the data is to interpret the handle as a memory handle and then read from that block of memory.

However, there are some formats which won't reveal their inner selfs that easily, such as bitmaps. Hence, it's about time we form a circle, take each other by the hands, and meditate over clipboard formats.

Microsoft lists clipboard formats here and distinguishes the following main classes of clipboard formats:

  • Standard clipboard formats, i.e. predefined formats such as CF_BITMAP, CF_TEXT, CF_ENHMETAFILE, CF_HDROP, CF_PALETTE, CF_WAVE etc.
  • Registered clipboard formats, i.e. application-defined formats which are registered at runtime. RTF is a prominent example for such a format.
  • Private clipboard formats: Another special kind of application-defined formats.

For the purpose of code which tries to retrieve arbitrary data from the clipboard, however, it is more useful to use a different classification.

Memory-based clipboard formats

These are all the formats for which the handle returned from GetClipboardData can be interpreted as a memory handle.

The text formats (CF_TEXT and cousins) are in this class, and probably most application-defined formats ("registered formats"), although it is entirely up to the application to decide on how the data in the clipboard needs to be decoded. Other examples: CF_LOCALE, CF_WAVE, CF_TIFF.

These formats can be posted to the clipboard and read from there using code similar to what I posted last time.

Handle-based clipboard formats

Examples for such formats:

  • CF_ENHMETAFILE, CF_DSPENHMETAFILE
  • CF_METAFILEPICT, CF_DSPMETAFILEPICT
  • CF_BITMAP, CF_DSPBITMAP
  • CF_PALETTE

I found that I had to provide format-specific code to interpret these formats.

Metafiles

CF_ENHMETAFILE and CF_DSPENHMETAFILE data can be copied by interpreting the clipboard handle as a metafile handle (HENHMETAFILE) and using the CopyEnhMetaFile API in Win32 to directly create a file on the disk.

CF_METAFILEPICT and CF_DSPMETAFILEPICT differ slightly from this. The clipboard handle is a memory handle to a METAFILEPICT data structure. That structure has a member called hMF which is the actual metafile handle; pass this handle to CopyMetaFile, and you'll get a metafile on the disk.

Bitmaps

The clipboard handle really is a bitmap handle of type HBITMAP. HBITMAPs refer to device-dependent bitmap data (DDB), which first need to be converted into device-independent format (DIB); then you can add a bitmap header and write the whole shebang to the disk in a format which can be read as *.BMP by image viewers.

Palettes

The clipboard handle must be interpreted as a HPALETTE handle. The GetPaletteEntries API can be used to retrieve the data behind such a handle; then we can dump the palette entries in the data structure to a file in any format we choose; for example, a simple integer specifying the number of entries, followed by PALETTEENTRY structures.

Other formats

When dragging and dropping files (or copying them in Explorer), information about these files is transmitted in CF_HDROP format. The handle returned by GetClipboardData can be interpreted as an HDROP, which can be passed to DragQueryFile to learn more about the files in the clipboard.

(I tried to come up with testcases for formats such as CF_PENDATA and CF_DSPTEXT, but could not find any. If anyone comes across these formats, please let me know.)

Finally: New toys!

With the above findings, I was ready to extend my very simplistic original test code. The result: Two useful tools which can be used to copy data from the clipboard into files (ClipboardToFile.exe), and to copy data from files into the clipboard (FileToClipboard.exe). And I'm even sharing this code .-)

Source code and executables can be downloaded here - and here are some hints on how to use the toolset.

ClipboardToFile

ClipboardToFile does exactly what the name hints at: It enumerates the formats which are currently in the clipboard, and writes files containing the clipboard data in that format.

So to create a set of test files, simply run your favorite apps on any system and use them to copy data in various formats to the clipboard. For example:

  • Run Paint, load a file and copy it to the clipboard
  • Run ClipboardToFile to save whatever Paint added to the clipboard
  • Run Word, type some text, and copy it to the clipboard
  • Run ClipboardToFile again to get clipboard extracts in various text formats.

An example of how to run ClipboardToFile:

 ClipboardToFile c:\temp\clipbboarddata

With the above command line, ClipboardToFile will produce files in the directory c:\temp\clipboarddata. Those files are named after the clipboard format from which they were produced. Typical names are "CF_TEXT", "CF_BITMAP", "CF_DIB" and so on. Repeat the process with other apps on your system until you have a library of clipboard data files which you can use for unit tests!

FileToClipboard

FileToClipboard is a command-line tool which takes any file you throw at it, reads the file's contents and copies them to the clipboard in (almost) any format you specify:

   FileToClipboard foo.wmf CF_ENHMETAFILE
   FileToClipboard foo.bmp CF_BITMAP
   FileToClipboard foo.txt CF_UNICODETEXT

So the basic idea is to prepare a couple of test files (here: foo.bmp, foo.wmf and foo.txt), dump them into your unit test directory, and use them to prepare the clipboard. Then you run your application's "Edit/Paste" functionality and verify that it works as expected. Since FileToClipboard is a command-line utility, you can automate such tests easily; also, the executable is very small and can be installed everywhere simply by copying the exe file.

In the case of text and bitmap files, it is easy to see where you can get sample test data. However, some formats are only used for clipboard transfer and are never persisted to files. As an example, the CF_LOCALE format indicates that locale data is in the clipboard. In the case of CF_LOCALE, it would be easy to fudge a binary file: A single integer is used to encode a locale ID. So you could create such a file with a hex editor or by writing a one-liner C program or whatever, and then feed it into FileToClipboard in CF_LOCALE mode.

However, there are many other formats which are not quite that simple. Worse, any application out there can define its own undocumented clipboard formats at any time. Fortunately, we already have a tool which fills this gap: ClipboardToFile.

The end of the saga? Not quite!

These tools cover a wide variety of clipboard formats, including many registered formats - most of them seem to be memory-based. For the most part, I'm a happy camper now. I can automate all my tests, and move on to greener pastures.

Well, I could, except that the whole Win32 approach which I took is still fundamentally flawed, really.

When I fire up Paint, then select an image area and copy it to the clipboard, ClipboardToFile will report the following:

ClipboardToFile - (C) 2006 Claus Brod, http://www.clausbrod.de

Clipboard format 49161 successfully written to DataObject.
Clipboard format 49163 successfully written to Embed Source.
Clipboard format 49156 successfully written to Native.
ERROR: Cannot write data to OwnerLink
Clipboard format 49166 successfully written to Object Descriptor.
Clipboard format 3 successfully written to CF_METAFILEPICT.
Clipboard format 8 successfully written to CF_DIB.
Clipboard format 49171 successfully written to Ole Private Data.
Clipboard format 14 successfully written to CF_ENHMETAFILE.
Clipboard format 2 successfully written to CF_BITMAP.
Clipboard format 17 successfully written to CF_DIBV5.
ERROR: Failure to enumerate clipboard and store all formats.

Apparently, there's a funny format on the clipboard called OwnerLink which, for some reason, ClipboardToFile cannot read properly. When debugging into this case, it turns out that GetClipboardData returns a null handle for this format. Hmmm... what is this format used for? Why doesn't it contain any data? Or are there ways to retrieve the data than GetClipboardData?

And what are these other formats such as DataObject, Embed Source, Native, Object Descriptor and Ole Private Data for?

Indeed, there is much more to the clipboard than was dreamt of in my philosophy. More on this (hopefully) soon.


Pasting my own dogfood, part 3 (14 Apr 2006)

Last time around, I discussed a slightly kludgy approach for automatic testing of clipboard code, which was based on clipbrd.exe and some VBscript code. That wasn't bad, but I wasn't really satisfied. After all, the original goal was to write rock-solid unit tests for clipboard code. I wanted a more reliable tool to copy data from and to the clipboard in arbitrary formats. I needed more control. And, most important of all, I was in the mood for reinventing wheels (really fancy ones, of course).

The key Win32 APIs for clipboard handling are SetClipboardData and GetClipboardData. Their signatures are as follows:

  HANDLE SetClipboardData(UINT uFormat, HANDLE hMem);

  HANDLE GetClipboardData(UINT uFormat);

So when you post to the clipboard, all you need to specify is a format and a memory handle, as it seems! This looks so trivial that I had my strategy laid out almost immediately: I would allocate a global memory block using GlobalAlloc. Then I would read some clipboard data from a file into that block, and finally call SetClipboardData - like in the following code:

HANDLE ReadFileToMemory(const TCHAR *filename)
{
  FILE *f;
  errno_t error = _tfopen_s(&f, filename, _T("rb"));
  if (error) {
    _ftprintf(stderr, _T("ERROR: Cannot open %s\n"), filename);
    return 0;
  }

  // get size of file
  fseek(f, 0, SEEK_END);
  long size=ftell(f);
  fseek(f, 0, SEEK_SET);

  // allocate memory
  HANDLE hMem = ::GlobalAlloc(GMEM_MOVEABLE, size);
  if (!hMem) {
    fclose(f);
    return 0;
  }

  LPVOID mem = ::GlobalLock(hMem);
  if (!mem) {
    fclose(f);
    ::GlobalFree(hMem);
    return 0;
  }

  // read the file into memory
  size_t bytes_read = fread(mem, 1, size, f);
  fclose(f);
  ::GlobalUnlock(mem);
  if (bytes_read != size) {
    ::GlobalFree(hMem);
    return 0;
  }

  return hMem;
}

bool FileToClipboard(TCHAR *filename, UINT clipid, HWND ownerWindow)
{
  if (::OpenClipboard(ownerWindow)) {
    HANDLE hClip = ReadFileToMemory(filename);
    ::EmptyClipboard();
    HANDLE h = ::SetClipboardData(clipid, hClip);
    ::CloseClipboard();
    ::DestroyWindow(owner);
    return true;
  }
  return false;
}

And the reverse code to get data from the clipboard and save it to a file would be just as simple:

  FILE *f = fopen(filename, "wb");
  if (f) {
    HANDLE hClip = GetClipboardData(clipID);  
                   // clipID culled from looping over
                   // available formats using EnumClipboardFormats
    void *pData = (void*)GlobalLock(hClip);
    if (pData) {
      SIZE_T sz = ::GlobalSize(pData);
      if (sz) {
        size_t written = fwrite(pData, 1, sz, f);
        ret = (written == sz);
      }
    }
    ::GlobalUnlock(hClip);
    fclose(f);
  }

Piece of cake! Mission accomplished! I slapped the usual boilerplate code for a console app onto the above, and ran my first successful tests: I could read text from the clipboard, and post text to it just fine.

However, several formats stubbornly refused to cooperate. In particular, the really useful stuff, like metafiles. Or bitmaps. What was going on?

When calling GetClipboardData for these formats, the code that interprets the returned handle as a global memory handle flatly falls on its face. It turns out that I had jumped to conclusions way too early when I read the first few lines of the SetClipboardData documentation - some of those clipboard handles are actually anything but memory handles! Examples for such formats:

  • CF_ENHMETAFILE, CF_DSPENHMETAFILE
  • CF_METAFILEPICT, CF_DSPMETAFILEPICT
  • CF_BITMAP, CF_DSPBITMAP
  • CF_PALETTE

And then, of course, there are whole classes of clipboard formats which I had not even explored yet, such as application-defined formats.

Next time: How I learnt to peacefully coexist with all the various classes of clipboard formats.


Pasting my own dogfood, part 2 (09 Apr 2006)

Yesterday, I introduced the problem of how to automatically test Windows clipboard code in applications. The idea is to move from manual and error-prone clickety-click style testing to an automatic process which produces reliable results.

Clipboard viewer Unbeknownst to many, Windows ships with a fairly interesting tool called the ClipBook Viewer (clipbrd.exe), which monitors what the clipboard contains, and will even display the formats it knows about.

This is quite helpful while developing and debugging clipboard code. However, ClipBook Viewer can even help with test automation since it can save the current clipboard contents to *.CLP files and load them back into the clipboard later.

Which, in fact, is almost all we need to thoroughly and reliably test clipboard code: We run some apps which produce a good variety of clipboard formats which our own application needs to deal with. We select some data, copy them to the clipboard, then save the clipboard contents as a *.CLP file from ClipBook Viewer.

Once we have created a reasonably-sized clipboard file library, we run ClipBook viewer and load each one of those clipboard files in turn. After loading, we switch to our own app, paste the data and check whether the incoming data makes sense to us. Not bad at all!

But alas, I could not find a way to automate the ClipBook Viewer via the command-line or COM interfaces. If someone knows about such interfaces, I'm certainly most interested to hear about them.

Luck would have it that only recently, I blogged about poor man's automation via SendKeys. The idea is to write a small shell script which runs the target application (here: clipbrd.exe), and then simulate how a user presses keys to use the application.

clipbrd.exe can be started with the name of a *.CLP file in its command line, and will then automatically load this file. However, before it pushes the contents of the file to clipboard, it will ask the user for confirmation in a message box. Well, in fact, first it will try to establish NetDDE connections, and will usually waste quite a bit of time for this. The following script tries to take this into account:

Set WshShell = WScript.CreateObject("WScript.Shell")
WshShell.Run("clipbrd.exe c:\temp\clip.CLP")
WScript.Sleep 5000 ' Wait for "Clear clipboard (yes/no)?" message box
WshShell.SendKeys "{ENTER}"

Now we could add some more scripting glue code to control our own application, have it execute its "Paste" functionality and verify that the data arrives in all its expected glory.

The above code is not quite that useful if we need to run a set of tests in a loop; the following modified version is better suited for that purpose. It assumes that all *.CLP files are stored in c:\temp\clipfiles.

Set WshShell = WScript.CreateObject("WScript.Shell")
WshShell.Run("clipbrd.exe")
WScript.Sleep 20000

startFolder="c:\temp\clipfiles"
set folder=CreateObject("Scripting.FileSystemObject").GetFolder(startFolder)
for each file in folder.files
  WScript.Echo "Now testing " & file.Path
  OpenClipFile(file.Path)
  ' Add here:
  ' - Activate application under test
  ' - Have it paste data from the clipboard
  ' - Verify that the data comes in as expected
next

' Close clipbrd.exe
WshShell.AppActivate("ClipBook Viewer")
WshShell.SendKeys "%F"
WScript.Sleep 1000
WshShell.SendKeys "x"

Sub OpenClipFile(filename)
  WshShell.AppActivate("ClipBook Viewer")
  WshShell.SendKeys "%W"  ' ALT-W for Windows menu
  WScript.Sleep 500
  WshShell.SendKeys "1"   ' Activate Clipboard window
  WScript.Sleep 500
  WshShell.SendKeys "%F"  ' ALT-F for File menu
  WScript.Sleep 1000
  WshShell.SendKeys "O"
  WScript.Sleep 1000
  WshShell.SendKeys filenam9e
  WScript.Sleep 1000
  WshShell.SendKeys "{ENTER}"
  WScript.Sleep 1000  ' Wait for "Clear clipboard (yes/no)?"
  WshShell.SendKeys "{ENTER}"
End Sub

I'm sure a VBscript hacker could tidy this up considerably and use it to form a complete test suite. However, while this approach finally gives us some degree of automation, it is still lacking in several ways:

  • The format for the *.CLP file is undocumented, so we cannot add clipboard data of our own, unless we first copy it to the clipboard, then save it from there using ClipBook Viewer.
  • Automation via sending keys is a very brittle approach. For instance, the above code was written for the English version of clipbrd.exe. The German or French or Lithuanian versions of clipbrd.exe might have completely different keyboard shortcuts.
  • I shudder when looking at those magic delay time values which the code is ridden with - what if we run on a really slow system? On a system which has even more networking problems than the one which I tested the code with?
  • Any process (or user) stealing the window focus while the test is running will break the test.

Hence, next time: Look Ma, no SendKeys!


Pasting my own dogfood (08 Apr 2006)

Simplicity often breeds success. The Windows clipboard is undoubtedly an example for this. Even though fairly limited and brittle, it is probably the most popular mechanism of data exchange in a computer user's daily life - every time I copy and paste some text from here to there, I'm using the clipboard.

(This is a tempting opportunity to gripe about clipboard inheritance. In my time as a programmer, I have certainly found way more jaw-dropping instances of boneheaded copy-paste programming than I'd ever wish for. But then, considering all the stuff I've written and forgotten about, who knows if I'm really in the right position to cast the first stone! And today doesn't feel like soapbox day, anyway. No, today I'll try to be constructive, just for a change!)

Pretty much every application under the sun supports the clipboard, and if you want to write a new application, you'll also want to be able to export application-specific data to the clipboard in some popular formats, or to import data from Office, your favorite audio ripper software, or simply from Notepad.

Clipboard code isn't too tricky to write. However, it isn't all that obvious how you can test it effectively.

You can of course test clipboard functionality in your application by mousing around: Run your app, select some data, hit CTRL-C to paste data to the clipboard, hit CTRL-V to paste it back into your own application again, select some other data, rinse and repeat. While this roundtrip test certainly covers a lot of ground, it has two fundamental weaknesses: First, it assumes that your application understands the same clipboard input formats which it produces for output, which is often not the case. Second, this approach only verifies that your "copy" code is just as broken as your "paste" code, i.e. that they make the same assumptions about the clipboard and the format of the data stored there. If the transfer works as expected, it either means that both copy and paste code are correct, or it means that both code areas have symmetric bugs!

So to fully test clipboard functionality in an application, you better try to interact with other applications. After all, this is the whole original purpose of the clipboard. If exchanging data with other apps works, then you know that you interpret certain clipboard formats the same way other applications do, and can claim with confidence that your application is interoperable.

However, running other applications as part of clipboard unit tests poses its own challenges: For example, the remote application might be difficult to automate because it does not have an automation API. Also, to run the unit tests on any given test system, you'd have to install the remote application on that test system first. Not exactly a tempting thought if the Windows Installer file for that application fills 100 MB, or if the installation process requires you to enter license codes.

This is the kind of situation I found myself in recently. In the next few blog entries, I'll discuss a few ideas on how to tackle this problem.


Add new blog entry.



Revision: r1.40 - 29 Apr 2006 - 09:35 - ClausBrod
Blog > BlogOnSoftware
Copyright © 1999-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback