Pasting my own dogfood, part 4 (15 Apr 2006)

In the epic "dogfood" series, the hero comes to realize he needs systematic tests for clipboard code. On his quest, he briefly gives in to the sweet song of the Scripting Sirens, but escapes and makes it to the safer shores of the Win32 API - but only to realize that fate has even more trials in place for him.

When we left our hero the last time, he had just figured out that not all Win32 handles are made alike. Data for most clipboard formats is held in a global memory buffer (allocated via GlobalAlloc); GetClipboardData returns a handle to the memory block, and all you need to do in order to decode the data is to interpret the handle as a memory handle and then read from that block of memory.

However, there are some formats which won't reveal their inner selfs that easily, such as bitmaps. Hence, it's about time we form a circle, take each other by the hands, and meditate over clipboard formats.

Microsoft lists clipboard formats here and distinguishes the following main classes of clipboard formats:

  • Standard clipboard formats, i.e. predefined formats such as CF_BITMAP, CF_TEXT, CF_ENHMETAFILE, CF_HDROP, CF_PALETTE, CF_WAVE etc.
  • Registered clipboard formats, i.e. application-defined formats which are registered at runtime. RTF is a prominent example for such a format.
  • Private clipboard formats: Another special kind of application-defined formats.

For the purpose of code which tries to retrieve arbitrary data from the clipboard, however, it is more useful to use a different classification.

Memory-based clipboard formats

These are all the formats for which the handle returned from GetClipboardData can be interpreted as a memory handle.

The text formats (CF_TEXT and cousins) are in this class, and probably most application-defined formats ("registered formats"), although it is entirely up to the application to decide on how the data in the clipboard needs to be decoded. Other examples: CF_LOCALE, CF_WAVE, CF_TIFF.

These formats can be posted to the clipboard and read from there using code similar to what I posted last time.

Handle-based clipboard formats

Examples for such formats:

  • CF_ENHMETAFILE, CF_DSPENHMETAFILE
  • CF_METAFILEPICT, CF_DSPMETAFILEPICT
  • CF_BITMAP, CF_DSPBITMAP
  • CF_PALETTE

I found that I had to provide format-specific code to interpret these formats.

Metafiles

CF_ENHMETAFILE and CF_DSPENHMETAFILE data can be copied by interpreting the clipboard handle as a metafile handle (HENHMETAFILE) and using the CopyEnhMetaFile API in Win32 to directly create a file on the disk.

CF_METAFILEPICT and CF_DSPMETAFILEPICT differ slightly from this. The clipboard handle is a memory handle to a METAFILEPICT data structure. That structure has a member called hMF which is the actual metafile handle; pass this handle to CopyMetaFile, and you'll get a metafile on the disk.

Bitmaps

The clipboard handle really is a bitmap handle of type HBITMAP. HBITMAPs refer to device-dependent bitmap data (DDB), which first need to be converted into device-independent format (DIB); then you can add a bitmap header and write the whole shebang to the disk in a format which can be read as *.BMP by image viewers.

Palettes

The clipboard handle must be interpreted as a HPALETTE handle. The GetPaletteEntries API can be used to retrieve the data behind such a handle; then we can dump the palette entries in the data structure to a file in any format we choose; for example, a simple integer specifying the number of entries, followed by PALETTEENTRY structures.

Other formats

When dragging and dropping files (or copying them in Explorer), information about these files is transmitted in CF_HDROP format. The handle returned by GetClipboardData can be interpreted as an HDROP, which can be passed to DragQueryFile to learn more about the files in the clipboard.

(I tried to come up with testcases for formats such as CF_PENDATA and CF_DSPTEXT, but could not find any. If anyone comes across these formats, please let me know.)

Finally: New toys!

With the above findings, I was ready to extend my very simplistic original test code. The result: Two useful tools which can be used to copy data from the clipboard into files (ClipboardToFile.exe), and to copy data from files into the clipboard (FileToClipboard.exe). And I'm even sharing this code .-)

Source code and executables can be downloaded here - and here are some hints on how to use the toolset.

ClipboardToFile

ClipboardToFile does exactly what the name hints at: It enumerates the formats which are currently in the clipboard, and writes files containing the clipboard data in that format.

So to create a set of test files, simply run your favorite apps on any system and use them to copy data in various formats to the clipboard. For example:

  • Run Paint, load a file and copy it to the clipboard
  • Run ClipboardToFile to save whatever Paint added to the clipboard
  • Run Word, type some text, and copy it to the clipboard
  • Run ClipboardToFile again to get clipboard extracts in various text formats.

An example of how to run ClipboardToFile:

 ClipboardToFile c:\temp\clipbboarddata

With the above command line, ClipboardToFile will produce files in the directory c:\temp\clipboarddata. Those files are named after the clipboard format from which they were produced. Typical names are "CF_TEXT", "CF_BITMAP", "CF_DIB" and so on. Repeat the process with other apps on your system until you have a library of clipboard data files which you can use for unit tests!

FileToClipboard

FileToClipboard is a command-line tool which takes any file you throw at it, reads the file's contents and copies them to the clipboard in (almost) any format you specify:

   FileToClipboard foo.wmf CF_ENHMETAFILE
   FileToClipboard foo.bmp CF_BITMAP
   FileToClipboard foo.txt CF_UNICODETEXT

So the basic idea is to prepare a couple of test files (here: foo.bmp, foo.wmf and foo.txt), dump them into your unit test directory, and use them to prepare the clipboard. Then you run your application's "Edit/Paste" functionality and verify that it works as expected. Since FileToClipboard is a command-line utility, you can automate such tests easily; also, the executable is very small and can be installed everywhere simply by copying the exe file.

In the case of text and bitmap files, it is easy to see where you can get sample test data. However, some formats are only used for clipboard transfer and are never persisted to files. As an example, the CF_LOCALE format indicates that locale data is in the clipboard. In the case of CF_LOCALE, it would be easy to fudge a binary file: A single integer is used to encode a locale ID. So you could create such a file with a hex editor or by writing a one-liner C program or whatever, and then feed it into FileToClipboard in CF_LOCALE mode.

However, there are many other formats which are not quite that simple. Worse, any application out there can define its own undocumented clipboard formats at any time. Fortunately, we already have a tool which fills this gap: ClipboardToFile.

The end of the saga? Not quite!

These tools cover a wide variety of clipboard formats, including many registered formats - most of them seem to be memory-based. For the most part, I'm a happy camper now. I can automate all my tests, and move on to greener pastures.

Well, I could, except that the whole Win32 approach which I took is still fundamentally flawed, really.

When I fire up Paint, then select an image area and copy it to the clipboard, ClipboardToFile will report the following:

ClipboardToFile - (C) 2006 Claus Brod, http://www.clausbrod.de

Clipboard format 49161 successfully written to DataObject.
Clipboard format 49163 successfully written to Embed Source.
Clipboard format 49156 successfully written to Native.
ERROR: Cannot write data to OwnerLink
Clipboard format 49166 successfully written to Object Descriptor.
Clipboard format 3 successfully written to CF_METAFILEPICT.
Clipboard format 8 successfully written to CF_DIB.
Clipboard format 49171 successfully written to Ole Private Data.
Clipboard format 14 successfully written to CF_ENHMETAFILE.
Clipboard format 2 successfully written to CF_BITMAP.
Clipboard format 17 successfully written to CF_DIBV5.
ERROR: Failure to enumerate clipboard and store all formats.

Apparently, there's a funny format on the clipboard called OwnerLink which, for some reason, ClipboardToFile cannot read properly. When debugging into this case, it turns out that GetClipboardData returns a null handle for this format. Hmmm... what is this format used for? Why doesn't it contain any data? Or are there ways to retrieve the data than GetClipboardData?

And what are these other formats such as DataObject, Embed Source, Native, Object Descriptor and Ole Private Data for?

Indeed, there is much more to the clipboard than was dreamt of in my philosophy. More on this (hopefully) soon.



When asked for a TWiki account, use your own or the default TWikiGuest account.


Attachment sort Action Size Date Who Comment
clipboardtofile.zip manage 19.1 K 15 Apr 2006 - 13:38 ClausBrod ClipboardToFile, FileToClipboard

Revision: r1.5 - 16 Apr 2006 - 12:39 - ClausBrod
Blog > BlogOnSoftware20060415
Copyright © 1999-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback