Archive for the ‘windows’ Category

Trouble with ACE and IPv6? Make Sure Your Config is Consistent

July 2, 2010

I just spent about 5 hours over the week debugging a stack corruption problem. The MSVC debugger was politely telling me the stack was corrupted in the area of an ACE_INET_Addr object instantiated on the stack. But all I did was create it then return from the method. So the problem had to be localized pretty well. Also, I was seeing the problem but none of the students in the class I was working this example for saw the problem. So it was somehow specific to my ACE build.

I stepped through the ACE_INET_Addr constructor and watched it clear the contents of the internal sockaddr to zeroes. Fine. I noted it was clearing 28 bytes and setting address family 23. “IPv6. Ok, ” I thought. But I knew the stack was being scribbled on outside the bounds of that ACE_INET_Addr object. I checked to see if ACE somehow had a bad definition of sockaddr_in6. After rummaging around ACE and Windows SDK headers I was pretty sure that wasn’t it. But there was definitely some confusion on the size of what needed to be cleared.

If you haven’t looked at the ACE_INET_Addr internals (and, really, why would you?), when ACE is built with IPv6 support (the ACE_HAS_IPV6 setting) the internal sockaddr is a union of socketaddr_in and sockaddr_in6 so both IPv4 and IPv6 can be supported. The debugger inside the ACE_INET_Addr constructor was showing me both v4 and v6 members of the union. But as I stepped out of the ACE_INET_Addr constructor back to the example application, the debugger changed to only showing the IPv4 part. Hmmm… why is that? The object back in the example is suddenly looking smaller (since the sockaddr_in6 structure is larger than the sockaddr_in structure, the union gets smaller when you leave out the sockaddr_in6). Ok, so now I know why the stack is getting smashed… I’m passing a IPv4-only ACE_INET_Addr object to a method that thinks it’s getting a IPv4-or-IPv6 object which is larger. But why?

I checked my $ACE_ROOT/ace/config.h since that’s where ACE config settings usually are. No ACE_HAS_IPV6 setting there. Did the ACE-supplied Windows configs add it in somewhere sneakily? Nope. I checked the ACE.vcproj file ACE was built with. Ah-ha… in the compile preprocessor settings there it is – ACE_HAS_IPV6.

AAAAARRRRRGGGGGGG!!!!! Now I remember where it came from. IPv6 support is turned on/off in the MPC-generated Visual Studio projects using an MPC feature setting, ipv6=1 (this is because some parts of ACE and tests aren’t included without the ipv6 feature). When I generated the ACE projects that setting was used, but when I generated the example program’s projects it wasn’t. So the uses of ACE_INET_Addr in the example had only the IPv4 support, but were passed to an ACE build that was expecting both IPv4 and IPv6 support – a larger object.

Solution? Regenerate the example’s projects with the same MPC feature file ACE’s projects were generated with. That made all the settings consistent between ACE and my example programs. No more stack scribbling.

How to Use Schannel for SSL Sockets on Windows

December 29, 2009

Did you know that Windows offers a native SSL sockets facility? You’d figure that since IE has SSL support, and MS SQL has SSL support, and .NET has SSL support, that there would be some lower-level SSL support available in Windows. It’s there. Really. But you’ll have a hard time finding any clear, explanatory documentation on it.

I spent most of the past month adding SSL support to Apache Qpid on Windows, both client and broker components. I used a disproportionate amount of that time struggling with making sense of the Schannel API, as it is poorly documented. Some information (such as a nice overview of how to make use of it) is missing, and I’ll cover that here. Other information is flat out wrong in the MSDN docs; I’ll cover some of that in a subsequent post.

I pretty quickly located some info in MSDN with the promising title “Establishing a Secure Connection with Authentication”. I read it and really just didn’t get it. (Of course, now, in hindsight, it looks pretty clear.) Part of my trouble may have been a paradigm expectation.   Both OpenSSL and NSS pretty much wrap all of the SSL operations into their own API which takes the place of the plain socket calls. Functions such as connect(), send(), recv() have their SSL-enabled counterparts in OpenSSL and NSS; adding SSL capability to an existing system ends up copying the socket-level code and replacing plain sockets calls with the equivalent SSL calls (yes, there are some other details to take care of, but model-wise, that’s pretty much how it goes).

In Schannel the plain Windows Sockets calls are still used for establishing a connection and transferring data. The SSL support is, conceptually, added as a layer between the Winsock calls and the application’s data handling. The SSL/Schannel layer acts as an intermediary between the application data and the socket, encrypting/decrypting and handling SSL negotiations as needed. The data sent/received on the socket is opaque data either handed to Windows for decrypting or given by Windows after encrypting the normal application-level data. Similarly, SSL negotiation involves passing opaque data to the security context functions in Windows and obeying what those functions say to do: send some bytes to the peer, wait for more bytes from the peer, or both. So to add SSL support to an existing TCP-based application is more like adding a shim that takes care of negotiating the SSL session and encrypting/decrypting data as it passes through the shim.

The shim approach is pretty much how I added SSL support to the C++ broker and client for Apache Qpid on Windows. Once I got my head around the new paradigm, it wasn’t too hard. Except for the errors and omissions in the encrypt/decrypt API documentation… I’ll cover that shortly.

The SSL support did  not get into Qpid 0.6, unfortunately. But it will be in the development stream shortly after 0.6 is released and part of the next release for sure.

Why Apache Qpid’s C++ Build is Switching to CMake

May 14, 2009

I’ve been working on converting the Apache Qpid build system from the infamous “autotools” (automake, autoconf, libtool) to CMake. The CMake package also includes facilities for testing and packaging, and I’ll cover those another time. For now I’ll deal with the configure/build steps. This article deals more with why Apache Qpid’s C++ project decided to make this move. The next article will cover more details concerning how it was done.

First let me review how the autotools work for those that aren’t familiar with them. There are 3 steps:

  1. Bootstrap the configure script. This is done by the development team (or release engineer) and involves processing some (often) hard to follow M4-based scripts into a shell script, generally named “configure”.
  2. Configure the build for the target environment. This is done by the person building the source code, whether in development or at a user site. Configuration is carried out by executing the “configure” script. The script often offers command-line options for including/excluding optional parts of the product, setting the installation root, setting needed compile options, etc. The configure script also examines the build environment for the presence or absence of features and capabilities that the product can use. This step produces a “config.h” file that the build process uses, as well as a set of Makefiles.
  3. Build, generally using the “make” utility.

The autotools work well, even if writing the configure scripts is a black art. People accustomed to downloading open source programs to build are used to unpacking the source, then running “configure” and “make”. So why would Qpid want to switch? Two reasons, primarily:

  1. Windows. The autotools don’t work natively on Windows since there’s no UNIX-like shell, and no “make” utility. Getting these involves installing Mingw and many Windows developers, sysadmins, etc. just won’t go there.
  2. Maintaining multiple build inputs is a royal pain. And at least one of them is always out of step. Keeping Visual Studio projects and autotools-based files updated is very error-prone. Even the subset of developers that have ready access to both can get it wrong.
  3. (Ok, this one is bonus) Once you’ve spent enough nights trying to debug scripting, you’ll do anything to get away from autotools.

We looked at a number of alternatives and settled on CMake. CMake is picking up in popular usage (KDE recently switched, and there’s an effort to make Boost build with CMake as well). CMake works from its own specification of the product’s source inputs and outputs, similar to autoconf, but has two advantages:

  • It also performs the “configure” step in the autotools process
  • It can generate make inputs (Visual Studio projects, Makefiles, etc.) for numerous build systems

In the CMake world, the autotools “bootstrap” (step 1, above) is not needed. This is because rather than produce a neutral shell script for the configure step, CMake itself must be installed on each build system. This seems a bit onerous at first, but I think is better for two main reasons:

  1. The configuration step in CMake lets the user view a nice graphical layout of all the options the developers offer to configure and build optional areas of the product. As the configure happens, the display is updated to show what was learned about the environment and its capabilities. Only when all the settings look correct and desired does the user generate the build files and proceed to build the software. It takes the guesswork out of knowing if you’ve specified the correct configure options, or even knowing what options you have to pick from.
  2. It will probably cause more projects to offer pre-built binary install packages, such as RPMs and Windows installers, to help users get going quicker. One of CMake’s components, CPack, helps to ease this process as well.

The impending Qpid 0.5 release is the last one based on autotools and native Visual Studio projects. The CMake changes are pretty well in place on the development trunk and we’re about ready to remove the source-controlled Visual Studio files and then the autotools stuff. Next time I’ll discuss more of what we had to do to get to this point.

Sometimes Using Less Abstraction is Better

March 24, 2009

Last week I was working on getting the Apache Qpid unit tests running on Windows. The unit tests are arranged to take advantage of the fact that the bulk of the Qpid client and broker is built as shared/dynamic libraries. The unit tests invoke capabilities directly in the shared libraries, making it easier to test. Most of the work needed to get these tests built on Windows was taken care of by the effort to build DLLs on Windows. However, there was a small but important piece remaining that posed a challenge.

Being a networked system, Qpid tests need to be sure it correctly handles situations where the network or the network peer fails or acts in some unexpected way. The Qpid unit tests have a useful little class named SocketProxy which sits between the client and broker. SocketProxy relays network traffic in each direction but can also be told to drop pieces of traffic in one or both directions, and can be instructed to drop the socket in one or both directions. Getting this SocketProxy class to run on Windows was a challenge. SocketProxy uses the Qpid common Poller class to know when network data is available in one or both directions, then directly performs the socket recv() and send() as needed. This use of Poller, ironically, was what caused me problems. Although the Windows port includes an implementation of Poller, it doesn’t work in the same fashion as the Linux implementation.

In Qpid proper, the Poller class is designed to work in concert with the AsynchIO class; Poller detects and multiplexes events and AsynchIO performs I/O. The upper level frame handling in Qpid interacts primarily with the AsynchIO class. Below that interface there’s a bit of difference from Linux to Windows. On Linux, Poller indicates when a socket is ready, then AsynchIO performs the  I/O and hands the data up to the next layer. However, the Windows port uses overlapped I/O and an I/O completion port; AsynchIO initiates I/O, Poller indicates completions (rather than I/O ready-to-start), and AsynchIO gets control to hand the resulting data to the next layer. So, the interface between the frame handling and I/O layers in Qpid is the same for all platforms, but the way that Poller and AsynchIO interact can vary between platforms as needed.

My initial plan for SocketProxy was to take it up a level, abstraction-wise. After all, abstracting away behavior is often a good way to make better use of existing, known-to-work code, and avoid complexities. So my first approach was to replace SocketProxy’s direct event-handling code and socket send/recv operations with use of the AsynchIO and Poller combination that is used in Qpid proper.

The AsynchIO-Poller arrangement’s design and place in Qpid involves some dynamic allocation and release of memory related to sockets, and a nice mechanism to do orderly cleanup of sockets regardless of which end initiates the socket close. Ironically, it is this nice cleanup arrangement which tanked its use in the SocketProxy case. Recall that SocketProxy’s usefulness is its ability to interrupt sockets in messy ways, but not be messy itself in terms of leaking handles and memory. My efforts to get AsynchIO and Poller going in SocketProxy resulted in memory leaks, sockets not getting interrupted as abruptly as needed for the test, and connections not getting closed properly. It was a mess.

The solution? Rather than go up a level of abstraction, go down. Use the least common denominator for what’s needed in a very limited use case. I used select() and fd_set. This is just what I advise customers not to do. Did I lose my mind? Sell out to time pressure? No. In this case, using less abstraction was the correct approach – I just didn’t recognize it immediately.

So what made this situation different from “normal”? Why was it a proper place to use less abstraction?

  • The use case is odd. Poller and AsynchIO are very well designed for running the I/O activities in Qpid, correctly handling all socket activity quickly and efficiently. They’re not designed to force failures, and that’s what was needed. It makes no sense to redesign foundational classes in order to make a test harness more elegant.
  • The use is out of the way. It’s a test harness, not the code that has to be maintained and relied on for efficient, correct performance in deployed environments.
  • It’s needs are limited and isolated. SocketProxy handles only two sockets at a time. Performance is not an issue.

Sometimes less is more – it works in abstractions too. The key is to know when it really is best.

Lessons Learned Converting Apache Qpid to Build DLLs on Windows

March 12, 2009

During fall and winter 2008 I worked on the initial port of Apache Qpid to Windows (I blogged about this in January 2009, here). The central areas of Qpid (broker, client) are arranged as a set of shared libraries, qpidclient (the client API), qpidbroker (the guts of the broker), and qpidcommon (supporting code shared by other parts of Qpid). As I mentioned in my previous blog entry, I built the libraries as static libraries instead of shared libraries (“DLLs” in Windows) primarily due to time constraints. Changing the build to produce DLLs for Windows is one of the items I noted that should be done in the future. The primary reasons for building as DLLs are:

  • Probably reduces memory load, and enables easier maintenance of the product. These are standard drivers for using shared libraries in general.
  • Enables expanded use of Qpid in other use cases where dynamic loading is required, such as plug-in environments and dynamically assembled components.
  • Allows the full regression test suite to run on Windows more easily; many of the broker-side unit tests link the qpidbroker library, something I didn’t account for in the initial port.

So, there was clearly a need for switching to DLLs. A couple of other companies saw the value in using DLLs as well: Microsoft, which sponsored my efforts to do the DLL conversion (and some other things coming up), and WSO2, whose Danushka Menikkumbura developed many of the patches for Qpid that went into this project.

Unlike building shared libraries on UNIX/Linux (or most any other OS), it is necessary for the developer to annotate source code to tell the Windows compiler/linker which entrypoints in the library should be “exported”, or made known to external users of the DLL. Conversely, the same entrypoints must be marked “imported” for consumers of the library. The standard way to do this is to define a macro, for example, QPID_CLIENT_EXTERN, that is defined as __declspec(dllexport) when building a DLL and as __declspec(dllimport) when using that DLL. Typically this is accomplished in a header file, such as this example from the Qpid client source:

#if defined(WIN32) && !defined(QPID_DECLARE_STATIC)
#if defined(CLIENT_EXPORT)
#define QPID_CLIENT_EXTERN __declspec(dllexport)
#define QPID_CLIENT_EXTERN __declspec(dllimport)

Then the class or method, etc. that should be exported is annotated with QPID_CLIENT_EXTERN. When developing C++ code, the most direct way to make classes and their methods available is to mark the class as exported:


This works well in many cases (this is how ACE does nearly all of its exporting, for example). However, in Qpid this technique quickly hit some problems. The reason is that if a particular class is exported, all of its ancestors have to be exported from DLLs as well (and their ancestors, and so on) as well as any non-POD types exposed in the method interfaces or used as static members of the class that’s exported. Given Qpid’s layers of inheritance and reuse, only annotating classes that users should be using was not going to fly with a simple export-the-classes scheme. Fortunately Danushka was ahead of the curve here and prepared patches to export only necessary methods. For example:

class MyClass {
    QPID_CLIENT_EXTERN int myMethod();
    MyOtherClass data;

Note that now the class is not exported as a whole, but all the methods needed to use it are. In particular, note that MyOtherClass need not be exported. The obvious trade-off here is that we avoid the issues with inheritance hierarchy needing to be exported, but each user-required method must be annotated explicitly. This is the way we primarily did the Qpid DLL work.

In summary, the lesson learned is that exporting individual class members is more tedious to implement, but provides a measure of independence and flexibility in class design and reuse. I originally looked dubiously at the technique of marking individual members, but Danushka’s ideas here ended up making the end result better and easier for users to apply in new projects. The forthcoming M5 release will contain these new improvements to Apache Qpid.