How to Use Schannel for SSL Sockets on Windows

December 29, 2009

Did you know that Windows offers a native SSL sockets facility? You’d figure that since IE has SSL support, and MS SQL has SSL support, and .NET has SSL support, that there would be some lower-level SSL support available in Windows. It’s there. Really. But you’ll have a hard time finding any clear, explanatory documentation on it.

I spent most of the past month adding SSL support to Apache Qpid on Windows, both client and broker components. I used a disproportionate amount of that time struggling with making sense of the Schannel API, as it is poorly documented. Some information (such as a nice overview of how to make use of it) is missing, and I’ll cover that here. Other information is flat out wrong in the MSDN docs; I’ll cover some of that in a subsequent post.

I pretty quickly located some info in MSDN with the promising title “Establishing a Secure Connection with Authentication”. I read it and really just didn’t get it. (Of course, now, in hindsight, it looks pretty clear.) Part of my trouble may have been a paradigm expectation.   Both OpenSSL and NSS pretty much wrap all of the SSL operations into their own API which takes the place of the plain socket calls. Functions such as connect(), send(), recv() have their SSL-enabled counterparts in OpenSSL and NSS; adding SSL capability to an existing system ends up copying the socket-level code and replacing plain sockets calls with the equivalent SSL calls (yes, there are some other details to take care of, but model-wise, that’s pretty much how it goes).

In Schannel the plain Windows Sockets calls are still used for establishing a connection and transferring data. The SSL support is, conceptually, added as a layer between the Winsock calls and the application’s data handling. The SSL/Schannel layer acts as an intermediary between the application data and the socket, encrypting/decrypting and handling SSL negotiations as needed. The data sent/received on the socket is opaque data either handed to Windows for decrypting or given by Windows after encrypting the normal application-level data. Similarly, SSL negotiation involves passing opaque data to the security context functions in Windows and obeying what those functions say to do: send some bytes to the peer, wait for more bytes from the peer, or both. So to add SSL support to an existing TCP-based application is more like adding a shim that takes care of negotiating the SSL session and encrypting/decrypting data as it passes through the shim.

The shim approach is pretty much how I added SSL support to the C++ broker and client for Apache Qpid on Windows. Once I got my head around the new paradigm, it wasn’t too hard. Except for the errors and omissions in the encrypt/decrypt API documentation… I’ll cover that shortly.

The SSL support did  not get into Qpid 0.6, unfortunately. But it will be in the development stream shortly after 0.6 is released and part of the next release for sure.

Advertisements

Simple, Single-Thread Message Receive in Apache Qpid C++

November 16, 2009

In my September 2009 newsletter article I showed a simple example of sending a message using the C++ API to Apache Qpid. Since I intend to write a number of how-to Qpid articles, most of them will appear here in my blog. (You can still subscribe to my newsletter… there are other interesting things there 🙂 If you’re not too familiar with AMQP terminology it may help to review the newsletter article because it defines the basic architectural pieces in AMQP.

This article shows a simple, single-threaded way to receive a message from a queue. The previous example showed how to send a message to the amq.direct exchange with routing key “my_key”. Thus, if we have a client that creates a queue and binds the amq.direct exchange to it using the routing key “my_key” the message will arrive on the queue.

To start, we need the header files:

#include <qpid/client/Connection.h>
#include <qpid/client/Session.h>
#include <qpid/client/Message.h>

Just as with the message-sending example, we need a connection to the broker and a session to use. We’ll assume there’s a broker on the local system and that it’s listening on TCP port 5672 (the default for Qpid). The new things in this example are:

  • Declare a queue (my_queue) that will receive the messages this program is interested in.
  • Bind the new queue to the amq.direct exchange using the routing key that the message sender used (my_key).
  • Use a SubscriptionManager and LocalQueue to manage receiving messages from the new queue. There are a number of ways to receive messages, but this one is simple and needs only the single, main thread.
qpid::client::Connection connection;
try {
  connection.open("localhost", 5672);
  qpid::client::Session session = connection.newSession();
  session.queueDeclare(arg::queue="my_queue");
  session.exchangeBind(arg::exchange="amq.direct", arg::queue="my_queue", arg::bindingKey="my_key");

  SubscriptionManager subscriptions(session);
  LocalQueue local_queue;
  subscriptions.subscribe(local_queue, string("my_queue"));
  Message message;
  local_queue.get(message, 10000);
  std::cout << message.getData() << std::endl;
  connection.close();
  return 0;
}
catch(const std::exception& error) {
  std::cout << error.what() << std::endl;
  return 1;
}

It’s pretty easy. Again, this is a simple case and there are many options and alternatives. One particular thing to note in the above example is that the Message::getData() method returns a reference to a std::string. It is easy to assume that this means the message is text. This would be a bad assumption in general. A std::string can contain any set of bytes, text or not. What those bytes are is up to the application designers.

My Initial Impressions of Zircomp

November 13, 2009

I checked out a webinar yesterday describing Zircomp by Zircon Computing. I’m always interested in new tools for developing distributed systems that make more effective use of networks and available systems, and that’s just what the description of Zircomp billed it as. I have no relationship with Zircon Computing or Zircomp, but ACE’s inventor and my co-author of C++ Network Programming, Dr. Douglas C. Schmidt, is Zircon’s CTO and Zircomp uses ACE, so I was immediately interested.

I have to admit, I expected this to another CORBA-esque heavyweight conglomeration of fancy-acronymed services. I was surprisingly delighted to find no CORBA mentioned anywhere – the framework described is all layered on ACE, and my come-away summary was this is like RPC on steroids, then simplified so humans can understand it and distributed across an incredibly scalable set of compute nodes. I was duly impressed.

There are, as I understand, a series of webinars planned to address the various aspects of Zircomp’s range of uses. This first webinar focused on distributing a parallel set of actions around a set of compute resources. It allows one to take the code for an action, wrap it for distribution, and replace it with a proxy that finds available servers and forwards the call. After having programmed RPC and CORBA, as well as hand-crafting RPC-ish type services, the simplicity of this new framework really blew me away. A lot of thought went into making this easy to use for your average programmer while allowing power users to really push the envelope.

I’ll be saving this one away as another tool in my bag of tricks for helping customers get the most value from both their engineering budgets and their computing equipment. Congrats, Zircon!

ACE 5.7 Changes You Should Know About, But (Probably) Don’t

June 29, 2009

ACE 5.7 was released last week (see my newsletter article for info including new platforms supported at ACE 5.7). It was a happy day for ACE users, as ACE 5.7 contains many important fixes, especially in the Service Configurator framework and the ACE_Dev_Poll_Reactor class. Indeed, a recent survey of Riverace’s ACE support customers revealed that nearly 53% are planning to upgrade to ACE 5.7 immediately.

As the rollouts started happening, a few unsuspected changes were noticed that users should know about. Riverace posted these notes in the ACE Knowledge Base last week but I’m also proactively describing them here because I suspect there may be more users out there who may trip over these issues as ACE 5.7 adoption picks up.

The LIB Makefile Variable

Many ACE users reuse the ACE GNU Makefile scheme to take advantage of ACE’s well-tuned system for knowing how to build ACE applications. This is a popular mechanism to reuse because it automatically picks up the ACE configuration settings and compiler options. This is very important to ensure that ACE-built settings match the application’s ACE-related settings for a successful build.

When reusing the ACE make scheme to build libraries the LIB makefile variable specifies the library name to build. This has worked for many years. However, during ACE 5.7 development support was added to the GNU Make scheme to enable its use on Windows with Visual C++. Visual C++ uses the LIB environment variable as a search path for libraries at link time, which clashed with the existing use of the LIB variable. Therefore, the old LIB variable was renamed to LIB_CHECKED. This broke existing builds.

Since the Windows case requires LIB be left available for Visual C++, the name change wasn’t reverted; however, I added a patch that reverts the LIB behavior on non-Windows build systems. The patch will be available in the ACE 5.7.1 bug-fix-only beta as well as in ACE 5.7a for Riverace support customers. If you’re stuck on this problem now, and you’re a support customer, open a case to get the patch immediately.

Note that if you generate your application’s project files with MPC instead of hand-coding to the ACE make scheme, you can avoid the LIB name problem by regenerating your projects after ACE 5.7 is installed.

Symlink Default for Build Libaries Changed from Absolute to Relative

When ACE builds libraries it can “install” them by creating a symbolic link. In ACE 5.6 and earlier, the link used an absolute path to the original file. In ACE 5.7 the default behavior changed to create a link relative to the $ACE_ROOT/ace directory. This helps to enable relocating an entire tree without breaking the links, but in some cases can cause invalid links. For example, if you are building your own libraries that do not get relocated with ACE, or won’t have the same directory hierarchy, the links will not be valid.

To change to the pre-5.7 behavior of creating links with absolute pathnames, build with the make variable symlinks=absolute. You can either specify symlinks=absolute on the make command line or add it to your $ACE_ROOT/include/makeinclude/platform_macros.GNU file prior to including wrapper_macros.GNU.

Revised ACE_Dev_Poll_Reactor Fixes Multithread Issues (and more!) on Linux

June 15, 2009

When ACE 5.7 is released this week it will contain an important fix (a number of them, actually) for use cases that rely on multiple threads running the Reactor event loop concurrently on Linux. The major fix areas involved for ACE_Dev_Poll_Reactor in ACE 5.7 are:

  • Dispatching events from multiple threads concurrently
  • Properly handling changes in handle registration during callbacks
  • Change in suspend/resume behavior to be more ACE_TP_Reactor-like

At the base of these fixes was a foundational change in the way ACE_Dev_Poll_Reactor manages events returned from Linux epoll. Prior to this change, ACE would obtain all ready events from epoll and then each event loop-executing thread in turn would pick the next event from that set and dispatch it. This design was, I suppose, more or less borrowed from the ACE_Select_Reactor event demultiplexing strategy. In that case it made sense since select() is relatively expensive and avoiding repeated scans of all the watched handles is a good thing. Also, the ACE_Select_Reactor (and ACE_TP_Reactor, which inherits from it) have a mechanism to note that something in the handle registrations changed, signifying that select() must be called again. This mechanism was lacking in ACE_Dev_Poll_Reactor.

However, unlike with select(), it’s completely unnecessary to try to avoid calls to epoll_wait(). Epoll is much more scalable than is select(), and letting epoll manage the event queue, only passing back one event at a time, is much simpler than the previous design, and also much easier to get correct. So that was the first change: obtain one event per call to epoll_wait(), letting Linux manage the event queue and weed out events for handles that are closed, etc. The second change was to add the EPOLLONESHOT option bit to the event registration for each handle. The effect of this is that once an event for a particular handle is delivered from epoll_wait(), that handle is effectively suspended. No more events for the handle will be delivered until the handle’s event mask is re-enabled via epoll_ctl(). These two changes were used to fix and extend ACE_Dev_Poll_Reactor as follows.

Dispatching Events from Multiple Threads Concurrently

The main defect in the previous scheme was the possibility that events obtained from epoll_wait() could be delivered to an ACE_Event_Handler object that no longer existed. This was the primary driver for fixing ACE_Dev_Poll_Reactor. However, another less likely, but still possible, situation was that callbacks for a handler could be called out of order, triggering time-sensitive ordering problems that are very difficult to track down. Both these situations are resolved by only obtaining one I/O event per ACE_Reactor::handle_events() iteration. A side-effect of this change is that the concurrency behavior of ACE_Dev_Poll_Reactor changes from being similar to ACE_WFMO_Reactor (simultaneous callbacks to the same handler are possible) to being similar to ACE_TP_Reactor (only one I/O callback for a particular handle at a time). Since epoll’s behavior with respect to when a handle’s availability for more events differs from Windows’s WaitForMultipleObjects, the old multiple-concurrent-calls-per-handle couldn’t be done correctly anyway, so the new ACE_Dev_Poll_Reactor behavior leads to easier coding and programs that are much more likely to be correct when changing reactor use between platforms.

Properly handling changes in handle registration during callbacks

A difficult problem to track down sometimes arose in the previous design when a callback handler changed handle registration. In such a case, if the reactor made a subsequent callback to the original handler (for example, if the callback returned -1 and needed to be removed) the callback may be made to the wrong handler – the new registered handler instead of the originally called handler. This problem was fixed by making some changes and additions to the dispatching data structures and code and is no longer an issue.

Change in suspend/resume behavior to be more ACE_TP_Reactor-like

An important aspect of ACE_TP_Reactor’s ability to support complicated use cases arising in systems such as TAO is that a dispatched I/O handler is suspended around the upcall. This prevents multiple events from being dispatched simultaneously. As previously mentioned, the changes to ACE_Dev_Poll_Reactor also effectively suspend a handler around an upcall. However, a feature once only available with the ACE_TP_Reactor is that an application can specify that the application,  not the ACE reactor, will resume the suspended handler. This capability is important to properly supporting the nested upcall capability in TAO, for example. The revised ACE_Dev_Poll_Reactor now also has this capability. Once the epoll changes were made to effectively suspend a handler around an upcall, taking advantage of the existing suspend-resume setting in ACE_Event_Handler was pretty straight-forward.

So, if you’ve been holding off on using ACE_Dev_Poll_Reactor on Linux because it was unstable with multiple threads, or you didn’t like the concurrency behavior and the instability it may bring, I encourage you to re-evaluate this area when ACE 5.7 is released this week. And if you’ve ever wondered what good professional support services are, I did this work for a support customer who is very happy they didn’t have to pay hourly for this. And many more people will be happy that since I wasn’t billing for time I could freely fix tangential issues not in the original report such as the application-resume feature. Everyone wins: the customer’s problem is resolved and ACE’s overall product quality and functionality are improved. Enjoy!

How We Converted the Apache Qpid C++ Build to CMake

June 1, 2009

A previous post covered why the Apache Qpid C++ build switched to CMake; this post describes how it was done.

The project was generously funded by Microsoft. We started the conversion in February 2009. At this point, the builds have been running well for a while; the test executions are not quite done. So, it took about 3 months to get the builds running on both Linux and Windows. We’re working on the testing aspects now. We have not really addressed the installation steps yet. There were only two aspects of the Qpid build conversion that weren’t completely straight forward:

  1. The build processes XML versions of the AMQP specification and the Qpid Management Framework specification to generate a lot of the code. The names of the generated files are not known a priori. The generator scripts produce a list of the generated files in addition to the files themselves. This list of files obviously needs to be plugged into the appropriate places when generating the makefiles.
  2. There are a number of optional features to build into Qpid. In addition to explicitly enabling or disabling the features, the autoconf scheme checked for the requisite capabilities and enabled the features when the user didn’t specify. It built as much as it could if the user didn’t specify what to build (or not to build).

To start, one person on the team (Cliff Jansen of Interop Systems) ran the existing automake through the KDE conversion steps to get a base set of CMakeLists.txt files and did some initial prototyping for the code generation step. The original autoconf build ran the code generator at make time if the source XML specifications were available at configure time (in a release kit, the generated sources are already there, and the specs are not in the kit). The Makefile.am file then included the generated lists of sources to generate the Makefile from which the product was built. Where to place the code generating step in the CMake scheme was a big question. We considered two options:

  • Do the code generation in the generated Makefile (or Visual Studio project). This had the advantage of being able to leverage the build system’s dependency evaluation and regenerate the code as needed. However, once generated, the Makefile (or Visual Studio project) would need to be recreated by CMake. Recall that the code generation generates a list of source files that must be in the Makefile. We couldn’t get this to be as seamless as desired.
  • Do the code generation in the CMake configuration step. This puts the dependency evaluation in the CMakeLists.txt file, and had to be coded by hand since we wouldn’t have the build system’s dependency evaluation available. However, once the code was generated, the list of generated source files was readily available for inclusion in the Makefile (and Visual Studio project) file generation and the build could proceed smoothly.

We elected the second approach for ease of use. The CMakeLists code for generating the AMQP specification-based code looks like this (note this code is covered by the Apache license):

# rubygen subdir is excluded from stable distributions
# If the main AMQP spec is present, then check if ruby and python are
# present, and if any sources have changed, forcing a re-gen of source code.
set(AMQP_SPEC_DIR ${qpidc_SOURCE_DIR}/../specs)
set(AMQP_SPEC ${AMQP_SPEC_DIR}/amqp.0-10-qpid-errata.xml)
if (EXISTS ${AMQP_SPEC})
  include(FindRuby)
  include(FindPythonInterp)
  if (NOT RUBY_EXECUTABLE)
    message(FATAL_ERROR "Can't locate ruby, needed to generate source files.")
  endif (NOT RUBY_EXECUTABLE)
  if (NOT PYTHON_EXECUTABLE)
    message(FATAL_ERROR "Can't locate python, needed to generate source files.")
  endif (NOT PYTHON_EXECUTABLE)

  set(specs ${AMQP_SPEC} ${qpidc_SOURCE_DIR}/xml/cluster.xml)
  set(regen_amqp OFF)
  set(rgen_dir ${qpidc_SOURCE_DIR}/rubygen)
  file(GLOB_RECURSE rgen_progs ${rgen_dir}/*.rb)
  # If any of the specs, or any of the sources used to generate code, change
  # then regenerate the sources.
  foreach (spec_file ${specs} ${rgen_progs})
    if (${spec_file} IS_NEWER_THAN ${CMAKE_CURRENT_SOURCE_DIR}/rubygen.cmake)
      set(regen_amqp ON)
    endif (${spec_file} IS_NEWER_THAN ${CMAKE_CURRENT_SOURCE_DIR}/rubygen.cmake)
  endforeach (spec_file ${specs})
  if (regen_amqp)
    message(STATUS "Regenerating AMQP protocol sources")
    execute_process(COMMAND ${RUBY_EXECUTABLE} -I ${rgen_dir} ${rgen_dir}/generate gen
                           {specs} all ${CMAKE_CURRENT_SOURCE_DIR}/rubygen.cmake
                           WORKING_DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR})
  else (regen_amqp)
    message(STATUS "No need to generate AMQP protocol sources")
  endif (regen_amqp)
else (EXISTS ${AMQP_SPEC})
  message(STATUS "No AMQP spec... won't generate sources")
endif (EXISTS ${AMQP_SPEC})

# Pull in the names of the generated files, i.e. ${rgen_framing_srcs}
include (rubygen.cmake)

With the code generation issue resolved, I was able to get the rest of the project building on both Linux and Windows without much trouble. The cmake@cmake.org email list was very helpful when questions came up.

The remaining not-real-clear-for-a-newbie area was how to best handle building optional features. Where the original autoconf script tried to build as much as possible without the user specifying, I put in simpler CMake language to allow the user to select options, try the configure, and adjust settings if a feature (such as SSL libraries) was not available. This took away a convenient feature for building as much as possible without user intervention, though with CMake’s ability to very easily adjust the settings and re-run the configure, I didn’t think this was much of a loss.

Shortly after I got the first set of CMakeLists.txt files checked into the Qpid subversion repository, other team members started iterating on the initial CMake-based build. Andrew Stitcher from Red Hat quickly zeroed in on the removed capability to build as much as possible without user intervention. He developed a creative approach to setting the CMake defaults in the cache based on some initial system checks. For example, this is the code that sets up the SSL-enabling default based on whether or not the required capability is available on the build system (note this code is covered by the Apache license):

# Optional SSL/TLS support. Requires Netscape Portable Runtime on Linux.

include(FindPkgConfig)

# According to some cmake docs this is not a reliable way to detect
# pkg-configed libraries, but it's no worse than what we did under
# autotools
pkg_check_modules(NSS nss)

set (ssl_default ${ssl_force})
if (CMAKE_SYSTEM_NAME STREQUAL Windows)
else (CMAKE_SYSTEM_NAME STREQUAL Windows)
  if (NSS_FOUND)
    set (ssl_default ON)
  endif (NSS_FOUND)
endif (CMAKE_SYSTEM_NAME STREQUAL Windows)

option(BUILD_SSL "Build with support for SSL" ${ssl_default})
if (BUILD_SSL)

  if (NOT NSS_FOUND)
    message(FATAL_ERROR "nss/nspr not found, required for ssl support")
  endif (NOT NSS_FOUND)

  foreach(f ${NSS_CFLAGS})
    set (NSS_COMPILE_FLAGS "${NSS_COMPILE_FLAGS} ${f}")
  endforeach(f)

  foreach(f ${NSS_LDFLAGS})
    set (NSS_LINK_FLAGS "${NSS_LINK_FLAGS} ${f}")
  endforeach(f)

  # ... continue to set up the sources and targets to build.
endif (BUILD_SSL)

With that, the Apache Qpid build is going strong with CMake.

During the process I developed a pattern for naming CMake variables that play a part in user configuration and, later, in the code. There are two basic prefixes for cache variables:

  • BUILD_* variables control optional features that the user can build. For example, the SSL section shown above uses BUILD_SSL. Using a common prefix, especially one that collates near the front of the alphabet, puts options that users change most often right at the top of the list, and together.
  • QPID_HAS_* variables note variances about the build system that affect code but not users. For example, is a header file present, or a particular system call.

Future efforts in this area will complete the transition of the test suite to CMake/CTest, which will have the side affect of making it much easier to script the regression test on Windows. The last area to be addressed will be how downstream packagers make use of the new CMake/CPack system for building RPMs, Windows installers, etc. Stay tuned…

Why Apache Qpid’s C++ Build is Switching to CMake

May 14, 2009

I’ve been working on converting the Apache Qpid build system from the infamous “autotools” (automake, autoconf, libtool) to CMake. The CMake package also includes facilities for testing and packaging, and I’ll cover those another time. For now I’ll deal with the configure/build steps. This article deals more with why Apache Qpid’s C++ project decided to make this move. The next article will cover more details concerning how it was done.

First let me review how the autotools work for those that aren’t familiar with them. There are 3 steps:

  1. Bootstrap the configure script. This is done by the development team (or release engineer) and involves processing some (often) hard to follow M4-based scripts into a shell script, generally named “configure”.
  2. Configure the build for the target environment. This is done by the person building the source code, whether in development or at a user site. Configuration is carried out by executing the “configure” script. The script often offers command-line options for including/excluding optional parts of the product, setting the installation root, setting needed compile options, etc. The configure script also examines the build environment for the presence or absence of features and capabilities that the product can use. This step produces a “config.h” file that the build process uses, as well as a set of Makefiles.
  3. Build, generally using the “make” utility.

The autotools work well, even if writing the configure scripts is a black art. People accustomed to downloading open source programs to build are used to unpacking the source, then running “configure” and “make”. So why would Qpid want to switch? Two reasons, primarily:

  1. Windows. The autotools don’t work natively on Windows since there’s no UNIX-like shell, and no “make” utility. Getting these involves installing Mingw and many Windows developers, sysadmins, etc. just won’t go there.
  2. Maintaining multiple build inputs is a royal pain. And at least one of them is always out of step. Keeping Visual Studio projects and autotools-based Makefile.am files updated is very error-prone. Even the subset of developers that have ready access to both can get it wrong.
  3. (Ok, this one is bonus) Once you’ve spent enough nights trying to debug configure.ac scripting, you’ll do anything to get away from autotools.

We looked at a number of alternatives and settled on CMake. CMake is picking up in popular usage (KDE recently switched, and there’s an effort to make Boost build with CMake as well). CMake works from its own specification of the product’s source inputs and outputs, similar to autoconf, but has two advantages:

  • It also performs the “configure” step in the autotools process
  • It can generate make inputs (Visual Studio projects, Makefiles, etc.) for numerous build systems

In the CMake world, the autotools “bootstrap” (step 1, above) is not needed. This is because rather than produce a neutral shell script for the configure step, CMake itself must be installed on each build system. This seems a bit onerous at first, but I think is better for two main reasons:

  1. The configuration step in CMake lets the user view a nice graphical layout of all the options the developers offer to configure and build optional areas of the product. As the configure happens, the display is updated to show what was learned about the environment and its capabilities. Only when all the settings look correct and desired does the user generate the build files and proceed to build the software. It takes the guesswork out of knowing if you’ve specified the correct configure options, or even knowing what options you have to pick from.
  2. It will probably cause more projects to offer pre-built binary install packages, such as RPMs and Windows installers, to help users get going quicker. One of CMake’s components, CPack, helps to ease this process as well.

The impending Qpid 0.5 release is the last one based on autotools and native Visual Studio projects. The CMake changes are pretty well in place on the development trunk and we’re about ready to remove the source-controlled Visual Studio files and then the autotools stuff. Next time I’ll discuss more of what we had to do to get to this point.

Review of AMQP Face-to-Face Public Meeting, April 1, 2009

April 2, 2009
View from Scripps Forum at UCSD

View from Scripps Forum at UCSD

Holding a public review for a new middleware specification somewhere as gorgeous as the Scripps Institution of Oceanography at UC San Diego takes guts. I mean, really, with views like this, the material being presented has to be really good to keep the audience’s attention.

AMQP, and its energetic originator and chief evangelist, John O’Hara from JP Morgan, did not disappoint, referring to AMQP as “Internet Protocol for Business Messaging” with a vision to have an AMQP endpoint with every TCP endpoint, much as is the case with HTTP today.

With the AMQP specification nearing version 1.0, it was time to bring the community up to date on progress and what has changed since version 0-10.

During the 1.0 work, Mark Blair from Credit Suisse headed a “user sig” to ensure that business requirements would be met by the AMQP specification. The sig identified a number of critical characteristics:

  • Ubiquitous, prevasive. In addition to the wire protocol, the AMQP license is extraordinarily open, ensuring it can be obtained and implemented by virtually any interested party.
  • Safety – trusted/secure operation is a priority.
  • Fidelity – predictable delivery semantics across boundaries
  • Unified – wants to be the sole messaging tool for all; has new global addressing format
  • Interoperability between different implementations
  • Manageable – users want standard management and plug-ins to other existing standard management tools

There was a lot of information at multiple levels of technical detail that you can read about at www.amqp.org (the presentations are at the bottom of the page) but I’ll briefly explain the two items that struck me most as being big improvements for version 1.0:

  1. Exchanges are gone. Pre version 1.0, an Exchange accepts messages from producer applications and routes them to message queues using prearranged message binding criteria. In real-life situations this left too much responsibility to the producing application; the arrangement has been simplified for version 1.0. Applications now interact only with queues having well-known names. A new element, Links, has been added. Links move messages between queues using contained routing and predicate logic. They can be configured using the new management facilities. The new arrangement is simpler and more flexible and will likely result in even simpler client implementations.
  2. A new Service concept – a Service is an application inside the broker. You can add new ones as needed. Interbroker operation is a service, as is the management facility. I’m sure many new, creative things can be added to AMQP implementations with this new facility.

So when will AMQP version 1.0 be available? The PMC is working to iron out the remaining details of the specification now. Then the implementing begins. After there are at least two independent implementations of the specification that successfully interoperate, the specification will be considered final 1.0.

I came away from the meeting very excited about AMQP’s future. Message-oriented middleware (MOM) is a fundamental piece of many systems and is the natural solution to many networked application areas, but the solution space has been fragmented and often very expensive. That’s all changing. There are hundreds of real-life AMQP deployments today. The attention AMQP has generated is leading to more interest from more varied application areas. I’m confident that the new facilities and simplifications for the next AMQP version will further its progress even more.

Sometimes Using Less Abstraction is Better

March 24, 2009

Last week I was working on getting the Apache Qpid unit tests running on Windows. The unit tests are arranged to take advantage of the fact that the bulk of the Qpid client and broker is built as shared/dynamic libraries. The unit tests invoke capabilities directly in the shared libraries, making it easier to test. Most of the work needed to get these tests built on Windows was taken care of by the effort to build DLLs on Windows. However, there was a small but important piece remaining that posed a challenge.

Being a networked system, Qpid tests need to be sure it correctly handles situations where the network or the network peer fails or acts in some unexpected way. The Qpid unit tests have a useful little class named SocketProxy which sits between the client and broker. SocketProxy relays network traffic in each direction but can also be told to drop pieces of traffic in one or both directions, and can be instructed to drop the socket in one or both directions. Getting this SocketProxy class to run on Windows was a challenge. SocketProxy uses the Qpid common Poller class to know when network data is available in one or both directions, then directly performs the socket recv() and send() as needed. This use of Poller, ironically, was what caused me problems. Although the Windows port includes an implementation of Poller, it doesn’t work in the same fashion as the Linux implementation.

In Qpid proper, the Poller class is designed to work in concert with the AsynchIO class; Poller detects and multiplexes events and AsynchIO performs I/O. The upper level frame handling in Qpid interacts primarily with the AsynchIO class. Below that interface there’s a bit of difference from Linux to Windows. On Linux, Poller indicates when a socket is ready, then AsynchIO performs the  I/O and hands the data up to the next layer. However, the Windows port uses overlapped I/O and an I/O completion port; AsynchIO initiates I/O, Poller indicates completions (rather than I/O ready-to-start), and AsynchIO gets control to hand the resulting data to the next layer. So, the interface between the frame handling and I/O layers in Qpid is the same for all platforms, but the way that Poller and AsynchIO interact can vary between platforms as needed.

My initial plan for SocketProxy was to take it up a level, abstraction-wise. After all, abstracting away behavior is often a good way to make better use of existing, known-to-work code, and avoid complexities. So my first approach was to replace SocketProxy’s direct event-handling code and socket send/recv operations with use of the AsynchIO and Poller combination that is used in Qpid proper.

The AsynchIO-Poller arrangement’s design and place in Qpid involves some dynamic allocation and release of memory related to sockets, and a nice mechanism to do orderly cleanup of sockets regardless of which end initiates the socket close. Ironically, it is this nice cleanup arrangement which tanked its use in the SocketProxy case. Recall that SocketProxy’s usefulness is its ability to interrupt sockets in messy ways, but not be messy itself in terms of leaking handles and memory. My efforts to get AsynchIO and Poller going in SocketProxy resulted in memory leaks, sockets not getting interrupted as abruptly as needed for the test, and connections not getting closed properly. It was a mess.

The solution? Rather than go up a level of abstraction, go down. Use the least common denominator for what’s needed in a very limited use case. I used select() and fd_set. This is just what I advise customers not to do. Did I lose my mind? Sell out to time pressure? No. In this case, using less abstraction was the correct approach – I just didn’t recognize it immediately.

So what made this situation different from “normal”? Why was it a proper place to use less abstraction?

  • The use case is odd. Poller and AsynchIO are very well designed for running the I/O activities in Qpid, correctly handling all socket activity quickly and efficiently. They’re not designed to force failures, and that’s what was needed. It makes no sense to redesign foundational classes in order to make a test harness more elegant.
  • The use is out of the way. It’s a test harness, not the code that has to be maintained and relied on for efficient, correct performance in deployed environments.
  • It’s needs are limited and isolated. SocketProxy handles only two sockets at a time. Performance is not an issue.

Sometimes less is more – it works in abstractions too. The key is to know when it really is best.

Lessons Learned Converting Apache Qpid to Build DLLs on Windows

March 12, 2009

During fall and winter 2008 I worked on the initial port of Apache Qpid to Windows (I blogged about this in January 2009, here). The central areas of Qpid (broker, client) are arranged as a set of shared libraries, qpidclient (the client API), qpidbroker (the guts of the broker), and qpidcommon (supporting code shared by other parts of Qpid). As I mentioned in my previous blog entry, I built the libraries as static libraries instead of shared libraries (“DLLs” in Windows) primarily due to time constraints. Changing the build to produce DLLs for Windows is one of the items I noted that should be done in the future. The primary reasons for building as DLLs are:

  • Probably reduces memory load, and enables easier maintenance of the product. These are standard drivers for using shared libraries in general.
  • Enables expanded use of Qpid in other use cases where dynamic loading is required, such as plug-in environments and dynamically assembled components.
  • Allows the full regression test suite to run on Windows more easily; many of the broker-side unit tests link the qpidbroker library, something I didn’t account for in the initial port.

So, there was clearly a need for switching to DLLs. A couple of other companies saw the value in using DLLs as well: Microsoft, which sponsored my efforts to do the DLL conversion (and some other things coming up), and WSO2, whose Danushka Menikkumbura developed many of the patches for Qpid that went into this project.

Unlike building shared libraries on UNIX/Linux (or most any other OS), it is necessary for the developer to annotate source code to tell the Windows compiler/linker which entrypoints in the library should be “exported”, or made known to external users of the DLL. Conversely, the same entrypoints must be marked “imported” for consumers of the library. The standard way to do this is to define a macro, for example, QPID_CLIENT_EXTERN, that is defined as __declspec(dllexport) when building a DLL and as __declspec(dllimport) when using that DLL. Typically this is accomplished in a header file, such as this example from the Qpid client source:

#if defined(WIN32) && !defined(QPID_DECLARE_STATIC)
#if defined(CLIENT_EXPORT)
#define QPID_CLIENT_EXTERN __declspec(dllexport)
#else
#define QPID_CLIENT_EXTERN __declspec(dllimport)
#endif
#else
#define QPID_CLIENT_EXTERN
#endif

Then the class or method, etc. that should be exported is annotated with QPID_CLIENT_EXTERN. When developing C++ code, the most direct way to make classes and their methods available is to mark the class as exported:

class QPID_CLIENT_EXTERN MyClass {
public:
...
};

This works well in many cases (this is how ACE does nearly all of its exporting, for example). However, in Qpid this technique quickly hit some problems. The reason is that if a particular class is exported, all of its ancestors have to be exported from DLLs as well (and their ancestors, and so on) as well as any non-POD types exposed in the method interfaces or used as static members of the class that’s exported. Given Qpid’s layers of inheritance and reuse, only annotating classes that users should be using was not going to fly with a simple export-the-classes scheme. Fortunately Danushka was ahead of the curve here and prepared patches to export only necessary methods. For example:

class MyClass {
public:
    QPID_CLIENT_EXTERN MyClass();
    QPID_CLIENT_EXTERN ~MyClass();
    QPID_CLIENT_EXTERN int myMethod();
...
private:
    MyOtherClass data;
};

Note that now the class is not exported as a whole, but all the methods needed to use it are. In particular, note that MyOtherClass need not be exported. The obvious trade-off here is that we avoid the issues with inheritance hierarchy needing to be exported, but each user-required method must be annotated explicitly. This is the way we primarily did the Qpid DLL work.

In summary, the lesson learned is that exporting individual class members is more tedious to implement, but provides a measure of independence and flexibility in class design and reuse. I originally looked dubiously at the technique of marking individual members, but Danushka’s ideas here ended up making the end result better and easier for users to apply in new projects. The forthcoming M5 release will contain these new improvements to Apache Qpid.