Things I Like About Working on Apache Qpid

March 4, 2009

Qpid logo

I’ve mentioned before that I’ve been working on the Apache Qpid project, particularly on its port to Windows, first available in the M4 release. I also work on other open source projects related to networked applications programming (primarily, ACE). Since no two customers’ needs are the same, it pays to have multiple tools in order to get the best solution for the situation.

Although Qpid graduated as a Apache top-level project (TLP) some time ago, Apache issued a press release about it this week. As I was reading the release, I was reminded of some of the reasons I really enjoy working with the Apache Qpid team.

  • Meritocracy: the Apache way of working forces contributors to demonstrate their skills and desire to contribute over time before being admitted to the set of people who can commit to the source code repository. Prior to this step, code must be vetted by more senior members who assist with integrating and testing as well as helping newcomers along on the path the committership.
  • Intellectual property rights handling: Lawyers get the sweats when dealing with some open source projects because of the fear of intellectual property rights issues which may arise down the road. Apache has put a very nice system in place for ensuring that contributions to the project have all intellectual property rights properly assigned so there are no issues that users need to worry about.
  • Quality: I’ve been very impressed by the experience, skill, and professionalism of the Apache Qpid development and project team. I’m proud to be a part of this team and they inspire me daily to reach higher and make Qpid the best it can be.

I’m pleased to have customers that see the value in sponsoring work on Qpid because the resulting product provides fantastic enterprise messaging functionality without the exorbitant enterprise messaging price tag. I’m currently working to improve the Qpid user experience for Windows developers as well as reworking the build system to make it easier to keep the Windows and Linux/UNIX builds in synch. Much of the Windows improvements (build the libraries as DLLs, produce an installable prebuilt kit) will be available at the next release (M5) in a month or so. The build improvements will get worked into the development stream after M5.

USB pass-thru in RHEL 5 Xen VM doesn’t work; why do I buy support?

February 23, 2009

As part of my efforts to maintain ACE+TAO on LabVIEW RT (with Pharlap ETS kernel) I have a setup to run the test suite on a National Instruments chassis, driven by the build system on Windows. This arrangement is easily handled by ACE+TAO build environment, including a mechanism to reboot the NI box when things go wrong. The reboot is triggered by a USB-connected NI USB-6009 device that trips the reset signal on the NI box. It’s very slick and keeps from having to cycle power. The hitch is that it requires a USB 2.0 connection from the Windows machine.

In the past I’d used a VMware virtual machine hosted on Linux (RHEL 4) running a Windows guest OS to host this test environment. The VMware software passed the USB device through to the Windows VM without a hitch. However, over the winter I got a new machine set up with a great deal more capacity and decided to move the LabVIEW RT test environment to the new machine which runs RHEL 5 and Xen.

And that’s when the trouble started…

First I had to search quite a bit to find out how to configure the Xen VM to pass the USB device to the guest OS. After a bit of googling and reading, I found the magic configuration lines to add. I also found another blog entry (http://www.olivetalks.com/2008/02/03/usb-forwarding-on-xen-it-just-does-not-work/) saying it wouldn’t work right. But I forged on, confident that even if it didn’t work “out of the box” I had purchased support from Red Hat and could get any help I needed.

Well, long story short, the USB device didn’t pass through correctly from Xen. On December 9, 2008 I opened a support case with Red Hat to have them do whatever is needed to make it work. After twelve (12) exchanges over 22 days I requested escalation to someone who could do more to help than quote manual sections that were not applicable to what I needed.

After 11 more exchanges with 3 more support engineers over another 49 days I got the long-awaited answer: “It doesn’t work.”

Well, I wasn’t totally surprised since I had no success and had already seen a blog posting saying it won’t work. But I was still clinging to hope that my support contract would come through and Red Hat would make it work. Nope. Sorry. It don’t work. End of story.

So why do I buy support? Sure, I get all the updates, but I paid extra for someone to actually work on problems for me and all I get is “It doesn’t work.”? When my customers raise issues about ACE not working, they get fixes. Solutions. You know, like they paid for.

Apparently, solutions are optional for other providers.

So what happened in the end? I went back to running the Windows VM in a VMware environment, where it’s happily chugging along.

Analysis of ACE_Proactor Shortcomings on UNIX

January 22, 2009

I’ve been looking into two related issues in the ACE development stream:

  1. SSL_Asynch_Stream_Test times out on HP-UX (I recently made a bunch of fixes to the test itself so it runs as well as can be on Linux, but times out on HP-UX)
  2. Proactor_Test shows a stray, intermittent diagnostic on HP-UX: EINVAL returned from aio_suspend()

Although I’ve previously discussed use of ACE_Proactor on Linux (https://stevehuston.wordpress.com/2008/11/25/when-is-it-ok-to-use-ace-proactor-on-linux/) the issues on HP-UX are of a different sort. If the previously discussed Linux aio issues are resolved inside Linux, the same problem I’m seeing on HP-UX may also arise, but it doesn’t get that far. Also, I suspect that the issues arising from these tests’ execution on Solaris are of the same nature, though the symptoms are a bit different.

The symptoms are that the proactor event loop either fails to detect completions, or it gets random errors that smell like the aiocb list is damaged. I believe I’ve got a decent idea of what’s going on, and it’s basically two issues:

  1. If all of the completion dispatch threads are blocked waiting for completions when new I/O is initiated, the new operation(s) are  not taken into account by the threads waiting for completions. This is basically the case in the SSL_Asynch_Stream_Test timeout on HP-UX – all the completion-detecting threads are already running before any I/O is initiated and no completions are ever detected.
  2. The completion and initiation activities modify the aiocb list used to detect completions directly, without interlocks, and without consideration of what affect it may have (or not) on the threads waiting for completions.

The ACE_Reactor framework uses internal notifications to handle the need to unblock waiting demultiplexing threads so they can re-examine the handle set as needed; something similar is needed for the ACE_Proactor to remedy issue #1 above. There is a notification pipe facility in the proactor code, but I need to see if it can be used in this case. I hope so…

The other problem, of concurrent access to the aiocb list by threads both waiting for completions and modifying the list is a much larger problem. That requires more of a fundamental change in the innards of the POSIX Proactor implementation.

Note that there are a number of POSIX Proactor flavors inside ACE (section 8.5 in C++NPv2 describes most of them). The particular shortcomings I’ve noted here only affect the ACE_POSIX_AIOCB_Proactor and ACE_POSIX_SIG_Proactor, which is largely based on the ACE_POSIX_AIOCB_Proactor. The newest one, ACE_POSIX_CB_Proactor, is much less affected, but is not as widely available.

So, the Proactor situation on UNIX platforms is generally not too good for demanding applications. Again, Proactor on Windows is very good, and recommended for high-performance, highly scalable networked applications. On Linux, stick to ACE_Reactor using the ACE_Dev_Poll_Reactor implementation; on other systems, stick with ACE_Reactor and ACE_Select_Reactor or ACE_TP_Reactor depending on your need for multithreaded dispatching.

My Experiences Porting Apache Qpid C++ to Windows

January 9, 2009

I recently finished (modulo some capabilities that should be added) porting Apache Qpid‘s C++ implementation to Microsoft Windows. Apache Qpid also sports Java broker and client as well as Python, Ruby, C# and .NET clients. For my customer’s project I needed C++ which had, to date, been developed and used primarily on Linux. What I thought would be a slam dunk 20-40 hour piece of work took about 4 months and hundreds of hours. Fortunately, my customer’s projects waiting for this completion also were delayed and my customer was very accommodating. Still, since I goofed the estimate so wildly I only billed the customer a fraction of the hours I worked. Fortunately, I rarely goof estimates that badly. This post-project review takes a look at what went wrong and why it ended up a very good thing.

When I first looked through the Qpid code base, I got some initial impressions:

  • It’s nicely layered, which will make life easy
  • It’s hard to find one’s way around it
  • The I/O layer (at the bottom of the stack) is easily modified for what I need

The first two impressions held; the third did not. Most of the troubles and false starts had to do with the I/O layer at the bottom of the stack. Most of the rest of the code ported over with relative ease. The original authors did a very nice job isolating code that was likely to need varying implementations. Those areas generally use the Bridge pattern to offer a uniform API that’s implemented differently as needed.

The general areas I had to work on for the port are described below.

Synchronization

Qpid uses multiple threads – no big surprise for a high-performance networked system. So there’s of course a need for synchronization objects (mutex, condition variables, etc.) The existing C++ code had nice wrapper classes and a Pthreads implementation. The options for completing the Windows implementation were:

  • Use native Windows system calls
  • ACE (relatively obvious for me)
  • Apache Portable Runtime (it’s an Apache project after all)
  • Boost (Qpid already made use of Boost in many other areas)

Windows system calls were ruled out fairly quickly because they don’t offer all that was needed (particularly, condition variables) on XP and the interaction of the date-time parts of the existing threading/synch objects and Windows system time was very clunky.

I was hesitant to add ACE as an outside requirement just for the Windows port. I was also sensitive to the fact that as a newbie on this particular project I could be seen as simply trying to wedge ACE in for my own sinister purposes (which is definitely not the case!). So scratch that one.

After a brief but unsuccessful attempt at APR (and being told that some previous APR use was abandoned) I settled on Boost. This was my first project using Boost and it took some getting used to, but overall was pretty smooth.

Thread Management

The code that actually spawned and managed threads was easily implemented using native Windows system calls. Straight-forward and easy.

I/O

This is where all the action is. The existing code comments (there aren’t many, but what was there was descriptive) talked about “Asynch I/O.” This was welcome since I planned to use overlapped I/O to get high throughput; Windows’ implementation of asynchronous (they call it overlapped) I/O is very good, scales well and performs very well. The interface to the I/O layer from the upper level in Qpid looked good for asynchronous I/O and I got a little over confident. In retrospect, the name of the event dispatcher class (Poller) should have tipped me off that I had some difficulty ahead.

The Linux code’s Poller implementation uses Linux epoll to get high performance and remain very scalable. The code is solid and well designed and implemented. However, it is event driven, synchronous I/O and that tends to show a bit more than maybe intended. Handles need to be registered with the Poller, for example, something that’s not done with overlapped I/O.

My first attempt at the Windows version of a Poller implementation was large and disruptive. Fortunately, once I offered it up for review I received a huge amount of help from the I/O layer expert on the project. He and I sat down for a morning to review the design, the code I came up with, and best ways to go forward. The people I’ve worked with on Apache Qpid are consummate professionals and I’m very thankful for their input and guidance.

My second design for the I/O layer went much better. It doesn’t interfere with the Linux code, and slides in nicely with very little code duplication. I think that after another port or two are done where more of these implementations need to be designed, it may be possible to refactor some of the I/O layer to make things a bit cleaner, but that’s very minor at this point – the code works very well and integrates without disruption.

Lessons Learned

So what did I learn from this?

  1. It pays to spend a little extra time reading the existing code before designing extensions. Even when it looks pretty straight-forward. Even if you have to write some design documentation and run it by the original author(s).
  2. Forge good relationships with the other people on the team. This is an easy one when you all work in the same group, even in the same building. It’s more often assumed to be difficult at best when the group is spread around the world and across (sometimes competing) companies. It’s worth the effort.

So although the project took far longer than I originally estimated, the result is a good implementation that fits with the rest of the system and performs well. I could have wedged in my original bad design in far less time, but someone would have had to pick up the pieces later. The design constraints and rules that were not written before are somewhat written now (at least in the Windows code). If I do another port, it’ll be much smoother next time.

Where to Go From Here?

There are a few difficulties remaining for the Windows port and a few capabilities that should be added:

  • Keep project files synched with generated code. The Qpid project’s build process generates a lot of code from XML protocol specifications. This is very nice, but runs into trouble keeping the Visual Studio project files up to date as the set of generated files changes. I’ve been using the MPC tool to generate Visual Studio projects and MPC can pick up names by wildcard, but that still leaves an extra step: generate C++ code, regenerate project files. This need has caused a couple of hiccups during the Qpid M4 release process where I had to regenerate the project files. It would be nice if Visual Studio’s C++ build could deal with wildcards, or if the C++  project file format allowed inclusion of an outside file listing sources (which could be generated along with the C++ code).
  • Add SSL support. The Linux code uses OpenSSL. I’d rather not pull in another prerequisite when Windows knows how to do SSL already. At least I assume it does, and in a way that doesn’t require an HTTP connection to use. I’ve yet to figure this out…
  • Persistent storage for messages. AMQP (and Qpid) allows for persistent message store, guaranteeing message delivery in the face of failures. There’s not yet a store for Qpid on Windows, and it should be added.
  • Add the needed “declspec”s to build the Qpid libraries as DLLs; currently they’re built as static libraries.
  • Minor tweaks making Qpid integrate better with Windows, such as a message definition for storing diagnostics in the event log and being able to set the broker up as a Windows Service.

There’s No Substitute for Experience with Threads

January 5, 2009

When your system performance is not all you had hoped it would be, are you tempted to think that adding more threads will speed things up? When your customers complain that they upgraded to the latest multicore processor but your application doesn’t run any faster, what answers do you have for them?

Even if race conditions, synchronization bottlenecks, and atomicity are part of your normal vocabulary, the world of multithreaded programming is one where one must really understand what’s below the surface of the API to get the full picture of why some piece of code is, or is not, working. I was reminded of this, and the deep truths of how deep your understanding must be, while catching up on some reading this week.

I was introduced to threads (DECthreads, the precursor to Pthreads, for you history buffs) in the early 1990s. Neat! I can do multiple things at the same time! Fortunately, I spent a fair amount of time in my programming formative years working on an operating system for the Control Data 3600 (anyone remember the TV show “The Bionic Man”? The large console in the bionics lab was a CDC-3600). I learned the hard way that the world can change in odd ways between instructions. So I wasn’t completely fooled by the notion of magically being able to do multiple things at the same time, but threading libraries make the whole area of threads much more approachable. But with power comes responsibility – the responsibility to know what you’re doing with that power tool.

I’ve been working on multithreaded code for many years now and find multithreading a powerful tool for building high-performance networked applications. So I was eager to read the Dr. Dobbs article “Lock-Free Code: A False Sense of Security” by Herb Sutter (his blog entry related to the article is here). His “Effective Concurrency” column is a regular favorite of mine. The article is a diagnosis of an earlier article describing a lock-free queue implementation. I previously read the article being diagnosed and, although I only skimmed it since I had no immediate need for a single-writer, single-reader queue at the time, I didn’t catch anything really wrong with it. So I was anxious to see what I missed.

Boy, I missed a few things. Now that I see them explained, it’s like “ah, of course” but I probably wouldn’t have thought about those issues before I was trying to figure out what’s wrong at runtime. Some may say the issues are sort of esoteric and machine-specific and I may agree, but it doesn’t matter – it’s a case of understanding your environment and tools and another situation where experience makes all the difference between banging your head on the wall and getting the job done.

I’m thankful that I can get more understanding by reading the works of smart people who’ve trodden before me. I’m sure that knowledge will save me some time at some point when debugging some odd race condition. And that’s what it’s all about – learn, experience, save time. Thanks Herb.

There’s No Substitute for Experience with TCP/IP Sockets

December 31, 2008

The number of software development tools and aids available to us as we begin 2009 is staggering. IDEs, code generators, component and class libraries, design and modeling tools, high-level protocols, etc. were just speculation and dreams when I began working with TCP/IP in 1985. TCP and IP were not yet even approved MIL-STDs and the company I worked for had to get US Department of Defense permission to connect to the fledgling Internet. The “Web” was still 10 years away. If you wanted to use TCP/IP for much more than FTP, Telnet, or email you had to write the protocol and the code to run it yourself. The Sockets API was the highest level access we had at the time. That is a whole area of difficulty and complexity in and of itself, which C++ Network Programming addresses. But the API is more of a usage and programming efficiency issue – what I’m talking about today is the necessity of experience and understanding what’s going on between the API and the wire when working with TCP/IP, regardless of the toolkit or language or API you use.

A lot of software running on what many people consider “the net” piggy-backs on HTTP in one form or another. There are also many helpful libraries, such as .NET and ACE, to assist in writing networked applications at a number of levels. More specific problem areas also have very useful targeted solutions, such as message queuing systems and Apache Qpid. And, like most programming tasks, when everything’s ideal, it’s not too hard to get some code running pretty well. It’s when things don’t work as well as you planned that the way forward becomes murky. That’s when experience is needed. These are some examples of issues I’ve assisted people with lately:

  1. Streaming data transfer would periodically stop for 200 msec, then resume
  2. Character strings transferred would intermittently be bunched together or split apart
  3. Asynchronous I/O-based code stopped working when ported to Linux

The tendency when a problem such as this comes up is to find out who, or what, is to blame. In my experience, the first attempt at blame is usually laid on the most recent addition to the programming toolset – the piece trusted the least and that’s usually closest to the application being written. For ACE programs, this is usually why I get involved so early.

I’ve spent many years debugging applications and network protocol code. I spent way too much time trying to blame the layer below me, or the OS, or the hardware. The biggest lesson I learned is that when something goes wrong with code I wrote, it’s usually my problem and it’s usually a matter of some concept or facility I don’t understand enough to see the problem clearly or find the way to a solution. That’s why it’s so important to understand the features and functionality you are making use of – there’s no substitute for experience.

Helping my clients solve the three problems I mentioned above involved experience. Knowing where to target further diagnosis and gathering the right information made the difference between solving the problem that day and staring at code for days wondering what’s going on. Curious about what the problems were?

  1. Slow-start peculiarity on the receiver; disable Nagle’s on the receiving side.
  2. That’s the streaming nature of TCP. Need to mark the string boundaries and check for them on receive.
  3. Linux silently converts asynchronous socket I/O operations to synchronous and executes them in order; need to restructure order of operations in a very restricted way, or switch paradigm on Linux.

Although each client initially targeted blame at the same place, the real solution was in a different layer for each case. And none involved the software initially thought to be at fault.

When you are ready to begin developing a new networked application, or you’re spending yet another night staring at code and network traces, remember: there’s a good chance you need a little more clarity on something. Take a step back, assume the tools you’re using are probably correct, and begin to challenge your assumptions about how you think it’s all supposed to work. A little more understanding and experience will make it clear.

Apache Qpid graduates incubator; now a top-level project

December 11, 2008

The Apache Qpid project has been in incubation at the Apache Software Foundation for quite a while now, having delivered at least 3 releases of Apache Qpid. Recently the Apache Software Foundation board of directors voted to graduate the project from the incubator as a new top-level project (TLP) at Apache. This is a major milestone for Qpid and is based on:

  • Proven ability to manage and coordinate development and release a product
  • Cultivate a community of developers with sufficient diversity

I joined the Apache Qpid project this past summer, primarily to lead the port to Windows. I’ve been impressed with the development team’s professionalism, experience, and commitment to quality.

Congratulations to the Apache Qpid team on this great accomplishment!

When is it ok to use ACE Proactor on Linux?

November 25, 2008

The ACE Proactor framework (see C++NPv2 chapter 8, APG chapter 8 ) allows multiple I/O operations to be initiated and completed by a single thread. The idea is a good one, allowing a small number of threads to execute more I/O operations than could be done synchronously since the OS handles the actual transfers in the background. Many Windows programmers use this paradigm with overlapped I/O very effectively.

The overlapped I/O facility is used by ACE Proactor on Windows, and when the time comes for many to port their ACE-based application to Linux (or Solaris, or HP-UX, or AIX, or…) they naturally gravitate toward carrying the Proactor model to Linux. Seems safe, since Linux offers the aio facility, so off they go.

And then it happens. I/O locks up and all progress stops. Why? Because the aio facility upon which ACE Proactor builds is very restricted for socket I/O on Linux (at least through the Linuxes I’ve worked on). The issue is that the I/O operations initiated using aio from the application are silently converted to synchronous and executed in order based on the handle used. To see why this is a problem, consider the following common asynch I/O idiom:

  1. Open socket
  2. Initiate a read (whether expecting data or desiring to sense a broken connection)
  3. Initiate a write that should immediately complete

When the aio operations are converted to synchronous, the read is executed first, in a blocking manner. The write (which really has data to send) will not execute until the read completes. Consider the situation if the peer is waiting for an initial protocol exchange before sending any data. The peer is waiting for the local end to send data, but the local end’s data won’t actually send until the peer sends. But this will never happen, and we have deadlock.

The only way to make a Proactor-based application work on Linux is to follow a strict lock-step protocol. A ping-pong model, if you will. Each side may only have one operation outstanding on the socket at a time. This is fairly fragile and doesn’t suit many applications well. But if you do have such an application model, you can safely use ACE Proactor on Linux.

Note that the aio facility and, thus, ACE Proactor, on HP-UX, AIX, etc. does not suffer from this “silently converts ops to synchronous” problem, so these restrictions don’t apply there.

For an example of how to program this lock-step protocol arrangement, see the ACE_wrappers/tests/Proactor_Test.cpp program – it has an option to run half-duplex.

Nice new site for help – stackoverflow.com

November 19, 2008

I recently ran across a new site where one can ask and answer questions about all things related to software development. It’s stackoverflow.com and it’s pretty nice.

The questions I’ve read so far are wide-ranging and the answers thorough and correct. There’s a ranking scale so the most correct and helpful answers bubble up to the top and are clearly marked. In addition to strictly technical questions there are also questions about how to do one’s job better, conditions at work, etc.

Since the site is wide open to all software development, it’s dominated by Windows-type things and web development. For those of us like me who are usually interested in networked programming, there are tags for networked, tcp/ip, etc. and there is a level of help there.

The site allows one to post and answer questions without paying a fee to join, so is ad-supported. I think this is fine, as the ads are unintrusive and the wealth of advice and experience there is well worth it. I heartily recommend stackoverflow.com.

What’s “Networked Programming” all about?

November 15, 2008

When I’m asked what type of work I do, I often seem to grab for the just the right terms to describe it. But it’s a blind spot for me, I guess. I have been writing network protocol software and networked applications for over 25 years, am considered a network programming expert, and have co-authored three books on the subject, but am not real big on buzzwords. When I mention I write software to make networks more useful, people assume it’s a web type of thing.

Actually, I do networked applications and systems involving pretty much anything except the web. When I started doing this, I actually used serial lines and modems. DECnet, ring-net, and I helped implement the TCP/IP stack (twice) back when you needed US DoD permission to connect to the Internet. Although TCP/IP (and it’s assorted related protocols) drive the Internet today, TCP/IP is used in many applications that don’t touch “the Net”. Medical devices, automobiles, cell phones, industrial processes… practically anything involving more than one computer that needs to talk is what I put in the category of “networked application.”

Some people think it odd that I can specialize in such an area. After all, once you get some piece of software running in one computer, it’s pretty straight-forward to talk to another right? Aren’t there standards for that sort of thing? Well, yes there are. And the nice thing about them is that there are so many to choose from. And that’s just in the “plumbing” – once you put a network between two pieces of your system, the number of issues to be aware of and be able to work with explodes. Timing, byte orders, rogue data attacks, accidental complexities… the list goes on and on. And that’s where I come in – my job is to keep these issues from derailing projects, their schedules, and the jobs that depend on them. I love this stuff…

So the major purpose of this blog is to discuss issues related to networked programming and how to do it better. I hope you’ll join in and share your experiences too.

And if you are a buzzword-literate person and have a moment, do you have a better term for this than “networked applications”?