Wednesday, November 19, 2014

Building a Bitnami Tomcat Image using Docker

I am a long-time fan of Bitnami's prepackaged stacks. If you want to, for example, quickly stand up a new Drupal instance, Bitnami allows you to do this - using either a machine image with the stack pre-installed or a binary installer that you can run on the appropriate type of OS.

When I first learned about Docker, I thought of Bitnami and how it seemed a natural fit for them to offer Docker image versions of their stacks. It turns out that they are in the process of doing exactly that. However, at the time of this writing, they don't have these available, so I decided to build my own. What follows is a step by step recipe for taking the Bitnami Tomcat 7 installer and building a Docker image that captures the result of a successful install.


Step 0 - Create a VM and install Docker

I did this in a single step using Digital Ocean's ability to select OS / application combos - in this case Docker 1.3.1 on Ubuntu 14.04 (64 bit). To keep Bitnami's installer from complaining about memory (in Step 4) you are going to need at least a 2 GB VM. If you want to run multiple stacks side-by-side on the same VM, you are going to need at least 4GB.

Step 1 - Download the Bitnami Tomcat Installer onto Your VM

The easiest way to do this is use 'wget' on the VM:

root@vm:~# mkdir bitnami; cd bitnami
root@vm:~# wget https://bitnami.com/redirect/to/45854/bitnami-tomcatstack-7.0.57-0-linux-x64-installer.run
root@vm:~# chmod +x *.run

Note that Bitnami is always updating their downloads so, by the time you read this, the installer above may not be available. Just use the appropriate installer for your OS. Obviously you can also choose to use an earlier or later version of Tomcat.

Also note the I've saved the installer under a new directory (which we will reference in Step 3) and made it executable.

Step 2 - Download/Pull the Base Docker Image

Working with Docker is like baking sourdough bread; you need a little something to start with. I chose to use Docker's base Ubuntu image because (a) I really don't care which OS I'm running Tomcat on, and (b) I've used Bitnami's Tomcat stack on Ubuntu before and never had any problems.

root@vm:~# docker pull ubuntu

You should see a brief flurry of activity ending with:

Status: Downloaded newer image for ubuntu:latest

Step 3 - Start a Container

First I'll show you the command, then I'll explain the options:

root@vm:~# docker run --cap-add=ALL -i -p 80:80 -t -v /root/bitnami:/bitnami ubuntu /bin/bash

--cap-add=ALL: When it starts, Tomcat tries to set some capabilities (i.e. establish the privilege to do one or more "superuser like" things). By default Docker does not allow processes within a container to do this. This option allows processes within the container to set any capability they want. This is a sloppy and dangerous thing to do. I should dig into the Tomcat code and figure out exactly which capabilities it is requesting and grant only those capabilities (see the "principle of least privilege").

-v /root/bitnami:/bitnami: This option bind mounts "/root/bitnami" on the VM to "/bitnami" in the container. This will allow us to access the installer file from inside the container.

-p 80:80: By default the Apache web server listens on port 80. This option maps port 80 of the container to port 80 on our VM. Obviously you can map the container port to any free port on your VM (e.g 8080 using "-p 8080:80").

-i, -t: These two options connect you to the shell running inside the container.

ubuntu: This option specifies the image to run in the container. In this case it is the default Ubuntu image that we pulled in Step 2.

/bin/bash: This option tells Docker to run a bash shell inside the container.

At this point you should find yourself at a container-level prompt like:

root@d10f70897ce3:/# 


Step 4 - Run the Bitnami Installer

Next we want to run the Tomcat installer to install Apache, Tomcat, and MySQL into our container:

/bitnami/bitnami-tomcatstack-7.0.57-0-linux-x64-installer.run --mode unattended

This command will take a couple of minutes to complete, so be patient. If all goes well you should return to the container-level prompt where you can poke around a bit to check things out. A "ps -ef" should show you the Apache, MySQL, and Tomcat processes, there should be an "/opt/tomcatstack-7.0.57-0 directory", etc. You can test whether Apache is up and accessible by browsing to "http://<your VM address>/". You should see the welcome page for the Bitnami Tomcat stack.

Note that the way in which we installed Apache, MySQL, and Tomcat is extremely unsafe. For example, there is no password for the Tomcat manager application. Under this configuration it should only be a matter of minutes before someone installs something unpleasant onto Tomcat. The Bitnami installer supports a number of command-line options for setting the MySQL password, the Tomcat manager password, etc. You can play around with these to get the configuration you want. This is where Docker shines; you can quickly re-run Steps 3 and 4 to experiment with different configurations. One thing to be aware of is that Docker saves containers after you exit them so, to avoid confusion, you should probably "docker rm <container-id>" on any containers you are no longer interested in.

Step 5 - Snapshot the Container

Now that you have a container running a configuration of the Tomcat stack that you are happy with, it is time to snapshot that container and create a Docker image. Since we started the Apache, MySQL, and Tomcat processes from the bash shell that we launched on container startup, exiting the shell will cause these processes to terminate. I confess to being somewhat superstitious, however, so I prefer to shut down these processes in the "proper" manner:

root@d10f70897ce3:/# /opt/tomcatstack-7.0.57-0/ctlscript.sh stop

After this completes you can simply exit the bash shell to exit the container and return to your VM-level shell. At this point we can snapshot the container and create a new image using the "docker commit" command like so:

root@vm:~# docker commit -m="Some pithy comment." d10f70897ce3 mybitnami/tomcat:v1

The resulting image should be viewable through the "docker images" command.

Step 6 - Launching the Image

Launching our newly created image is simply a matter of starting a container using that image:

root@vm:~# docker run --cap-add=ALL -d -p 80:80 mybitnami/tomcat:v1 /bin/sh -c "/opt/tomcatstack-7.0.57-0/ctlscript.sh start; tail -F /opt/tomcatstack-7.0.57-0/apache-tomcat/logs/catalina-daemon.out"

This looks a little intimidating, so let's break it down. The "--cap-add=ALL" option was covered in Step 3. We still need this because Tomcat still sets the same capabilities. The "-d" option simply tells Docker to run the container in the background. We've eliminated the "-i" and "-t" options because we don't need to interact directly with the container. The "-p 80:80" options specifies the same port mapping and we've eliminated the "-v" option because we no longer need to access any host files from the container. What makes this step look complicated is the in-line shell script at the end. What we are telling Docker to do is run the following commands in a shell:

/opt/tomcatstack-7.0.57-0/ctlscript.sh start
tail -F /opt/tomcatstack-7.0.57-0/apache-tomcat/logs/catalina-daemon.out

Docker will run a shell that executes "ctlscript.sh start" thus starting Apache, MySQL, and Tomcat. It will then run the "tail" command on the main Tomcat log file, blocking on additional writes to this file. What this means is that the shell process that is the parent or grandparent of all the Apache, MySQL, and Tomcat processes will continue to run, thus keeping the whole tree of processes alive.

There are a number of ways we can monitor our container at this point. We can view a top-like display of the processes in the container via:

root@vm:~# docker top <container ID>

We can look at the container's STDOUT and STDERR using:

root@vm:~# docker logs <container ID>

Step 7 - Stopping the Container

To stop the container running our tomcat stack we can send the SIGTERM signal to the root process of the container (our shell running "tail") via:

root@vm:~# docker stop <container ID>

This should cause all of the server processes to shut down cleanly. As I mentioned, I'm a bit superstitious about these things so I would prefer a mechanism that invoked "ctlscript.sh stop" before exiting the container. I've spent enough time investigating to determine that this is a subject for another post.

Some Questions


Why Not Use an Existing Tomcat Image?
If you are familiar with Docker you are probably aware that there are plenty of existing images that run Tomcat. Why not simply use one of these? Firstly, none of these images (that I am aware of) include an integrated Apache or, more importantly, MySQL. Secondly, I am working with an application that I built using the Bitnami stack and I'm comfortable dinking with this stack. It is less work for me to build an image of my existing system than it is to switch to a new system.

Why Not Use "docker build"?
Steps 3-5 could have been replaced using the "docker build" command and a Docker file. However, at the time of this writing, the containers used during the "docker build" command do not allow their processes to request capabilities. A

RUN bitnami-tomcatstack-7.0.56-0-linux-x64-installer.run

command will fail with the following error:

set_caps: failed to set capabilities
check that your kernel supports capabilities
set_caps(CAPS) failed for user 'tomcat'

Service exit with a return value of 4

when Tomcat tries to run for the first time. This issue is being tracked by Docker here: https://github.com/docker/docker/issues/1916.

Why Use Docker At All?
At the beginning of this post I pointed out that Bitnami stacks exist in machine image form for most popular systems. I can go to AWS and, in less time and less effort, create a new VM that is functionally equivalent to the docker container that I have created here. Some points:
  • My Bitnami Tomcat stack Docker image is a just a building block. Next I'm going to install a webapp on Tomcat, a database on MySQL, etc. Then I'm going to snapshot that. Again, I could do the same with AWS, but I can't run an AMI anywhere besides AWS. I can take my Docker images and run them on anything with a compatible kernel.
  • When saved as a TAR file my docker image is approximately 800 Mb. Most VM images are far larger than this. Lighter is faster.
  • Bitnami does a great job with integration but nothing is ever quite exactly the way you want it. The dink-->test-->dink-some-more cycle in Steps 3 and 4 is much faster using containers on an individual VM than using multiple VMs.
  • If, for whatever reason, I wanted to run multiple instances/versions of my stack it would probably be much cheaper to run them side-by-side in separate containers on the same (larger) VM than it would be to run them each in their own (smaller) VMs. This cost difference is even greater if I decide that I need to make my stacks available at a static IP address and/or given DNS name.

Wednesday, May 9, 2012

The “Let’s Impersonate Eric Holder” Game

In the course of numerous arguments about the need for stricter voter ID laws, I’ve had a number people refer me to the story of the man who obtained Eric Holder’s primary ballot from a Washington D.C. polling station (google it if you aren’t already familiar with the story). The people that refer me to this story usually seem to feel that it is some sort of trump card – as in “There, we’ve proven that voter fraud could occur, therefore we need voter ID laws.”

This is the same sort of shallow thinking that led to the TSA and our ridiculous airport security procedures. The question isn’t “can one person steal another person's ballot?”, it is “can enough ballots be stolen to change the outcome of an election?” I’ve been thinking about this problem and come up with a little mental game people can play to run through the possible scenarios.

The Goal

The goal of  “Let’s Impersonate Eric Holder” is to fraudulently cast 1,000 or more additional votes for a congressional candidate. That’s 1,000 more votes than the candidate would have received had you not participated in the game. Note that I’m setting the bar extremely low here. Yes, the Franken/Coleman race was decided by less than a thousand votes, but that was a very rare case. The majority of congressional races are rarely closer than 2 or 3 percentage points. Given that the average size of a congressional district is 700,000 people and assuming a voter turnout of around 40% – you’d need 2,800 votes to effect a single percentage point of change in a congressional race.

Starting Pieces

To start with you get:
  1. A list of all the registered voters in a district including their names, addresses, and party affiliation (if any). This will sometimes be referred to as “the target list”; the people on this list will sometimes be referred to as “targets”.
  2. A list of all the polling places in the district broken down by streets and/or neighborhoods.

Rules

The following is a list of some common sense constraints on the activities in the game:
  1. You can walk into any polling station and vote as anyone on the list provided that person lives in the neighborhood(s) serviced by that polling station and provided that the likely sex of that person’s name matches your apparent sex. For example, someone who looks male cannot vote as a person named “Kathy” though he could vote as a someone named “Kelsey”.*
  2. The assignment of neighborhoods to polling stations is NOT one-to-one. That is, a single polling place may service several neighborhoods.
  3. Attempting to vote more than once at the same polling station may result in detection and apprehension (see rule 6). The chance of detection is modified by a number of personal factors. If you are 6’4” with prominent moles etc. (like myself), it is likely that attempting to vote even twice at the same polling station will result in detection. If you are of medium height, medium build, non-descript features, etc. you may be able to vote several times at the same polling station.
  4. Attempting to vote as a person who has already voted may result in detection and apprehension (see rule 6).
  5. Although the list contains the party affiliation of the voters, it does not contain any information about their voting intentions. You can assume that voters intend to vote for their party’s candidate, but you cannot make any assumption about who people registered as “independent” intend to vote for.
  6. Voting fraud is a felony offense with mandatory jail time. If caught, it is likely that you will be charged, tried, convicted, fined, and jailed.
  7. Conspiracy to commit voting fraud is a felony offense with mandatory jail time. If caught, it is likely that you and your co-conspirators (at least those who don’t testify against you) will be charged, tried, convicted, fined, and jailed.
* For the sake of brevity we will not consider cases that involve personal knowledge of the target by a polling worker. For example, attempting to vote as the polling workers next door neighbor.

Conspiracy

One of the most readily apparent aspects of this game is that it is impossible for a single player to vote 1,000 times in the same day. On top of this, one can assume that the target list is split approximately 50/50 between women and men. To have a chance of reaching the goal, the player must recruit a number of co-conspirators – some men and some women. Leaving aside the difficulty of recruiting people to participate in a (free – unless you are going to pay them) felonious activity, one has to recognize that the risk of detection increases (at the very least) linearly with the number of co-conspirators. If you don’t want to run afoul of rule 7, you must keep the size of your conspiracy down to the absolute minimum necessary to reach your goal.

The Multiplier

The key to this game is what I call “the multiplier”. The multiplier is the number of times the player and his co-conspirators can vote as someone else without getting caught. For example, if the multiplier is 20, you will need 50 people (1 player and 49 co-conspirators) to reach the goal of 1,000 extra votes (sort of – we’ll get into this later). At the end of the game, each conspirator will have their own multiplier, but we can expect that they will tend to clump around some average value. A larger multiplier means fewer co-conspirators and a smaller chance of getting caught; a smaller multiplier means more co-conspirators and a greater chance of getting caught.

There are a number of factors that influence the multiplier. One of these is the “lumpiness” of the polling stations – how many neighborhoods per polling station? A related factor is the physical distance between polling stations. Because of rule 3, the ideal situation for the player is fine-grained polling stations (ideally one per neighborhood) that are fairly close to one another. The anti-ideal is coarse-grained polling stations (many neighborhoods in one station) and/or polling stations that are distant from each other.

Another factor affecting the multiplier is time. Assuming it takes a minimum of 10 minutes to get the ballot and vote, and assuming the polling stations are open for 12 hours, it is obvious that the maximum theoretical multiplier is 72. Obviously, by rule 3, you can’t vote 72 times at the same polling place, so you must take into consideration the travel time between various polling stations. Also you have to consider the possible presence of lines and/or other delays at the polling stations. Keep in mind that any attempts to mitigate the effects of rule 3 by changing clothes and/or disguises also cuts into the multiplier by consuming time.

Timing Is Everything

Rule 4 has some interesting, time-related effects on the course of play. When the player shows up at their first polling place promptly at 7:00 am (as you would assume they would if they were attempting to maximize their multiplier), we can be reasonably sure that their target has not voted yet. When the (by now weary) player shows up at the last polling place at 6:59 pm, they can can be sure that, if their target voted today, they will have voted already. In between these two extremes, the chance of running afoul of rule 4 increases throughout the day.

There are two ways to address this issue. The first is to stop voting earlier in the day, perhaps at noon, or 2 pm. This, obviously, decreases your multiplier and requires you to recruit more co-conspirators if you want to reach the goal. The second is to develop an act that will get you out of the polling place when confronted with the inevitable “Mr. Smith, it shows here that you already voted” – something that convinces people that you are a genuine, disenfranchised voter, but at the same time keeps you out of the clutches of any over-helpful poll workers that may inadvertently expose you. Note that, once you “burn” a polling place by hitting rule 4, it is probably unwise to go back there again. This, in turn, reduces your multiplier.

Another time-related factor is the changing of workers at the polling stations. If you can get information on when and how these changes occur (not one of your starting pieces, sorry), you can use this information to mitigate the effects of rule 3 (though not rule 4).

Overlapping Votes

Suppose that your goal was to cast 1,000 extra votes for a certain Democratic candidate, “Mr. Johnson”. By rule 5, stealing the vote of a target who was going to vote for Mr. Johnson doesn’t count – you basically just did their voting for them. As a player you need to maximize your chances of stealing the vote of someone who wasn’t going to vote for Mr. Johnson. Obviously this means that you should target Republican voters – but it isn’t that simple. The key to maximizing your multiplier is to spread the target list evenly across you and your co-conspirators in a way that prevents you from running afoul of rule 3. Depending upon the make up of the neighborhoods etc. there may not be enough Republican voters to target at some polling stations. Given that independents run around 30-40% of registered voters, it is more than likely that you are going to have to steal votes from the independents who’s intentions, by rule 5, are not guessable. At the end of the day, what this means is that you are going to have to steal extra votes to compensate for overlapping votes. If your overlap rate is 20%, this means that you are going to have to steal 1,200 votes to accomplish your goal. This, in turn, requires you to either increase you multiplier and/or recruit more co-conspirators.

End of Game

So is it possible to play this game, reach the goal of 1,000 extra votes, and not get caught? I don’t think so. Given all the factors that I’ve discussed, I can’t imagine an average multiplier any greater than 10. With a modest overlap rate of 10%, this means you would have to recruit (and possibly pay) 109 other people to participate in a felonious conspiracy to change the outcome of a congressional election by less than a single percentage point. Good luck with that.

Tuesday, May 8, 2012

It’s the Viewers Dummy

I hate to pick on Mindjet again but I wouldn’t bother if I didn’t love their program (MindManager) and really want to see other people use it, yadda, yadda. I’m the “enthusiastic but cranky customer”. Sometimes it’s important to listen to me. So here goes:

What’s with the overhead of getting a free, viewer-only version of MindManager? Seriously, didn’t Adobe (and hundreds of others) show everyone the way on this? The value of any document I create in MindManager is directly proportional to the number of people that can “easily” view that document. And (this part is important) the bar on “easily” is going down as the internet evolves. Ten years ago you could ask people to spend 20 minutes jumping through hoops to get a free version of your product but, as they say, “things have changed”.

I’m trying to share information with people that may or may not understand the value in mind maps. A lot of people are still unfamiliar with the concept. Most of the people I have introduced to mind maps have gotten really excited about them. But, if you make it too hard for them to at least see their first map, there’s no chance you are ever going to convince them.

The best thing Mindjet could do would be to implement something in the “code-on demand style” that could view any .mmap file (doable in JavaScript? – no idea, sorry.) Short of that, they need to make it very easy to download and install free viewers on whatever platforms make sense (can you really do anything useful with a mind map in a handheld form factor?)

Finally, if you (the mythological reader) are thinking of taking me to task for using proprietary file formats – yeah, yeah. I may often claim to be right, but I seldom claim to be consistent. I like all the bling, bling in MindManager and I haven’t found free mind mapping tool that gives me that.

Friday, March 23, 2012

Cloud Broker Overload



'That's a great deal to make one word mean,' Alice said in a thoughtful tone.

'When I make a word do a lot of work like that,' said Humpty Dumpty, 'I always pay it extra.'
- Through the Looking Glass

“Cloud brokers” are a hot topic, thanks in part to their inclusion in the NIST Cloud Computing Reference Architecture [1]. NIST’s definition derives, in part, from a 2009 Gartner report [2]. As Ben Kepes points out [3], these definitions of cloud broker are at odds with the accepted meanings of the word “broker”. Ben also makes the point that the issue is more fundamental than what names we use to call the various actors in a multi-provider scenario. The article suggests the term “service intermediary” as more descriptive of the kinds of things that companies like enStratus and RightScale actually do – where “service intermediary” is defined as an actor that does service intermediation and/or service aggregation but doesn’t do service arbitration. Although I agree with much of Ben’s article, I think it misses the main problem with the NIST definition.


The Boat Analogy

Suppose I wanted to buy a boat. For various reasons, I decide to use a boat broker. I expect the broker to (among other things) introduce me to the parties selling boats and help me work through the process of buying the boat. The interaction pattern is three-way. The seller, the broker, and I are all aware of each others existence and expect different things from one another. For example, if the engine seized the day after I bought the boat, it is doubtful that I would hold the broker responsible.

Suppose that, instead of buying a boat, I simply wanted to rent one. Now, instead of seeking out a broker, I would look for a boat charterer. In contrast to my dealings with the broker and the seller, my interactions with the chartering company are two-way. The chartering company may or may not own the boat. I don’t know and, ultimately, I don’t care. All I care about is that the boat is made available for my use over a specific period of time. Any problems with the boat are the responsibility of the chartering company – regardless of who owns the boat.

The main problem with the NIST definition is that it lumps “brokers” and “charterers” together and, in so doing, masks the significant differences in the interactions and expectations of the parties involved.


It’s the Relationships

The first step to unraveling this hairball is to stop focusing on the functional aspects of what (for argument’s sake) I will simply call “the intermediary”. Whether the intermediary simply arbitrates requests amongst (nearly) identical back-end providers or synthesizes an aggregation of different providers to create a new service is not as important as whether or not the consumer does or doesn’t have a contractual relationship with these back-end providers.

Regardless of how many back-end services an intermediary uses and regardless of how imaginatively it might use them, if the consumer doesn’t have a contractual relationship with those back-end providers, their interactions with that intermediary are no different than those of any other cloud provider. While the intermediary may have more fodder for excuses (“our storage provider failed in exactly such a way as to expose a heretofore unknown bug in our billing provider”), an SLA is an SLA and, if the intermediary fails to meet their SLA, the consumer is entitled to whatever compensation is specified in the service contract.

If you squint at the NIST definition you can infer that the distinction it draws between “given services” and services that “are not fixed” are a reference to the visibility (or lack thereof) between the consumer the back-end services. If this is the case, this distinction needs to be made explicit and unbundled from the definitions of intermediation, aggregation, and arbitrage.


Functional and Business Relationships

Most of the discussion around cloud brokers tends to focus on the functional relationships (i.e. who sends requests to whom and how are the results processed). Above, I point out the importance of the business relationships (i.e. who has contracts with whom). Obviously both sets of relationships are important. What makes multi-party cloud scenarios interesting is that the two sets of relationships are independent of one another. This can lead to a fair number of different scenarios.

Take, for example, the “punch out” scenario found in many enterprise purchase portals. The consumer (an employee) has both business and functional relationships with the intermediary (their employer). At some point there is an SSO exchange and the consumer is redirected from the intermediary to the provider (the supplier’s website). Although the consumer now has a functional relationship with the provider (in that they are sending requests and receiving responses from the supplier’s site) they do not have a business relationship with the provider (i.e. they aren’t asked for their credit card). Behind the scenes, there are both functional and business relationships between the employer and the supplier (the order information is sent back to the portal and the supplier expects to be paid by the employer).

If we confine our considerations to a cloud consumer, a single intermediary, and a single cloud provider then further restrict ourselves to consider only those cases in which the consumer has, at a minimum, a functional relationship with the intermediary and a business relationship with at least one other party – I figure there are 26 possible scenarios (you may want to check me on this). Granted, many of these combinations may not have a workable business case, but here are some discrete examples:

Jamcracker
  • consumer has business and functional relationships with intermediary (Jamcracker)
  • consumer has business and functional relationships with the cloud provider (e.g. WebEx)
  • intermediary and cloud provider have business and functional relationships
SpotCloud
  • consumer has business and functional relationships with intermediary (SpotCloud)
  • consumer has no business or functional relationship with cloud provider
  • intermediary and cloud provider have business and functional relationships
Akamai
  • consumer has functional but no business relationship with intermediary (Akamai)
  • consumer has functional and business relationships with the cloud provider
  • intermediary and cloud provider have business and functional relationships
Again, the danger with calling all these scenarios “cloud broker scenarios" is that you will mask important differences in their characteristics and behavior.This creates both confusion and misunderstanding.


The Taxonomy Challenge

Obviously we can’t simply give each of the possible multi-party scenarios a unique name; there are too many to remember. What we have is the classic problem of taxonomy. The scenarios are distinguished along a number of different axes and it is difficult to tell which axis is “the most important”.

While I don’t have a complete answer to this problem, it seems to me that it makes the most sense to do the “top level split” around the existence or non-existence of any business relationship between the consumer and the back-end provider(s). Although it pains me to admit it, the industry is coalescing around the term “cloud broker” to refer to scenarios in which there is no business relationship between the consumer and the provider (exactly the opposite of how the term is used in the real world). This leaves the term “service intermediary” to refer to those scenarios in which there is a business relationship between the consumer and the cloud provider.

When describing new things it is easy to fall into the trap of wasting time arguing about their names. Regardless of what terms people use, it would be helpful if we consistently used the same, separate names to refer to the top-level cases I outlined above. “Broker” and “intermediary” are as good as any others.


Final Digression

I suspect that the term “cloud broker”, as it is currently used, derives from an older term – “message broker”. This makes sense because “message broker” is misapplied in exactly the same way as “cloud broker”. “Message broker” is commonly used to refer to an architectural pattern in which you use an intermediary to minimize or eliminate the producer’s and consumer’s awareness of each another.


References

[1] NIST SP 500-292, “NIST Cloud Computing Reference Architecture”, http://collaborate.nist.gov/twiki-cloud-computing/pub/CloudComputing/ReferenceArchitectureTaxonomy/NIST_SP_500-292_-_090611.pdf

[2] Gartner, “Gartner Says Cloud Consumers Need Brokerages to Unlock the Potential of Cloud Services”, http://www.gartner.com/it/page.jsp?id=1064712

[3] Diversity, “NIST Decides to Redefine the English Language, Broker != Service Intermediary”, http://www.diversity.net.nz/nist-decides-to-redefine-the-english-language-broker-service-intermediary/2011/09/12/

Monday, January 23, 2012

Spec Conformance in the Age of Clouds

As a veteran of three or four (depending upon how you count them) majorly disruptive changes in computing, I’m always on the lookout for things that distinguish cloud computing from what-has-been-before. I am seeing a rather interesting change in the notion of “conformance” as it applies to the way specifications are written and negotiated.

What Does “Conform” Mean?

To be brief (and oversimplify somewhat), in the age of packaged software, a statement in a specification that “conformant implementations MUST support FeatureX” is a promise about the possible behavior of any software claiming to conform to that spec. If you buy a chunk of software that claims to conform to this specification, it must be possible for you to configure that software such that FeatureX is supported. Note that this configuration doesn’t have to be the default configuration. The vendor that sold you the software may even recommend against such a configuration. Nevertheless, that vendor can rightfully claim that their product conforms to the spec, even if some of their customers “choose” to configure their deployments in ways that are not spec conformant.

In the cloud, a statement that “conformant implementations MUST support FeatureX” is more closely a statement about the actual runtime configuration of any system claiming to conform to that spec. Because the vendor and provider roles have merged, “the vendor” cannot simply allow “the provider” to enable support for FeatureX – FeatureX has to actually be supported in the systems that are deployed and operated by that provider. There are ways the provider can skirt this, for example, by allowing/enabling FeatureX on a per-tenant basis – but, overall, it seems to me that the move to cloud computing has reduced the amount of wiggle room available to implementers.

Sausage Making

Warning to anyone laboring under the illusion that specifications are crafted by disinterested scientists whose main goal is technical quality: this next section deals with some of the political/technical maneuvering that goes into creating specifications and may be unsettling.

Let’s lay out a scenario: You are involved in a standards-development group that is collaborating on the specification of some API. It turns out that some members of this group feel that it is absolutely essential that the API MUST support FeatureX. After researching their proposal you become convinced that these people have been engaging in some activity that seriously impairs the functioning of their pre-frontal cortex. You try arguing them out of it, watering down the requirement, etc. all to no avail.

If you are representing an organization that develops and sells packaged software, this situation is not too dire if (1) FeatureX doesn’t affect too many other areas, (2) a minimal FeatureX isn’t overly complicated and difficult to implement, (3) you are reasonably sure that none of your customers will ever want FeatureX. Simply get your developers to implement a minimal version of FeatureX, enable it as a non-default configuration option, and ship. If you are right about (3), the code for FeatureX will never be exercised outside of conformance testing. You and your organization may not want to do this, but you have some degree of flexibility.

Now suppose you are representing an organization the develops, hosts, and operates a cloud service. Even with per-tenant configuration tricks, the call to require FeatureX means that your organization not only has to develop the code to support FeatureX, it may have to deploy it and support it. This significantly raises the stakes around conformance – particularly for features that are “operationally infeasible” in your particular architecture. You can’t be flexible about a requirement to support a feature you can’t actually support.

Upshot

I see a couple of obvious effects of this difference in the context around cloud specifications. The first is that cloud specs will take longer to develop. Arguments that formerly could have been resolved with a “fine, have your FeatureX” now have to follow some (in all likelihood torturous) course that morphs FeatureX into something everyone can support and/or some parties have to reconcile themselves to the refactoring work necessary to support it. Secondly, I expect cloud specs to have fewer strange requirements that were included due to the intransigence of some parties and laziness of others. This is a good thing for interoperability and thus for humanity at large.

Caveat

Note that none of this has anything to do with the creation (or blessed lack thereof) of “optional features” – i.e. features that are described by a spec but not required to claim conformance. As near as I can tell, there is nothing about the context of cloud computing that effects the creation of such features one way or another.

Wednesday, January 5, 2011

lyrics daddy moon

If you came across this because you are searching for the artist or lyrics to the song that played on the episode of Parenthood that aired Tuesday, January 11th 2011 that has the hook line “oh daddy moon …” this post is to tell you that artist is Tom Freund and the name of the song is “Little Room Of Mine”. You can find the song on his latest album “Fit To Screen”.

Obviously I like Tom or I wouldn’t be trying to help other people find him. If you liked “Little Room Of Mine” you’ll like is other stuff.

Technorati Tags: ,

The Mind that Maps

Considering my appetite for cool software tools, it shouldn’t come as a surprise that I’m into mind mapping software. I’ve used MindManager for years now and, though I like the product, I can’t see shelling out $180 for an upgrade when there are so many cheaper/free alternatives. Is it too much to expect Mindjet to factor the existence of these competitive offerings into their pricing? Or is it just the case that MindManager is targeted at the enterprise and no one actually uses their own money to buy it?

Technorati Tags: ,