Moved Kite from Google Code to GitHub

I wrote a small library called Kite a couple of years back, based on Michael Nygard’s outstanding Release It! book. Currently it has a circuit breaker and concurrency throttle. There are several others I’d like to add. [Update: I've since added a rate-limiting throttle.] It’s something I’ll get back to once I finish the book.

So today I moved Kite from Google Code to GitHub. Check out my GitHub Kite repo.

Kite includes a tiny sample app so you can see how it works. But it’s pretty easy to describe just in a blog post. Kite allows you to harden your app in a couple of easy steps. First:

<?xml version="1.0" encoding="UTF-8"?>
<beans:beans xmlns="/schema/kite"
        /schema/kite /schema/kite/kite-1.0-a3.xsd">

    <!-- Activate Kite annotations -->
    <annotation-config />
    <!-- Message service -->
    <circuit-breaker id="messageServiceBreaker" exceptionThreshold="3" timeout="30000" />
    <throttle id="messageServiceThrottle" limit="50" />
    <!-- Expose components as JMX MBeans (optional) -->
    <context:mbean-export />

And second, here’s the message service itself:

import org.zkybase.kite.GuardedBy;

... other imports ...

public class MessageService {

    @GuardedBy({ "messageServiceThrottle", "messageServiceBreaker" })
    public Message getMotd() { ... }

    @GuardedBy({ "messageServiceThrottle", "messageServiceBreaker" })
    public List getImportantMessages() { ... }

After this, your service methods are protected by a circuit breaker that trips if there are three consecutive errors (“trip” here means that it disables the call), and a concurrency throttle that throws an exception if there are 50 concurrent requests. Such components protect both the backend service and also clients that want to avoid queuing on services that are experiencing performance degradation.

For more information on how the circuit breaker works, see Annotation-based circuit breakers with Spring.

I’ll do a more careful writeup later, but have a look and send any questions or feedback my way.

Posted in Kite, News | Tagged , , , , | Leave a comment

Devops: what it is and why you should be doing it

This could be brogramming. It's not devops.

Devops is a big deal nowadays, and there’s a variety of ways people describe it and its benefits. Here we’ll move past the fluffy characterization involving developers and operations working together joyously—not to mention the outright wrong characterization of one superrole that does it all—and get to the heart of what devops is really about. Then we’ll go into what’s driving it, and why it’s something you should be doing.

What is devops?

Here’s my take, fluff-free:

Devops is the push to bring the development, test, infrastructure and operations domains under software management and control. It does this through the aggressive pursuit of integration, standardization and automation.

This characterization explains why developers tend to be more excited about devops than ops is. The very last thing ops wants is a bunch of software developers telling them how to manage configuration files, deploy packages or monitor servers. I’m on the development side myself, and for me devops is fun and exciting. But from the other side it has to feel something like a hostile takeover, especially in light of the history of strained relationships between development and IT.

So why does everybody keep talking about dev and ops working together? It’s because both sides have important roles to play:

  • First, there’s relevant technical expertise on both sides. We need people who know how to code against the AWS APIs, how to integrate tools using web services and message buses, how to build self-service user interfaces for app deployments an so on. But we also need people who can create reasonable server builds, patch them when there are issues, script against the F5s to support zero-downtime software deployments, etc.
  • Second, there’s also relevant domain expertise on both sides. I suspect this is a lot more obvious to developers than ops: most developers would agree that there’s a point beyond which their understanding of infrastructure becomes hazy. But it turns out that development has its own important infrastructure, such as SCM, configuration repos, CI/build servers, Maven repos, test automation and more. This infrastructure is very much a part of the picture, and developers are the experts in this case.

So it’s true that devops involves dev and ops working together, but that’s a byproduct of what’s really going on, which is using software to run increasingly larger parts of the IT show. That makes a lot of people uncomfortable, but it’s happening all over the place, ready or not.

Now let’s see what’s driving the trend.

What’s driving devops?

There are a couple of major forces driving devops.

Cloud and virtualization. The first force is of course the emergence of services and tools that either take advantage of cloud and virtualization technologies, or else offer a way to manage the additional complexity they bring. Examples in the former category include cloud provisioning APIs like Amazon EC2, or SaaS services like New Relic and Loggly, which offer cloud-based turnkey operational capabilities. Examples in the latter category include configuration management tools like Chef and Puppet. As VMs blink in and out of existence, there has to be some way to manage everything that’s happening.

Continuous delivery. There’s a second force too: software engineering itself is maturing. We are seeing growing consensus as to software development best practices, in terms of both design and development process/methodology. While it would be overstating the case to say that software developers are automatically engineers, it is no exaggeration to say there are plenty of developers who practice development as an engineering discipline. Proper use of source control, versioning configuration, continuous integration, unit and integration testing, automating deployment—these are all engineering activities designed to make development predictable and repeatable.

This second force is driving devops in the following way. As a general rule, forward steps in software engineering lead directionally to the continuous delivery of software, which is essentially the ability to deliver high-quality software to users within hours or days of starting work on some development task, as opposed to weeks or months. (Obviously it depends on the size of the task, but the concept is that if you’re able to finish the development task in an hour, your delivery capability shouldn’t prevent you from releasing it hours later. It’s fine if your product folks want to prevent it, but it’s not fine if your capability prevents it.) Continuous delivery involves integrating a bunch of different tools into a deployment pipeline, and this pipeline—which includes everything from the developer’s IDE to monitoring and diagnostic tools in the operational environment—spans both development and operations toolsets. This drive toward continuous delivery in turn drives the need for development and operations to pull it together.

The previous discussion offers a hint as to why we might want to pursue a devops-style technology practice: it allows us to deliver more quickly to our customers. But there’s more to it than just that.

The devops value proposition

We mentioned earlier the historically strained relationship between development and ops. There are numerous reasons, such as different bodies of best practice (e.g. agile vs. ITIL) and the fact that they’re often separate organizations, which tends to create barriers. Where I work, the IT team rides motorcycles, whereas we on the development side drive Toyotas. I’m not sure why but that’s how it is.

One of the key points of conflict, however, is different attitudes toward change. In most operational environments, change trades against stability. In other words, the more stuff you push out into production, the more likely you are to break something. Ops doesn’t like it when things break, because they’re the ones who have to deal with the fallout. But developers are under pressure to push changes into production, and in fact to accelerate the rate of change.

These different perspectives toward change result in the dysfunctional relationship that often emerges between dev and ops. Development pushes out a bunch of new features but didn’t have time to update the knowledge base, and when something breaks ops doesn’t know what to do. Ops calls for a change moratorium, dev misses release targets and blames ops, ops blames dev right back for producing unsupportable software, and so on.

What devops and continuous delivery bring to the table is the ability to replace the change-or-stability dilemma with a world where increased rates of change actually enhance stability, because change surfaces broadly-visible issues that admit of a common and permanent fix. For example, if there’s only one deployment tool, and it’s not able to handle the delay between the time that an instance comes up and SSH becomes available, then everybody will see it, and somebody will definitely fix it. And it’s very easy to justify putting a correct fix in place (not an ugly hack) because potentially hundreds of apps are depending on it. Moreover, a proper devops practice enables speed and stability improvements beyond what are possible without it:

Integration, standardization and automation are transformational, because they create a situation in which change no longer hurts. Dozens or even hundreds of packages flow through the deployment pipeline every single day, and if there’s a problem, it’s often a common problem with a systematic and permanent fix. Dev and ops are aligned around delivering software to customers quickly and at volume, as well as around maintaining a smoothly running operational environment.

Want to learn more?

I haven’t said much about exactly how devops helps. I’ve really made some simple assertions to the effect that it does. That’s fine for an overview, but you may want to hear some details. The purpose of this blog is to explore exactly those details. So if you’re interested, check back as I’ll be posting more.

And for more information on continuous delivery in particular, it’s hard to beat Continuous Delivery, by Jez Humble and David Farley. It explains the what, why and how. It has been the playbook for the automation platform we’re building at work, which currently has 20,000 automated deployments under its belt.

Posted in Devops principles | Tagged , , , , | 5 Comments

Zkybase now supports authorized access to GitHub via Spring Social GitHub

This one’s another quick Zkybase screenshot post. Using Spring Social GitHub, it’s easy to access public user and repo information via the GitHubTemplate. But if we want to access private information, or write capabilities, we need to use the Spring Social connection framework. This involves adding a Spring Social web controller to the app, giving users a way to log into Zkybase itself (via Spring Security), and then finally giving them a way to connect their Zkybase account to their GitHub account via OAuth 2.


Here’s what the GitHub section of the account page looks like before you connect to GitHub:

Once you’ve done that, the display changes to the following:

And now we can perform various secure operations on our GitHub objects, like viewing repo hooks:

I plan to use this integration to support autoprovisioning continuous integration servers, among other things.

If you’re interested in the code itself, please see my blog post entitled Using Spring Social GitHub to access secured GitHub data [code].

Posted in Zkybase CMDB | Leave a comment

More Zkybase screenshots

It’s been a little while since I’ve posted Zkybase screenshots, so here’s the work in progress. I explain how to implement this stuff (Spring Data Neo4j, Spring/GitHub integration, JavaScript InfoVis Toolkit, etc.) in chapter 11 of my book Spring in Practice (Manning).

App overview page

Applications are a central concept in Zkybase. This is how the app overview page looks so far. (It’s just a start.) The concept behind it is that if you want a 360-degree view of an app (dev view, test view, release view and ops view), you come to the app details page and then you can start looking at different views.

App repository commits page

Here’s a repo commit history for a given app. Right now it’s just integrated with GitHub, but the idea is to provide integrations with other providers too (e.g. BitBucket).

App repository watchers page

Again, an app details view. This time it’s GitHub watchers for a given app repo.

Region details page

Visualizations are one of the more exciting features that a CMDB can offer. One of the huge advantages of putting your configuration management data in a database is that you can move away from Visio documents and Gliffy diagrams, and move instead toward data modeling and automatically generating views. Here I’m using the JavaScript InfoVis Toolkit library to generate an interactive graph visualization. It is a great complement to the underlying Neo4j graph database.

Automation view

This might strike you as a strange screenshot, but in reality it’s an example of the most important view–the automation view. The main reason we want to manage configuration data in a database is that we can build web services (Zkybase supports both JSON and XML views) that simultaneously drive automation and human-consumable views of the sort that a support team would use. And the data is accurate because it’s what brings the environment into being–no more chasing your environment around trying to document it.

If you’re interested in checking it out or even getting involved in development, see the Zkybase GitHub site.

Posted in News, Screenshots, Zkybase CMDB | Leave a comment

Zkybase/GitHub integration

A major goal for Zkybase is to tie together the activities of developers, testers, release engineers and operations. A few major benefits are:

  • Tool integration: We can single source app lists and more across development, build, test, deployment, monitoring, knowledge management and other tools.
  • Process streamlining: Outputs from the dev process feed into the test and deployment processes, which in turn feed into operational processes
  • Communications across teams: All the teams work from the same concepts and data, facilitating communication.

One of the first things I want to tackle in Zkybase, then, is supporting some developer-centric concerns. An important one is source control. Obviously Zkybase isn’t itself a source control management system, but it makes sense to pull in useful information from the source code repos just to provide a “single pane of glass” where developers and other users can get a holistic view of what an app is all about.

Since I’m using GitHub, Github integration is a logical starting point for Zkybase/SCM integration. GitHub has a REST API that makes this integration easy. I wrote a blog post on Spring in Practice that explains the technical approach. Here’s the end result:

Zkybase SCM page

I suspect that such integrations will be a large part of the value that Zkybase delivers. They will help realize the process, tool and communications benefits I highlighted above.

Posted in News, Zkybase CMDB | Tagged | Leave a comment

Closed loops: the secret to collecting configuration management data

Hi all, Willie here. Happy New Year!

In my last post, How NOT to collect configuration management data, I gave a quick rundown of some losing CM data approaches that I and others have attempted in the past. Most of these approaches were variants of asking people for information, putting their answers in documents somewhere and never looking at the documents again.

This time around I’m going to describe a key breakthough that our team finally made–one that made it a lot easier to collect and update the data in question.

That breakthrough was the concept of a closed loop and how it relates to configuration management.

[As it happens, the concept is well-known in configuration management circles, but at the time it wasn't well-known to us. So we discovered something that other people already knew. It's in that limited sense I say we made a breakthrough.]

We’re going to have to build up to the closed loop concept. Let’s start by looking at who has the best CM data in the org.

Who has the best CM data?

Different orgs are different, so it’s tough to make blanket statements about who has the best CM data. But what I can do is give the answer for my workplace, and hopefully the principles will make sense even if the reality is different where you work. But I’ll bet for a lot of orgs it’s quite similar.

To avoid keeping you in suspense, the answer is…

Winner: the release team. Where I work, the release team has the best CM data, where “best” means something like comprehensive, accurate and actively managed. The release team knows which apps go on which servers, which service accounts to launch Tomcat or JBoss under, which alerts to suppress during a deployment, and so on. They know these things across the entire range of apps they support. It’s all documented (in YAML files, databases, scripts, etc.) and it’s all maintained in real time.

Let’s look at some other teams though.

  • App teams have nonsystematic data. The app teams typically know the URLs for their apps, which servers their apps are on (or at least the VIPs), the interdependencies between their apps and other adjacent systems (web services, databases). But the knowledge is less systematic. It’s more like browser bookmarks, tribal knowledge and not-quite-up-to-date wiki pages. And any given developer knows his own app and maybe the last app or two he worked on, but not all apps.
  • Ops teams have to depend on busy developers for info. The ops teams have better or worse information depending on how close to the app teams they are. The team in the NOC is almost entirely at the mercy of the app teams to write and update knowledge base articles. As you might imagine, it can be a challenge to ensure that developers up against a deadline are writing solid KB articles and maintaining older ones. For the NOC it’s very important to know who the app SMEs are, given such challenges. Even this is not always readily clear as org changes occur, new apps appear, old apps get new names and so on.
  • App support teams are more expertise-driven than data-driven. The app support teams (they handle escalations out of the NOC) are generally more familiar with the apps themselves and so build up stronger knowledge about the apps, but this knowledge tends to be stronger with “problem child” apps. Also, different people tend to develop expertise with specific apps.

Why does the release team have the best CM data?

The release team has the best CM data because properly maintained CM data is fundamental to their job in a way that it’s not for other teams.

First, a quick aside. If your company isn’t regulated by SOX, you may be wondering about what a release team is and why we have one. Among many other things, SOX requires a separation of duties between people who write software and people who deploy/support it in the production environment. The release team’s primary responsibility is to release software into production. We actually have a couple of release teams, and each of them services many apps. It would not be feasible from a cost and utilization perspective for each app team to have its own dedicated release engineer. The release teams release software at low-volume times, generally during the wee hours.

Back to this idea that the release team needs proper CM data more than the other teams do. Why am I saying that?

Here’s why. The software development team is highly motivated to release software at a regular cadence. A fairly limited number of release engineers must service hundreds of applications (generally not all at once, though), so “tribal knowledge” isn’t a viable strategy when it comes to knowing what to deploy where. It must be thoroughly and accurately documented. Releases happen every week, and late at night, so it’s not reasonable for the release team to call up his buddy on the app team and ask for the list of app servers. The release team needs this information at their fingertips. If they don’t have it, the software organization fails to realize the value of its development investment.

Indeed, “documented” is the wrong word here, because deployment automation drives the deployments. The CM data must be properly “operationalized”, meaning that it must be consumable by automation. No Word docs, no Excel spreadsheets, no wiki pages. More like YAML files, XML files, web service calls against a CMDB, etc.

Importantly, when the data is wrong, the deployment fails. People really care about deployments not failing, so if there are data problems, people will definitely discover and fix them.

Let’s look at the app and ops teams again.

  • App teams can make do without great CM data. The dependency of app developers on their CM data is softer. Yes, a developer needs to know which web services his app calls, but someone just explains that when he joins the project, and that’s really all there is to it. If he has a question about a transitive dependency, he might ask a teammate. If he needs to get to the app in the test environment, he probably has the URL bookmarked and the credentials recorded somewhere, but if not, he can easily ask somebody. 99% of the time, the developer can do what he needs to do without reference to specific CM data points. The developer may or may not automate against the CM data.
  • Ops/support teams need good CM data, but expertise is cheaper in the short- to medium-term. Except in cases involving very aggressive SLAs, even ops often has a softer dependency on CM data than the release team does. Since (hopefully) app outages occur much less frequently than app deployments, the return on knowledge base investments is more sporadic than that on deployment automation. If the app in question isn’t particularly important, investments in KB articles may be very limited indeed. In most cases, investing in serious support firepower (when something breaks, bring significant subject matter expertise to bear on the problem) yields a better short- to medium-term return. (Of course, in the longer term this strategy fails, because eventually there will be the very costly outage that takes the business out for several days. That’s a subject for a different day.)

Now we’re in a good place to understand closed loops and why they’re so important for configuration management data.

Closed loops and why they matter

I think of closed loops like this. There’s a “steady state” that we want to establish and maintain with respect to our CM data. We want it to be comprehensive, accurate and relevant. When the state of our CM data diverges from that desired steady state, we want feedback loops to alert us to the situation so we can address it. That’s a closed loop.

Example 1: deployment automation. The best example is the one that we already described: deployment data. Deployment data drives the deployment process, and when the data is wrong, the deployment process fails. Because the deployment process is extremely important to the organization, some level of urgency attaches to fixing wrong data. But it’s not just wrong data. If we need to deploy an app and there’s missing data in the CMDB, then sorry, there’s no deployment! Rest assured that if the deployment matters, the missing data is only a temporary issue.

Example 2: fine-grained access controls. Here’s another example: team membership data. We’ve already noted that for operational reasons it’s very important to know who is on which development team. This isn’t something that’s going to be in the HR system, and people have better things to do than to update team membership data. But what happens when that team membership data drives ACLs for something you care about being able to do, like deploying your app to your dev environment? Now you’re going to see better team membership data.

The basic concept is to find something that people really, really care about, and then make it strongly dependent on having good CM data:

Ideally it’s best if the CM data drives an automated process that people care about, but that’s not strictly necessary. In my org, for instance, there’s a fairly robust but manual goal planning and goal tracking process. Every quarter the whole department goes through a goal planning process (my goals roll up into my boss’ goals and so on), and then we track progress against those goals every couple of weeks. The goal planning and tracking app requires correct information about who’s on which team, and so this serves to establish yet another closed loop on the team membership data. It also illustrates the point that you can hit the same type of data with multiple loops.

Design your CM strategy holistically

There are several areas in technology where it pays to take a holistic view of design: security, user experience and system testing come immediately to mind. In each case you consider a given technical system in its wider organizational context. (Super-duper-strong password requirements don’t help if people have to write them down on Post-Its.)

Configuration management is another place where it makes lots of sense to take a holistic approach to design. For any given type of data (there’s no one-size-fits-all answer here), try to figure out something important that depends on it, and then figure out how to tie that something to your data so that wheels just start falling off if the data is wrong, incomplete and so on. Again, data-driven automated processes are superior here, but any important process (automated or not) will help.

Fewer meetings?

Almost forgot. In the last post, I mentioned that I’ll equip you to get out of some pointless meetings. The meetings in question are the ones where somebody wants to get together with you to collect CM data from you so they can post it to their Sharepoint site. Decline those–they’re just a waste of time. Insist that people be able to articulate the closed loops that they will be creating to make sure that someone discovers gaps and errors in the data. I’ve been in plenty of such meetings, and in some cases they’re set up as half- or full-day meetings. I don’t do those anymore.

I’m working on an open source CMDB, called Zkybase, that can help you establish closed loop configuration management. See the Zkybase GitHub site.

Posted in Devops principles | Leave a comment

Devops: How NOT to collect configuration management data

Hi all, Willie here. This time we’re going to step away from the keyboard and get architectural. But no ivory towers here. In my next two blog posts, I’m going to give you something that will get you out of lots of pointless meetings.

Got your attention yet? Good!

If you’re in devops, one of the things that you have to figure out is how to collect up all the information that will allow you to manage your configuration and also keep your apps up and running. This isn’t too hard when you have two apps sitting on a single server. It’s much harder when you have 400 apps deployed to thousands of servers across multiple environments and data centers.

So how do you do that? Let’s start off by looking at some things that just don’t work. In my next post I’ll share with you an approach that does work.

Some quick background

Probably owing to some psychological glitch, I’ve always been an information curator. Especially when I was in management, it was always important to me to understand exactly where my apps were deployed, which servers they were on, which services and databases they depended on, which servers those services and databases lived on, who are the SMEs for a given app, and so on.

If you work in a small environment, you’re probably thinking, “what’s the big deal?”

Well, I don’t work in a small environment, but the first time I undertook this task, I probably would have said something like, “no sweat.” (That’s the developer in me.) Anyway, for whatever reason, I decided that it was high time that somebody around the department could answer seemingly basic questions about our systems.

Attempt #1: Wiki Wheeler

Wiki Wheeler

The first time I made the attempt (several years ago), I was a director over a dozen or so app teams. So I created a wiki template with all the information that I wanted, and I chased after my teams to fill it out. My zeal was such that, unbeknownst to me, I acquired the nickname “Wiki Wheeler”. (One of my managers shared this heartwarming bit of information with me a couple years later.) I guess other managers liked the approach, though, because they chased after their teams too.

This approach started out decently well, since the teams involved could manage their own information, and since I was, well, Wiki Wheeler. But it didn’t last. Through different system redesigns and department reorgs, wiki spaces came and went, some spaces atrophied from neglect, and there was redundant but contradictory information everywhere. The QA team had its own list, the release guys had their list, the app teams had their list. The UX guys might have even gotten in on the act. Anyway, after a year or so it was a big mess and to this day our wiki remains a big mess.

Attempt #2: My huge Visio

The second time, my approach was to collect the information from all the app development teams. We had hundreds of developers, so to make things efficient, I sent out a department-wide e-mail telling everybody that I was putting together a huge Visio document with all of our systems, and I’d appreciate it if they could reply back with the requested information. And while I had to send out more nag e-mails than I would have liked, the end result was in fact a huge Visio diagram that had hundreds of boxes and lines going everywhere. I was very proud of this thing, and I printed out five or six copies on the department plotter and hung them on the walls.

How long do you think it was before it was out of date?

I have no idea. I seriously doubt that it was ever correct in the first place. Nobody (myself included) ever used the diagram to do actual work, and its chief utility was in making our workplace look somewhat more hardcore since there were huge color engineering plots on the walls.

There was one additional effect of this diagram, though I hesitate to call it utility. I acquired the wholly undeserved reputation for knowing what all these apps were and how they related to one another. This went on for years through no fault of my own. I mention it because it’s relevant for the next attempt.

Attempt #3: Disaster recovery

When asteroids strike...This one wasn’t my attempt, but it’s still worth describing since I have to imagine that we aren’t the only ones who ever tried it.

As a rapidly growing company, disaster recovery became an increasingly big deal for us several years back, and there was a mandate from up high to put a plan in place. This involved collecting up all the information that would allow us to keep the business running if our primary data center ever got cratered. The person in charge of the effort spent about two years meeting with app teams or else “experts” (me, along with others who, like me, had only the highest-level understanding of how things were connected up) and documenting it on a Sharepoint site where nobody could see it.

This didn’t work at all. Most people were too busy to meet with her, so the quality of the information was poor. Apps were misnamed, app suites were mistaken for apps, and to make a long story short, the result was not correct or maintainable.

Attempt #4: Our external consultants’ even bigger Visio

Following a major reorg, we brought in some external consultants and they came up with their own Visio. By this time I had already figured out that interviewing teams and putting together huge Visios is a losing approach, so I was surprised that a bunch of highly-paid consultants would use it. Well, they did, and their diagram was, well, “impressive”. It was also (to put it gently) neither as helpful nor as maintainable as we would have liked.

Attempt #5: HP uCMDB autodiscovery

We have the HP uCMDB tool, and so our IT Services group ran some automated discovery agents to populate it with data. It loaded the uCMDB up with tons of fine-grained detail that nobody cared about, and we couldn’t see the stuff that we actually did care about (like apps, web services, databases, etc.).

Attempt #6: Service catalog

Our IT Services group was pretty big on ITIL, so they went about putting together a service catalog. The person doing it collected service and app information into an Excel spreadsheet and posted it to a Sharepoint site. But this didn’t really get into the details of configuration management. It was mostly around figuring out which business services we offered and which systems support which services.

They ended up printing out a glossy, full-color service catalog for the business, but nobody was really asking for it, so it was more of a curiosity than anything else.

There were other attempts too (Paul led a couple that I haven’t even mentioned), but by now you get the picture.

So why did these attempts fail?

To understand why these attempts failed, it helps to look at what they had in common:

  • First, they generally involved trying to collect up information from SMEs and then putting it in some kind of document. This could be wiki pages, Visio diagrams, Word docs on Sharepoint or an Excel spreadsheet.
  • Once the information went there, nobody used it. So the information, if it was ever correct and/or complete in the first place, was before long outdated, and there wasn’t any mechanism in place to bring it up to date.

In general, people saw the problem as a data collection problem rather than a data management problem. There needs to be a systematic, ongoing mechanism to discover and fix errors. None of the approaches took the management problem seriously at all–they all simply assumed that the organization would care about the data enough to keep it up to date.

In the next installment, I’ll tell you about the approach we eventually figured out. And as promised, that’s also where I’ll explain how to get rid of some of your meetings.

Interested in configuration management? I’m working on an open source CMDB called Zkybase.

Posted in Devops principles | Leave a comment

Domain Modeling with Spring Data Neo4j

Hi all, Willie here. Last time I told you that I’m building the Zkybase CMDB using Neo4j and Spring Data Neo4j, and I was excited to get a lot of positive feedback about that. I showed a little code but not that much. In this post I’ll show you how I’m building out the person configuration item (CI) in Zkybase using Spring Data Neo4j.

Person CI requirements

We’re going to start really simply here by building a person CI. It’s useful to have people in the CMDB for various reasons: they allow you to define fine-grained access controls (e.g., Jim can deploy such-and-such apps to the development environment; Eric can deploy whatever he wants wherever he wants; etc.); they allow you to define groups who will receive notifications for critical events and incidents; etc.

Our person CI will have a username, first and last names, some phone numbers, an e-mail address, a manager, direct reports and finally projects he or she works on. We need to be able to display people in a list view, display a given person in a details view, allow users to create, edit and delete people and so on. Here for example is what the list view will look like, at least for now:

Person list view

Person list view

And here’s how our details view will look:

Person details

Person details

The relationship between a person and a project has an associated role. This relationship is also the basis for the list of collaborators: two people are collaborators if there’s at least one project of which they’re both members.

Our simple requirements should be enough to show what it feels like to write Spring Data Neo4j code.

Create the Person and ProjectMembership entities

First we’ll create the Person. I’ve suppressed the validation and JAXB annotations since they’re irrelevant for our current purposes:

package org.zkybase.model;

import java.util.Set;
import org.neo4j.graphdb.Direction;
import org.zkybase.model.relationship.ProjectMembership;

public class Person implements Comparable<Person> {
    @GraphId private Long id;

    @Indexed(indexType = IndexType.FULLTEXT, indexName = "searchByUsername")
    private String username;

    private String firstName, lastName, title, workPhone, mobilePhone, email;

    @RelatedTo(type = "REPORTS_TO")
    private Person manager;

    @RelatedTo(type = "REPORTS_TO", direction = Direction.INCOMING)
    private Set<Person> directReports;

    @RelatedToVia(type = "MEMBER_OF")
    private Set<ProjectMembership> memberships;

    public Long getId() { return id; }

    public void setId(Long id) { = id; }

    public String getUsername() { return username; }

    public void setUsername(String username) { this.username = username; }

    ... other accessor methods ...

    public Person getManager() { return manager; }

    public void setManager(Person manager) { this.manager = manager; }

    public Set<Person> getDirectReports() { return directReports; }

    public void setDirectReports(Set<Person> directReports) {
        this.directReports = directReports;

    public Iterable<ProjectMembership> getMemberships() { return memberships; }

    public ProjectMembership memberOf(Project project, String role) {
        ProjectMembership membership = new ProjectMembership(this, project, role);
        return membership;

    ... equals(), hashCode(), compareTo() ...

There are lots of annotations we’re using to put a structure in place. Let’s start with nodes and their properties. Then we’ll look at simple relationships between nodes. Then we’ll look at so-called relationship entities, which are basically fancy relationships. First, here’s an abstract representation of our domain model:

Abstract domain model

Abstract domain model

Now let’s look at some details.

Nodes and their properties. When we have a node-backed entity, first we annotate it with the @NodeEntity annotation. Most of the simple node properties (i.e., properties that aren’t relationships to other nodes) come along for the ride. Notice that I didn’t have to annotate firstName, lastName, email, and so forth. Spring Data Neo4j will handle the mapping there automatically.

There are a couple of exceptions though. The first one is that I put @GraphId on my id property. This tells Spring Data Neo4j that this is an identifier that we can use for lookups. The other one is the @Indexed annotation, which (surprise) creates an index for the property in question. This is useful when you want an alternative to ID-based lookup.

Now we’ll look at relationships. Speaking broadly, there are simple relationships and more advanced relationships. We’ll start with the simple ones.

Simple relationships. At a low level, Neo4j is a graph database, so we can talk about the graph in graph theoretical terms like nodes, edges, directed edges, DAGs and all that. But here we’re using graphs for domain modeling, so we interpret low-level graph concepts in terms of higher-level domain modeling concepts. The language that Spring Data Neo4j uses is “node entity” for nodes, and “relationships” for edges.

Our Person CI has a simple relationship, called REPORTS_TO, that relates people so we can model reporting hierarchies. Person has two fields for this relationship: manager and directReports. These are opposite sites of the same relationship. We use @RelatedTo(type = “REPORTS_TO”) to annotate these fields. The annotation has a direction element as well, whose default value is Direction.OUTGOING, which means that “this” node is the edge tail. That’s why we specify direction = Direction.INCOMING explicitly for the directReports field.

What’s this look like in the database? Neoclipse reveals all. Here are some example reporting relationships (click the image for a larger view):

(Small aside: there’s a @Fetch annotation–we’ll see it in a moment–that tells Spring Data Neo4j to eager load a related entity. For some reason I’m not having to use it for the manager and direct reports relationships, and I’m not sure why. If somebody knows, I’d appreciate the explanation.)

Relationship entities. Besides the REPORTS_TO relationship between people, we care about the MEMBER_OF relationship between people and projects. This one’s more interesting than the REPORTS_TO relationship because MEMBER_OF has an associated property–role–that’s analogous to adding a column to a link table in a RDBMS, as I mentioned in my reply to Brig in the last post. The Person.memberOf() method provides a convenient way to assign a person to a project using a special ProjectMembership “relationship entity”. Here’s the code:

package org.zkybase.model.relationship;

import org.zkybase.model.Person;
import org.zkybase.model.Project;

@RelationshipEntity(type = "MEMBER_OF")
public class ProjectMembership {
    @GraphId private Long id;
    @Fetch @StartNode private Person person;
    @Fetch @EndNode private Project project;
    private String role;

    public ProjectMembership() { }

    public ProjectMembership(Person person, Project project, String role) {
        this.person = person;
        this.project = project;
        this.role = role;

    public Person getPerson() { return person; }

    public void setPerson(Person person) { this.person = person; }

    public Project getProject() { return project; }

    public void setProject(Project project) { this.project = project; }

    public String getRole() { return role; }

    public void setRole(String role) { this.role = role; }

    ... equals(), hashCode(), toString() ...


ProjectMembership, like Person, is an entity, but it’s a relationship entity. We use @RelationshipEntity(type = “MEMBER_OF”) to mark this as a relationship entity, and as with the Person, we use @GraphId for the id property. The @StartNode and @EndNode annotations indicate the edge tail and head, respectively. @Fetch tells Spring Data Neo4j to load the nodes eagerly. By default, Spring Data Neo4j doesn’t eagerly load relationships since risks loading the entire graph into memory.

Create the PersonRepository

Here’s our PersonRepository interface:

package org.zkybase.repository;

import java.util.Set;
import org.zkybase.model.Person;
import org.zkybase.model.Project;

public interface PersonRepository extends GraphRepository<Person> {

    Person findByUsername(String username);

    @Query("start project=node({0}) match project<--person return person")
    Set<Person> findByProject(Project project);

        "start person=node({0}) " +
        "match person-[:MEMBER_OF]->project<-[:MEMBER_OF]-collaborator " +
        "return collaborator")
    Set<Person> findCollaborators(Person person);

I noted in the last post that all we need to do is extend the GraphRepository interface; Spring Data generates the implementation automatically.

Spring Data repositories

Spring Data repositories

For findByUsername(), Spring Data can figure out what the intended query is there. For the other two queries, we use @Query and the Cypher query language to specify the desired result set. The {0} in the queries refers to the finder method parameter. In the findCollaborators() query, we use [:MEMBER_OF] to indicate which relationship we want to follow. These return Sets instead of Iterables to eliminate duplicates.

Create the web controller

We won’t cover the entire controller here, but we’ll cover some representative methods. Assume that we’ve injected a PersonRepository into the controller.

Creating a person. To create a person, we can use the following:

@RequestMapping(value = "", method = RequestMethod.POST)
public String createPerson(Model model, @ModelAttribute Person person) {;
    return "redirect:/people?a=created";

Once again, we’re ignoring validation. All we have to do is call the save() method on the repository. That’s how updates work too.

Finding all people. Next, here’s how we can get all people:

@RequestMapping(value = "", method = RequestMethod.GET)
public String getPersonList(Model model) {
    Iterable<Person> personIt = personRepo.findAll();
    List<Person> people =
        new ArrayList<Person>(IteratorUtil.asCollection(personIt));
    return "personList";

We have to do some work to get the Iterable that PersonRepository.findAll() returns into the format we want. IteratorUtil, which comes with Neo4j (org.neo4j.helpers.collection.IteratorUtil), helps here.

Finding a single person. Here we want to display the personal details we built out above. As with findAll(), we have to do some of the massaging ourselves:

@RequestMapping(value = "/{username}", method = RequestMethod.GET)
public String getPersonDetails(@PathVariable String username, Model model) {
    Person person = personRepo.findByUsername(username);
    List<ProjectMembership> memberships =
    List<Person> directReports =
    List<Person> collaborators =


    model.addAttribute("memberships", memberships);
    model.addAttribute("directReports", directReports);
    model.addAttribute("collaborators", collaborators);

    return "personDetails";

If you want to see the JSPs, check out the Zkybase GitHub site.

Configure the app

Finally, here’s my beans-service.xml file:

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns=""

        location="classpath:/spring/" />
    <context:annotation-config />
    <context:component-scan base-package="org.zkybase.service" />

    <tx:annotation-driven mode="proxy" />

    <neo4j:config storeDirectory="${graphDb.dir}" />
    <neo4j:repositories base-package="org.zkybase.repository" />

Neo4j has a basic POJO-based mapping model and an advanced AspectJ-based mapping model. In this blog post we’ve been using the basic POJO-based approach, so we don’t need to include AspectJ-related configuration like <context:spring-configured />.

There you have it–a Person CI backed by Neo4j. Happy coding!

To see the code in more detail, or to get involved in Zkybase development, please see the Zkybase GitHub site.

Posted in Zkybase CMDB | Tagged , | Leave a comment

Why I’m pretty excited about using Neo4j for a CMDB backend

Zkybase is my first open source configuration management database (CMDB) effort, but it’s not the first time I’ve built a CMDB. At work a bunch of us built–and continue to build–a proprietary, internal system CMDB called deathBURRITO as part of our deployment automation effort. We built deathBURRITO using Java, Spring, Hibernate and MySQL. deathBURRITO even has a robotic donkey (really) whose purpose we haven’t quite yet identified.

So far deathBURRITO has worked out well for us. Some of its features–namely, those that directly support deployment automation–have proven more useful than others. But the general consensus seems to be that deathBURRITO addresses an important configuration management (CM) gap, where previously we were “managing” CM data on a department wiki, in spreadsheets, in XML files and in Visio diagrams. While there’s more work to do, what we’ve done so far has been reasonably right-headed, and we’ve been able to evolve it as our needs have evolved.

That’s not to say that there’s nothing I would change. I think there’s an opportunity to do something better on the backend. That was indeed the impetus for Zkybase.

Revisiting the backend with Spring Data

Because CM involves a lot of different kinds of entities (e.g., regions, data centers, environments, farms, instance types, machine images, instances, applications, middleware, packages, deployments, EC2 security groups, key pairs–the list goes on and on), it was helpful to build out a bit of framework to handle various cross-cutting concerns more or less automatically, such as standard views (list view, details view, form views), web controllers (CRUD ops, standard queries), DAOs (again, CRUD ops and queries), generic domain object capabilities, security and more. And I think it’s fair to say that this helped, though perhaps not quite as much as I would have liked (at least not yet). The fact remains that anytime we want to add a new entity to the system, we still have to do a fair amount of work making each layer of the framework do what it’s supposed to do.

My experience has been that the data persistence layer is the one that’s most challenging to change. Besides the actual schema changes, we have to write data migration scripts, we have to make corresponding changes to our integration test data scripts, we have to make sure Hibernate’s eager- and lazy-loading are doing the right things, sometimes we have to change the domain object APIs and associated Hibernate queries, etc. Certainly doable, but there’s generally a good deal of planning, discussion and testing involved.

So I started looking for some ways to simplify the backend.

Recently I started goofing around with Spring Data, and in particular, Spring Data JPA and Spring Data MongoDB. For those who haven’t used Spring Data, it offers some nice features that I wanted to try out:

  • One of the features is that it’s able to generate DAO implementations automagically using Java’s dynamic proxy mechanism. Spring Data calls these DAOs “repositories”, which is fine with me. Basically what you do as a developer is you write an interface that extends a type-specific Repository interface, such as a JpaRepository or a MongoRepository. You don’t have to declare CRUD operations because the Repository interface already declares the ones you’d typically want (e.g., save(), findAll(), findById(), delete(), deleteAll(), exists(), count(), etc.).
  • And if you have custom queries, just declare them on the interface using some method naming conventions and Spring Data can figure out how to generate an implementation for you.
  • In some cases, Spring Data handles mapping for you. In the case of Spring Data JPA you still have to make sure you have the right JPA annotations in place, and that’s not too terribly bad. But for MongoDB, since the MongoDB backend is just BSON (binary JSON), the mapping from objects to MongoDB is straightforward and so Spring Data handles that very well, without requiring a bunch of annotations.

“Game changer” might be too strong for the features I’ve just described, but man, are they useful. Besides making it easier and faster to implement repos, Spring Data allows me to get rid of a lot of boilerplate code (lots of DAO implementations) and even some Spring Data-like DAO framework code that I usually write. (You know, define a generic DAO interface, a generic abstract DAO class, etc.) My days of building DAOs directly against Hibernate are probably over.

But that’s not the only backend change I’m looking at.

Revisiting the backend, part 2: Neo4j + Spring Data Neo4j

Example of a graph in Neo4j

Since Spring Data tends to gravitate toward the NoSQL stores, I finally got around to reading up on Neo4j and some other stuff that I probably ought to have read up on a long time ago. Better late than never I guess. It struck me that Neo4j could be a very interesting way to implement a CM backend, and that Spring Data Neo4j could help me keep the DAO layer thin. Here’s the thinking:

  • There are many entities and relationships. There are lots and lots of entities and relationships in a CMDB. There are different ways of grouping things, like grouping which assets form a stack, which stacks to deploy to which instances, which instances are in which farms, how to define shards and how to define failover farms, grouping features into apps, endpoints into services, etc. We have to associate SLAs and OLAs with app features and service endpoints, we have to define dependencies between components, and so on. There’s a ton of stuff. It would eliminate a major chunk of work if we could get away with defining the schema one time (say, in the app itself) instead of in both the app and the database.
  • We need schema agility to experiment with different CMDB approaches. Along similar lines, there are some strong forces that push for schema agility. Most fundamentally, the right approach to designing and implementing a CMDB isn’t necessarily a solved problem. Some tools involve running network discovery agents that collect up millions of configuration items (CIs) from the environment. Other tools focus more on collecting the right data rather than collecting everything. Some tools have more of a traditional enterprise data center customer in mind, where you’re capturing everything from bare metal to the apps to the ITIL services based on the apps. Other tools are more aligned with cloud infrastructures, starting from virtualized infrastructure and working up, leaving everything under that virtualization layer for the cloud provider to worry about. Some tools treat CIs as supporting configuration management but not application performance management (APM); other tools try to single-source their CM and APM data. Some organizations want to centralize CIs in a single database and other organizations pursue a more federated model. With such a panoply of approaches, it’s no surprise that schema changes occur.
  • We need schema agility to accommodate continuing innovations in infrastructure. Another schema agility driver is the fact that the way we do infrastructure is rapidly evolving. Infrastructure is steadily moving to the cloud, virtualization is commonplace and new automation and monitoring tools are appearing all the time. The constant stream of innovation demands the ability to adjust quickly, and schema flexibility is a big part of that.
  • We need schema flexibility to accommodate the needs of different organizations. Besides schema agility, a CMDB needs to offer a certain level of flexibility to support the needs of different customers. One org may have everything in the cloud, where another one is just getting its feet wet. One org may have two environments whereas another has seven. Different orgs have different roles, test processes, deployment pipelines and so forth. Any CMDB that hopes to support more than just a single customer needs to support some level of flexibility.
  • But we still need structure. One of the more powerful benefits of driving deployment automation from a CMDB is that you can control drift in the target environments by controlling the data in your CMDB. And defining a schema around that is a great way to do so. If, for example, you expect every instance in a given farm to have the same OS, then your schema should attach the OS to the farm itself, not to the instance. (The instances inherit the OS from the farm.) Or maybe it’s not just the OS–you want a certain instance type (in terms of virtual hardware profile), certain machine image, certain middleware stack, certain security configuration, etc. Great: bundle those together in a single server build object and then associate that with the farm. This constrains the instances in the farm to have the same server build. (See this post for a more detailed discussion.) While Neo4j is schema-free, Spring Data Neo4j gives us a way to impose schema from the app layer.
  • A schemaless backend makes zero-downtime deployments easier. If you have any apps with hardcore availability requirements, then it goes without saying that you have to have significant automation in place to support that, both on the deployment side and on the incident response side. Since the CMDB is the single source of truth driving that automation, it follows that the CMDB itself must have hardcore availability requirements. Having a schemaless database ought to make it easier to perform zero-downtime deployments of the CMDB itself.
  • We want to support intuitive querying. Once you bring a bunch of CIs together in a CMDB, there are all kinds of queries that pop up. This could be anything from using the dependency structure to diagnose an incident, assess the impact of a change, or sequence project deliverables (e.g. make system components highly available depending on where they sit in the system dependency graph); using org structures to determine support escalation paths and horizontal communications channels; SLA reporting by group, manager, application category; and so forth. Intuitive query languages for graph databases–and in particular, the Cypher query language for Neo4j–appear to be especially well-suited for the broad diversity of queries we expect to see.

Remember how I mentioned that Spring Data generates repository implementations automatically based on interfaces? Here’s what that looks like with Spring Data Neo4j:

package org.skydingo.zkybase.repository;

import org.skydingo.zkybase.model.Person;
import org.skydingo.zkybase.model.Project;

public interface ProjectRepository extends GraphRepository<Project> {
    Project findProjectByKey(String key);
    Project findProjectByName(String name);

    @Query("start person=node({0}) match person-->project return project")
    Iterable<Project> findProjectsByPerson(Person person);

Let me repeat that I don’t have to write the repository implementation myself. GraphRepository comes with various CRUD operations. Methods like findProjectByKey() and findProjectByName() obey a naming convention that allows Spring Data to produce the backing query automatically. And in the findProjectsByPerson() case, I provided a query using Neo4j’s Cypher query language (it uses ASCII art to define queries–how ridiculously cool is that?).

Exploring the ideas above with Zkybase

Zkybase dashboard

The point of Zkybase is to see whether we can build a better mousetrap based on the ideas above. I’m using Neo4j and Spring Data Neo4j to build it out. I haven’t decided yet whether Zkybase will focus on the CMDB piece or whether it’s a frontend to configuration management more generally (delegating on the backend to something like Chef or Puppet, say), but the CMDB will certainly be in there. That will include a representation of the as-is (current) configuration as well as representations for desired configurations as might be defined during a deployment planning activity.

So far I’m finding it a lot easier to work with the graph database than with a relational database, and I’m finding Spring Data Neo4j to be a big help in terms of repository building and defining app-level schemas. The code is a lot smaller than it was when I did this with a relational database. But it’s still early days, so the jury is out.

Watch this space for further developments.

Posted in Zkybase CMDB | Tagged , , , , | Leave a comment

Zkybase screenshots

One of the things we’re working on is an open source configuration management system called Skybase. It’s a new project, and we’re still working out a lot of the high-level vision for this thing. I’ll post more information about it later, but for now I just wanted to post some screenshots as a way for people to see how it looks. We’re using Twitter Bootstrap for the CSS (very nice toolkit–love it), and Spring/Neo4j (along with Spring Data Neo4j) on the backend. So far the Neo4j graph data is working out great for configuration management data.

Anyway here are the screenshots. (Click on any of the screenshots for a larger view.)

First, the dashboard:

Here’s the project details page:

And here’s the page that allows you to create a new project:


Posted in News, Screenshots, Zkybase CMDB | Leave a comment