Saturday, July 31, 2010

Split Hudson jobs

For a couple of weeks now I've been searching for a good post to tell me how to break my build in Hudson into couple of build jobs without problems. After not finding something good enough I decided to try my luck and write one myself.

There are some posts about how to build Hudson pipeline and separate your build into "build" and "test" steps around these links:
Don't get me wrong, those links provide very good information, I was just looking for something more practical.

So, I will try to show here how to configure a sample Java project that builds with ant and separated it into two projects - build and tests.

This post is divided into two parts:
  • The way I figured out how to do this - refer to this part as "The boring part"
  • The practical part of the result - refer to this part as "The fun part"

The boring part

The project structure was very simple:


In the build.xml file I had this list of main targets:
  • compile
  • dist
  • test
The initial Hudson job ran all the targets above, archived the jar file and published the test results.
Now the question was how to split it into two jobs.

The first thing I wanted was not to break the project structure in the SCM repository. I liked this structure as it was and if I'm thinking about real projects, I don't want to break the project structure for all developers just because I want to split the job in Hudson.
I tried to think what can I play with in order to build another project that will only run the tests for a specific build. I got all sorts of questions in my head:
  1. Should I checkout from the SCM in the new test job?
  2. Should the test job depend on the last successful build?
  3. Should I somehow share the workspace between the two jobs?
  4. Should I tag the project in the first job and checkout the tags version in the second job?
  5. Should I stop using the word "Should" as the first word in all my questions?
And I answered myself:
    1. If I'm using the same project structure then no, I shouldn't checkout the project from the SCM again. The reason is that someone could have check in something while the first job have run and I might be running tests for a newer revision on an old revision - this can break the build and cause people to investigate why for no reason.
    2. The answer for this question was identical for the answer for the previous question. No, for the same reasons.
    3. Sharing a workspace seems like a good idea at first (not that I had any idea how I'm going to do that), but then I realized that a workspace is something temporary. When a new build comes along, the workspace gets deleted.
    4. That seems like a good idea at first, but then I thought that it would be a bit weird. The first job should do something, at least compiling the code. If I'm going to checkout the same project from the repository, I still needs to get the built artifact from the previous job, and then where exactly do I put it in the test job? Or should I compile and build the source code again in the test project? I didn't really liked the idea of it, even though I knew it could work.
    5. Oh my god YES please!!!
    After answering all those questions, I started thinking that maybe I should change the project structure.
    I figured I want to separate the project into two separate projects that looks now like this:


    The ant includes the following targets:
    • compile
    • dist

    The ant includes the following targets:
    • compile (the tests)
    • test
    I figured that in real projects I might even have some basic unit tests within the original project, but as long as the unit tests will only take a few seconds. Anything that takes too long is just not a unit test, it's more of an integration/functional test and it could be written in a special project that handles functional/integration tests.
    So the jobs will be:
    1. Build: Compile + Unit tests (Should be as short as we can)
    2. Test: More tests (That can take longer to run)

    I was happy, but not satisfied, the job is not done yet.
    Now I had two jobs, I knew how the build job works, but I still wasn't sure how to run the test job.
    I figured I need a way to pass the jar file from the build job to the test job.

    The suggested ideas I found was to use the last successful build, I didn't like the idea because you can have another commit in the meantime, so you wouldn't "build per commit" and I didn't want to disappoint Hudson, I mean look at him...

    I installed the "Hudson Parameterized Trigger plugin" into my Hudson and made the test job a parameterized job. The parameter that this job gets is the build number of the build job.

    Using the paramererized trigger plugin I triggered a test job from the build job and passed the build number of the specifric build job.
    The test job would have checkout the test project from the SCM repository, get the artifact from the build job. The test job can get the specific artifact because it has the build number for the build job.
    After the test job has the artifact (using wget to get it), it can compile the tests and run the tests.

    As for the developers, they need to checkout two projects from the SCM repository and the projects needs to be dependent somehow, if you're using ant there are a lot of nice ways tweaking properties so it would be easy to have one ant file for the developers and for Hudson to use.

    So I got it! I had two jobs: build and test and I was able to build per commit pretty easily!
    I figured if I post it here, it might help me with the ladies (chicks love Hudson geeks...)

    The fun part

    myproject build.xml file:

    Configuring myproject-build job in Hudson steps:
    • Create a new freestyle job.
    • Add a repository URL.

    • Add a build step to invoke ant (targets cleans and dist)

    • Check "Archive the artifacts" and enter: **/dist/*.jar

    • Check "Record fingerprints of files to track usage".
    • Check "Fingerprint all archived artifacts"

    • Save the project.
    Now build the job once to see that everything is working.
    After building it once you should see something similar to this:

    myproject-test files will be:


    Notice that the wget task is unrelated directly to the build.xml. I added it because I'm running this example on windows and I'm not able to run wget command from bash.
    Also note that the default value for the build number is "lastSuccessfulBuild". This is for cases when you want to run your test job without running your build job.

    Notice I added the Thread.sleep in order for the tests to take a minute.

    Configuring the myproject-test job steps:
    • Create a new freestyle project in Hudson.
    • Check ".
    • Click Add Parameter - String Parameter.
    • Give it the default value: "lastSuccessfulBuild".

    • Add a repository URL.

    • Add a build step to invoke ant.
    • Add targets - clean get test.
    • Click on Advanced...
    • In the properties area add:${BUILD_JOB_BUILD_NUMBER}

    • Check "
    • Insert the value "**/reports/junit/*.xml".

    • Click on "Record fingerprints of files to track usage".
    • Insert the value "myproject-test/myproject.jar".

    • Save the project.
    Now we need to build it once to see that everything is working.
    When we run it, it will ask us for the parameter. We can use the default value or change it to 1 to see that it works both ways.
    It will take a minute for the tests to run (because we have the Thread.sleep). Hopefully when you will setup your project, you will use a smaller test group at first.
    The result should be something like this:

    Now all that is left is to configure the build job to run the test job after it is finished.
    Just a reminder: You will need to use the Parameterized Trigger Plugin, it can be found here.

    Configure the build job to run the test job steps:
    • Configure the myproject-build job again.
    • Check "Aggregate downstream test results".

    • Check "Trigger parameterized build on other projects".
    • Insert "myproject-test" in the "Projects to build" field.
    • Click on "Add Parameters" and choose "Predefined parameters. 
    • Insert "BUILD_JOB_BUILD_NUMBER=${BUILD_NUMBER}" in the field.

    • Save the project.
    Run the build job in order to check that it's working. The results should look something like this:

    That's it! You're done.

    Should the build number be identical for the build and test jobs?
    No, we used parameters to pass the build numbers so they are completely decoupled.

    Why am I recording fingerprints on both projects?
    In order to keep the connection between jobs, Hudson uses the fingerprints system. If you won't use it in both jobs, you will not be able to aggregate test results.

    What will happen now if I get another commit during the test is run?
    In that case you will have another build run and another test job will be scheduled. Notice that the new test job will not run until the old test job is finished.

    There is a plugin for cloning workspaces, can't I just use it to separate the jobs?
    Of course you can! You can do whatever works. I have to admit that a friend told me about the plug-in just before I published this post and I didn't really got the chance to investigate it. The plug-in is called Clone Workspace SCM Plugin and here is a link.

    What can I do next?
    You can continue configuring more jobs to run more and more tests. You can run some of your tests in parallel and make your build even faster.