Thought Deprived Musings: Mercurial

Showing posts with label Mercurial. Show all posts

Friday, November 15, 2013

Using DeltaWalker with Mercurial for diffs

I bought a bundle of apps with a deal, and got DeltaWalker as part of the bundle. It's a reasonably decent diff tool, with the ability to do folder merging as well (rsync anybody ;) ). In terms of interface it's not bad, nice layout, colours, etc. I haven't used it much because most of my diff work is handled by my JetBrains tools as I'm developing.

For my current project I'm doing a reasonably complicated merge between two feature branches to bring them into alignment. I'm not a fan of Feature Branches as they impede Continuous Integration (FeatureToggles are a better approach), and the fact that my merge is complicated is living proof as to why they should be avoided. However due to particular circumstances a Feature Branch was unavoidable.

With part of the codebase there's not a 1-1 mapping of files. For example a class that was an inner class in Branch A is now in it's own Java file in Branch B. So it turned out to be easier to see what changed in Branch A in a file and port the concepts/semantics of the changes across to Branch B instead of just relying on a syntactical/textual merge. This worked really well because even though the implementation had diverged quite significantly, the interfaces were reasonably unchanged so my Branch A Tests changes merged to Branch B very easily, which would help inform me if I stuffed up the merge of a file. Another win for TDD.

Mercurial's diff commands were powerful enough to show me the changes for a file between the last merge point from Branch A to Branch B to the head of Branch A. However since I like my graphical diff tools, I wanted to try out DeltaWalker to see how good it was.

DELTAWALKER DOCUMENTATION SUCKS - It doesn't actually tell you HOW to integrate with your SCM tools. This is the WORST form of marketing and almost made me toss the tool. Fortunately I was able to figure it out.

When you select Hg in the DeltaWalker and point it to your Hgrc file it updates the file for you. Would be nice if it explains that it does that! I found out about it because I version control my Hg config (with Mercurial of course), and running a diff on the file, for another update I made caused me to see the changes.

[extdiff]
cmd.dw = /Applications/DeltaWalker.app/Contents/MacOS/hg

[merge-tools]
dw.executable = /Applications/DeltaWalker.app/Contents/MacOS/hg
dw.args = $local $other $base -merged=$output

So one can then use Hg's extdiff functionality to load up the diffs.

$ hg dw -r $lastmerge:$branch_head file

It actually works well, but they guys behind the product need to update their docs!!

Finding the last merge point between branches in Mercurial

When merging between my development and release branches, I like to know what the last merge point was between the two so that I can do tasks like compile Release Notes or see if some code needs to be merged (based on tags).

I developed a little Hg alias that finds the last merge node between two branches.

lastmerge = !hg log -r "children(ancestor($1, $2)) and merge() and ::$2"

$ hg lastmerge BRANCH_1 BRANCH_2

In the example above, BRANCH_1 and BRANCH_2 are bookmarks. The alias makes use of Hg Revsets to find the latest merge node from BRANCH_1 into BRANCH_2

Wednesday, June 22, 2011

Server configuration with Mercurial

I've been playing a lot lately with Mercurial, and in my opinion it's the best SCM around. I also have been administering some of my servers (actually Rackspace VMs) and ran into the age old problem that sys admins have had; keeping the server config synchronised.

The problem in a nutshell is that you install a set of applications (through yum or apt-get or whatever) and configure them. However you run into the problem of config propagation, versioning/history, rolling back to a known configuration, etc. I've seen a few sys admins roll their own solution, usually involving rsync and a lot of logging.

My solution was to use Hg to do all the heavy lifting, with a wrapper bash script that I knocked up very quickly that invokes Hg to version configuration files. It maintains knowledge about where the file came from by copying the file to be relative to the repository directory. For example if a user was to edit /etc/hosts the file will reside in the repository at $REPOS_HOME/etc/hosts

How it works is that you run

$ editconf <file>

the script does the following.

Resolves the absolute path of a file (using a realpath bash script that a friend knocked up)

Checks if the file exists in the repository

If the file doesn't exist the file is copied and added to the repository

Drops through to the users editor (defaults to vim)

Copies the file to the repository when the user exits the editor

Attempts to commit the file

If the user is created for the first time (ommitting the initial check), the script is smart enough to add it first. Mistakes are fixed by leaving an empty commit message (most SCMs wont commit of course), and reverting the file.

Branches can be made, merged, and state can be pushed around various servers with very little effort.

I mentioned this approach to some people and they thought it was a nifty idea and asked me to share my scripts. They can be found on BitBucket under the nesupport[1] SysTools project. They're licensed under the MPL, and since they were knocked up in a hurry patches/feedback are always welcome. Further instructions are found in the scripts themselves (if further action is required).

Further work could include the automated sharing of repository state (cron job) and synchronising what's in the repo with what's on the filesystem.

[1] The code was developed for a project I'm working on with somebody else who agreed to open source our sys admin scripts, of which editconf is a part.

Monday, May 30, 2011

Integration is the mess that noone wants to clean up

The title for this post came as a rather tongue in cheek comment by the sys admin at my workplace. However I think this is a rather accurate description of a personal project that I've been working on for the past six months. This project was designed as an educational project into different technologies that I've wanted to play with; so a lot of the decisions were guided by that. Across the way I've thought about different issues, found different tips and tricks and thought I'd share them.

The goal of this project is an integration piece between a website (through RSS) and social media sites like Facebook, MySpace, and Twitter (with all the links fed back to the original site). The goal was to not have to replicate data across the multiple sites; as well as give me an excuse to play with new toys.

The runtime environment is Google App Engine as this app would run infrequently and GAE is free. I also have wanted to write a GAE app for a while, so this seemed like a good idea. The language is Java since that's my primary development language and I don't know Python.

The build tool is Ant as I despise Maven. I use Maven at work, and I find it to be inflexible, bloated and just plain annoying. It does have some good ideas however and given Ant's import abilities, one can write generic build tasks that implement good build practices without all the pain. I did consider other build tools like Gradle but I found personally that after thinking about what I wanted to do for a while I was able code the build declaratively with Ant being smart enough to handle all the heavy lifting. I'm considering Open Sourcing my Ant build library that I've accumulated when I've cleaned it up a little bit.

Maven also provides dependency management however I personally find Ivy's dependency management to be more mature, cleaner, simplier and easier to configure/use than Maven.

Discussions about build tools can often get people a bit hot under the collar and I found The Java Posse podcast on the subject to be very educational (as a hint they're not very Maven friendly).

My IDE of choice is Eclipse (Helios), with the relevant Google plugins. One other reason that I dislike Maven is that the M2 integration sucks (I've heard it's a lot better with Indigo however I've yet to try). However Ant was not immune from problems either in that the runtime I am using is 1.8.1, and the Ant build editor doesn't recognise syntax from that version so Eclipse tells me my build.xml has errors in it. Runs fine however.

For my SCM I'm using Mercurial as it is the best VCS around, beating Git by a gazillion miles (yes I have actually used Git in a sophisticated manner and Hg is so much easier and more intuitive).

The app itself is broken down into 3 parts (with Spring handling all the configuration). The first handles datastore/RPC services, with a GWT frontend, Spring MVC providing the CRUD RPC endpoints, and Objectify handling persistence. The second part is implementing the manual workflow of OAuth2 with Spring MVC rendering the views (very simple JSPs) and accepting callbacks. The third was the part that actually made posts to the social media sites and kept everything in sync.

I spent a lot of time trying to bash JPA to work on GAE, however the DataNucleus implementation is quirky at best. It does certain tasks (like the assigning of PK values) differently from all other JPA implementations and the JDO junk that gets stuck in you .class file can mess with other annotations or class behaviour. I spent one Saturday ripping out JPA and replacing it with Objectify (guided by tests for you TDD purists out there), and I've little trouble ever since.

A few big design principles that I've been trying to hammer into myself is good old DRY and others from the SOLID acronym like SRP; guided by tests. Working with a framework like Spring is that if you don't follow these ideas then you can get yourself into trouble quickly enough to realise something's amiss. For example even when using Objectify one still has to deal with transaction management. It's the same bit of code that runs for all DAOs so where do you put it? One should of course favour composition over inheritance. I went with an AOP (JDK proxy) approach where a transaction manager provides advice around the DAO methods, injecting an "entity manager" which the DAOs then used for DS operations. Very elegant but not as easy if one doesn't code to interfaces (the L of SOLID as I understand it).

However GAE itself doen't help with the testing side of these principles, and it's an area where Google could really improve their tooling, especially if they want more than Micky Mouse projects running on GAE. It's nigh impossible to preload the local DS file with test data for integration/acceptance tests (where the tests run in one JVM and the dev_server serving your app runs in another). I hacked a solution together but it broke when the SDK was updated. The only way therefore is to use an interface which you build into your app. Since most of my entities were being served in a XML representation over HTTP to be consumed by a GWT front end it wasn't too much of a pain to drive my tests through the browser using Selenium. However for a more sophisticated project it's a nail in the coffin for testability.

Learning Spring MVC was a joy, with the ability to render a view of the model (through JSP) or return data in a HTTP response body (ala REST/RPC) with minimal effort. As far as MVC frameworks go it's the best I've worked with to date. However there is some room for improvement with the way that Content-Types (MIME) are handled in an AJAX world, but I've detailed that problem before

Running code on GAE of course other than native servlets is annoying, which some readers may already know because of the way GAE attempts to serialize everything. I had to rework my atchitecture a bit to try and get around that, but I still get errors in the logs due to Spring classes not being Serializable.

I've probably left a few details out here and there, and questions/comments are welcome (unless you want to troll then bugger off). Constructive criticisms of technology choices are always interesting and I like to chig wag about that; although please don't start a flame war.

Overall I found this project to be satisfying and educational and I've come out the other side a better engineer/developer.