What do you think about cherry picking?

Hans Fugal hans at fugal.net
Thu Nov 15 11:47:15 MST 2007

I've been thinking about distributed revision control systems again, and
I want to start a discussion about cherry picking. 

Here's the scenario. Consider two branches of a source tree, let's call
them stable and unstable for the purposes of this discussion, but they
could be e.g. Andrew Morton's tree and Linus' tree, or whatever. A bug
is noticed by the unstable dev, and fixed. Now, or later, we want to
"cherry pick" (or "backport") that bug fix, but not all the
destabilizing changes on the whole unstable branch.

This can be done manually, by creating a patch for the bug fix and
applying it. The argument for explicit cherry picking support in a DRCS
is that it can (supposedly) figure out ancestor patches that the bugfix
depends on and bring them in as well. This is what darcs does. This is
of course on a patch level, not a semantic level.

Does your favorite DRCS handle cherry picking? How well does it handle
it? Have you ever used it? Have you actually found it useful?

I have on occasion used darcs' cherry picking support, but mostly for
simple changes that had no dependencies in the first place (e.g. config
file changes). My hypothesis is that there is no real practical use case
where the ability to cherry pick is really important. The convenience of
being able to browse/pick the patch(es) you want is important, but that
could easily be implemented on top of almost any DRCS (unless it's so
braindead as to not let you produce diffs between arbitrary changesets).
Once you have the patch(es) you want, they will either apply to your
working directory or they won't (the patch layer). Then, they will
either work or they will need patching up (the semantic layer). If your
DRCS grabs ancestors to make the patch layer work, you may end up
grabbing more than you want, and have to manually back things out. If
your DRCS doesn't grab ancestors, you may have to bring in more patches
yourself, and/or hand-merge, but you won't have to deal with backing out
over-eager patches. In either case it works well if you have committed
small and specific changesets, and doesn't work so well if you haven't.
In both cases you need to be alert and careful, because the software
can't do all your thinking for you. 

That's my latest thought pattern, but I am far from settled in my
opinion, and I would like to solicit your discussion to help enlarge my
mind. In my case, I'm trying to decide whether to abandon darcs for
mercurial (hg) or hold out hope that one day darcs will get over the
exponential merge problem. Right now I sit on the fence. I use hg
sometimes and darcs other times. I'm quite happy with both tools, and
both have their pros and cons. The thing that makes me most uneasy about
darcs (aside from the occasional exponential merge problem) is that it's
inconvenient to do "point in time" checkouts, branches, etc. The thing
that makes me uneasy about hg (or indeed most other DRCS's, to my
understanding of them) is the lack of cherry picking. But thinking back
on it I haven't ever truly used it in darcs either.

Another thing about darcs is that it's hard to really diverge on a
branch and still interact with the main trunk. It doesn't (currently)
have a way to say "no, thanks, I don't want this patch EVAR so quit
asking me every time I pull". In practice this means you have no benefit
over any other RCS's branching (where merging is all or nothing). So
finding the patch to cherry pick is either a manual affair anyway
(though darcs does make it really nice to select patches by description
or date or whatever) or quite shallow in which case the whole dependency
thing is moot.

Phew. :-) What are your thoughts?

Hans Fugal ; http://hans.fugal.net
There's nothing remarkable about it. All one has to do is hit the 
right keys at the right time and the instrument plays itself.
    -- Johann Sebastian Bach

More information about the PLUG mailing list