Questions about distributed repositories

Richard Esplin richjunk1 at byu.net
Thu Mar 10 21:16:07 MST 2005


Warning: long email follows.

I am tired of the limitations of CVS, and I want to switch to a distributed 
version control systems. The recent thread on this list was very informative, 
but left me with some questions about which I hoped the plug could enlighten 
me. Up until now, I have only used CVS extensively, so that is my basis of 
comparison. My current short list of VCSes is darcs, Tom Lord's arch, and 
Aegis. Specific responses dealing with these systems would be appreciated.

First, has anyone looked at Aegis? It is a filesystem centric VCS that has 
been around over ten years. It sounds very feature complete and mature. Why 
doesn't it get the buzz that arch and darcs draw?

My next questions have to do with flaws in my understanding of distributed 
repositories. Please help clarify my thinking:

I understand that a distributed revision control system does not enforce the 
concept of a master repository, but that it may be convenient for a 
development team to have a single place to sync all revisions. Let's call 
this place the master repository. It only differs from developer repositories 
because that is where the production branch is stored.

Suppose that I check out a copy of the master repository on my local machine 
and make 10 commits to it. These commits do not effect the code in the master 
repository. I can commit a log message with each local check-in, and I can 
roll back my local copy to any of those ten versions any time I want. At some 
point I can merge my local repository with the main code base so that other 
developers on my team can use it. When other developers on my team sync with 
the main repository, they will then have my changes. This much I understand.

My question will make more sense if I define some variables:
  The master repository is M
  The initial state of the master repository is M[A]
  My local repository is L
  Initially the state of my local repository is L[A]
  Initially M[A]=L[A]
  I do various changes to make L[B], L[C], . . . L[K]
  I merge L[K] with M[A] to get M[L]=L[L].

My understanding is that at this point, I can roll L back to any revision 
L[B] . . .L[L]. But, as I understand it, I can only roll M back to M[A]. M 
has no understand of revisions [B] through [K]. Is that correct?

If another developer syncs with M[L] to create D[L], can the developer roll D 
back to any previous revision?

With CVS, I would occasionally look at a file in the repository to browse all 
log messages and all changes ever made to a file. Knowing the complete 
history of a file can really help to fix bugs, especially with regards to 
business logic. Is this possible with a distributed repository system? Does 
M, or anything synced with M after revision [L] (such as D), know about the 
log messages and changesets stored in L during revisions [B] through [L]? Can 
M or D revert to revision [B] through [K]? What happens to local log files 
after a branch merge?

The thing that most draws me to a distributed repository system is the ability 
to commit often with more fine-grained log messages without breaking the main 
repository; I see this as being a great tool for documenting the logic of the 
software. If those fine-grained log messages and changesets are lost when I 
merge with the main repository, then much of the usefulness of a distributed 
repository system would be lost. Can it be used in the way I am thinking?

I appreciate any explanations.

Richard Esplin



More information about the PLUG mailing list