Have lunch and learn about content management

Richard Esplin richard-lists at esplins.org
Fri Sep 16 14:38:51 MDT 2011

I have no hands-on experience with Plone. Before responding to your question, I spent a few minutes brushing up on the features. I was surprised that Plone has a lot more document management features than I was previously aware of. Alfresco and Plone have a lot of overlap in web content management scenarios.

Though Plone has features like WebDAV support, workflow, and metadata, it appears to be optimized for web content management. In my quick search it appears that:

* It has limited facilities for inspection or transformation of non-text based content (full-text search, metadata extraction, preview).
* It stores everything in a single large binary database file, with BLOB capabilities for extracting some unstructured binaries.
* Its workflow capabilities appear to be tailored to web publishing scenarios (no event model).

That doesn't mean that Plone isn't a strong platform with a solid use case. I merely highlight those as examples of where Alfresco's focus is different from Plone. Though Alfresco can often be used in the same situations as Plone, Alfresco's focus is being a scalable content repository in back-office use cases (intranets, business workflow, embedded in applications).

Alfresco is most widely used in scenarios involving office documents, images, video, and audio. Useful features include:

* Automatic extraction of common metadata (EXIF, PDF), plus easy hooks to insert custom metadata extractors.
* Automatic transformation of common formats (MS Office to PDF), plus easy hooks to insert custom transformers.
* Out-of-the-box full-text search and preview of common office, image, video, and audio formats.
* All content is stored on the filesystem, so there are no limits to the size of content. Performance is comparable for lots of small files or lots of really big files.
* A configurable event system that fires when content is uploaded, modified, or removed.
* BPMN2 compliant workflow engine.
* Easy onramps for unstructured content like CIFS, WebDAV, FTP, SharePoint Protocol, IMAP, SMTP, NFS, etc.
* Compliant with the CMIS standard for REST and SOAP integrations.
* A system for easily writing your own rest services using JavaScript and XML.
* Able to publish content to external endpoints via email, REST, and other transfer services. Out-of-the-box endpoints include Twitter, YouTube, Flickr, and WordPress.
* Scales to the 10's of millions of documents.

I have recently helped clients deploy Alfresco in scenarios like:

* Serving large amounts of video or Flash games through a custom web portal,
* High performance scanning and ingestion,
* Processing loan applications,
* Storing business documents for HR, Marketing, Legal, and IT,
* Authoring and distributing product catalogs, datasheet, and marketing information to a team of resellers,
* Records archives for SOX compliance,
* Storing the multi-media assets used by an online game.

What I have been calling unstructured content management, lots of people call a content repository. It is a generic term applied to a specific class of technologies. Explaining it is like introducing someone to the idea of a database or web portal. These solutions do lots of things in lots of ways, but they share a set of strengths and weaknesses that make them good at a certain type of problem. As engineers, most of the time we hear about these types of needs we solve them using a database plus custom code, or a filesystem and custom code. A robust document store saves you from building these solutions yourself. Like other building-blocks, content repositories often don't do much out of the box; they need to be configured and integrated to be useful.

Alfresco's chief competitors are solutions like Documentum, Filenet, and Oracle Content Management. SharePoint is increasingly playing in this space, but I think it still has too many limitations. To complicate matters, Alfresco has been adding a lot of features that make it competitive with SharePoint as a collaboration portal.

Here is a good article which probably describes this better than I did:


P.S. I didn't find any quick numbers to estimate the size of the Plone community (number of downloads, commercial customers, installs), so I can't say much about that.

On Friday September 16 2011 13:36:52 Shane Hathaway <shane at hathawaymix.org> wrote:
> On 09/16/2011 11:19 AM, Richard Esplin wrote:
> > I should have been more precise and said open source content
> > management system for unstructured content.
> >
> > Drupal, Wordpress, and Plone are web content management systems that
> > focus on presentation management.
> >
> > Alfresco is a general content management system that can handle web
> > content management (structured content, usually XML) and any other
> > type of content like documents and images (unstructured content).
> >
> > Alfresco is regularly deployed as a back-end to Drupal and Plone to
> > add a robust authoring platform to the web presentation system. It is
> > also used in non-web scenarios for workflow, content transformations,
> > search, and more.
> I'd like to understand, but I'm struggling because I would describe 
> Plone with the very same words you used to describe Alfresco.  In fact, 
> these days people often use a separate process to present Plone on the 
> web (see Diazo, XDV, or Deliverance), so that Plone can focus on 
> authoring (with all its workflow, collaboration features, and 
> modularity) instead of presentation management.
> If I download and install Alfresco to figure out what I'm missing, what 
> should I pay attention to?  I want to be impressed.
> Shane

More information about the PLUG mailing list