Data Deduplication File Systems

Tod Hansmann plug.org at todandlorna.com
Mon Feb 25 08:18:26 MST 2013


On 2/24/2013 11:45 PM, DANIEL DAVID EGLI wrote:
> *Question for you guys. Someone recently told me about a new set of Linux
> file systems that include data-deduplication. They sent me an old article
> from Linux Journal before they stopped printing it and went web only. The
> article was very informative, except in one aspect. It only dealt with one
> file system, called lessfs, which (at least the way they showed it) could
> not be mounted via fstab. You had to run a separate binary to mount it. Has
> anyone heard of any other file systems that implement data deuplication?
> I'd be curious to check it out. I think it could be especially handy on my
> MythTV box because when you get all these commercials over and over the
> system would only need to store the entire video sequence for the
> commercial once. That's handy because as an example I was watching
> Underworld on spike Saturday night and must have seen the same commercials
> for places like Burger King over and over. If I'm going to record shows I
> like the idea of not having to save that space. Yes, on a multi-terabyte
> disk drive space is probably not going to be an issue. But I'd at least
> like to try it out anyway, IF I can find a file system that includes
> data-deduplication and can be mounted at boot via fstab/rc.sysinit.*
>
>
The videos will not be the same data.  Two commercials encoded within 
two different tv shows will not come out to the same bytes more than 
likely, and you won't get de-duplicated.  If you run the same source 
file through the same encoder twice you might get the same bytes, but 
barring that unique scenario you aren't getting close without 
specialized video checking tools (which will be CPU intensive, and 
probably require more storage for metadata than you'd save on commercials).

-Tod Hansmann


More information about the PLUG mailing list