Hard Disk IDs in Linux

Dan Egli ddavidegli at gmail.com
Tue Mar 19 00:43:04 MDT 2013


*Ok, bad hypothetical. Here's more real world situation: my home server.
The one I have is about to be replaced. It's getting flakey & given its age
and capacity (2TB, mirrored) I'm thinking it's just more cost effective to
replace it and transfer the data to a new server that way. The new one
would contain 4 3TB disks in raid10, probably use Ext4. I could use LVM
instead, but I find the stability in event of disk failure to be more
important than the extra space I could accomplish using LVM (discounting
LVM mirrors, of course). The server represents the "hard disk" for almost
all other computers in the house. Baring itself and the media server, all
computers in the house will boot via PXE using that server as it's PXE/NFS
host. It will store a combination of things, few of which are going to
change often. Mainly personal documents will be the dynamic content. The
"static" content will be programs, movies, and MP3s that have been legally
purchased online and downloaded to the server.*

* *

*> Furthermore, if the data is that important to you, you will need a *

*> backup system, which at the minimum is at least another set of disks,
that*

*> receive a full set of data periodically, and are stored off-line.*

* *

*Don't remind me! That's the part I dislike is having to duplicate the
disks. For now the plan is to buy two more 3TB disks in USB 3 enclosures
and use them as the backup medium, probably using mdadm first to make a
raid0 of them first. *

* *

*> Secondly you'll need a very large power supply and a SATA expansion card*

*> if you're really planning to stuff that many disks in a box. There's a *

*> reason why people often buy a SAN array box with its own (redundant)*

*> power supplies.*

* *

*If I was going to stuff a ton in there, yes I would. And will wind up
doing that for the work box. But my home server won't need that. I can
shove a 1200 watt power supply in the case and run the whole system that
way. I could probably get away with 1000 watts, but at that point there's
so little price difference that it's not worth the risk. Better to be over
powered than under.*

* *

*> I'm unsure as to whether you are now talking about your own personal*

*> server which will go into a house, or your work project.*

* *

*It was a hypothetical idea for a home box. But the box I mention above is
far more realistic for my home server. It's specs could change between now
and when I actually build it, but I think it's fairly good as is.*

* *

*> For a home system, I think most people are server well by just two disks*

*> in a RAID-1 configuration, plus a set of backup disks. Very large sets of
*

*> data like photos, movies, and maybe MythTV recordings, don't need RAID*

*> at all. They don't change often, so a really good backup is much more*

*> important than RAID.*

* *

*To a point, yes. But with a RAID10 then the server can remain up even with
two disks failing. Then I just need to replace the disks and can leave the
machine up until then. If I just stretched the data across multiple disks
using LVM then the whole setup would fail as soon as one drive died. And
while it's possible to place different types of files in different
directories, thus bypassing that problem, it still causes more problems
because during the time when the drive is unavailable any attempts to write
to that location will fail, or will get written to the root drive which
just causes extra problems when trying to recover to a new drive. I'll
stick with the peace of mind of uptime that a RAID10 provides. *

* *

*> There are companies that make cases for disks. Poor-man's arrays. The *

*> cases of lots of rails and a big power supply. Most of them then just*

*> have e-SATA ports on the back that you can connect to a PC's eSATA*

*> adapter cards (which you will need since PC's usually have 4 or less *

*> SATA ports on the motherboard. Also if you use an eSATA adapter then*

*> the drives are hot-swappable (if not in use!).*

* *

*Which motherboards are you looking at? The last few I've looked at have
had usually at least 6 SATA ports (if you include the 2 6Gb ports) and
often have 8 due to an extra pair of 6 Gb ports linked to a separate
controller chip. Although the eSATA idea does have merit, and I thank you
for that.*

* *

*> I did enjoy ZFS features a lot and hope that Linux's home-grown BtrFS*

*> will get stable and mature soon, since BtrFS will pretty much match ZFS*

*> for features when it's done same day. Snapshots are the number one*

*> feature of ZFS and BtrFS! *

*[ more of ZFS deleted to save space ]*

* *

*That's interesting. I have not personally looked at ZFS except as a way to
obtain deduplication. If there's a way to make Deduplication work on other
file systems (and not using the script to replace files with links, but
actual deduplication of individual data blocks) then I'll look into that
for my system and for the work box. Do you recall where you saw the
"opendedup" project? I'd like to look into that!*

* *

*> Again, the file system you end up choosing is going to depend entirely*

*> on exactly what he's using it for. *

* *

*As I said above, legally downloaded Movies (and DVD ISOs that I ripped
from disks I own), software (including games, of course) and MP3s will make
the bulk of the storage. The MythTV box will be a separate box with it's
own HDD, and I'll probably stick with a RAID1 on that one. Say, two 1TB
HDDs (I can use one of the two from my old home server, the one that's not
flaking out) and buy a 2nd to mirror it onto. *

* *

*Thanks for the help/suggestions on this! You've given me some real help
and good ammunition to use against the big man when he insists things be
done a certain way. :)*

* *
*--- Dan Egli*


On Sat, Mar 16, 2013 at 8:39 PM, Michael Torrie <torriem at gmail.com> wrote:

> On 03/16/2013 03:04 AM, Dan Egli wrote:
> >> For a home server I recommend RAID1 or RAID10 over RAID6.
> >
> > Really? I guess between RAID6 and RAID10 it's not much different, but
> what
> > about someone who has say six or eight disks in the server? I'm curious
> why
> > you'd still recommend RAID10? Hypothetically speaking, let's assume I
> > wanted to have a server big enough to hold 1 year of downloaded data from
> > the net, downloading at approx 5Mbps (with TCP overhead, that comes to
> > approx 1 MB every 2 seconds) 24/7/365. That's nearly 16 TB. A RAID6 could
> > handle that with 6 drives, 4TB each. A raid10 would need 8 drives. I
> admit
> > each is possible to throw into a full tower case, but why spend the extra
> > money on two more drives, making the two raid10s? I am genuinely curious.
>
> First of all, what do you need all that space for?  The kind of data you
> store really dictates what kind of setup you need.
>
> Disks are cheap and if the data is that important to you, then buying
> twice as many disks as your capacity needs is really not that big of a
> deal.  The cost of 6 disks vs 8 disks is negligible, compared to the
> peace of mind that the RAID-10 can bring you.
>
> Furthermore, if the data is that important to you, you will need a
> backup system, which at minimum is at least another set of disks, that
> receive a full set of data periodically, and are stored off-line.
>
> Secondly you'll need a very large power supply and a SATA expansion card
> if you're really planning to stuff that many disks in a box.  There's a
> reason why people often buy a SAN array box with its own (redundant)
> power supplies.
>
> I'm unsure as to whether you are now talking about your own personal
> server which will go into a house, or your work project.
>
> For a home system, I think most people are served well by just two disks
> in RAID-1 configuration, plus a set of backup disks.  Very large sets of
> data like photos, movies, and maybe MythTV recordings, don't need RAID
> at all.  They don't change often, so a really good backup is much more
> important than RAID.
>
>
> > Well, that's not really an issue because I finally realized I could break
> > my boss down by using some basic math. I showed him using basic
> > multiplication how long it would take to fill the 120TB array he wanted
> > (more than eight years to reach 25% capacity) and he FINALLY agreed that
> we
> > could do it much cheaper and easier by building a full tower PC and
> filling
> > it with Hard Disk Drives. So we're going to order the parts soon. Thank
> > goodness for that. I'm still not sure which chassis he wanted. I think he
> > was thinking of going to a company like Aberdeen or someone. I have
> > insufficent experience to state whether or not that was a good idea, but
> > thankfully it's a moot point now. I imagine we can fit about 10 disks in
> a
> > large case (I have to do some research on cases to find the one that will
> > let us hold as many hard disks as we can), and make a raid out of them.
>
> There are companies that make cases for disks.  Poor-man's arrays.  The
> cases have lots of rails and a big power supply.  Most of them then just
> have e-SATA ports on the back that you can connect to a PC's eSATA
> adapter cards (which you will need since PC's usually have 4 or less
> SATA ports on the motherboard.  Also if you use an eSATA adapter, then
> the drives are hot-swappable (if not in use!).
>
> Here're a couple of ideas:
>
> http://www.istarusa.com/istarusa/product_speclist.php?series=Tower&sub=JBOD%20CASE
>
> http://www.granitedigital.com/SATAproseries8x.aspx
>
> For inter-box connections eSATA connectors are better than SATA because
> the connector has a clip to keep it plugged in, whereas most SATA cables
> are just held in with friction.
>
> >> it for years on Solaris without issue). But I'm not sure of the status
> >> of the zfs-on-linux project.
> >
> > So what would you use? Be aware that he's REALLY keen on using a file
> > system that includes journaling and data-deduplication. I don't know how
> > easy it's going to be to change his mind. It took near a week of
> arguments
> > before I got him to abandon the rackmount server idea. I'm well aware of
> > many of the advantages of file systems like Ext4 and JFS. But try
> > convincing my boss on that. He's one of those people who hears about some
> > new idea, likes it, and wants it implemented, despite not knowing how it
> > works internally or what would be involved in the implementation.
>
> I did enjoy ZFS features a lot and hope that Linux's home-grown BtrFS
> will get stable and mature soon, since BtrFS will pretty much match ZFS
> for features when it's done some day.  Snapshots are the number one
> feature of ZFS and BtrFS!  For a file server serving thousands of users,
> have very cheap snapshots allowing users to see their own files over the
> last 7 days was really slick.  I used to snapshot every night for the
> last week, then every month for the last year.   Because of the COW
> nature of ZFS, these snapshots only cost in size the difference between
> the snapshot and the current version of the file.
>
> Anyway ZFS is not a journaling filesystem (neither is BtrFS).  They
> simply don't need journals.  They are copy-on-write file systems, which
> means they are always consistent and after a failure, all you can lose
> are uncommitted blocks.
>
> Deduplication is something you can do at a much higher level.  For
> example, a script could find identical files and replace them with hard
> links.  There was an experimental project I saw once called, "opendedup"
> that was a FUSE filesystem that you could run any any underlying
> filesystem and do block-level deduplication somewhere without having to
> have a special physical file system.
>
> Again, the file system you end up choosing is going to depend entirely
> on exactly what he's using it for.  For example, in a home server, if my
> main storage needs were for MythTV, I would eschew any form of RAID and
> keep my disks formatted as individual volumes to Ext4 because MythTV
> treats all its storage as a big pool so there's no need to have one big
> file system across all the devices.
>
> >From what I can see so far, you really have 3 native linux choices:
> Ext4, XFS, and BtrFS.  Of those, XFS has been used on huge arrays for
> many years.  BtrFS might be stable enough for your use.  Ext4 is very
> stable and perfectly capable of being used on multi-terrabyte volumes.
> None of them have built in deduplication.
>
> /*
> PLUG: http://plug.org, #utah on irc.freenode.net
> Unsubscribe: http://plug.org/mailman/options/plug
> Don't fear the penguin.
> */
>


More information about the PLUG mailing list