Monday, March 19, 2012

Defunct Drives Issue

We've been struggling with a problem for a while now. If anyone has has a
similiar issue, I'd appreciate it if you could share it here as it may lead
me to a solution... The Cluster houses SQL and IIS (bad I know, but it
shouldn't cause the problems we see)
We have the following Cluster hardware:
2 IBM x345 Servers
1 IBM ServeRAID 4MX (RAID 5)
Basically, whenever we have both machines connected to the cluster, at some
point (sometimes days, sometimes weeks) a failure will occur where the
Clustered drives (Data and Quorum) will become defunct. Bringing them back
online and restarting, etc works fine (but this takes a while and always
with risk).
I've been working with IBM for months now to try to troubleshoot this but
nothing has helped to make this a "highly available" environment.
From your description it looks like this is a SCSI cluster. Can you give a
complete description of the SCSI device as well as the physical(RAID) and
logical(LUN) disk layouts for the cluster. I have an idea where your
problem might be, but I need more information to be sure.
Geoff N. Hiten
Microsoft SQL Server MVP
Senior Database Administrator
"Joel" <joelmacaluso@.hotmail.com> wrote in message
news:%23qfT7o%23MFHA.4028@.tk2msftngp13.phx.gbl...
> We've been struggling with a problem for a while now. If anyone has has a
> similiar issue, I'd appreciate it if you could share it here as it may
> lead me to a solution... The Cluster houses SQL and IIS (bad I know, but
> it shouldn't cause the problems we see)
> We have the following Cluster hardware:
> 2 IBM x345 Servers
> 1 IBM ServeRAID 4MX (RAID 5)
> Basically, whenever we have both machines connected to the cluster, at
> some point (sometimes days, sometimes weeks) a failure will occur where
> the Clustered drives (Data and Quorum) will become defunct. Bringing them
> back online and restarting, etc works fine (but this takes a while and
> always with risk).
> I've been working with IBM for months now to try to troubleshoot this but
> nothing has helped to make this a "highly available" environment.
>
|||Thanks Geoff,
Here's what I think you are looking for:
1.)Both servers are equipped with 2x18GB (Array A) mirrored. They are
connected to the internal channel of the IBM4MX SCSI. This is the logical C:
and D: drives.
The external Channel 1 of the Raid controller connects to the shared scsi.
2.)Array B = 2x18 GB mirrored = Q drive (Quorum) slots 13-14. (Physical
device is a shared IBM SCSI storage array)
3)Array C= 5x18 Raid 5 = S drive (Shared) slots 0-4 (Physical device is
shared IBM SCSI storage array)
Summarized:
LUNs= Q: and S: [Storage Array-Arrays B&C]
C: and D: internal Server [Array A]
"Geoff N. Hiten" <SRDBA@.Careerbuilder.com> wrote in message
news:uiCzd1ANFHA.2136@.TK2MSFTNGP14.phx.gbl...
> From your description it looks like this is a SCSI cluster. Can you give
> a complete description of the SCSI device as well as the physical(RAID)
> and logical(LUN) disk layouts for the cluster. I have an idea where your
> problem might be, but I need more information to be sure.
> Geoff N. Hiten
> Microsoft SQL Server MVP
> Senior Database Administrator
>
> "Joel" <joelmacaluso@.hotmail.com> wrote in message
> news:%23qfT7o%23MFHA.4028@.tk2msftngp13.phx.gbl...
>
|||Looks like there is a problem sharing a controller between the clustered
resource and the local disk resources. Make the vendor show you where this
is a certified cluster solution. I don't thing shared controllers is
supported.
Any way you slice it, you will get very poor performance from a SCSI storage
array in a clustered environment using RAID5 containers. Clustering
requires that the controllers operate in direct-write mode (no write cache)
so RAID5 is extremely slow.
Geoff N. Hiten
Microsoft SQL Server MVP
Senior Database Administrator
"Joel" <joelmacaluso@.hotmail.com> wrote in message
news:Of$K0BHNFHA.244@.tk2msftngp13.phx.gbl...
> Thanks Geoff,
> Here's what I think you are looking for:
> 1.)Both servers are equipped with 2x18GB (Array A) mirrored. They are
> connected to the internal channel of the IBM4MX SCSI. This is the logical
> C: and D: drives.
> The external Channel 1 of the Raid controller connects to the shared scsi.
> 2.)Array B = 2x18 GB mirrored = Q drive (Quorum) slots 13-14. (Physical
> device is a shared IBM SCSI storage array)
> 3)Array C= 5x18 Raid 5 = S drive (Shared) slots 0-4 (Physical device is
> shared IBM SCSI storage array)
> Summarized:
> LUNs= Q: and S: [Storage Array-Arrays B&C]
> C: and D: internal Server [Array A]
>
> "Geoff N. Hiten" <SRDBA@.Careerbuilder.com> wrote in message
> news:uiCzd1ANFHA.2136@.TK2MSFTNGP14.phx.gbl...
>
|||Xref: TK2MSFTNGP08.phx.gbl microsoft.public.sqlserver.clustering:17788
This looks identical in concept to HP prepackaged cluster setup that I am
using in an extremely similar fashion (Internal Mirrors are C: only). Have
not have any problems similar to that in 7 months of running.
"Geoff N. Hiten" <sqlcraftsman@.gmail.com> wrote in message
news:ezGgdnfNFHA.3512@.TK2MSFTNGP15.phx.gbl...
> Looks like there is a problem sharing a controller between the clustered
> resource and the local disk resources. Make the vendor show you where
this
> is a certified cluster solution. I don't thing shared controllers is
> supported.
> Any way you slice it, you will get very poor performance from a SCSI
storage
> array in a clustered environment using RAID5 containers. Clustering
> requires that the controllers operate in direct-write mode (no write
cache)[vbcol=seagreen]
> so RAID5 is extremely slow.
>
> Geoff N. Hiten
> Microsoft SQL Server MVP
> Senior Database Administrator
> "Joel" <joelmacaluso@.hotmail.com> wrote in message
> news:Of$K0BHNFHA.244@.tk2msftngp13.phx.gbl...
logical[vbcol=seagreen]
scsi.[vbcol=seagreen]
give[vbcol=seagreen]
your[vbcol=seagreen]
has[vbcol=seagreen]
may[vbcol=seagreen]
where[vbcol=seagreen]
while
>

No comments:

Post a Comment