Wednesday, March 28, 2012

I/O issues after SP4 upgrade

Hi,
We have upgraded 3 of our SQL Servers to SP4 in the past 2 weeks. We
monitor these servers with Veritas, and we have noticed that all 3 servers
have shown a marked increase (around 4x) in Disk Sec/Transfer, Disk
Sec/Write, and Disk Queue Length.
For example, on one server, Disk Writes and Disk Transfers were taking on
average 7 ms to complete (with SP3a). Post-SP4, they are taking
approximately 30 ms on average. The queue length has jumped from around 3
to about 15. [I know the avg disk queue length was a bit high to start
with - we are working with our SAN vendor to get this figured out]
Queries are still executing in about the same time, CPU utilization is about
the same, and we see no other negative effects. But we're wondering why the
I/O is so much worse than when we were running SP3a. We changed nothing
besides moving to SP4. And we did install the post-SP4 AWE patch as well.
Has anyone run into this before or know of any articles that touch on
possible causes?
--
Regards,
Jake Marx
www.longhead.com
[please keep replies in the newsgroup - email address unmonitored]Have the query plans changed? Is the increased I/O due to more table scans?
Did you run sp_updatestats?
--
Andrew J. Kelly SQL MVP
"Jake Marx" <msnews@.longhead.com> wrote in message
news:OM%23R2MCVGHA.4348@.TK2MSFTNGP09.phx.gbl...
> Hi,
> We have upgraded 3 of our SQL Servers to SP4 in the past 2 weeks. We
> monitor these servers with Veritas, and we have noticed that all 3 servers
> have shown a marked increase (around 4x) in Disk Sec/Transfer, Disk
> Sec/Write, and Disk Queue Length.
> For example, on one server, Disk Writes and Disk Transfers were taking on
> average 7 ms to complete (with SP3a). Post-SP4, they are taking
> approximately 30 ms on average. The queue length has jumped from around 3
> to about 15. [I know the avg disk queue length was a bit high to start
> with - we are working with our SAN vendor to get this figured out]
> Queries are still executing in about the same time, CPU utilization is
> about the same, and we see no other negative effects. But we're wondering
> why the I/O is so much worse than when we were running SP3a. We changed
> nothing besides moving to SP4. And we did install the post-SP4 AWE patch
> as well. Has anyone run into this before or know of any articles that
> touch on possible causes?
> --
> Regards,
> Jake Marx
> www.longhead.com
>
> [please keep replies in the newsgroup - email address unmonitored]
>|||Hi Andrew,
Andrew J. Kelly wrote:
> Have the query plans changed? Is the increased I/O due to more table
> scans? Did you run sp_updatestats?
Thanks for the suggestions. I did not run sp_updatestats - I just did it on
one of the less-busy servers to see if it makes a difference. The query
plans are not different from what I have seen so far, but I will keep
looking. Table scans have come down a bit since upgrading to SP4.
The strange thing (to me) is that overall I/O activity is about the same;
reads/writes per sec is pretty constant. It's just the duration of each I/O
operation and the disk queue length that went up. And this happened only on
the data drive - the trans log and local (tempdb) drives' queue lengths
actually decreased.
--
Regards,
Jake Marx
www.longhead.com
[please keep replies in the newsgroup - email address unmonitored]
>> We have upgraded 3 of our SQL Servers to SP4 in the past 2 weeks. We
>> monitor these servers with Veritas, and we have noticed that all 3
>> servers have shown a marked increase (around 4x) in Disk
>> Sec/Transfer, Disk Sec/Write, and Disk Queue Length.
>> For example, on one server, Disk Writes and Disk Transfers were
>> taking on average 7 ms to complete (with SP3a). Post-SP4, they are
>> taking approximately 30 ms on average. The queue length has jumped
>> from around 3 to about 15. [I know the avg disk queue length was a
>> bit high to start with - we are working with our SAN vendor to get
>> this figured out] Queries are still executing in about the same time, CPU
>> utilization
>> is about the same, and we see no other negative effects. But we're
>> wondering why the I/O is so much worse than when we were running
>> SP3a. We changed nothing besides moving to SP4. And we did install
>> the post-SP4 AWE patch as well. Has anyone run into this before or
>> know of any articles that touch on possible causes?|||There should be very few table scans if they are optimized properly
regardless of the SP. I haven't seen anything yet that indicates SP4 itself
caused increased I/O times but anything is possible. Any chance you were
monitoring fn_virtualfilestats() before and after? If so do you notice
increased bytes read or written or just increased IOStallms?
--
Andrew J. Kelly SQL MVP
"Jake Marx" <msnews@.longhead.com> wrote in message
news:%23KCJtvNVGHA.4764@.TK2MSFTNGP10.phx.gbl...
> Hi Andrew,
> Andrew J. Kelly wrote:
>> Have the query plans changed? Is the increased I/O due to more table
>> scans? Did you run sp_updatestats?
> Thanks for the suggestions. I did not run sp_updatestats - I just did it
> on one of the less-busy servers to see if it makes a difference. The
> query plans are not different from what I have seen so far, but I will
> keep looking. Table scans have come down a bit since upgrading to SP4.
> The strange thing (to me) is that overall I/O activity is about the same;
> reads/writes per sec is pretty constant. It's just the duration of each
> I/O operation and the disk queue length that went up. And this happened
> only on the data drive - the trans log and local (tempdb) drives' queue
> lengths actually decreased.
> --
> Regards,
> Jake Marx
> www.longhead.com
>
> [please keep replies in the newsgroup - email address unmonitored]
>
>> We have upgraded 3 of our SQL Servers to SP4 in the past 2 weeks. We
>> monitor these servers with Veritas, and we have noticed that all 3
>> servers have shown a marked increase (around 4x) in Disk
>> Sec/Transfer, Disk Sec/Write, and Disk Queue Length.
>> For example, on one server, Disk Writes and Disk Transfers were
>> taking on average 7 ms to complete (with SP3a). Post-SP4, they are
>> taking approximately 30 ms on average. The queue length has jumped
>> from around 3 to about 15. [I know the avg disk queue length was a
>> bit high to start with - we are working with our SAN vendor to get
>> this figured out] Queries are still executing in about the same time,
>> CPU utilization
>> is about the same, and we see no other negative effects. But we're
>> wondering why the I/O is so much worse than when we were running
>> SP3a. We changed nothing besides moving to SP4. And we did install
>> the post-SP4 AWE patch as well. Has anyone run into this before or
>> know of any articles that touch on possible causes?
>|||Hi Andrew,
Andrew J. Kelly wrote:
> There should be very few table scans if they are optimized properly
> regardless of the SP. I haven't seen anything yet that indicates SP4
> itself caused increased I/O times but anything is possible. Any
> chance you were monitoring fn_virtualfilestats() before and after? If so
> do you notice increased bytes read or written or just increased
> IOStallms?
Most of the table/index scans are on very small tables (some of which are
heaps). There aren't any table scans on larger tables, and there are very
few index scans on larger tables (we're working on removing those as we
can). Full scans/sec seem to hover around 3. We have several hundred
concurrent users on this db.
No, I haven't used fn_virtualfilestats() before. Just ran it, and it seems
useful - thanks for the tip. IoStallMS / NumberReads + NumberWrites) is
less than 6 for all files, so that seems OK based on the documentation I've
seen.
--
Regards,
Jake Marx
www.longhead.com
[please keep replies in the newsgroup - email address unmonitored]
>> Andrew J. Kelly wrote:
>> Have the query plans changed? Is the increased I/O due to more
>> table scans? Did you run sp_updatestats?
>> Thanks for the suggestions. I did not run sp_updatestats - I just
>> did it on one of the less-busy servers to see if it makes a
>> difference. The query plans are not different from what I have seen
>> so far, but I will keep looking. Table scans have come down a bit
>> since upgrading to SP4. The strange thing (to me) is that overall I/O
>> activity is about the
>> same; reads/writes per sec is pretty constant. It's just the
>> duration of each I/O operation and the disk queue length that went
>> up. And this happened only on the data drive - the trans log and
>> local (tempdb) drives' queue lengths actually decreased.
>> --
>> Regards,
>> Jake Marx
>> www.longhead.com
>>
>> [please keep replies in the newsgroup - email address unmonitored]
>>
>> We have upgraded 3 of our SQL Servers to SP4 in the past 2 weeks. We
>> monitor these servers with Veritas, and we have noticed that
>> all 3 servers have shown a marked increase (around 4x) in Disk
>> Sec/Transfer, Disk Sec/Write, and Disk Queue Length.
>> For example, on one server, Disk Writes and Disk Transfers were
>> taking on average 7 ms to complete (with SP3a). Post-SP4, they are
>> taking approximately 30 ms on average. The queue length has jumped
>> from around 3 to about 15. [I know the avg disk queue length was a
>> bit high to start with - we are working with our SAN vendor to get
>> this figured out] Queries are still executing in about the same
>> time, CPU utilization
>> is about the same, and we see no other negative effects. But we're
>> wondering why the I/O is so much worse than when we were running
>> SP3a. We changed nothing besides moving to SP4. And we did
>> install the post-SP4 AWE patch as well. Has anyone run into this
>> before or know of any articles that touch on possible causes?|||The filestats will let you know if you are actually reading / writing more
data or simply getting increased access times with the same data.
--
Andrew J. Kelly SQL MVP
"Jake Marx" <msnews@.longhead.com> wrote in message
news:eTDfwOOVGHA.4864@.TK2MSFTNGP12.phx.gbl...
> Hi Andrew,
> Andrew J. Kelly wrote:
>> There should be very few table scans if they are optimized properly
>> regardless of the SP. I haven't seen anything yet that indicates SP4
>> itself caused increased I/O times but anything is possible. Any
>> chance you were monitoring fn_virtualfilestats() before and after? If so
>> do you notice increased bytes read or written or just increased
>> IOStallms?
> Most of the table/index scans are on very small tables (some of which are
> heaps). There aren't any table scans on larger tables, and there are very
> few index scans on larger tables (we're working on removing those as we
> can). Full scans/sec seem to hover around 3. We have several hundred
> concurrent users on this db.
> No, I haven't used fn_virtualfilestats() before. Just ran it, and it
> seems useful - thanks for the tip. IoStallMS / NumberReads +
> NumberWrites) is less than 6 for all files, so that seems OK based on the
> documentation I've seen.
> --
> Regards,
> Jake Marx
> www.longhead.com
>
> [please keep replies in the newsgroup - email address unmonitored]
>> Andrew J. Kelly wrote:
>> Have the query plans changed? Is the increased I/O due to more
>> table scans? Did you run sp_updatestats?
>> Thanks for the suggestions. I did not run sp_updatestats - I just
>> did it on one of the less-busy servers to see if it makes a
>> difference. The query plans are not different from what I have seen
>> so far, but I will keep looking. Table scans have come down a bit
>> since upgrading to SP4. The strange thing (to me) is that overall I/O
>> activity is about the
>> same; reads/writes per sec is pretty constant. It's just the
>> duration of each I/O operation and the disk queue length that went
>> up. And this happened only on the data drive - the trans log and
>> local (tempdb) drives' queue lengths actually decreased.
>> --
>> Regards,
>> Jake Marx
>> www.longhead.com
>>
>> [please keep replies in the newsgroup - email address unmonitored]
>>
>> We have upgraded 3 of our SQL Servers to SP4 in the past 2 weeks. We
>> monitor these servers with Veritas, and we have noticed that
>> all 3 servers have shown a marked increase (around 4x) in Disk
>> Sec/Transfer, Disk Sec/Write, and Disk Queue Length.
>> For example, on one server, Disk Writes and Disk Transfers were
>> taking on average 7 ms to complete (with SP3a). Post-SP4, they are
>> taking approximately 30 ms on average. The queue length has jumped
>> from around 3 to about 15. [I know the avg disk queue length was a
>> bit high to start with - we are working with our SAN vendor to get
>> this figured out] Queries are still executing in about the same
>> time, CPU utilization
>> is about the same, and we see no other negative effects. But we're
>> wondering why the I/O is so much worse than when we were running
>> SP3a. We changed nothing besides moving to SP4. And we did
>> install the post-SP4 AWE patch as well. Has anyone run into this
>> before or know of any articles that touch on possible causes?
>|||Hi Andrew,
Andrew J. Kelly wrote:
> The filestats will let you know if you are actually reading / writing
> more data or simply getting increased access times with the same data.
Thanks. According to perfmon before and after SP4 installation, the I/O
load is about the same now as it was then (approx the same # of writes/sec
and reads/sec) - it's just the sec/write and sec/transfer (along with the
disk queue length) that are higher.
I'm not sure what else to look at. I know that SP4 reports I/O
latching/locking differently than SP3a, but that shouldn't affect perfmon OS
stats, right? I'd imagine those are collected directly by the disk
subsystem and that SQL would have nothing to do with that collection...
Thanks again for your help. I'm going through all the query plans now to
make sure nothing has changed.
--
Regards,
Jake Marx
www.longhead.com
[please keep replies in the newsgroup - email address unmonitored]
>> Andrew J. Kelly wrote:
>> There should be very few table scans if they are optimized properly
>> regardless of the SP. I haven't seen anything yet that indicates
>> SP4 itself caused increased I/O times but anything is possible. Any
>> chance you were monitoring fn_virtualfilestats() before and after?
>> If so do you notice increased bytes read or written or just
>> increased IOStallms?
>> Most of the table/index scans are on very small tables (some of
>> which are heaps). There aren't any table scans on larger tables,
>> and there are very few index scans on larger tables (we're working
>> on removing those as we can). Full scans/sec seem to hover around
>> 3. We have several hundred concurrent users on this db.
>> No, I haven't used fn_virtualfilestats() before. Just ran it, and it
>> seems useful - thanks for the tip. IoStallMS / NumberReads +
>> NumberWrites) is less than 6 for all files, so that seems OK based
>> on the documentation I've seen.
>>
>> Andrew J. Kelly wrote:
>> Have the query plans changed? Is the increased I/O due to more
>> table scans? Did you run sp_updatestats?
>> Thanks for the suggestions. I did not run sp_updatestats - I just
>> did it on one of the less-busy servers to see if it makes a
>> difference. The query plans are not different from what I have
>> seen so far, but I will keep looking. Table scans have come down
>> a bit since upgrading to SP4. The strange thing (to me) is that
>> overall I/O activity is about the
>> same; reads/writes per sec is pretty constant. It's just the
>> duration of each I/O operation and the disk queue length that went
>> up. And this happened only on the data drive - the trans log and
>> local (tempdb) drives' queue lengths actually decreased.
>> We have upgraded 3 of our SQL Servers to SP4 in the past 2
>> weeks. We monitor these servers with Veritas, and we have
>> noticed that all 3 servers have shown a marked increase (around 4x)
>> in Disk
>> Sec/Transfer, Disk Sec/Write, and Disk Queue Length.
>> For example, on one server, Disk Writes and Disk Transfers were
>> taking on average 7 ms to complete (with SP3a). Post-SP4, they
>> are taking approximately 30 ms on average. The queue length has
>> jumped from around 3 to about 15. [I know the avg disk queue
>> length was a bit high to start with - we are working with our
>> SAN vendor to get this figured out] Queries are still executing
>> in about the same time, CPU utilization
>> is about the same, and we see no other negative effects. But
>> we're wondering why the I/O is so much worse than when we were
>> running SP3a. We changed nothing besides moving to SP4. And we
>> did install the post-SP4 AWE patch as well. Has anyone run into
>> this before or know of any articles that touch on possible
>> causes?|||The sp should not affect perfmon stats for I/O. Are you sure there wasn't
anything else done around that same timeframe that may have affected this?
Any changes to the SAN or the OS? I wish I could offer something better
than that but this is not something I have heard happening to others upon
adding SP4 (or any service pack for that matter).
--
Andrew J. Kelly SQL MVP
"Jake Marx" <msnews@.longhead.com> wrote in message
news:u7lZo02VGHA.4724@.TK2MSFTNGP09.phx.gbl...
> Hi Andrew,
> Andrew J. Kelly wrote:
>> The filestats will let you know if you are actually reading / writing
>> more data or simply getting increased access times with the same data.
> Thanks. According to perfmon before and after SP4 installation, the I/O
> load is about the same now as it was then (approx the same # of writes/sec
> and reads/sec) - it's just the sec/write and sec/transfer (along with the
> disk queue length) that are higher.
> I'm not sure what else to look at. I know that SP4 reports I/O
> latching/locking differently than SP3a, but that shouldn't affect perfmon
> OS stats, right? I'd imagine those are collected directly by the disk
> subsystem and that SQL would have nothing to do with that collection...
> Thanks again for your help. I'm going through all the query plans now to
> make sure nothing has changed.
> --
> Regards,
> Jake Marx
> www.longhead.com
>
> [please keep replies in the newsgroup - email address unmonitored]
>> Andrew J. Kelly wrote:
>> There should be very few table scans if they are optimized properly
>> regardless of the SP. I haven't seen anything yet that indicates
>> SP4 itself caused increased I/O times but anything is possible. Any
>> chance you were monitoring fn_virtualfilestats() before and after?
>> If so do you notice increased bytes read or written or just
>> increased IOStallms?
>> Most of the table/index scans are on very small tables (some of
>> which are heaps). There aren't any table scans on larger tables,
>> and there are very few index scans on larger tables (we're working
>> on removing those as we can). Full scans/sec seem to hover around
>> 3. We have several hundred concurrent users on this db.
>> No, I haven't used fn_virtualfilestats() before. Just ran it, and it
>> seems useful - thanks for the tip. IoStallMS / NumberReads +
>> NumberWrites) is less than 6 for all files, so that seems OK based
>> on the documentation I've seen.
>>
>> Andrew J. Kelly wrote:
>> Have the query plans changed? Is the increased I/O due to more
>> table scans? Did you run sp_updatestats?
>> Thanks for the suggestions. I did not run sp_updatestats - I just
>> did it on one of the less-busy servers to see if it makes a
>> difference. The query plans are not different from what I have
>> seen so far, but I will keep looking. Table scans have come down
>> a bit since upgrading to SP4. The strange thing (to me) is that
>> overall I/O activity is about the
>> same; reads/writes per sec is pretty constant. It's just the
>> duration of each I/O operation and the disk queue length that went
>> up. And this happened only on the data drive - the trans log and
>> local (tempdb) drives' queue lengths actually decreased.
>>> We have upgraded 3 of our SQL Servers to SP4 in the past 2
>>> weeks. We monitor these servers with Veritas, and we have
>>> noticed that all 3 servers have shown a marked increase (around 4x)
>>> in Disk
>>> Sec/Transfer, Disk Sec/Write, and Disk Queue Length.
>>>
>>> For example, on one server, Disk Writes and Disk Transfers were
>>> taking on average 7 ms to complete (with SP3a). Post-SP4, they
>>> are taking approximately 30 ms on average. The queue length has
>>> jumped from around 3 to about 15. [I know the avg disk queue
>>> length was a bit high to start with - we are working with our
>>> SAN vendor to get this figured out] Queries are still executing
>>> in about the same time, CPU utilization
>>> is about the same, and we see no other negative effects. But
>>> we're wondering why the I/O is so much worse than when we were
>>> running SP3a. We changed nothing besides moving to SP4. And we
>>> did install the post-SP4 AWE patch as well. Has anyone run into
>>> this before or know of any articles that touch on possible
>>> causes?
>|||Hi Andrew,
Andrew J. Kelly wrote:
> The sp should not affect perfmon stats for I/O. Are you sure there
> wasn't anything else done around that same timeframe that may have
> affected this? Any changes to the SAN or the OS? I wish I could
> offer something better than that but this is not something I have
> heard happening to others upon adding SP4 (or any service pack for
> that matter).
I know - that's why I find this so strange. I would think that maybe we had
done something else at the same time, but we have done the SP4 upgrade on
three separate servers at three separate times, and we see the increase in
perfmon I/O stats on each server beginning with the time we upgraded each
one. We are using an EMC Celerra over iSCSI for the data and log files on
all three servers, so maybe SP4 changed something in that respect? I
wouldn't think so. (?)
Thanks again for all of your help!
--
Regards,
Jake
>> Andrew J. Kelly wrote:
>> The filestats will let you know if you are actually reading /
>> writing more data or simply getting increased access times with the
>> same data.
>> Thanks. According to perfmon before and after SP4 installation, the
>> I/O load is about the same now as it was then (approx the same # of
>> writes/sec and reads/sec) - it's just the sec/write and sec/transfer
>> (along with the disk queue length) that are higher.
>> I'm not sure what else to look at. I know that SP4 reports I/O
>> latching/locking differently than SP3a, but that shouldn't affect
>> perfmon OS stats, right? I'd imagine those are collected directly
>> by the disk subsystem and that SQL would have nothing to do with
>> that collection... Thanks again for your help. I'm going through all
>> the query plans
>> now to make sure nothing has changed.
>> Andrew J. Kelly wrote:
>> There should be very few table scans if they are optimized
>> properly regardless of the SP. I haven't seen anything yet that
>> indicates SP4 itself caused increased I/O times but anything is
>> possible. Any chance you were monitoring fn_virtualfilestats()
>> before and after? If so do you notice increased bytes read or
>> written or just increased IOStallms?
>> Most of the table/index scans are on very small tables (some of
>> which are heaps). There aren't any table scans on larger tables,
>> and there are very few index scans on larger tables (we're working
>> on removing those as we can). Full scans/sec seem to hover around
>> 3. We have several hundred concurrent users on this db.
>> No, I haven't used fn_virtualfilestats() before. Just ran it, and
>> it seems useful - thanks for the tip. IoStallMS / NumberReads +
>> NumberWrites) is less than 6 for all files, so that seems OK based
>> on the documentation I've seen.
>>
>> Andrew J. Kelly wrote:
>>> Have the query plans changed? Is the increased I/O due to more
>>> table scans? Did you run sp_updatestats?
>> Thanks for the suggestions. I did not run sp_updatestats - I
>> just did it on one of the less-busy servers to see if it makes a
>> difference. The query plans are not different from what I have
>> seen so far, but I will keep looking. Table scans have come down
>> a bit since upgrading to SP4. The strange thing (to me) is that
>> overall I/O activity is about the
>> same; reads/writes per sec is pretty constant. It's just the
>> duration of each I/O operation and the disk queue length that
>> went up. And this happened only on the data drive - the trans
>> log and local (tempdb) drives' queue lengths actually decreased.
>>> We have upgraded 3 of our SQL Servers to SP4 in the past 2
>>> weeks. We monitor these servers with Veritas, and we have
>>> noticed that all 3 servers have shown a marked increase
>>> (around 4x) in Disk
>>> Sec/Transfer, Disk Sec/Write, and Disk Queue Length.
>>>
>>> For example, on one server, Disk Writes and Disk Transfers were
>>> taking on average 7 ms to complete (with SP3a). Post-SP4, they
>>> are taking approximately 30 ms on average. The queue length
>>> has jumped from around 3 to about 15. [I know the avg disk
>>> queue length was a bit high to start with - we are working
>>> with our SAN vendor to get this figured out] Queries are still
>>> executing
>>> in about the same time, CPU utilization
>>> is about the same, and we see no other negative effects. But
>>> we're wondering why the I/O is so much worse than when we were
>>> running SP3a. We changed nothing besides moving to SP4. And
>>> we did install the post-SP4 AWE patch as well. Has anyone run
>>> into this before or know of any articles that touch on possible
>>> causes?sql

No comments:

Post a Comment