Wednesday, March 28, 2012

I/O error 1450, Error 823 during heavy loads

Hi,

We're experiencing the following error regularly:
"I/O error 1450(Insufficient system resources
exist to complete the requested service.)"

We are running SQL2000 with 2GB memory, Windows 2000 AS, 4 CPUs. We are using a SAN storage connected by 2 Fibre cards. The databases range from 10GB to 400GB in size for decision support applications.

Although it seems clear that the disk subsystem is causing this error, our hosting party is blaming the application layer for this behaviour.

From the SQL server log:
I/O error 1450(Insufficient system resources exist to complete the requested service.) detected during read at offset 0x00000350860000 in file 'O:\DATA\SA_Data.MDF'..
Error: 823, Severity: 24, State: 2
I/O error 1450(Insufficient system resources exist to complete the requested service.) detected during read at offset 0x00000350862000 in file 'O:\DATA\SA_Data.MDF'..
Error: 823, Severity: 24, State: 2

From the event log:
[..]
dmio: Harddisk37 read error at block 23192007: status 0xc000009a
dmio: Harddisk35 read error at block 23192135: status 0xc000009a
dmio: Harddisk36 read error at block 23192127: status 0xc000009a
dmio: Harddisk36 read error at block 23192263: status 0xc000009a
[etc]
dmio: Disk Harddisk31 block 23193791 (mountpoint O:): Uncorrectable read error
[..]

Does anyone have any suggestions??

Desmondcheck this KB
http://support.microsoft.com/default.aspx?scid=kb;EN-US;274310|||Thanks, but this KB article applies to MS SQL Server 7; we're on SQL Server 2000, SP3.|||skipped that part sorry.|||Where should we look first? At the application, or the hardware infrastructure?

We have 'dynamic disks' configured on Windows, and use about 60+ LUNs.
Partly configured as stripeset on OS level (Software RAID) and partly grouped in SQL server filegroups, so that we maximize performance on the SAN storage.

It's is a decision support datawarehouse system, with a 400GB as the largest database.

Any suggestion is appreciated!|||You should do all the hardware diagnostics you can on each piece of hardware between the CPU and the disk (fibre cards, controller, disks, and even the fibre cable if the previous ones all come up clean). Error 823 is a very low level error, and can not be caused by an errant stored procedure, or query. Given even the limited amount of errors displayed here, I am sure Microsoft would be willing to participate in a conference call with you and the hosting company to review the errors, and get them to do the hardware diagnostics.|||Right now, the most likely root cause would be the low level of System Page Table Entries (SystemPTEs). This should, however, show up in the even log like this:
Stop 0x0000003F NO_MORE_SYSTEM_PTES
Stop 0x000000D8 DRIVER_USED_EXCESSIVE_PTES
which does not.

Looking at:
http://support.microsoft.com/default.aspx?scid=kb;en-us;828339
it should be caused by a lower level OS-related problem, possibly to do with
http://support.microsoft.com/default.aspx?scid=kb;en-us;555068

Please also refer to
http://support.microsoft.com/default.aspx?scid=kb;EN-US;247904
which describes what resources fall(System Page Table Entry , SystemPTE) below a certain threshold.
We are now monitoring our SystemPTE levels while the box is on heavy loads.

I'll keep you posted.

No comments:

Post a Comment