hard drive - WD1000FYPS harddrive is marked 0 mb in 3ware (and no SMART)

hard-drive raid smart

01
2013-08

osgx

After reboot my SATA 1TB WD1000FYPS (previously is was "Drive error") is marked 0 mb in 3ware web gui.

Complete message:

Available Drives (Controller ID 0)
Port 1  WDC WD1000FYPS-01ZKB0   0.00 MB NOT SUPPORTED   [Remove Drive]

SMART gives me only Device Model and ATA protocol version 1 (not 7-8 as it must be for SATA)

What does it mean?

Just before reboot, when is was marked only with "Device Error", smart was:

Device Model:     WDC WD1000FYPS-01ZKB0
Serial Number:    WD-WCASJ1130***
Firmware Version: 02.01B01
User Capacity:    1,000,204,886,016 bytes
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   8
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Sun Mar  7 18:47:35 2010 MSK
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

SMART overall-health self-assessment test result: PASSED

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0003   188   186   021    Pre-fail  Always       -       7591
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       229
  5 Reallocated_Sector_Ct   0x0033   199   199   140    Pre-fail  Always       -       3
  7 Seek_Error_Rate         0x000e   193   193   000    Old_age   Always       -       125
  9 Power_On_Hours          0x0032   078   078   000    Old_age   Always       -       16615
 10 Spin_Retry_Count        0x0012   100   100   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0012   100   253   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       77
192 Power-Off_Retract_Count 0x0032   198   198   000    Old_age   Always       -       1564
193 Load_Cycle_Count        0x0032   146   146   000    Old_age   Always       -       164824
194 Temperature_Celsius     0x0022   117   100   000    Old_age   Always       -       35
196 Reallocated_Event_Count 0x0032   199   199   000    Old_age   Always       -       1
197 Current_Pending_Sector  0x0012   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   200   200   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0

What can be wrong with he? Can it be restored?

new smart is

=== START OF INFORMATION SECTION ===
Device Model:     WDC WD1000FYPS-01ZKB0
Serial Number:    [No Information Found]
Firmware Version: [No Information Found]
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   1
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Mon Mar  8 00:29:44 2010 MSK
SMART is only available in ATA Version 3 Revision 3 or greater.
We will try to proceed in spite of this.
SMART support is: Ambiguous - ATA IDENTIFY DEVICE words 82-83 don't show if SMART supported.
                  Checking for SMART support by trying SMART ENABLE command.
Command failed, ata.status=(0x00), ata.command=(0x51), ata.flags=(0x01)
Error SMART Enable failed: Input/output error
                  SMART ENABLE failed - this establishes that this device lacks SMART functionality.
A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options.

PPS There was a rapid grow of " 192 Power-Off_Retract_Count " before dying. The hard was used in raid, with several hards from the same fabric packaging box (close id's). The hard drives were placed identically. Rapid means almost linear grow from 300 to 1700 in 6-7 hours. Maximal temperature was 41C. (thanks to munin's smart monitoring)

UPDATE

On the harddrive's PCB (on bottom) I have found contact pads with unusual colors. The most pads (not soldered) are Yellow, but some are blue and some are somewhere between orange and red. The max temperature for the drive was 42-43 Celsius. The 2 drives, which was next to the died one is normal, all unsoldered pads are yellow.

The harddrive was used for 2 years in RAID with rather big load.

Answers

Alexander Burke

The drive has failed. RMA it back to WD.

view all most popular Amazon Coupons
.

Related Question

smart - hard drive pending sector count

hard-drive smart sectors

Matthew

For some reason, my pending sector count to be remapped is unbelievably high (2163 currently). I've seen it go up 20 in one week. But no sectors have been remapped. Dell's computer diagnostics utility reported no problems, smartctl -H returned PASSED, and I have yet to notice any problems with the hard drive.

So do I need to worry about such a high pending count?

Here are the results of smartctl -A:

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   100   100   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0007   252   252   025    Pre-fail  Always       -       2062
  4 Start_Stop_Count        0x0032   097   097   000    Old_age   Always       -       36147
  5 Reallocated_Sector_Ct   0x0033   252   252   010    Pre-fail  Always       -       0
  9 Power_On_Hours          0x0032   095   095   000    Old_age   Always       -       3261
 12 Power_Cycle_Count       0x0032   098   098   000    Old_age   Always       -       2087
191 G-Sense_Error_Rate      0x0032   002   002   000    Old_age   Always       -       999999
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       47
194 Temperature_Celsius     0x0022   127   094   000    Old_age   Always       -       37 (Lifetime Min/Max 13/48)
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       191990
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       2163
198 Offline_Uncorrectable   0x0030   100   100   000    Old_age   Offline      -       19080
199 UDMA_CRC_Error_Count    0x0036   252   252   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x000a   252   252   000    Old_age   Always       -       0

Edit:

The disk is about 1 1/2 years old. Pending Sector Count was about 2000 when I started keeping an eye on it 2 weeks ago. I have never noticed any problems with the disk. If it makes any difference, I have a Dell M1530 dual boot Vista-Ubuntu. The hard drive is a Samsung HM160HI.

Edit:

Apparently half the problem was that I didn't (still somewhat don't) know how to interpret the data.

Thanks to everyone who gave me feedback.

Related Answers

harrymc

Your Current Pending Sector Count (2163) is higher than the Reallocation Sector Count (252).
This means that failing sectors can no longer be replaced by the disk firmware.
The disk is failing - make sure you've backups, and get a replacement..

raven

If the drive is under warranty, send it back for replacement. On a stable drive, that number should be 0, just like the Reallocated Sectors Count.

nik

From your data dump, the SMART Attribute value shows 100.
Therefore, this is not a problem flagged by SMART either.

Update: That 100 is an attribute -- it just indicates the health-status, not the count.
The worst value had been 100 too -- so, it never went lower.

For example, look at ID# 194, the temperature,
Raw value is 37, Attribute value is 127 and worst went in 90s.
Nothing to worry there too -- just an example on how to interpret attributes.
Again, the attribute value does not suggest your drive is running at 127C.

Couple of points from Wikipedia.

The inability to read some sectors is not always an indication that a drive is about to fail. One way that unreadable sectors may be created, even when the drive is functioning within specification, is through a sudden power failure while the drive is writing. In order to prevent this problem, modern hard drives will always finish writing at least the current sector immediately after the power fails (typically using rotational energy from the disk). Also, even if the physical disk is damaged at one location, such that a certain sector is unreadable, the disk may be able to use spare space to replace the bad area, so that the sector can be overwritten.

Number of "unstable" sectors (waiting to be remapped, because of read errors). If an unstable sector is subsequently written or read successfully, this value is decreased and the sector is not remapped. Read errors on a sector will not remap the sector (since it might be readable later); instead, the drive firmware remembers that the sector needs to be remapped, and remaps it the next time it's written.

Further on the down vote and comment.

A raw count at Current Pending Sectors usually implies sectors that are sort of written-off by the drive. This could be for various reasons that do not always imply an impeding disk failure.
If the raw count keeps increasing at regular intervals (days/weeks) it would then suggest a likely full disk failure. For example, do you recall (or have stored data) from an earlier check that shows this count to be lower or zero?

Home

hard drive - WD1000FYPS harddrive is marked 0 mb in 3ware (and no SMART)