Potential hard-drive failure

07
2014-07
  • AaronDS

    This question already has an answer here:

  • Answers
  • lserni

    The disk seems good to me. "Pre-Fail" attributes are those that, if below (or equal to) threshold, may indicate pre-failure (imminent disk failure). "Old age" attributes are the ones that indicate normal wear and tear.

    So, a reallocated event count of 200/200 with threshold 000 ought to mean "No reallocated events", i.e., "No errors".

    This is what I read on my home unit:

      5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail  Always       -       1
     10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       3
     12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       105
    184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
    187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
    188 Command_Timeout         0x0032   100   096   000    Old_age   Always       -       4295098559
    197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
    198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
    199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
    

    As you can see, I got a sluggish command timeout sometime or other, but the other parameters are (touch wood!) read from a healthy disk.

    See also here.

    Anyway, check out Windows Event Viewer (eventvwr). If the hard disk has problems, even if they are not reported by SMART, you ought to see something in the event log relating to disk errors, or maybe filesystem errors. If you see nothing of the kind, you of course still have some problem - the system didn't slow down by itself! - but they are not disk problems.

    For example, once I experienced similar symptoms (only much worse). The hard disk was working... then sometimes it would log "hardware disk errors" that the SMART wasn't seeing at all. Windows signaled "Delayed write failed, data might have been lost" (and now that I come to think of it: did you see that message popping up? If the hard disk is going, you should have). I removed the disk and connected to another computer to run some tests. Everything worked. Sheesh. So I put the disk back. But this time it kept working perfectly. Only then I remembered that reconnecting the disk had felt much harder than disconnecting had been -- just as if the connector was already partially loose. It might be worth it to check.

    Else, you might be interested in some tool such as Auslogics' BoostSpeed. It's not perfect (it falls for the .wid btrez.dll 'error' - but it's fixable and there's a workaround) and it falls a bit on the scaremongering side when it reports any registry anomaly as a sign of impending doom, but it does its work, and IMHO it's worth the money.

    Just to be sure, you can download an ISO of some antivirus - Kaspersky has a free version, and there are others -, boot from that, and make sure you're not slowed down by some unwanted "guest".

    But before doing anything else, however, backup all your valuable data on an external device. That way, whatever it is happening, they ought to be safe.


  • Related Question

    Hard drive misbehaving, SMART and manufacturer diagnostics pass. Should I replace it?
  • TuxRug

    My laptop is an Asus G50Vt-X5, which came with a Seagate Momentus HDD. After owning this computer for a while, I added a second HDD from an older Toshiba laptop that had a broken screen. Both hard drives have been working perfectly for a long time, but I've recently had data corruption issues on my system drive (the Seagate).

    After a while, my computer began locking up for a couple minutes about once a week, during which time, the system drive audibly struggles. It sounds like the head is seeking repeatedly but failing to lock on the correct track. Just last night, it did it again several times during boot, taking more than 30 seconds to get from the BIOS finishing POST to the Windows Boot loader appearing. After logging into Windows 7, the drive started thrashing again after loading the desktop, and after about one minute of this, I got a STOP error indicating that a system process had terminated unexpectedly. The memory dump failed, as the system drive was still thrashing. When it rebooted, my BIOS froze on drive detection.

    I unplugged the laptop and removed the battery, waited two minutes, then plugged it back in and it booted up normally (except for the "Windows did not shut down properly" message).

    I shut down and rebooted into Repair mode to run CHKDSK (I can no longer schedule it on boot, as autochk.exe keeps getting corrupted). It found minor inconsistencies and corrected them.

    I then returned to Windows and used Smartmontools to get the following information about my system drive:

    smartctl 5.39.1 2010-01-28 r3054 [i686-pc-mingw32-win7(64)] (sf-win32-5.39.1-1)
    Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net
    
    === START OF INFORMATION SECTION ===
    Model Family:     Seagate Momentus 7200.3 series
    Device Model:     ST9320421AS
    Serial Number:    5TJ0KWKP
    Firmware Version: SD14
    User Capacity:    320,072,933,376 bytes
    Device is:        In smartctl database [for details use: -P show]
    ATA Version is:   8
    ATA Standard is:  ATA-8-ACS revision 4
    Local Time is:    Thu Sep 30 19:03:01 2010 MDT
    SMART support is: Available - device has SMART capability.
                      Enabled status cached by OS, trying SMART RETURN STATUS cmd.
    SMART support is: Enabled
    
    === START OF READ SMART DATA SECTION ===
    SMART overall-health self-assessment test result: PASSED
    See vendor-specific Attribute list for marginal Attributes.
    
    General SMART Values:
    Offline data collection status:  (0x00) Offline data collection activity
                        was never started.
                        Auto Offline Data Collection: Disabled.
    Self-test execution status:      (   0) The previous self-test routine completed
                        without error or no self-test has ever 
                        been run.
    Total time to complete Offline 
    data collection:         (   0) seconds.
    Offline data collection
    capabilities:            (0x73) SMART execute Offline immediate.
                        Auto Offline data collection on/off support.
                        Suspend Offline collection upon new
                        command.
                        No Offline surface scan supported.
                        Self-test supported.
                        Conveyance Self-test supported.
                        Selective Self-test supported.
    SMART capabilities:            (0x0003) Saves SMART data before entering
                        power-saving mode.
                        Supports SMART auto save timer.
    Error logging capability:        (0x01) Error logging supported.
                        General Purpose Logging supported.
    Short self-test routine 
    recommended polling time:    (   1) minutes.
    Extended self-test routine
    recommended polling time:    (  93) minutes.
    Conveyance self-test routine
    recommended polling time:    (   2) minutes.
    SCT capabilities:          (0x103b) SCT Status supported.
                        SCT Feature Control supported.
                        SCT Data Table supported.
    
    SMART Attributes Data Structure revision number: 10
    Vendor Specific SMART Attributes with Thresholds:
    ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
      1 Raw_Read_Error_Rate     0x000f   114   099   006    Pre-fail  Always       -       59028298
      3 Spin_Up_Time            0x0003   097   096   000    Pre-fail  Always       -       0
      4 Start_Stop_Count        0x0032   099   099   020    Old_age   Always       -       1080
      5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail  Always       -       0
      7 Seek_Error_Rate         0x000f   083   060   030    Pre-fail  Always       -       205986876
      9 Power_On_Hours          0x0032   093   093   000    Old_age   Always       -       6512
     10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       32
     12 Power_Cycle_Count       0x0032   099   037   020    Old_age   Always       -       1056
    184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
    187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
    188 Command_Timeout         0x0032   100   096   000    Old_age   Always       -       4295032928
    189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
    190 Airflow_Temperature_Cel 0x0022   063   044   045    Old_age   Always   In_the_past 37 (0 14 38 28)
    191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       0
    192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       10
    193 Load_Cycle_Count        0x0032   039   039   000    Old_age   Always       -       123290
    194 Temperature_Celsius     0x0022   037   056   000    Old_age   Always       -       37 (0 20 0 0)
    195 Hardware_ECC_Recovered  0x001a   056   044   000    Old_age   Always       -       59028298
    197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
    198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
    199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
    254 Free_Fall_Sensor        0x0032   100   100   000    Old_age   Always       -       0
    
    General Purpose Logging (GPL) feature set supported
    General Purpose Log Directory Version 1
    SMART           Log Directory Version 1 [multi-sector log support]
    GP/S  Log at address 0x00 has    1 sectors [Log Directory]
    GP/S  Log at address 0x01 has    1 sectors [Summary SMART error log]
    GP/S  Log at address 0x02 has    5 sectors [Comprehensive SMART error log]
    GP/S  Log at address 0x03 has    5 sectors [Ext. Comprehensive SMART error log]
    GP/S  Log at address 0x06 has    1 sectors [SMART self-test log]
    GP/S  Log at address 0x07 has    1 sectors [Extended self-test log]
    GP/S  Log at address 0x09 has    1 sectors [Selective self-test log]
    GP/S  Log at address 0x10 has    1 sectors [NCQ Command Error]
    GP/S  Log at address 0x11 has    1 sectors [SATA Phy Event Counters]
    GP/S  Log at address 0x21 has    1 sectors [Write stream error log]
    GP/S  Log at address 0x22 has    1 sectors [Read stream error log]
    GP/S  Log at address 0x80 has   16 sectors [Host vendor specific log]
    GP/S  Log at address 0x81 has   16 sectors [Host vendor specific log]
    GP/S  Log at address 0x82 has   16 sectors [Host vendor specific log]
    GP/S  Log at address 0x83 has   16 sectors [Host vendor specific log]
    GP/S  Log at address 0x84 has   16 sectors [Host vendor specific log]
    GP/S  Log at address 0x85 has   16 sectors [Host vendor specific log]
    GP/S  Log at address 0x86 has   16 sectors [Host vendor specific log]
    GP/S  Log at address 0x87 has   16 sectors [Host vendor specific log]
    GP/S  Log at address 0x88 has   16 sectors [Host vendor specific log]
    GP/S  Log at address 0x89 has   16 sectors [Host vendor specific log]
    GP/S  Log at address 0x8a has   16 sectors [Host vendor specific log]
    GP/S  Log at address 0x8b has   16 sectors [Host vendor specific log]
    GP/S  Log at address 0x8c has   16 sectors [Host vendor specific log]
    GP/S  Log at address 0x8d has   16 sectors [Host vendor specific log]
    GP/S  Log at address 0x8e has   16 sectors [Host vendor specific log]
    GP/S  Log at address 0x8f has   16 sectors [Host vendor specific log]
    GP/S  Log at address 0x90 has   16 sectors [Host vendor specific log]
    GP/S  Log at address 0x91 has   16 sectors [Host vendor specific log]
    GP/S  Log at address 0x92 has   16 sectors [Host vendor specific log]
    GP/S  Log at address 0x93 has   16 sectors [Host vendor specific log]
    GP/S  Log at address 0x94 has   16 sectors [Host vendor specific log]
    GP/S  Log at address 0x95 has   16 sectors [Host vendor specific log]
    GP/S  Log at address 0x96 has   16 sectors [Host vendor specific log]
    GP/S  Log at address 0x97 has   16 sectors [Host vendor specific log]
    GP/S  Log at address 0x98 has   16 sectors [Host vendor specific log]
    GP/S  Log at address 0x99 has   16 sectors [Host vendor specific log]
    GP/S  Log at address 0x9a has   16 sectors [Host vendor specific log]
    GP/S  Log at address 0x9b has   16 sectors [Host vendor specific log]
    GP/S  Log at address 0x9c has   16 sectors [Host vendor specific log]
    GP/S  Log at address 0x9d has   16 sectors [Host vendor specific log]
    GP/S  Log at address 0x9e has   16 sectors [Host vendor specific log]
    GP/S  Log at address 0x9f has   16 sectors [Host vendor specific log]
    GP/S  Log at address 0xa1 has   20 sectors [Device vendor specific log]
    GP    Log at address 0xa2 has 2248 sectors [Device vendor specific log]
    GP/S  Log at address 0xa8 has   20 sectors [Device vendor specific log]
    GP/S  Log at address 0xa9 has    1 sectors [Device vendor specific log]
    GP    Log at address 0xb0 has 2819 sectors [Device vendor specific log]
    GP    Log at address 0xbe has 65535 sectors [Device vendor specific log]
    GP    Log at address 0xbf has 65535 sectors [Device vendor specific log]
    GP/S  Log at address 0xe0 has    1 sectors [SCT Command/Status]
    GP/S  Log at address 0xe1 has    1 sectors [SCT Data Transfer]
    
    SMART Extended Comprehensive Error Log Version: 1 (5 sectors)
    No Errors Logged
    
    SMART Extended Self-test Log Version: 1 (1 sectors)
    Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
    # 1  Extended offline    Completed without error       00%      3868         -
    
    SMART Selective self-test log data structure revision number 1
     SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
        1        0        0  Not_testing
        2        0        0  Not_testing
        3        0        0  Not_testing
        4        0        0  Not_testing
        5        0        0  Not_testing
    Selective self-test flags (0x0):
      After scanning selected spans, do NOT read-scan remainder of disk.
    If Selective self-test is pending on power-up, resume after 0 minute delay.
    
    SCT Status Version:                  3
    SCT Version (vendor specific):       522 (0x020a)
    SCT Support Level:                   1
    Device State:                        Active (0)
    Current Temperature:                    37 Celsius
    Power Cycle Min/Max Temperature:     28/38 Celsius
    Lifetime    Min/Max Temperature:     20/56 Celsius
    Under/Over Temperature Limit Count:   0/21
    SCT Temperature History Version:     2
    Temperature Sampling Period:         1 minute
    Temperature Logging Interval:        1 minute
    Min/Max recommended Temperature:      0/ 0 Celsius
    Min/Max Temperature Limit:            0/ 0 Celsius
    Temperature History Size (Index):    128 (41)
    
    Index    Estimated Time   Temperature Celsius
      42    2010-09-30 16:56    38  *******************
     ...    ..(  9 skipped).    ..  *******************
      52    2010-09-30 17:06    38  *******************
      53    2010-09-30 17:07    37  ******************
      54    2010-09-30 17:08    37  ******************
      55    2010-09-30 17:09    37  ******************
      56    2010-09-30 17:10     ?  -
      57    2010-09-30 17:11    30  ***********
      58    2010-09-30 17:12    30  ***********
      59    2010-09-30 17:13    31  ************
      60    2010-09-30 17:14    32  *************
      61    2010-09-30 17:15    33  **************
      62    2010-09-30 17:16    34  ***************
      63    2010-09-30 17:17    35  ****************
      64    2010-09-30 17:18    35  ****************
      65    2010-09-30 17:19    36  *****************
      66    2010-09-30 17:20    37  ******************
      67    2010-09-30 17:21    37  ******************
      68    2010-09-30 17:22    38  *******************
      69    2010-09-30 17:23    38  *******************
      70    2010-09-30 17:24    39  ********************
      71    2010-09-30 17:25    39  ********************
      72    2010-09-30 17:26    40  *********************
      73    2010-09-30 17:27    40  *********************
      74    2010-09-30 17:28    40  *********************
      75    2010-09-30 17:29    41  **********************
      76    2010-09-30 17:30    41  **********************
      77    2010-09-30 17:31    42  ***********************
      78    2010-09-30 17:32    42  ***********************
      79    2010-09-30 17:33    43  ************************
      80    2010-09-30 17:34    43  ************************
      81    2010-09-30 17:35    44  *************************
     ...    ..(  2 skipped).    ..  *************************
      84    2010-09-30 17:38    44  *************************
      85    2010-09-30 17:39    45  **************************
     ...    ..( 18 skipped).    ..  **************************
     104    2010-09-30 17:58    45  **************************
     105    2010-09-30 17:59    46  ***************************
     ...    ..(  8 skipped).    ..  ***************************
     114    2010-09-30 18:08    46  ***************************
     115    2010-09-30 18:09    47  ****************************
     ...    ..(  4 skipped).    ..  ****************************
     120    2010-09-30 18:14    47  ****************************
     121    2010-09-30 18:15    46  ***************************
     ...    ..(  6 skipped).    ..  ***************************
       0    2010-09-30 18:22    46  ***************************
       1    2010-09-30 18:23    47  ****************************
     ...    ..(  5 skipped).    ..  ****************************
       7    2010-09-30 18:29    47  ****************************
       8    2010-09-30 18:30     ?  -
       9    2010-09-30 18:31    32  *************
      10    2010-09-30 18:32    33  **************
      11    2010-09-30 18:33    34  ***************
      12    2010-09-30 18:34     ?  -
      13    2010-09-30 18:35    37  ******************
      14    2010-09-30 18:36     ?  -
      15    2010-09-30 18:37    38  *******************
      16    2010-09-30 18:38    38  *******************
      17    2010-09-30 18:39    39  ********************
      18    2010-09-30 18:40    39  ********************
      19    2010-09-30 18:41    40  *********************
      20    2010-09-30 18:42     ?  -
      21    2010-09-30 18:43    28  *********
      22    2010-09-30 18:44    29  **********
      23    2010-09-30 18:45    30  ***********
      24    2010-09-30 18:46    31  ************
      25    2010-09-30 18:47    32  *************
      26    2010-09-30 18:48    32  *************
      27    2010-09-30 18:49    32  *************
      28    2010-09-30 18:50    33  **************
      29    2010-09-30 18:51    34  ***************
      30    2010-09-30 18:52    35  ****************
      31    2010-09-30 18:53    36  *****************
      32    2010-09-30 18:54    37  ******************
      33    2010-09-30 18:55    38  *******************
     ...    ..(  6 skipped).    ..  *******************
      40    2010-09-30 19:02    38  *******************
      41    2010-09-30 19:03    37  ******************
    
    SATA Phy Event Counters (GP Log 0x11)
    ID      Size     Value  Description
    0x000a  2            2  Device-to-host register FISes sent due to a COMRESET
    0x0001  2            0  Command failed due to ICRC error
    0x0003  2            0  R_ERR response for device-to-host data FIS
    0x0004  2            0  R_ERR response for host-to-device data FIS
    0x0006  2            0  R_ERR response for device-to-host non-data FIS
    0x0007  2            0  R_ERR response for host-to-device non-data FIS
    

    I also ran Seagate SeaTools for Windows in both S.M.A.R.T. Check and Long Drive Self Test modes, both passing.

    I have only noticed the drive audibly thrashing about three times. Should I hurry out and buy a new system drive ASAP?


  • Related Answers
  • Sarge

    The short answer is YES! Do it NOW!

    Sorry, couldn't help myself. From what I am reading, your drive is starting to fail and there is no going back. Get a back-up of al your data and replace the system drive.

  • knitti

    Generally for COTS hardware, any test like hard disk tests, memtests etc. CANNOT guarantee 100% that a component is not broken, including manufacturer's tests. The only thing (not counting bugs) you can be reasonably sure about is, when a test says its broken, it is probably broken.

    So if a test returns green and you have eliminated any reasonable other possibility, you should still assume a component is failing and check against a known (or probable) good one.