Avaria d’un disc amb RAID1. (i II)

Ens vam quedar que vaig anar-me a comprar un altre disc SATA de 500 GB, me l’he comprat aquest matí i perquè us feu una idea, estem parlant de 59.95€ un SEAGATE.

He apagat i he tret la tapa de l’ordinador i he desmuntat el disc avariat, que ja veurem què faig. No pateixo gaire per on acabi el seu contingut, està xifradíssim :^)

He tornat a muntar l’ordinador i via gparted he creat la taula de particions i he reproduït l’esquema de les particions que tenia el disc /dev/sda (el disc no avariat). Llavors simplement he passat a fer això (nota: no sé si he fet exactament lo correcte, però m’ho anoto aquí per veure els pasos fets):

He afegit les particions /dev/sdb1 a la partició 0 del RAID1:

root@padova:/home/xavi# mdadm --manage /dev/md0 --add /dev/sdb1

i he afegit la partició /dev/sdb2 a la partició 1 del RAID1:

root@padova:/home/xavi# mdadm --manage /dev/md1 --add /dev/sdb2

He mirat com està la primera partició del RAID:

root@padova:/home/xavi# mdadm --detail /dev/md0
/dev/md0:
Version : 1.0
Creation Time : Thu Mar 29 18:38:44 2012
Raid Level : raid1
Array Size : 511988 (500.07 MiB 524.28 MB)
Used Dev Size : 511988 (500.07 MiB 524.28 MB)
Raid Devices : 2
Total Devices : 2
Persistence : Superblock is persistent

Update Time : Wed Nov 28 21:08:49 2012
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0

Name : chryse64:0
UUID : 4ca42d11:b43b6253:faf87c59:3facb905
Events : 118

Number Major Minor RaidDevice State
0 8 1 0 active sync /dev/sda1
2 8 17 1 active sync /dev/sdb1

I viceversa amb la partició 2 del RAID1:

root@padova:/home/xavi# mdadm --detail /dev/md1
/dev/md1:
Version : 1.2
Creation Time : Sat Jun 23 20:14:00 2012
Raid Level : raid1
Array Size : 487870328 (465.27 GiB 499.58 GB)
Used Dev Size : 487870328 (465.27 GiB 499.58 GB)
Raid Devices : 2
Total Devices : 2
Persistence : Superblock is persistent

Update Time : Wed Nov 28 21:15:07 2012
State : clean, degraded, recovering
Active Devices : 1
Working Devices : 2
Failed Devices : 0
Spare Devices : 1

Rebuild Status : 25% complete

Name : padova:1 (local to host padova)
UUID : 3019c9db:4a9e8d15:579ba070:aaa08e78
Events : 15521

Number Major Minor RaidDevice State
0 8 2 0 active sync /dev/sda2
2 8 18 1 spare rebuilding /dev/sdb2

I per tant veiem com /dev/md0 que era considerablement més petit ja està fet i sincronitzat, i /dev/md1 (considerablement més gran) encara està sincronitzant.

Comprovem també l’estat del disc nou:

root@padova:/home/xavi# smartctl --all /dev/sdb
smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.2.0-4-amd64] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Device Model: ST500DM002-1BD142
Serial Number: Z3T72Q65
LU WWN Device Id: 5 000c50 04ec2262b
Firmware Version: KC45
User Capacity: 500,106,780,160 bytes [500 GB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: 8
ATA Standard is: ATA-8-ACS revision 4
Local Time is: Wed Nov 28 21:17:07 2012 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 600) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 82) minutes.
Conveyance self-test routine
recommended polling time: ( 2) minutes.
SCT capabilities: (0x303f) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 100 100 006 Pre-fail Always - 44008
3 Spin_Up_Time 0x0003 100 100 000 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 6
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 100 253 030 Pre-fail Always - 5231
9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 0
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 11
183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0
184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
188 Command_Timeout 0x0032 100 100 000 Old_age Always - 0
189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0
190 Airflow_Temperature_Cel 0x0022 068 049 045 Old_age Always - 32 (Min/Max 21/32)
194 Temperature_Celsius 0x0022 032 051 000 Old_age Always - 32 (0 17 0 0)
195 Hardware_ECC_Recovered 0x001a 100 100 000 Old_age Always - 44008
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 116260469735424
241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 201517604
242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 20197

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged. [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

I tot i ser nou de trinca em surt o bé que segons alguns paràmetres o està per avariar-se o en tot cas és vell (em pregunto si serà cosa del disc o de que realment Seagate és una segona qualitat ¿?¿?). Boh, en tot cas qui ho sap?.

Conclusió final:

Una avaria que generalment vol dir disgust gros per a la gent en el meu cas no ha passat d’anècdota amb un sobrecost econòmic d’aprox 60€ del que costa no tenir un altre disc de la mateixa mida per fer mirror (RAID1) dels nostres continguts.

Les comandes exactes no són les que surten als blogs que he anat consultant, però amb una mica de molta cura us en podeu ensortir (per cert no em faig responsable de les errades que podeu tenir seguint aquest petit guió probablement ple d’errades que us poso aquí). Abans de fer qualsevol operació, sobre els vostres discs, us recomano molt fermament, sisplau: FEU D’UNA VEGADA LA SANTA CÒPIA DE SEGURETAT. BACKUP O MORTE. He dit.

nomCategoria: