AnswerBun.com

RAID1 / CentOS - I can not remove a faulty disk and connect the good disk

This server (has RAID 1 – SoftRAID with CentOS) has been delayed by deleting a cache of 290,000 files last week, and yesterday when performing manual backup of the machine via tar.gz, it failed and when restarting it did not return. Now I want to leave / dev / sdb (where it has no faults) in the air, and turn off / dev / sda, and wait for the provider to change the disk (/ dev / sda).

In recovery mode it shows that / dev / sda has a problem (according to image):

enter image description here

Note: Running smartctl -a on each ssd (https://pastebin.com/a7ssE88T) also verifies that / dev / sda is defective, and / dev / sdb is normal.

In recovery mode, I try to run:

root @ rescue: ~ # cat / proc / mdstat
Personalities: [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty]
unused devices: <none>

Even running fdisk -l, I have this return:

enter image description here

enter image description here

This is my recovery environment:

root @ rescue: ~ # df -h
Filesystem Size Used Avail Use% Mounted on
devtmpfs 24G 0 24G 0% / dev
198.27.85.63:/home/pub/rescue.v8 846G 352G 451G 44% / nfs
tmpfs 24G 3.2M 24G 1% / rw
aufs 24G 3.2M 24G 1% /
198.27.85.63:/home/pub/pro-power 846G 352G 451G 44% / power
198.27.85.63:/home/pub/commonnfs 846G 352G 451G 44% / common
tmpfs 24G 0 24G 0% / dev / shm
tmpfs 24G 10M 24G 1% / run
tmpfs 5.0M 0 5.0M 0% / run / lock
tmpfs 24G 0 24G 0% / sys / fs / cgroup
tmpfs 24G 16K 24G 1% / tmp

If I try to mount the 2 disks:

root @ rescue: ~ # mount / dev / sdb / mnt /
NTFS signature is missing.
Failed to mount '/ dev / sdb': Invalid argument
The device '/ dev / sdb' does not seem to have a valid NTFS.
Maybe the wrong device is used? Or the whole disk instead of a
partition (e.g. / dev / sda, not / dev / sda1)? Or the other way around?
root @ rescue: ~ # mount / dev / sda / mnt /
Error reading bootsector: Input / output error
Failed to mount '/ dev / sda': Input / output error
NTFS is either inconsistent, or there is a hardware fault, or it's a
SoftRAID / FakeRAID hardware. In the first case run chkdsk / f on Windows
then reboot into Windows twice. The usage of the / f parameter is very
important If the device is a SoftRAID / FakeRAID then first activate
it and mount a different device under the / dev / mapper / directory, (e.g.
/ dev / mapper / nvidia_eahaabcc1). Please see the 'dmraid' documentation
for more details.

Super User Asked by Thiago Silva on November 26, 2021

1 Answers

One Answer

As no one helped me in this case, here is a summary of the precautions I have passed below:

  • After the incident, I understood that it is necessary to have HardRAID (if one the main disk of the RAID1 raid fails), the other takes over, and you only have to request the replacement to the data center;
  • I no longer use local backup, now only off-site and direct to Google Drive via shell_script + api with google;
  • Projects involving SEO sites are in the Cloud with Ceph, as this one at Hetzner has 3 physical servers (if one burns, the other goes up the Cloud directly from the storage);
  • Do not depend on OVH support for these issues, as it took 3 days to accept that the disc had burned.

Answered by Thiago Silva on November 26, 2021

Add your own answers!

Related Questions

Couldn’t start project on Android: Error running adb:

3  Asked on January 3, 2022 by andy4ward

   

PC shuts off when attempting to wake up

1  Asked on January 3, 2022 by situbole

   

USB 3.1 cable transferring 40MB/s

1  Asked on January 3, 2022 by user1201656

     

How can I export public keys in PEM format with GnuPG?

4  Asked on January 1, 2022 by inaimathi

   

Run Python scripts without explicitly invoking `python`

5  Asked on January 1, 2022 by user2018084

   

Ask a Question

Get help from others!

© 2022 AnswerBun.com. All rights reserved.