[Disk Manager ] 求助,这几天文件系统频繁的变成只读模式,导致无法使用
Tofloor
poster avatar
牛牪犇
deepin
2024-11-01 10:03
Author

求助:这几天文件系统频繁的变成只读模式,百度了一波,发现可能是磁盘问题导致,还有大佬帮忙分析分析么,有需要的日志我可以提供

具体情况:

1、安装了一个客户的vpn后,下午系统开始出现文件无法保存,检查后发现,是文件系统变成了制度模式

2、依次执行了sudo umount /dev/sda1、sudo fsck -f /dev/sda1、sudo mount -o remount,rw /dev/sda1 /data 重启后正常

3、但这几天频繁出现该问题,所以检查了下sudo smartctl -a /dev/sda1,下面是结果

smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.15.77-amd64-desktop] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model: TOSHIBA MQ04ABF100
Serial Number: 20K8P9UDT
LU WWN Device Id: 5 000039 9d2506724
Firmware Version: JU0A0E
User Capacity: 1,000,204,886,016 bytes [1.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 5400 rpm
Form Factor: 2.5 inches
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ACS-3 T13/2161-D revision 5
SATA Version is: SATA 3.3, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Fri Nov 1 09:16:43 2024 CST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x85) Offline data collection activity
was aborted by an interrupting command from host.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 120) seconds.
Offline data collection
capabilities: (0x5b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 170) minutes.
SCT capabilities: (0x003d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000b 100 100 050 Pre-fail Always - 0
2 Throughput_Performance 0x0005 100 100 050 Pre-fail Offline - 0
3 Spin_Up_Time 0x0027 100 100 001 Pre-fail Always - 1303
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 565
5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 3
7 Seek_Error_Rate 0x000b 100 100 050 Pre-fail Always - 0
8 Seek_Time_Performance 0x0005 100 100 050 Pre-fail Offline - 0
9 Power_On_Hours 0x0032 020 020 000 Old_age Always - 32301
10 Spin_Retry_Count 0x0033 111 100 030 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 177
191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age Always - 514
192 Power-Off_Retract_Count 0x0032 096 096 000 Old_age Always - 2341
193 Load_Cycle_Count 0x0032 079 079 000 Old_age Always - 216065
194 Temperature_Celsius 0x0022 100 100 000 Old_age Always - 46 (Min/Max 8/59)
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 3
197 Current_Pending_Sector 0x0032 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0
206 Flying_Height 0x0020 100 100 000 Old_age Offline - 16817
220 Disk_Shift 0x0002 100 100 000 Old_age Always - 0
222 Loaded_Hours 0x0032 024 024 000 Old_age Always - 30715
223 Load_Retry_Count 0x0032 100 100 000 Old_age Always - 0
224 Load_Friction 0x0022 100 100 000 Old_age Always - 0
226 Load-in_Time 0x0026 100 100 000 Old_age Always - 260
240 Head_Flying_Hours 0x0001 100 100 001 Pre-fail Offline - 0

SMART Error Log Version: 1
ATA Error Count: 672 (device log contains only the most recent five errors)
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 672 occurred at disk power-on lifetime: 32301 hours (1345 days + 21 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH


40 41 88 20 b0 fe 40 Error: UNC at LBA = 0x00feb020 = 16691232

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name


60 08 88 20 b0 fe 40 00 33d+00:02:52.620 READ FPDMA QUEUED
ef 10 02 00 00 00 a0 00 33d+00:02:52.620 SET FEATURES [Enable SATA feature]
27 00 00 00 00 00 e0 00 33d+00:02:52.619 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
ec 00 00 00 00 00 a0 00 33d+00:02:52.619 IDENTIFY DEVICE
ef 03 45 00 00 00 a0 00 33d+00:02:52.619 SET FEATURES [Set transfer mode]

Error 671 occurred at disk power-on lifetime: 32301 hours (1345 days + 21 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH


40 41 b0 20 b0 fe 40 Error: UNC at LBA = 0x00feb020 = 16691232

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name


60 08 b0 20 b0 fe 40 00 33d+00:02:52.112 READ FPDMA QUEUED
ef 10 02 00 00 00 a0 00 33d+00:02:51.710 SET FEATURES [Enable SATA feature]
27 00 00 00 00 00 e0 00 33d+00:02:51.709 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
ec 00 00 00 00 00 a0 00 33d+00:02:51.709 IDENTIFY DEVICE
ef 03 45 00 00 00 a0 00 33d+00:02:51.709 SET FEATURES [Set transfer mode]

Error 670 occurred at disk power-on lifetime: 32301 hours (1345 days + 21 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH


40 41 a8 20 b0 fe 40 Error: UNC at LBA = 0x00feb020 = 16691232

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name


60 08 a8 20 b0 fe 40 00 33d+00:02:51.265 READ FPDMA QUEUED
ef 10 02 00 00 00 a0 00 33d+00:02:51.265 SET FEATURES [Enable SATA feature]
27 00 00 00 00 00 e0 00 33d+00:02:51.265 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
ec 00 00 00 00 00 a0 00 33d+00:02:51.264 IDENTIFY DEVICE
ef 03 45 00 00 00 a0 00 33d+00:02:51.264 SET FEATURES [Set transfer mode]

Error 669 occurred at disk power-on lifetime: 32301 hours (1345 days + 21 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH


40 41 18 20 b0 fe 40 Error: UNC at LBA = 0x00feb020 = 16691232

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name


60 08 18 20 b0 fe 40 00 33d+00:02:50.776 READ FPDMA QUEUED
ef 10 02 00 00 00 a0 00 33d+00:02:48.315 SET FEATURES [Enable SATA feature]
27 00 00 00 00 00 e0 00 33d+00:02:48.314 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
ec 00 00 00 00 00 a0 00 33d+00:02:48.313 IDENTIFY DEVICE
ef 03 45 00 00 00 a0 00 33d+00:02:48.313 SET FEATURES [Set transfer mode]

Error 668 occurred at disk power-on lifetime: 32301 hours (1345 days + 21 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH


40 41 a0 20 b0 fe 40 Error: UNC at LBA = 0x00feb020 = 16691232

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_

4、journalctl日志中有看到10月 31 17:06:35 NiuZW-PC daemon/dock[6520]: dock_manager_xevent.go:185: x.Error: 3 (BadWindow), sequence: 29456, resource id: 149847513, major code: 2 (ChangeWindowAttributes), minor code: 0

5、挂载的时候还NiuZW@NiuZW-PC:~$ dmesg | tail
res 41/40:08:20:b0:fe/00:00:72:00:00/40 Emask 0x409 (media error)
[171769.112952] ata2.00: status: { DRDY ERR }
[171769.112953] ata2.00: error: { UNC }
[171769.115379] ata2.00: configured for UDMA/100
[171769.115398] sd 1:0:0:0: [sda] tag#6 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
[171769.115401] sd 1:0:0:0: [sda] tag#6 Sense Key : Medium Error [current]
[171769.115404] sd 1:0:0:0: [sda] tag#6 Add. Sense: Unrecovered read error - auto reallocate failed
[171769.115407] sd 1:0:0:0: [sda] tag#6 CDB: Read(10) 28 00 72 fe b0 20 00 00 08 00
[171769.115408] blk_update_request: I/O error, dev sda, sector 1929293856 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[171769.115426] ata2: EH complete

Reply Favorite View the author
All Replies
HualetWang
deepin
2024-11-01 11:26
#1

根据您提供的SMART检测结果,我们可以分析几个关键的SMART属性来判断磁盘的健康状况:

  1. RAW_READ_ERROR_RATE:这个属性表示硬盘读取错误的速率。您的硬盘该属性值为100,阈值为50,表明目前没有读取错误的问题。
  2. REALLOCATED_SECTOR_CT:这个属性表示硬盘中被重新分配的扇区数量,即硬盘中有多少个坏扇区被标记为不可用并被替换。您的硬盘该属性值为100,阈值为10,RAW_VALUE为3,意味着有3个扇区被重新分配。这可能是硬盘出现坏道的一个信号,但目前还在阈值范围内。
  3. CURRENT_PENDING_SECTOR:这个属性表示当前硬盘中存在问题的扇区数量,这些扇区尚未被重新分配。您的硬盘该属性值为100,阈值为0,RAW_VALUE为0,意味着目前没有待处理的问题扇区。
  4. OFFLINE_UNCORRECTABLE:这个属性表示硬盘中存在无法纠正的错误的扇区数量。您的硬盘该属性值为100,阈值为0,RAW_VALUE为0,意味着目前没有无法纠正的错误。
  5. SMART overall-health self-assessment test result:硬盘的整体健康自检结果为PASSED,这是一个好消息,表明硬盘整体健康状况良好。

综合以上信息,虽然您的硬盘有3个扇区被重新分配,但整体健康状况检查结果为通过,并且没有当前待处理的问题扇区和无法纠正的错误。这可能意味着硬盘出现了一些轻微的问题,但尚未达到严重损害的程度。不过,鉴于您提到问题频繁出现,建议您尽快备份重要数据,并考虑更换硬盘以避免未来数据丢失的风险。同时,您可以尝试使用磁盘修复工具检查并修复硬盘上的错误。如果问题持续存在,可能需要专业的数据恢复服务或更换硬盘。

根据您提供的日志信息,这里显示的是硬盘读取操作失败的一系列错误记录。具体来说:

  1. **[171769.115398] sd 1:0:0:0: [sda] tag#6 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s**:这条记录表明硬盘的读取命令失败了,但是主机(host)和驱动(driver)都没有报告错误,这意味着错误可能在硬盘本身或者更底层的硬件上。
  2. [171769.115401] sd 1:0:0:0: [sda] tag#6 Sense Key : Medium Error [current]:这表示当前操作遇到了介质错误,也就是说,硬盘介质(存储数据的物理部分)有问题。
  3. [171769.115404] sd 1:0:0:0: [sda] tag#6 Add. Sense: Unrecovered read error - auto reallocate failed:这条记录说明读取操作遇到了未恢复的错误,并且自动重新分配扇区失败。这通常意味着硬盘有一个坏扇区,并且尝试自动修复或重新分配到备用扇区失败了。
  4. [171769.115407] sd 1:0:0:0: [sda] tag#6 CDB: Read(10) 28 00 72 fe b0 20 00 00 08 00:这是尝试读取硬盘时的命令数据块(CDB),其中包含了读取操作的具体参数,例如要读取的扇区地址。

综合这些信息,您的硬盘很可能有一个或多个坏扇区,导致读取操作失败。这种情况可能需要专业的数据恢复服务来尝试恢复数据,或者可能需要更换硬盘。同时,建议尽快备份硬盘上的重要数据,以防止进一步的数据丢失。您也可以尝试使用磁盘修复工具来检查并修复硬盘上的错误,但鉴于自动重新分配扇区失败,这可能不会解决根本问题。

看起来是因为磁盘有坏道,可以考虑换个硬盘试试 😂

Reply View the author
牛牪犇
deepin
2024-11-01 15:26
#2
HualetWang

根据您提供的SMART检测结果,我们可以分析几个关键的SMART属性来判断磁盘的健康状况:

  1. RAW_READ_ERROR_RATE:这个属性表示硬盘读取错误的速率。您的硬盘该属性值为100,阈值为50,表明目前没有读取错误的问题。
  2. REALLOCATED_SECTOR_CT:这个属性表示硬盘中被重新分配的扇区数量,即硬盘中有多少个坏扇区被标记为不可用并被替换。您的硬盘该属性值为100,阈值为10,RAW_VALUE为3,意味着有3个扇区被重新分配。这可能是硬盘出现坏道的一个信号,但目前还在阈值范围内。
  3. CURRENT_PENDING_SECTOR:这个属性表示当前硬盘中存在问题的扇区数量,这些扇区尚未被重新分配。您的硬盘该属性值为100,阈值为0,RAW_VALUE为0,意味着目前没有待处理的问题扇区。
  4. OFFLINE_UNCORRECTABLE:这个属性表示硬盘中存在无法纠正的错误的扇区数量。您的硬盘该属性值为100,阈值为0,RAW_VALUE为0,意味着目前没有无法纠正的错误。
  5. SMART overall-health self-assessment test result:硬盘的整体健康自检结果为PASSED,这是一个好消息,表明硬盘整体健康状况良好。

综合以上信息,虽然您的硬盘有3个扇区被重新分配,但整体健康状况检查结果为通过,并且没有当前待处理的问题扇区和无法纠正的错误。这可能意味着硬盘出现了一些轻微的问题,但尚未达到严重损害的程度。不过,鉴于您提到问题频繁出现,建议您尽快备份重要数据,并考虑更换硬盘以避免未来数据丢失的风险。同时,您可以尝试使用磁盘修复工具检查并修复硬盘上的错误。如果问题持续存在,可能需要专业的数据恢复服务或更换硬盘。

根据您提供的日志信息,这里显示的是硬盘读取操作失败的一系列错误记录。具体来说:

  1. **[171769.115398] sd 1:0:0:0: [sda] tag#6 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s**:这条记录表明硬盘的读取命令失败了,但是主机(host)和驱动(driver)都没有报告错误,这意味着错误可能在硬盘本身或者更底层的硬件上。
  2. [171769.115401] sd 1:0:0:0: [sda] tag#6 Sense Key : Medium Error [current]:这表示当前操作遇到了介质错误,也就是说,硬盘介质(存储数据的物理部分)有问题。
  3. [171769.115404] sd 1:0:0:0: [sda] tag#6 Add. Sense: Unrecovered read error - auto reallocate failed:这条记录说明读取操作遇到了未恢复的错误,并且自动重新分配扇区失败。这通常意味着硬盘有一个坏扇区,并且尝试自动修复或重新分配到备用扇区失败了。
  4. [171769.115407] sd 1:0:0:0: [sda] tag#6 CDB: Read(10) 28 00 72 fe b0 20 00 00 08 00:这是尝试读取硬盘时的命令数据块(CDB),其中包含了读取操作的具体参数,例如要读取的扇区地址。

综合这些信息,您的硬盘很可能有一个或多个坏扇区,导致读取操作失败。这种情况可能需要专业的数据恢复服务来尝试恢复数据,或者可能需要更换硬盘。同时,建议尽快备份硬盘上的重要数据,以防止进一步的数据丢失。您也可以尝试使用磁盘修复工具来检查并修复硬盘上的错误,但鉴于自动重新分配扇区失败,这可能不会解决根本问题。

看起来是因为磁盘有坏道,可以考虑换个硬盘试试 😂

好吧,主要是公司的电脑,一报修就得好几周😭

Reply View the author
raspbian
deepin
2024-11-01 19:44
#3

传说中固态到寿之后就会频繁变为只读

但是我有一个刚出固态硬盘这个概念的时候1200买的三星750EVO 用到现在也没到寿

Reply View the author
TangJacobs
deepin
2024-11-01 20:55
#4

我的vbox虚拟机安装的win10系统也变为只读打不开了

Reply View the author
TangJacobs
deepin
2024-11-01 20:56
#5
TangJacobs

我的vbox虚拟机安装的win10系统也变为只读打不开了

用的btrfs文件系统

Reply View the author
小小怪冲啊!
deepin
2024-11-02 01:43
#6

该换了

Reply View the author
132******94
deepin
2024-11-02 19:13
#7

你提到安装了vpn之后出的问题,这个vpn是一个单独软件还是系统内置的网络配置?

如果是安装软件的话有没有卸载试试?

看到你的这个问题我想到很久以前在windows上碰到的问题,当时还是XP系统,将一个磁盘分区分配到一个目录而不是盘符,然后安装PPS还是PPTV来着记不清楚了,这个分配就会失效并自动分配一个盘符,当时网络还不是很发达,没查到解决办法,又想着用那个软件看视频就没管它了

Reply View the author