I made a post a little bit ago about my cpus locking up and wasn't sure what was causing it. From that post we thought one of my drives was failing but I replaced the one that had issues and now all my smart data is saying everything is okay now.
I'm still running into the same issue where it's printing to console that Kernel: watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [kworker:0:1:277397]. Ill also see reports such as
kernel:[929339.922954] Dazed and confused, but trying to continue
kernel:[929339.922953] Do you have a strange power saving mode enabled?
kernel:[929339.922950] Uhhuh. NMI received for unknown reason 30 on CPU 1.
But after a little bit it comes back.
I am currently running two vms one thats a windows 10 and another that is running debian. The host is a debian and when the vms freeze the host is still accessible and working.
Any idea where I can start investigating to see why they both keep freezing?
edit:
From further investigation I see some logs in the host I believe at the same time the vms go down. I have an lsi 9266 Megaraid sas card and it looks like the commands are stalling there as well.
```
[1035702.593076] megaraid_sas 0000:82:00.0: megasas_disable_intr_fusion is called outbound_intr_mask:0x40000009
[1035702.593099] megaraid_sas 0000:82:00.0: [ 0]waiting for 1 commands to complete for scsi0
[1035707.716655] megaraid_sas 0000:82:00.0: [ 5]waiting for 1 commands to complete for scsi0
[1035712.836661] megaraid_sas 0000:82:00.0: [10]waiting for 1 commands to complete for scsi0
[1035717.956660] megaraid_sas 0000:82:00.0: [15]waiting for 1 commands to complete for scsi0
[1035723.076660] megaraid_sas 0000:82:00.0: [20]waiting for 1 commands to complete for scsi0
[1035728.196660] megaraid_sas 0000:82:00.0: [25]waiting for 1 commands to complete for scsi0
[1035733.316634] megaraid_sas 0000:82:00.0: [30]waiting for 1 commands to complete for scsi0
[1035738.436661] megaraid_sas 0000:82:00.0: [35]waiting for 1 commands to complete for scsi0
[1035743.556629] megaraid_sas 0000:82:00.0: [40]waiting for 1 commands to complete for scsi0
```
Anyone know of a good way to investigate what drive is making the rest of the raid array slow
Post Details
- Posted
- 3 years ago
- Reddit URL
- View post on reddit.com
- External URL
- reddit.com/r/kvm/comment...