commit 5b4ded3f35d5b2c077e4035b66af2d04668d7ee5 Author: Don Brace Date: Thu Jul 11 14:47:04 2024 -0500 scsi: smartpqi: Update driver version to 2.1.28-025 Update driver version to 2.1.28-025. Reviewed-by: Mike Tran Reviewed-by: Gerry Morong Reviewed-by: Scott Benesh Reviewed-by: Scott Teel Signed-off-by: Don Brace Link: https://lore.kernel.org/r/20240711194704.982400-6-don.brace@microchip.com Signed-off-by: Martin K. Petersen commit 57abab70a5e0179b5fff7c4a6dfdbcbcc97ca910 Author: Kevin Barnett Date: Thu Jul 11 14:47:03 2024 -0500 scsi: smartpqi: Improve handling of multipath failover Improve multipath failovers by mapping firmware errors into I/O errors. In some rare instances, firmware does not return the proper error code for I/O errors caused by a multipath path failure. Map I/O errors returned by firmware into errors that help the multipath layer to detect the failure of a path. Reviewed-by: Gerry Morong Reviewed-by: Scott Benesh Reviewed-by: Scott Teel Reviewed-by: Mike McGowen Signed-off-by: Kevin Barnett Signed-off-by: Don Brace Link: https://lore.kernel.org/r/20240711194704.982400-5-don.brace@microchip.com Signed-off-by: Martin K. Petersen commit f1393d52e6cda9c20f12643cbecf1e1dc357e0e2 Author: Gilbert Wu Date: Thu Jul 11 14:47:02 2024 -0500 scsi: smartpqi: revert propagate-the-multipath-failure-to-SML-quickly Correct a rare multipath failure issue by reverting commit 94a68c814328 ("scsi: smartpqi: Quickly propagate path failures to SCSI midlayer") [1]. Reason for revert: The patch propagated the path failure to SML quickly when one of the path fails during IO and AIO path gets disabled for a multipath device. But it created a new issue: when creating a volume on an encryption-enabled controller, the firmware reports the AIO path is disabled, which cause the driver to report a path failure to SML for a multipath device. There will be a new fix to handle "Illegal request" and "Invalid field in parameter list" on RAID path when the AIO path is disabled on a multipath device. [1] https://lore.kernel.org/all/164375209313.440833.9992416628621839233.stgit@brunhilda.pdev.net/ Fixes: 94a68c814328 ("scsi: smartpqi: Quickly propagate path failures to SCSI midlayer") Reviewed-by: Scott Benesh Reviewed-by: Scott Teel Reviewed-by: Mike McGowen Signed-off-by: Gilbert Wu Signed-off-by: Don Brace Link: https://lore.kernel.org/r/20240711194704.982400-4-don.brace@microchip.com Signed-off-by: Martin K. Petersen commit bb0f5445b27f4a8f8359cc0f36a59a397b4b5e0c Author: Kevin Barnett Date: Thu Jul 11 14:47:01 2024 -0500 scsi: smartpqi: Improve accuracy/performance of raid-bypass-counter The original implementation of this counter used an atomic variable. However, this implementation negatively impacted performance in some configurations. Switch to using per_cpu variables. Reviewed-by: Scott Benesh Reviewed-by: Scott Teel Reviewed-by: Mike McGowen Co-developed-by: Mahesh Rajashekhara Signed-off-by: Mahesh Rajashekhara Signed-off-by: Kevin Barnett Signed-off-by: Don Brace Link: https://lore.kernel.org/r/20240711194704.982400-3-don.brace@microchip.com Signed-off-by: Martin K. Petersen commit 0e21e73384d324f75ea16f3d622cfc433fa6209b Author: David Strahan Date: Thu Jul 11 14:47:00 2024 -0500 scsi: smartpqi: Add new controller PCI IDs All PCI ID entries in hex. Add new inagile PCI IDs: VID / DID / SVID / SDID ---- ---- ---- ---- SMART-HBA 8242-24i 9005 / 028f / 1ff9 / 0045 RAID 8236-16i 9005 / 028f / 1ff9 / 0046 RAID 8240-24i 9005 / 028f / 1ff9 / 0047 SMART-HBA 8238-16i 9005 / 028f / 1ff9 / 0048 PM8222-SHBA 9005 / 028f / 1ff9 / 004a RAID PM8204-2GB 9005 / 028f / 1ff9 / 004b RAID PM8204-4GB 9005 / 028f / 1ff9 / 004c PM8222-HBA 9005 / 028f / 1ff9 / 004f MT0804M6R 9005 / 028f / 1ff9 / 0051 MT0801M6E 9005 / 028f / 1ff9 / 0052 MT0808M6R 9005 / 028f / 1ff9 / 0053 MT0800M6H 9005 / 028f / 1ff9 / 0054 RS0800M5H24i 9005 / 028f / 1ff9 / 006b RS0800M5E8i 9005 / 028f / 1ff9 / 006c RS0800M5H8i 9005 / 028f / 1ff9 / 006d RS0804M5R16i 9005 / 028f / 1ff9 / 006f RS0800M5E24i 9005 / 028f / 1ff9 / 0070 RS0800M5H16i 9005 / 028f / 1ff9 / 0071 RS0800M5E16i 9005 / 028f / 1ff9 / 0072 RT0800M7E 9005 / 028f / 1ff9 / 0086 RT0800M7H 9005 / 028f / 1ff9 / 0087 RT0804M7R 9005 / 028f / 1ff9 / 0088 RT0808M7R 9005 / 028f / 1ff9 / 0089 RT1608M6R16i 9005 / 028f / 1ff9 / 00a1 Add new h3c pci_id: VID / DID / SVID / SDID ---- ---- ---- ---- UN RAID P4408-Mr-2 9005 / 028f / 193d / 1110 Add new powerleader pci ids: VID / DID / SVID / SDID ---- ---- ---- ---- PL SmartROC PM8204 9005 / 028f / 1f3a / 0104 Reviewed-by: Scott Benesh Reviewed-by: Scott Teel Reviewed-by: Mike McGowen Signed-off-by: David Strahan Signed-off-by: Don Brace Link: https://lore.kernel.org/r/20240711194704.982400-2-don.brace@microchip.com Signed-off-by: Martin K. Petersen commit f874d7210d882cb1c58a8e3da66f61cdc63cd4b4 Author: Li Feng Date: Thu Jul 18 16:07:22 2024 +0800 scsi: sd: Keep the discard mode stable There is a scenario where a large number of discard commands are issued when the iscsi initiator connects to the target and then performs a session rescan operation. There is a time window, most of the commands are in UNMAP mode, and some discard commands become WRITE SAME with UNMAP. The discard mode has been negotiated during the SCSI probe. If the mode is temporarily changed from UNMAP to WRITE SAME with UNMAP, an I/O ERROR may occur because the target may not implement WRITE SAME with UNMAP. Keep the discard mode stable to fix this issue. Signed-off-by: Li Feng Link: https://lore.kernel.org/r/20240718080751.313102-2-fengli@smartx.com Reviewed-by: Christoph Hellwig Reviewed-by: Martin K. Petersen Signed-off-by: Martin K. Petersen commit 5b247f03779d3601dcb080446bd7d680149a2aee Author: Justin Tee Date: Fri Jul 26 16:15:12 2024 -0700 scsi: lpfc: Copyright updates for 14.4.0.4 patches Update copyrights to 2024 for files modified in the 14.4.0.4 patch set. Signed-off-by: Justin Tee Link: https://lore.kernel.org/r/20240726231512.92867-9-justintee8345@gmail.com Signed-off-by: Martin K. Petersen commit 62b52495e6a12b37ba3f790f815e717737643bf7 Author: Justin Tee Date: Fri Jul 26 16:15:11 2024 -0700 scsi: lpfc: Update lpfc version to 14.4.0.4 Update lpfc version to 14.4.0.4 Signed-off-by: Justin Tee Link: https://lore.kernel.org/r/20240726231512.92867-8-justintee8345@gmail.com Signed-off-by: Martin K. Petersen commit 1f0f7679ad8942f810b0f19ee9cf098c3502d66a Author: Justin Tee Date: Fri Jul 26 16:15:10 2024 -0700 scsi: lpfc: Update PRLO handling in direct attached topology A kref imbalance occurs when handling an unsolicited PRLO in direct attached topology. Rework PRLO rcv handling when in MAPPED state. Save the state that we were handling a PRLO by setting nlp_last_elscmd to ELS_CMD_PRLO. Then in the lpfc_cmpl_els_logo_acc() completion routine, manually restart discovery. By issuing the PLOGI, which nlp_gets, before nlp_put at the end of the lpfc_cmpl_els_logo_acc() routine, we are saving us from a final nlp_put. And, we are still allowing the unreg_rpi to happen. Signed-off-by: Justin Tee Link: https://lore.kernel.org/r/20240726231512.92867-7-justintee8345@gmail.com Signed-off-by: Martin K. Petersen commit b5c18c9dd138733c16893613345af44deadcf05e Author: Justin Tee Date: Fri Jul 26 16:15:09 2024 -0700 scsi: lpfc: Fix unsolicited FLOGI kref imbalance when in direct attached topology In direct attached topology, certain target vendors that are quick to issue FLOGI followed by a cable pull for more than dev_loss_tmo may result in a kref imbalance for the remote port ndlp object. Add an nlp_get when the defer_flogi_acc flag is set. This is expected to balance the nlp_put in the defer_flogi_acc clause in the lpfc_issue_els_flogi() routine. Because we need to retain the ndlp ptr, reorganize all of the defer_flogi_acc information into one lpfc_defer_flogi_acc struct. Signed-off-by: Justin Tee Link: https://lore.kernel.org/r/20240726231512.92867-6-justintee8345@gmail.com Signed-off-by: Martin K. Petersen commit 3976beb1b410441bab9c3726e2ba76cc7a4c0b2d Author: Justin Tee Date: Fri Jul 26 16:15:08 2024 -0700 scsi: lpfc: Fix unintentional double clearing of vmid_flag The vport->vmid_flag is unintentionally cleared twice after an issue_lip via the lpfc_reinit_vmid routine(). The first call to lpfc_reinit_vmid() is in lpfc_cmpl_els_flogi(). Then lpfc_cmpl_els_flogi_fabric() calls lpfc_register_new_vport(), which calls lpfc_cmpl_reg_new_vport() when the mbox command completes and calls lpfc_reinit_vmid() a second time. Fix by moving the vmid_flag clear outside of the lpfc_reinit_vmid() routine so that vmid_flag is only cleared once upon FLOGI completion. Signed-off-by: Justin Tee Link: https://lore.kernel.org/r/20240726231512.92867-5-justintee8345@gmail.com Signed-off-by: Martin K. Petersen commit 2be1d4f11944cd6283cb97268b3e17c4424945ca Author: Justin Tee Date: Fri Jul 26 16:15:07 2024 -0700 scsi: lpfc: Validate hdwq pointers before dereferencing in reset/errata paths When the HBA is undergoing a reset or is handling an errata event, NULL ptr dereference crashes may occur in routines such as lpfc_sli_flush_io_rings(), lpfc_dev_loss_tmo_callbk(), or lpfc_abort_handler(). Add NULL ptr checks before dereferencing hdwq pointers that may have been freed due to operations colliding with a reset or errata event handler. Signed-off-by: Justin Tee Link: https://lore.kernel.org/r/20240726231512.92867-4-justintee8345@gmail.com Signed-off-by: Martin K. Petersen commit f1bfe32073964372c1dfa572480eb43d44e34909 Author: Justin Tee Date: Fri Jul 26 16:15:06 2024 -0700 scsi: lpfc: Remove redundant vport assignment when building an abort request The lpfc_sli_issue_abort_iotag() routine has a redundant assignment of abtsiocbp->vport = vport; The duplicate lines are from a previous refactoring, and this patch removes the accidental redundancy. Signed-off-by: Justin Tee Link: https://lore.kernel.org/r/20240726231512.92867-3-justintee8345@gmail.com Signed-off-by: Martin K. Petersen commit 5b8963c53de11dcef2b6241452988427db5773ec Author: Justin Tee Date: Fri Jul 26 16:15:05 2024 -0700 scsi: lpfc: Change diagnostic log flag during receipt of unknown ELS cmds During diagnostics, it has been determined that the 0115 log message for receipt of unknown ELS cmds does not benefit from trace buffer dumps. The trace buffer dump floods the console with unnecessary information, and the singular LOG_ELS flag has proven more beneficial in debugging efforts when dealing with unknown ELS cmds. Signed-off-by: Justin Tee Link: https://lore.kernel.org/r/20240726231512.92867-2-justintee8345@gmail.com Signed-off-by: Martin K. Petersen commit 88e6804fb323d1dbe477f002a89755f8b034e890 Author: Bao D. Nguyen Date: Tue Jul 23 20:49:32 2024 -0700 scsi: ufs: core: Support Updating UIC Command Timeout The default UIC command timeout still remains 500ms. Allow platform drivers to override the UIC command timeout if desired. In a real product, the 500ms timeout value is probably good enough. However, during the product development where there are a lot of logging and debug messages being printed to the UART console, interrupt starvations happen occasionally because the UART may print long debug messages from different modules in the system. While printing, the UART may have interrupts disabled for more than 500ms, causing UIC command timeout. The UIC command timeout would trigger more printing from the UFS driver, and eventually a watchdog timeout may occur unnecessarily. Add support for overriding the UIC command timeout value with the newly created uic_cmd_timeout kernel module parameter. Default value is 500ms. Supported values range from 500ms to 2 seconds. Signed-off-by: Bao D. Nguyen Link: https://lore.kernel.org/r/e4e1c87f3f867f270a3d4b5d57a00139ff0e9741.1721792309.git.quic_nguyenb@quicinc.com Suggested-by: Bart Van Assche Reviewed-by: Bart Van Assche Reviewed-by: Manivannan Sadhasivam Reviewed-by: Peter Wang Signed-off-by: Martin K. Petersen commit fdb1db6ea7f66cad970b19b5cd341b8386350bca Author: Kees Cook Date: Thu Jul 11 14:57:38 2024 -0700 scsi: aacraid: struct {user,}sgmap{,64,raw}: Replace 1-element arrays with flexible arrays Replace the deprecated[1] use of 1-element arrays in struct sgmap, struct sgmap64, struct sgmapraw, struct user_sgmap, and struct user_sgmap64 with modern flexible arrays. Additionally remove struct user_sgmapraw as it is unused. The resulting binary output differences from this change are limited only to stack space consumption of the smaller "srbu" variable in aac_issue_safw_bmic_identify() and aac_get_safw_ciss_luns(), as well as the smaller associated pair of memcpy()s in aac_send_safw_bmic_cmd(). Artificially growing the size of srbu back to its prior size removes all binary differences[2]. As an aside, after studying the aacraid driver code I wonder how aac_send_wellness_command() ever works. It is reporting a size 4 bytes too small for what it has constructed in memory in the DMA region: sgentry64 is size 12, whereas sgentry is size 8. Perhaps the hardware doesn't care. (Regardless, it is unchanged by this patch.) Link: https://github.com/KSPP/linux/issues/79 [1] Link: https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/commit/?h=dev/v6.10-rc2/1-element&id=45e6226bcbc5e982541754eca7ac29f403e82f5e [2] Signed-off-by: Kees Cook Link: https://lore.kernel.org/r/20240711215739.208776-2-kees@kernel.org Signed-off-by: Martin K. Petersen commit 6e5860b0ad4934baee8c7a202c02033b2631bb44 Author: Kees Cook Date: Thu Jul 11 14:57:37 2024 -0700 scsi: aacraid: Rearrange order of struct aac_srb_unit struct aac_srb_unit contains struct aac_srb, which contains struct sgmap, which ends in a (currently) "fake" (1-element) flexible array. Converting this to a flexible array is needed so that runtime bounds checking won't think the array is fixed size (i.e. under CONFIG_FORTIFY_SOURCE=y and/or CONFIG_UBSAN_BOUNDS=y), as other parts of aacraid use struct sgmap as a flexible array. It is not legal to have a flexible array in the middle of a structure, so it either needs to be split up or rearranged so that it is at the end of the structure. Luckily, struct aac_srb_unit, which is exclusively consumed/updated by aac_send_safw_bmic_cmd(), does not depend on member ordering. The values set in the on-stack struct aac_srb_unit instance "srbu" by the only two callers, aac_issue_safw_bmic_identify() and aac_get_safw_ciss_luns(), do not contain anything in srbu.srb.sgmap.sg, and they both implicitly initialize srbu.srb.sgmap.count to 0 during memset(). For example: memset(&srbu, 0, sizeof(struct aac_srb_unit)); srbcmd = &srbu.srb; srbcmd->flags = cpu_to_le32(SRB_DataIn); srbcmd->cdb[0] = CISS_REPORT_PHYSICAL_LUNS; srbcmd->cdb[1] = 2; /* extended reporting */ srbcmd->cdb[8] = (u8)(datasize >> 8); srbcmd->cdb[9] = (u8)(datasize); rcode = aac_send_safw_bmic_cmd(dev, &srbu, phys_luns, datasize); During aac_send_safw_bmic_cmd(), a separate srb is mapped into DMA, and has srbu.srb copied into it: srb = fib_data(fibptr); memcpy(srb, &srbu->srb, sizeof(struct aac_srb)); Only then is srb.sgmap.count written and srb->sg populated: srb->count = cpu_to_le32(xfer_len); sg64 = (struct sgmap64 *)&srb->sg; sg64->count = cpu_to_le32(1); sg64->sg[0].addr[1] = cpu_to_le32(upper_32_bits(addr)); sg64->sg[0].addr[0] = cpu_to_le32(lower_32_bits(addr)); sg64->sg[0].count = cpu_to_le32(xfer_len); But this is happening in the DMA memory, not in srbu.srb. An attempt to copy the changes back to srbu does happen: /* * Copy the updated data for other dumping or other usage if * needed */ memcpy(&srbu->srb, srb, sizeof(struct aac_srb)); But this was never correct: the sg64 (3 u32s) overlap of srb.sg (2 u32s) always meant that srbu.srb would have held truncated information and any attempt to walk srbu.srb.sg.sg based on the value of srbu.srb.sg.count would result in attempting to parse past the end of srbu.srb.sg.sg[0] into srbu.srb_reply. After getting a reply from hardware, the reply is copied into srbu.srb_reply: srb_reply = (struct aac_srb_reply *)fib_data(fibptr); memcpy(&srbu->srb_reply, srb_reply, sizeof(struct aac_srb_reply)); This has always been fixed-size, so there's no issue here. It is worth noting that the two callers _never check_ srbu contents -- neither srbu.srb nor srbu.srb_reply is examined. (They depend on the mapped xfer_buf instead.) Therefore, the ordering of members in struct aac_srb_unit does not matter, and the flexible array member can moved to the end. (Additionally, the two memcpy()s that update srbu could be entirely removed as they are never consumed, but I left that as-is.) Signed-off-by: Kees Cook Link: https://lore.kernel.org/r/20240711215739.208776-1-kees@kernel.org Signed-off-by: Martin K. Petersen commit f296cc1d7f5aeccd0d87ec167e0aea05acb8a022 Author: Kees Cook Date: Thu Jul 11 10:28:20 2024 -0700 scsi: message: fusion: struct _CONFIG_PAGE_IOC_4: Replace 1-element array with flexible array Replace the deprecated[1] use of a 1-element array in struct _CONFIG_PAGE_IOC_4 with a modern flexible array. Additionally add __counted_by annotation since SEP is only ever accessed after updating ACtiveSEP: lsi/mpi_cnfg.h: IOC_4_SEP SEP[] __counted_by(ActiveSEP); /* 08h */ mptsas.c: ii = IOCPage4Ptr->ActiveSEP++; mptsas.c: IOCPage4Ptr->SEP[ii].SEPTargetID = id; mptsas.c: IOCPage4Ptr->SEP[ii].SEPBus = channel; No binary differences are present after this conversion. Link: https://github.com/KSPP/linux/issues/79 [1] Signed-off-by: Kees Cook Link: https://lore.kernel.org/r/20240711172821.123936-6-kees@kernel.org Reviewed-by: Gustavo A. R. Silva Signed-off-by: Martin K. Petersen commit 70631322dbab5b48f49f142dd9aef768c0fb077c Author: Kees Cook Date: Thu Jul 11 10:28:19 2024 -0700 scsi: message: fusion: struct _CONFIG_PAGE_IOC_3: Replace 1-element array with flexible array Replace the deprecated[1] use of a 1-element array in struct _CONFIG_PAGE_IOC_3 with a modern flexible array. Additionally add __counted_by annotation since PhysDisk is only ever accessed via a loops bounded by NumPhysDisks: lsi/mpi_cnfg.h: IOC_3_PHYS_DISK PhysDisk[] __counted_by(NumPhysDisks); /* 08h */ mptscsih.c: for (i = 0; i < ioc->raid_data.pIocPg3->NumPhysDisks; i++) { mptscsih.c: if ((id == ioc->raid_data.pIocPg3->PhysDisk[i].PhysDiskID) && mptscsih.c: (channel == ioc->raid_data.pIocPg3->PhysDisk[i].PhysDiskBus)) { mptscsih.c: for (i = 0; i < ioc->raid_data.pIocPg3->NumPhysDisks; i++) { mptscsih.c: ioc->raid_data.pIocPg3->PhysDisk[i].PhysDiskNum); mptscsih.c: ioc->raid_data.pIocPg3->PhysDisk[i].PhysDiskNum, mptscsih.c: for (i = 0; i < ioc->raid_data.pIocPg3->NumPhysDisks; i++) { mptscsih.c: if ((id == ioc->raid_data.pIocPg3->PhysDisk[i].PhysDiskID) && mptscsih.c: (channel == ioc->raid_data.pIocPg3->PhysDisk[i].PhysDiskBus)) { mptscsih.c: rc = ioc->raid_data.pIocPg3->PhysDisk[i].PhysDiskNum; mptscsih.c: for (i = 0; i < ioc->raid_data.pIocPg3->NumPhysDisks; i++) { mptscsih.c: ioc->raid_data.pIocPg3->PhysDisk[i].PhysDiskNum); mptscsih.c: ioc->raid_data.pIocPg3->PhysDisk[i].PhysDiskNum, No binary differences are present after this conversion. Link: https://github.com/KSPP/linux/issues/79 [1] Signed-off-by: Kees Cook Link: https://lore.kernel.org/r/20240711172821.123936-5-kees@kernel.org Reviewed-by: Gustavo A. R. Silva Signed-off-by: Martin K. Petersen commit de80fe29ab53dc9efde5c773fadebd3e547ad785 Author: Kees Cook Date: Thu Jul 11 10:28:18 2024 -0700 scsi: message: fusion: struct _CONFIG_PAGE_IOC_2: Replace 1-element array with flexible array Replace the deprecated[1] use of a 1-element array in struct _CONFIG_PAGE_IOC_2 with a modern flexible array. Additionally add __counted_by annotation since RaidVolume is only ever accessed from loops controlled by NumActiveVolumes: lsi/mpi_cnfg.h: CONFIG_PAGE_IOC_2_RAID_VOL RaidVolume[] __counted_by(NumActiveVolumes); /* 0Ch */ mptbase.c: for (i = 0; i < pIoc2->NumActiveVolumes ; i++) mptbase.c: pIoc2->RaidVolume[i].VolumeBus, mptbase.c: pIoc2->RaidVolume[i].VolumeID); mptsas.c: for (i = 0; i < ioc->raid_data.pIocPg2->NumActiveVolumes; i++) { mptsas.c: RaidVolume[i].VolumeID) { mptsas.c: RaidVolume[i].VolumeBus; mptsas.c: for (i = 0; i < ioc->raid_data.pIocPg2->NumActiveVolumes; i++) { mptsas.c: ioc->raid_data.pIocPg2->RaidVolume[i].VolumeID, 0); mptsas.c: ioc->raid_data.pIocPg2->RaidVolume[i].VolumeID); mptsas.c: ioc->raid_data.pIocPg2->RaidVolume[i].VolumeID, 0); mptsas.c: for (i = 0; i < ioc->raid_data.pIocPg2->NumActiveVolumes; i++) { mptsas.c: if (ioc->raid_data.pIocPg2->RaidVolume[i].VolumeID == mptsas.c: for (i = 0; i < ioc->raid_data.pIocPg2->NumActiveVolumes; i++) mptsas.c: if (ioc->raid_data.pIocPg2->RaidVolume[i].VolumeID == id) mptspi.c: for (i=0; i < ioc->raid_data.pIocPg2->NumActiveVolumes; i++) { mptspi.c: if (ioc->raid_data.pIocPg2->RaidVolume[i].VolumeID == id) { No binary differences are present after this conversion. Link: https://github.com/KSPP/linux/issues/79 [1] Signed-off-by: Kees Cook Link: https://lore.kernel.org/r/20240711172821.123936-4-kees@kernel.org Reviewed-by: Gustavo A. R. Silva Signed-off-by: Martin K. Petersen commit dc8932fbf6a9cad4cf6dd312d115a26d90facb7e Author: Kees Cook Date: Thu Jul 11 10:28:17 2024 -0700 scsi: message: fusion: struct _CONFIG_PAGE_RAID_PHYS_DISK_1: Replace 1-element array with flexible array Replace the deprecated[1] use of a 1-element array in struct _CONFIG_PAGE_RAID_PHYS_DISK_1 with a modern flexible array. Additionally add __counted_by annotation since Path is only ever accessed via a loops bounded by NumPhysDiskPaths: lsi/mpi_cnfg.h: RAID_PHYS_DISK1_PATH Path[] __counted_by(NumPhysDiskPaths);/* 0Ch */ mptbase.c: phys_disk->NumPhysDiskPaths = buffer->NumPhysDiskPaths; mptbase.c: for (i = 0; i < phys_disk->NumPhysDiskPaths; i++) { mptbase.c: phys_disk->Path[i].PhysDiskID = buffer->Path[i].PhysDiskID; mptbase.c: phys_disk->Path[i].PhysDiskBus = buffer->Path[i].PhysDiskBus; No binary differences are present after this conversion. Link: https://github.com/KSPP/linux/issues/79 [1] Signed-off-by: Kees Cook Link: https://lore.kernel.org/r/20240711172821.123936-3-kees@kernel.org Reviewed-by: Gustavo A. R. Silva Signed-off-by: Martin K. Petersen commit 14c1f88c7f625dc12fb4a6f18154bcd6f7cd021f Author: Kees Cook Date: Thu Jul 11 10:28:16 2024 -0700 scsi: message: fusion: struct _CONFIG_PAGE_SAS_IO_UNIT_0: Replace 1-element array with flexible array Replace the deprecated[1] use of a 1-element array in struct _CONFIG_PAGE_SAS_IO_UNIT_0 with a modern flexible array. Additionally add __counted_by annotation since PhyData is only ever accessed via a loops bounded by NumPhys: lsi/mpi_cnfg.h: MPI_SAS_IO_UNIT0_PHY_DATA PhyData[] __counted_by(NumPhys); /* 10h */ mptsas.c: port_info->num_phys = buffer->NumPhys; mptsas.c: for (i = 0; i < port_info->num_phys; i++) { mptsas.c: mptsas_print_phy_data(ioc, &buffer->PhyData[i]); mptsas.c: port_info->phy_info[i].phy_id = i; mptsas.c: port_info->phy_info[i].port_id = mptsas.c: buffer->PhyData[i].Port; mptsas.c: port_info->phy_info[i].negotiated_link_rate = mptsas.c: buffer->PhyData[i].NegotiatedLinkRate; No binary differences are present after this conversion. Link: https://github.com/KSPP/linux/issues/79 [1] Signed-off-by: Kees Cook Link: https://lore.kernel.org/r/20240711172821.123936-2-kees@kernel.org Reviewed-by: Gustavo A. R. Silva Signed-off-by: Martin K. Petersen commit 8e76c9c9dd117c95c0ae248a217197228000912c Author: Kees Cook Date: Thu Jul 11 10:28:15 2024 -0700 scsi: message: fusion: struct _RAID_VOL0_SETTINGS: Replace 1-element array with flexible array Replace the deprecated[1] use of a 1-element array in struct _RAID_VOL0_SETTINGS with a modern flexible array. Additionally add __counted_by annotation since PhysDisk is only ever accessed via a loops bounded by NumPhysDisks: lsi/mpi_cnfg.h: RAID_VOL0_PHYS_DISK PhysDisk[] __counted_by(NumPhysDisks); /* 28h */ mptbase.c: for (i = 0; i < buffer->NumPhysDisks; i++) { mptbase.c: buffer->PhysDisk[i].PhysDiskNum, &phys_disk) != 0) mptsas.c: for (i = 0; i < buffer->NumPhysDisks; i++) { mptsas.c: buffer->PhysDisk[i].PhysDiskNum, &phys_disk) != 0) mptsas.c: for (i = 0; i < buffer->NumPhysDisks; i++) { mptsas.c: buffer->PhysDisk[i].PhysDiskNum, &phys_disk) != 0) No binary differences are present after this conversion. Link: https://github.com/KSPP/linux/issues/79 [1] Signed-off-by: Kees Cook Link: https://lore.kernel.org/r/20240711172821.123936-1-kees@kernel.org Reviewed-by: Gustavo A. R. Silva Signed-off-by: Martin K. Petersen commit c72e13cf820bc79fffd5ffaaa0aaab11269027d1 Author: Kees Cook Date: Thu Jul 11 11:07:06 2024 -0700 scsi: ipr: Replace 1-element arrays with flexible arrays Replace the deprecated[1] use of a 1-element arrays in struct ipr_hostrcb_fabric_desc and struct ipr_hostrcb64_fabric_desc with modern flexible arrays. No binary differences are present after this conversion. Link: https://github.com/KSPP/linux/issues/79 [1] Signed-off-by: Kees Cook Link: https://lore.kernel.org/r/20240711180702.work.536-kees@kernel.org Reviewed-by: Gustavo A. R. Silva Signed-off-by: Martin K. Petersen commit 2e35b43bc9a82fde4e7aebe5d8331e1158374d5c Author: Kees Cook Date: Thu Jul 11 10:50:55 2024 -0700 scsi: aacraid: struct aac_ciss_phys_luns_resp: Replace 1-element array with flexible array Replace the deprecated[1] use of a 1-element array in struct aac_ciss_phys_luns_resp with a modern flexible array. No binary differences are present after this conversion. Link: https://github.com/KSPP/linux/issues/79 [1] Signed-off-by: Kees Cook Link: https://lore.kernel.org/r/20240711175055.work.928-kees@kernel.org Reviewed-by: Gustavo A. R. Silva Reviewed-by: Kees Cook Signed-off-by: Martin K. Petersen commit 575b9be63684600e7e02517f0b647e5cb759120c Author: Kees Cook Date: Thu Jul 11 10:48:23 2024 -0700 scsi: aacraid: union aac_init: Replace 1-element array with flexible array Replace the deprecated[1] use of a 1-element array in union aac_init with a modern flexible array. Additionally add __counted_by annotation since rrq is only ever accessed after rr_queue_count has been set (with the same value used to control the loop): init->r8.rr_queue_count = cpu_to_le32(dev->max_msix); ... for (i = 0; i < dev->max_msix; i++) { addr = (u64)dev->host_rrq_pa + dev->vector_cap * i * sizeof(u32); init->r8.rrq[i].host_addr_high = cpu_to_le32( upper_32_bits(addr)); No binary differences are present after this conversion. Link: https://github.com/KSPP/linux/issues/79 [1] Signed-off-by: Kees Cook Link: https://lore.kernel.org/r/20240711174815.work.689-kees@kernel.org Reviewed-by: Gustavo A. R. Silva Signed-off-by: Martin K. Petersen commit 29b4a49750776925ea48ef440da6aa75828913db Author: Kees Cook Date: Thu Jul 11 08:58:42 2024 -0700 scsi: megaraid_sas: struct MR_HOST_DEVICE_LIST: Replace 1-element array with flexible array Replace the deprecated[1] use of a 1-element array in struct MR_HOST_DEVICE_LIST with a modern flexible array. One binary difference appears in megasas_host_device_list_query(): struct MR_HOST_DEVICE_LIST *ci; ... ci = instance->host_device_list_buf; ... memset(ci, 0, sizeof(*ci)); The memset() clears only the non-flexible array fields. Looking at the rest of the function, this appears to be fine: firmware is using this region to communicate with the kernel, so it likely never made sense to clear the first MR_HOST_DEVICE_LIST_ENTRY. Link: https://github.com/KSPP/linux/issues/79 [1] Signed-off-by: Kees Cook Link: https://lore.kernel.org/r/20240711155841.work.839-kees@kernel.org Reviewed-by: Gustavo A. R. Silva Signed-off-by: Martin K. Petersen commit ed8ab02c85b3494ec81f35f12e01f6678af09f6a Author: Kees Cook Date: Thu Jul 11 08:58:23 2024 -0700 scsi: megaraid_sas: struct MR_LD_VF_MAP: Replace 1-element arrays with flexible arrays Replace the deprecated[1] use of a 1-element array in struct MR_LD_VF_MAP with a modern flexible array. No binary differences are present after this conversion. Link: https://github.com/KSPP/linux/issues/79 [1] Signed-off-by: Kees Cook Link: https://lore.kernel.org/r/20240711155823.work.778-kees@kernel.org Reviewed-by: Gustavo A. R. Silva Signed-off-by: Martin K. Petersen commit a62193abae75f3c3f68bbc7d3343751644140556 Author: Kees Cook Date: Thu Jul 11 08:56:36 2024 -0700 scsi: mpi3mr: struct mpi3_sas_io_unit_page1: Replace 1-element array with flexible array Replace the deprecated[1] use of a 1-element array in struct mpi3_sas_io_unit_page1 with a modern flexible array. No binary differences are present after this conversion. Link: https://github.com/KSPP/linux/issues/79 [1] Signed-off-by: Kees Cook Link: https://lore.kernel.org/r/20240711155637.3757036-4-kees@kernel.org Reviewed-by: Gustavo A. R. Silva Signed-off-by: Martin K. Petersen commit 41bb96296f9d3a92f3b52bcf502d6c0e86fa84a0 Author: Kees Cook Date: Thu Jul 11 08:56:35 2024 -0700 scsi: mpi3mr: struct mpi3_sas_io_unit_page0: Replace 1-element array with flexible array Replace the deprecated[1] use of a 1-element array in struct mpi3_sas_io_unit_page0 with a modern flexible array. No binary differences are present after this conversion. Link: https://github.com/KSPP/linux/issues/79 [1] Signed-off-by: Kees Cook Link: https://lore.kernel.org/r/20240711155637.3757036-3-kees@kernel.org Reviewed-by: Gustavo A. R. Silva Signed-off-by: Martin K. Petersen commit 0e11f97bfddc608fd31d4fb7b8feaf74755d6372 Author: Kees Cook Date: Thu Jul 11 08:56:34 2024 -0700 scsi: mpi3mr: struct mpi3_event_data_pcie_topology_change_list: Replace 1-element array with flexible array Replace the deprecated[1] use of a 1-element array in struct mpi3_event_data_pcie_topology_change_list with a modern flexible array. Additionally add __counted_by annotation since port_entry is only ever accessed in loops controlled by num_entries. For example: for (i = 0; i < event_data->num_entries; i++) { handle = le16_to_cpu(event_data->port_entry[i].attached_dev_handle); No binary differences are present after this conversion. Link: https://github.com/KSPP/linux/issues/79 [1] Signed-off-by: Kees Cook Link: https://lore.kernel.org/r/20240711155637.3757036-2-kees@kernel.org Reviewed-by: Gustavo A. R. Silva Signed-off-by: Martin K. Petersen commit ac5b7505de7013a4c572aa80d5b43db287fa0161 Author: Kees Cook Date: Thu Jul 11 08:56:33 2024 -0700 scsi: mpi3mr: struct mpi3_event_data_sas_topology_change_list: Replace 1-element array with flexible array Replace the deprecated[1] use of a 1-element array in struct mpi3_event_data_sas_topology_change_list with a modern flexible array. Additionally add __counted_by annotation since phy_entry is only ever accessed in loops controlled by num_entries. For example: for (i = 0; i < event_data->num_entries; i++) { ... handle = le16_to_cpu(event_data->phy_entry[i].attached_dev_handle); No binary differences are present after this conversion. Link: https://github.com/KSPP/linux/issues/79 [1] Signed-off-by: Kees Cook Link: https://lore.kernel.org/r/20240711155637.3757036-1-kees@kernel.org Reviewed-by: Gustavo A. R. Silva Signed-off-by: Martin K. Petersen commit ffed586b8c4f1fdb772ee350e229863f145defb5 Author: Shin'ichiro Kawasaki Date: Thu Aug 1 14:42:34 2024 +0900 scsi: sd: Move sd_read_cpr() out of the q->limits_lock region Commit 804e498e0496 ("sd: convert to the atomic queue limits API") introduced pairs of function calls to queue_limits_start_update() and queue_limits_commit_update(). These two functions lock and unlock q->limits_lock. In sd_revalidate_disk(), sd_read_cpr() is called after queue_limits_start_update() call and before queue_limits_commit_update() call. sd_read_cpr() locks q->sysfs_dir_lock and &q->sysfs_lock. Then new lock dependencies were created between q->limits_lock, q->sysfs_dir_lock and q->sysfs_lock, as follows: sd_revalidate_disk queue_limits_start_update mutex_lock(&q->limits_lock) sd_read_cpr disk_set_independent_access_ranges mutex_lock(&q->sysfs_dir_lock) mutex_lock(&q->sysfs_lock) mutex_unlock(&q->sysfs_lock) mutex_unlock(&q->sysfs_dir_lock) queue_limits_commit_update mutex_unlock(&q->limits_lock) However, the three locks already had reversed dependencies in other places. Then the new dependencies triggered the lockdep WARN "possible circular locking dependency detected" [1]. This WARN was observed by running the blktests test case srp/002. To avoid the WARN, move the sd_read_cpr() call in sd_revalidate_disk() after the queue_limits_commit_update() call. In other words, move the sd_read_cpr() call out of the q->limits_lock region. [1] https://lore.kernel.org/linux-scsi/vlmv53ni3ltwxplig5qnw4xsl2h6ccxijfbqzekx76vxoim5a5@dekv7q3es3tx/ Fixes: 804e498e0496 ("sd: convert to the atomic queue limits API") Signed-off-by: Shin'ichiro Kawasaki Link: https://lore.kernel.org/r/20240801054234.540532-1-shinichiro.kawasaki@wdc.com Tested-by: Luca Coelho Reviewed-by: Damien Le Moal Reviewed-by: Christoph Hellwig Reviewed-by: Bart Van Assche Signed-off-by: Martin K. Petersen commit ab9fd06cb8f0db0854291833fc40c789e43a361f Author: Vamshi Gajjela Date: Wed Jul 24 19:21:26 2024 +0530 scsi: ufs: core: Fix hba->last_dme_cmd_tstamp timestamp updating logic The ufshcd_add_delay_before_dme_cmd() always introduces a delay of MIN_DELAY_BEFORE_DME_CMDS_US between DME commands even when it's not required. The delay is added when the UFS host controller supplies the quirk UFSHCD_QUIRK_DELAY_BEFORE_DME_CMDS. Fix the logic to update hba->last_dme_cmd_tstamp to ensure subsequent DME commands have the correct delay in the range of 0 to MIN_DELAY_BEFORE_DME_CMDS_US. Update the timestamp at the end of the function to ensure it captures the latest time after any necessary delay has been applied. Signed-off-by: Vamshi Gajjela Link: https://lore.kernel.org/r/20240724135126.1786126-1-vamshigajjela@google.com Fixes: cad2e03d8607 ("ufs: add support to allow non standard behaviours (quirks)") Cc: stable@vger.kernel.org Reviewed-by: Bart Van Assche Signed-off-by: Martin K. Petersen commit 47398f49dab8326bb652fa2d7a51ae5ec78775b5 Author: Eric Biggers Date: Sun Jul 21 11:38:40 2024 -0700 scsi: ufs: exynos: Don't resume FMP when crypto support is disabled If exynos_ufs_fmp_init() did not enable FMP support, then exynos_ufs_fmp_resume() should not execute the FMP-related SMC calls. Fixes: c96499fcb403 ("scsi: ufs: exynos: Add support for Flash Memory Protector (FMP)") Signed-off-by: Eric Biggers Link: https://lore.kernel.org/r/20240721183840.209284-1-ebiggers@kernel.org Reviewed-by: Bart Van Assche Signed-off-by: Martin K. Petersen commit 82dbb57ac8d06dfe8227ba9ab11a49de2b475ae5 Author: Damien Le Moal Date: Fri Jul 19 16:39:12 2024 +0900 scsi: mpt3sas: Avoid IOMMU page faults on REPORT ZONES Some firmware versions of the 9600 series SAS HBA byte-swap the REPORT ZONES command reply buffer from ATA-ZAC devices by directly accessing the buffer in the host memory. This does not respect the default command DMA direction and causes IOMMU page faults on architectures with an IOMMU enforcing write-only mappings for DMA_FROM_DEVICE DMA driection (e.g. AMD hosts). scsi 18:0:0:0: Direct-Access-ZBC ATA WDC WSH722020AL W870 PQ: 0 ANSI: 6 scsi 18:0:0:0: SATA: handle(0x0027), sas_addr(0x300062b2083e7c40), phy(0), device_name(0x5000cca29dc35e11) scsi 18:0:0:0: enclosure logical id (0x300062b208097c40), slot(0) scsi 18:0:0:0: enclosure level(0x0000), connector name( C0.0) scsi 18:0:0:0: atapi(n), ncq(y), asyn_notify(n), smart(y), fua(y), sw_preserve(y) scsi 18:0:0:0: qdepth(32), tagged(1), scsi_level(7), cmd_que(1) sd 18:0:0:0: Attached scsi generic sg2 type 20 sd 18:0:0:0: [sdc] Host-managed zoned block device mpt3sas 0000:41:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0021 address=0xfff9b200 flags=0x0050] mpt3sas 0000:41:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0021 address=0xfff9b300 flags=0x0050] mpt3sas_cm0: mpt3sas_ctl_pre_reset_handler: Releasing the trace buffer due to adapter reset. mpt3sas_cm0 fault info from func: mpt3sas_base_make_ioc_ready mpt3sas_cm0: fault_state(0x2666)! mpt3sas_cm0: sending diag reset !! mpt3sas_cm0: diag reset: SUCCESS sd 18:0:0:0: [sdc] REPORT ZONES start lba 0 failed sd 18:0:0:0: [sdc] REPORT ZONES: Result: hostbyte=DID_RESET driverbyte=DRIVER_OK sd 18:0:0:0: [sdc] 0 4096-byte logical blocks: (0 B/0 B) Avoid such issue by always mapping the buffer of REPORT ZONES commands using DMA_BIDIRECTIONAL (read+write IOMMU mapping). This is done by introducing the helper function _base_scsi_dma_map() and using this helper in _base_build_sg_scmd() and _base_build_sg_scmd_ieee() instead of calling directly scsi_dma_map(). Fixes: 471ef9d4e498 ("mpt3sas: Build MPI SGL LIST on GEN2 HBAs and IEEE SGL LIST on GEN3 HBAs") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal Link: https://lore.kernel.org/r/20240719073913.179559-3-dlemoal@kernel.org Reviewed-by: Christoph Hellwig Reviewed-by: Johannes Thumshirn Signed-off-by: Martin K. Petersen commit 1abc900ddda8ad2ef739fedf498d415655b6c3b8 Author: Damien Le Moal Date: Fri Jul 19 16:39:11 2024 +0900 scsi: mpi3mr: Avoid IOMMU page faults on REPORT ZONES Some firmware versions of the 9600 series SAS HBA byte-swap the REPORT ZONES command reply buffer from ATA-ZAC devices by directly accessing the buffer in the host memory. This does not respect the default command DMA direction and causes IOMMU page faults on architectures with an IOMMU enforcing write-only mappings for DMA_FROM_DEVICE DMA direction (e.g. AMD hosts), leading to the device capacity to be dropped to 0: scsi 18:0:58:0: Direct-Access-ZBC ATA WDC WSH722626AL W930 PQ: 0 ANSI: 7 scsi 18:0:58:0: Power-on or device reset occurred sd 18:0:58:0: Attached scsi generic sg9 type 20 sd 18:0:58:0: [sdj] Host-managed zoned block device mpi3mr 0000:c1:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0001 address=0xfec0c400 flags=0x0050] mpi3mr 0000:c1:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0001 address=0xfec0c500 flags=0x0050] sd 18:0:58:0: [sdj] REPORT ZONES start lba 0 failed sd 18:0:58:0: [sdj] REPORT ZONES: Result: hostbyte=DID_SOFT_ERROR driverbyte=DRIVER_OK sd 18:0:58:0: [sdj] 0 4096-byte logical blocks: (0 B/0 B) sd 18:0:58:0: [sdj] Write Protect is off sd 18:0:58:0: [sdj] Mode Sense: 6b 00 10 08 sd 18:0:58:0: [sdj] Write cache: enabled, read cache: enabled, supports DPO and FUA sd 18:0:58:0: [sdj] Attached SCSI disk Avoid this issue by always mapping the buffer of REPORT ZONES commands using DMA_BIDIRECTIONAL, that is, using a read-write IOMMU mapping. Suggested-by: Christoph Hellwig Fixes: 023ab2a9b4ed ("scsi: mpi3mr: Add support for queue command processing") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal Link: https://lore.kernel.org/r/20240719073913.179559-2-dlemoal@kernel.org Reviewed-by: Christoph Hellwig Reviewed-by: Johannes Thumshirn Signed-off-by: Martin K. Petersen commit ac6efb12ca64156f4a94e964acdb96ee7d59630d Author: Manivannan Sadhasivam Date: Thu Jul 18 22:36:59 2024 +0530 scsi: ufs: core: Do not set link to OFF state while waking up from hibernation UFS link is just put into hibern8 state during the 'freeze' process of the hibernation. Afterwards, the system may get powered down. But that doesn't matter during wakeup. Because during wakeup from hibernation, UFS link is again put into hibern8 state by the restore kernel and then the control is handed over to the to image kernel. So in both the places, UFS link is never turned OFF. But ufshcd_system_restore() just assumes that the link will be in OFF state and sets the link state accordingly. And this breaks hibernation wakeup: [ 2445.371335] phy phy-1d87000.phy.3: phy_power_on was called before phy_init [ 2445.427883] ufshcd-qcom 1d84000.ufshc: Controller enable failed [ 2445.427890] ufshcd-qcom 1d84000.ufshc: ufshcd_host_reset_and_restore: Host init failed -5 [ 2445.427906] ufs_device_wlun 0:0:0:49488: ufshcd_wl_resume failed: -5 [ 2445.427918] ufs_device_wlun 0:0:0:49488: PM: dpm_run_callback(): scsi_bus_restore returns -5 [ 2445.427973] ufs_device_wlun 0:0:0:49488: PM: failed to restore async: error -5 So fix the issue by removing the code that sets the link to OFF state. Cc: Anjana Hari Cc: stable@vger.kernel.org # 6.3 Fixes: 88441a8d355d ("scsi: ufs: core: Add hibernation callbacks") Signed-off-by: Manivannan Sadhasivam Link: https://lore.kernel.org/r/20240718170659.201647-1-manivannan.sadhasivam@linaro.org Reviewed-by: Bart Van Assche Signed-off-by: Martin K. Petersen commit da3e19ef0b3de0aa4b25595bdc214c02a04f19b8 Author: Johan Hovold Date: Tue Jul 16 18:11:01 2024 +0200 scsi: Revert "scsi: sd: Do not repeat the starting disk message" This reverts commit 7a6bbc2829d4ab592c7e440a6f6f5deb3cd95db4. The offending commit tried to suppress a double "Starting disk" message for some drivers, but instead started spamming the log with bogus messages every five seconds: [ 311.798956] sd 0:0:0:0: [sda] Starting disk [ 316.919103] sd 0:0:0:0: [sda] Starting disk [ 322.040775] sd 0:0:0:0: [sda] Starting disk [ 327.161140] sd 0:0:0:0: [sda] Starting disk [ 332.281352] sd 0:0:0:0: [sda] Starting disk [ 337.401878] sd 0:0:0:0: [sda] Starting disk [ 342.521527] sd 0:0:0:0: [sda] Starting disk [ 345.850401] sd 0:0:0:0: [sda] Starting disk [ 350.967132] sd 0:0:0:0: [sda] Starting disk [ 356.090454] sd 0:0:0:0: [sda] Starting disk ... on machines that do not actually stop the disk on runtime suspend (e.g. the Qualcomm sc8280xp CRD with UFS). Let's just revert for now to address the regression. Fixes: 7a6bbc2829d4 ("scsi: sd: Do not repeat the starting disk message") Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold Link: https://lore.kernel.org/r/20240716161101.30692-1-johan+linaro@kernel.org Reviewed-by: Bart Van Assche Reviewed-by: Damien Le Moal Signed-off-by: Martin K. Petersen commit 3911af778f208e5f49d43ce739332b91e26bc48e Author: Peter Wang Date: Mon Jul 15 14:38:31 2024 +0800 scsi: ufs: core: Fix deadlock during RTC update There is a deadlock when runtime suspend waits for the flush of RTC work, and the RTC work calls ufshcd_rpm_get_sync() to wait for runtime resume. Here is deadlock backtrace: kworker/0:1 D 4892.876354 10 10971 4859 0x4208060 0x8 10 0 120 670730152367 ptr f0ffff80c2e40000 0 1 0x00000001 0x000000ff 0x000000ff 0x000000ff __switch_to+0x1a8/0x2d4 __schedule+0x684/0xa98 schedule+0x48/0xc8 schedule_timeout+0x48/0x170 do_wait_for_common+0x108/0x1b0 wait_for_completion+0x44/0x60 __flush_work+0x39c/0x424 __cancel_work_sync+0xd8/0x208 cancel_delayed_work_sync+0x14/0x28 __ufshcd_wl_suspend+0x19c/0x480 ufshcd_wl_runtime_suspend+0x3c/0x1d4 scsi_runtime_suspend+0x78/0xc8 __rpm_callback+0x94/0x3e0 rpm_suspend+0x2d4/0x65c __pm_runtime_suspend+0x80/0x114 scsi_runtime_idle+0x38/0x6c rpm_idle+0x264/0x338 __pm_runtime_idle+0x80/0x110 ufshcd_rtc_work+0x128/0x1e4 process_one_work+0x26c/0x650 worker_thread+0x260/0x3d8 kthread+0x110/0x134 ret_from_fork+0x10/0x20 Skip updating RTC if RPM state is not RPM_ACTIVE. Fixes: 6bf999e0eb41 ("scsi: ufs: core: Add UFS RTC support") Cc: stable@vger.kernel.org # 6.9.x Signed-off-by: Peter Wang Link: https://lore.kernel.org/r/20240715063831.29792-1-peter.wang@mediatek.com Reviewed-by: Bean Huo Reviewed-by: Bart Van Assche Signed-off-by: Martin K. Petersen commit 022587d8aec3da1d1698ddae9fb8cfe35f3ad49c Author: Peter Wang Date: Fri Jul 12 17:45:06 2024 +0800 scsi: ufs: core: Bypass quick recovery if force reset is needed If force_reset is true, bypass quick recovery. This will shorten error recovery time. Signed-off-by: Peter Wang Link: https://lore.kernel.org/r/20240712094506.11284-1-peter.wang@mediatek.com Reviewed-by: Bean Huo Reviewed-by: Bart Van Assche Signed-off-by: Martin K. Petersen commit 0c60eb0cc320fffbb8b10329d276af14f6f5e6bf Author: Kyoungrul Kim Date: Wed Jul 10 08:25:20 2024 +0900 scsi: ufs: core: Check LSDBS cap when !mcq If the user sets use_mcq_mode to 0, the host will try to activate the LSDB mode unconditionally even when the LSDBS of device HCI cap is 1. This makes commands time out and causes device probing to fail. To prevent that problem, check the LSDBS cap when MCQ is not supported. Signed-off-by: Kyoungrul Kim Link: https://lore.kernel.org/r/20240709232520epcms2p8ebdb5c4fccc30a6221390566589bf122@epcms2p8 Reviewed-by: Bart Van Assche Signed-off-by: Martin K. Petersen