diff --git a/news/README.md b/news/README.md index 1890e7f620824e95042e43a4dbdc956370f5ac0b..df87ec92658a8eee770ac07f2eb8b6946e0349ff 100644 --- a/news/README.md +++ b/news/README.md @@ -5,6 +5,963 @@ * [2022 年](2022.md) * [2023 年 - 上半年](2023-1st-half.md) +## 20231119:第 69 期 + +### 内核动态 + +#### RISC-V 架构支持 + +**[v4: Change the sg2042 timer layout to fit aclint format](http://lore.kernel.org/linux-riscv/IA1PR20MB4953C82499C5D81D2C6A020BBBB6A@IA1PR20MB4953.namprd20.prod.outlook.com/)** + +> As the sg2042 uses different address for timer and mswi of its clint +> device, it should follow the aclint format. For the previous patchs, +> it only use only one address for both mtime and mtimer, this is can +> not be parsed by OpenSBI. To resolve this, separate these two registers +> in the dtb. +> + +**[v4: RISC-V SBI debug console extension support](http://lore.kernel.org/linux-riscv/20231118033859.726692-1-apatel@ventanamicro.com/)** + +> The SBI v2.0 specification is now frozen. The SBI v2.0 specification defines +> SBI debug console (DBCN) extension which replaces the legacy SBI v0.1 +> functions sbi_console_putchar() and sbi_console_getchar(). +> (Refer v2.0-rc5 at https://github.com/riscv-non-isa/riscv-sbi-doc/releases) +> +> This series adds support for SBI debug console (DBCN) extension in KVM RISC-V +> and Linux RISC-V. +> + +**[v11: riscv: Add fine-tuned checksum functions](http://lore.kernel.org/linux-riscv/20231117-optimize_checksum-v11-0-7d9d954fe361@rivosinc.com/)** + +> Each architecture generally implements fine-tuned checksum functions to +> leverage the instruction set. This patch adds the main checksum +> functions that are used in networking. +> +> This patch takes heavy use of the Zbb extension using alternatives +> patching. +> + +**[v1: riscv: Resolve module loading issues](http://lore.kernel.org/linux-riscv/20231117-module_fixup-v1-0-62bb777f6825@rivosinc.com/)** + +> Previous commits caused compilation of module linking tests to +> fail on rv32 toolchains with uleb128 support. The first patch resolves +> that issue. The second patch resolves the type issues pointed out by +> sparse. +> + +**[v1: riscv: compat_vdso: align VDSOAS build log](http://lore.kernel.org/linux-riscv/20231117125843.1058553-1-masahiroy@kernel.org/)** + +> Add one more space after "VDSOAS" for better alignment in the build log. +> + +**[v1: riscv: compat_vdso: install compat_vdso.so.dbg to /lib/modules/*/vdso/](http://lore.kernel.org/linux-riscv/20231117125807.1058477-1-masahiroy@kernel.org/)** + +> 'make vdso_install' installs debug vdso files to /lib/modules/*/vdso/. +> +> Only for the compat vdso on riscv, the installation destination differs; +> compat_vdso.so.dbg is installed to /lib/module/*/compat_vdso/. +> + +**[v2: Solve iommu probe races around iommu_fwspec](http://lore.kernel.org/linux-riscv/0-v2-36a0088ecaa7+22c6e-iommu_fwspec_jgg@nvidia.com/)** + +> The iommu subsystem uses dev->iommu to store bits of information about the +> attached iommu driver. This has been co-opted by the ACPI/OF code to also +> be a place to pass around the iommu_fwspec before a driver is probed. +> + +**[v11: Refactoring Microchip PCIe driver and add StarFive PCIe](http://lore.kernel.org/linux-riscv/20231115114912.71448-1-minda.chen@starfivetech.com/)** + +> This patchset final purpose is add PCIe driver for StarFive JH7110 SoC. +> JH7110 using PLDA XpressRICH PCIe IP. Microchip PolarFire Using the +> same IP and have commit their codes, which are mixed with PLDA +> controller codes and Microchip platform codes. +> + +**[v6: RISC-V: Add MMC support for TH1520 boards](http://lore.kernel.org/linux-riscv/20231114-th1520-mmc-v6-0-3273c661a571@baylibre.com/)** + +> This series adds support for the MMC controller in the T-Head TH1520 +> SoC, and it enables the eMMC and microSD slot on both the BeagleV +> Ahead and the Sipeed LicheePi 4A. +> +> I tested on top of v6.6 with riscv defconfig. I was able to boot the +> Ahead [1] and LPi4a [2] from eMMC. This patch series also exists as a +> git branch [3]. +> + +**[v1: kexec_file: print out debugging message if required](http://lore.kernel.org/linux-riscv/20231114153253.241262-1-bhe@redhat.com/)** + +> Currently, specifying '-d' will print a lot of debugging information +> about kexec/kdump loading with kexec_load interface. +> +> However, kexec_file_load prints nothing even though '-d' is specified. +> It's very inconvenient to debug or analyze the kexec/kdump loading when +> something wrong happened with kexec/kdump itself or develper want to +> check the kexec/kdump loading. +> + +**[v4: RESEND: riscv: errata: thead: use riscv_nonstd_cache_ops for CMO](http://lore.kernel.org/linux-riscv/20231114143338.2406-1-jszhang@kernel.org/)** + +> Previously, we use alternative mechanism to dynamically patch +> the CMO operations for THEAD C906/C910 during boot for performance +> reason. But as pointed out by Arnd, "there is already a significant +> cost in accessing the invalidated cache lines afterwards, which is +> likely going to be much higher than the cost of an indirect branch". +> And indeed, there's no performance difference with GMAC and EMMC per +> my test on Sipeed Lichee Pi 4A board. +> + +**[v4: riscv: report more ISA extensions through hwprobe](http://lore.kernel.org/linux-riscv/20231114141256.126749-1-cleger@rivosinc.com/)** + +> In order to be able to gather more information about the supported ISA +> extensions from userspace using the hwprobe syscall, add more ISA +> extensions report. +> Some of these extensions are actually shorthands for other "sub" +> extensions. This series includes a patch from Conor/Evan that adds a way +> to specify such "bundled" extensions. When exposing these bundled +> extensions to userspace through hwprobe, only the "sub" extensions are +> exposed. +> + +**[v1: kexec_file: Load kernel at top of system RAM if required](http://lore.kernel.org/linux-riscv/20231114091658.228030-1-bhe@redhat.com/)** + +> Kexec_load interface has been doing top down searching and loading +> kernel/initrd/purgtory etc to prepare for kexec reboot. In that way, +> the benefits are that it avoids to consume and fragment limited low +> memory which satisfy DMA buffer allocation and big chunk of continuous +> memory during system init; and avoids to stir with BIOS/FW reserved +> or occupied areas, or corner case handling/work around/quirk occupied +> areas when doing system init. By the way, the top-down searching and +> loading of kexec-ed kernel is done in user space utility code. +> + +**[v1: riscv: sophgo: add clock support for sg2042](http://lore.kernel.org/linux-riscv/cover.1699879741.git.unicorn_wang@outlook.com/)** + +> This series adds clock controller support for sophgo sg2042. +> + +**[v1: serial: sifive: Declare PM operations as static](http://lore.kernel.org/linux-riscv/20231113023122.1185407-1-samuel.holland@sifive.com/)** + +> They are only used within this file, so they should have static linkage. +> + +**[v1: riscv: Optimize hweight API with Zbb extension](http://lore.kernel.org/linux-riscv/20231112095244.4015351-1-xiao.w.wang@intel.com/)** + +> The Hamming Weight of a number is the total number of bits set in it, so +> the cpop/cpopw instruction from Zbb extension can be used to accelerate +> hweight() API. +> + +**[v1: riscv: Avoid code duplication with generic bitops implementation](http://lore.kernel.org/linux-riscv/20231112094421.4014931-1-xiao.w.wang@intel.com/)** + +> There's code duplication between the fallback implementation for bitops +> __ffs/__fls/ffs/fls API and the generic C implementation in +> include/asm-generic/bitops/. To avoid this duplication, this patch renames +> the generic C implementation by adding a "generic_" prefix to them, then we +> can use these generic APIs as fallback. +> + +**[v2: rv64ilp32: Running ILP32 on RV64 ISA](http://lore.kernel.org/linux-riscv/20231112061514.2306187-1-guoren@kernel.org/)** + +> This patch series adds s64ilp32 & u64ilp32 support to riscv. The term +> s64ilp32 means smode-xlen=64 and -mabi=ilp32 (ints, longs, and pointers +> are all 32-bit) and u64ilp32 means umode-xlen=64 and -mabi=ilp32, i.e., +> running 32-bit Linux kernel on 64-bit supervisor mode or running 32-bit +> Linux applications on 32-bit user mode. There have been many 64ilp32 +> abis existing, such as mips-n32 [1], arm-aarch64ilp32 [2], and x86-x32 +> [3], but they are all about userspace. Thus, this should be the first +> time running a 32-bit Linux kernel with the 64ilp32 ABI at supervisor +> mode (If not, correct me). +> + +#### 进程调度 + +**[v1: sched/core: put the cookie to uaddr when create cookie](http://lore.kernel.org/lkml/20231117132148.17844-1-CruzZhao@linux.alibaba.com/)** + +> For the control process, it's necessary to get the cookie of the +> task when create a cookie with command PR_SCHED_CORE_CREATE. In +> current design, we have to use command PR_SCHED_CORE_GET after we +> create a cookie, with one more syscall. +> + +**[v1: sched/idle: Add a few cpuidle VS timers comments](http://lore.kernel.org/lkml/20231114193840.4041-1-frederic@kernel.org/)** + +> Those are the scheduler specific bits extracted from a previous series +> (v1: timers/cpuidle: Fixes and cleanups). +> + +#### 内存管理 + +**[v4: mm/gup: Introduce pin_user_pages_fd() for pinning shmem/hugetlbfs file pages (v4)](http://lore.kernel.org/linux-mm/20231118063233.733523-1-vivek.kasireddy@intel.com/)** + +> The first two patches were previously reviewed but not yet merged. +> These ones need to be merged first as the fourth patch depends on +> the changes introduced in them and they also fix bugs seen in +> very specific scenarios (running Qemu with hugetlb=on, blob=true +> and rebooting guest VM). +> + +**[v2: netfs, afs, cifs: Delegate high-level I/O to netfslib](http://lore.kernel.org/linux-mm/20231117211544.1740466-1-dhowells@redhat.com/)** + +> I have been working on my netfslib helpers to the point that I can run +> xfstests on AFS to completion (both with write-back buffering and, with a +> small patch, write-through buffering in the pagecache). I can also run a +> certain amount of xfstests on CIFS, though that requires some more +> debugging. However, this seems like a good time to post a preview of the +> patches. +> + +**[v3: mm: memcg: subtree stats flushing and thresholds](http://lore.kernel.org/linux-mm/20231116022411.2250072-1-yosryahmed@google.com/)** + +> This series attempts to address shortages in today's approach for memcg +> stats flushing, namely occasionally stale or expensive stat reads. The +> series does so by changing the threshold that we use to decide whether +> to trigger a flush to be per memcg instead of global (patch 3), and then +> changing flushing to be per memcg (i.e. subtree flushes) instead of +> global (patch 5). +> + +**[v1: mm/gup: Unify hugetlb, part 2](http://lore.kernel.org/linux-mm/20231116012908.392077-1-peterx@redhat.com/)** + +> This patchset is in RFC stage. It's mostly because it is only yet tested on +> x86_64 in a VM. Not even compile tested on PPC or any other archs, it +> means at least the hugepd patch (patch 11) is mostly untested, or even not +> compile tested. Before doing that, I'd like to collect any information +> from high level. +> + +**[v1: kmsan: Enable on s390](http://lore.kernel.org/linux-mm/20231115203401.2495875-1-iii@linux.ibm.com/)** + +> This series provides the minimal support for Kernel Memory Sanitizer on +> s390. Kernel Memory Sanitizer is clang-only instrumentation for finding +> accesses to uninitialized memory. The clang support for s390 has already +> been merged [1]. +> + +**[v5: zswap: memcontrol: implement zswap writeback disabling](http://lore.kernel.org/linux-mm/20231115172344.4155593-1-nphamcs@gmail.com/)** + +> During our experiment with zswap, we sometimes observe swap IOs due to +> occasional zswap store failures and writebacks-to-swap. These swapping +> IOs prevent many users who cannot tolerate swapping from adopting zswap +> to save memory and improve performance where possible. +> + +**[v2: Transparent Contiguous PTEs for User Mappings](http://lore.kernel.org/linux-mm/20231115163018.1303287-1-ryan.roberts@arm.com/)** + +> This is v2 of a series to opportunistically and transparently use contpte +> mappings (set the contiguous bit in ptes) for user memory when those mappings +> meet the requirements. It is part of a wider effort to improve performance by +> allocating and mapping variable-sized blocks of memory (folios). One aim is for +> the 4K kernel to approach the performance of the 16K kernel, but without +> breaking compatibility and without the associated increase in memory. Another +> aim is to benefit the 16K and 64K kernels by enabling 2M THP, since this is the +> contpte size for those kernels. We have good performance data that demonstrates +> both aims are being met (see below). +> + +**[v7: Small-sized THP for anonymous memory](http://lore.kernel.org/linux-mm/20231115132734.931023-1-ryan.roberts@arm.com/)** + +> This is v7 of a series to implement small-sized THP for anonymous memory +> (previously called "large anonymous folios"). The objective of this is to +> improve performance by allocating larger chunks of memory during anonymous page +> faults: +> +> Since SW (the kernel) is dealing with larger chunks of memory than base +> pages, there are efficiency savings to be had; fewer page faults, batched PTE +> and RMAP manipulation, reduced lru list, etc. +> + +**[v9: mm: vmscan: try to reclaim swapcache pages if no swap space](http://lore.kernel.org/linux-mm/20231115050123.982876-1-liushixin2@huawei.com/)** + +> When spaces of swap devices are exhausted, only file pages can be +> reclaimed. But there are still some swapcache pages in anon lru list. +> This can lead to a premature out-of-memory. +> + +**[v1: implement "memmap on memory" feature on s390](http://lore.kernel.org/linux-mm/20231114180238.1522782-1-sumanthk@linux.ibm.com/)** + +> The patch series implements "memmap on memory" feature on s390 and +> provides the necessary fixes for it. +> +> Patch 1 addresses the locking order in memory hotplug operations, +> ensuring that the mem_hotplug_lock is held during critical operations +> like mhp_init_memmap_on_memory() and mhp_deinit_memmap_on_memory() +> + +**[v1: mm: More ptep_get() conversion](http://lore.kernel.org/linux-mm/20231114154945.490401-1-ryan.roberts@arm.com/)** + +> Commit c33c794828f2 ("mm: ptep_get() conversion") converted all +> (non-arch) call sites to use ptep_get() instead of doing a direct +> dereference of the pte. Full rationale can be found in that commit's +> log. +> + +**[v1: cxl: Add support for CXL feature commands, CXL device patrol scrub control and DDR5 ECS control features](http://lore.kernel.org/linux-mm/20231114125648.1146-1-shiju.jose@huawei.com/)** + +> 1. Add support for CXL feature commands(CXL spec 3.0 section 8.2.9.6). +> 2. Add CXL device scrub driver supporting patrol scrub control feature +> (CXL spec 3.1 section 8.2.9.9.11.1) and DDR5 ECS feature(CXL spec 3.1 +> section 8.2.9.9.11.2). +> 3. Add scrub attributes for DDR5 ECS control to the memory scrub driver. +> + +**[v1: ksm: delay the check of splitting compound pages](http://lore.kernel.org/linux-mm/202311142036302357580@zte.com.cn/)** + +> When trying to merge two pages, it may fail because the two pages +> belongs to the same compound page and split_huge_page fails due to +> the incorrect reference to the page. To solve the problem, the commit +> 77da2ba0648a4 ("mm/ksm: fix interaction with THP") tries to split the +> compound page after try_to_merge_two_pages() fails and put_page in +> that case. However it is too early to calculate of the variable 'split' which +> indicates whether the two pages belongs to the same compound page. +> + +**[v1: mm/gup: Introduce pin_user_pages_fd() for pinning shmem/hugetlbfs file pages (v3)](http://lore.kernel.org/linux-mm/20231114070044.464451-1-vivek.kasireddy@intel.com/)** + +> For drivers that would like to longterm-pin the pages associated +> with a file, the pin_user_pages_fd() API provides an option to +> not only pin the pages via FOLL_PIN but also to check and migrate +> them if they reside in movable zone or CMA block. This API +> currently works with files that belong to either shmem or hugetlbfs. +> Files belonging to other filesystems are rejected for now. +> + +**[v2: mm/page_owner: record and dump free_pid and free_tgid](http://lore.kernel.org/linux-mm/20231114034202.73098-1-v-songbaohua@oppo.com/)** + +> While investigating some complex memory allocation and free bugs +> especially in multi-processes and multi-threads cases, from time +> to time, I feel the free stack isn't sufficient as a page can be +> freed by processes or threads other than the one allocating it. +> And other processes and threads which free the page often have +> the exactly same free stack with the one allocating the page. We +> can't know who free the page only through the free stack though +> the current page_owner does tell us the pid and tgid of the one +> allocating the page. This makes the bug investigation often hard. +> + +**[v3: PATCH: arm64: mm: swap: save and restore mte tags for large folios](http://lore.kernel.org/linux-mm/20231114014313.67232-1-v-songbaohua@oppo.com/)** + +> This patch makes MTE tags saving and restoring support large folios, +> then we don't need to split them into base pages for swapping out +> on ARM64 SoCs with MTE. +> +> arch_prepare_to_swap() should take folio rather than page as parameter +> because we support THP swap-out as a whole. +> + +#### 文件系统 + +**[v1: Support fanotify FAN_REPORT_FID on all filesystems](http://lore.kernel.org/linux-fsdevel/20231118183018.2069899-1-amir73il@gmail.com/)** + +> In the vfs fanotify update for v6.7-rc1 [1], we considerably increased +> the amount of filesystems that can setup inode marks with FAN_REPORT_FID: +> - NFS export is no longer required for setting up inode marks +> - All the simple fs gained a non-zero fsid +> + +**[v1: fs: Rename mapping private members](http://lore.kernel.org/linux-fsdevel/20231117215823.2821906-1-willy@infradead.org/)** + +> It is hard to find where mapping->private_lock, mapping->private_list and +> mapping->private_data are used, due to private_XXX being a relatively +> common name for variables and structure members in the kernel. To fit +> with other members of struct address_space, rename them all to have an +> i_ prefix. Tested with an allmodconfig build. +> + +**[v5: Landlock: IOCTL support](http://lore.kernel.org/linux-fsdevel/20231117154920.1706371-1-gnoack@google.com/)** + +> Introduce the LANDLOCK_ACCESS_FS_IOCTL right, which restricts the use +> of ioctl(2) on file descriptors. +> +> We attach IOCTL access rights to opened file descriptors, as we +> already do for LANDLOCK_ACCESS_FS_TRUNCATE. +> +> If LANDLOCK_ACCESS_FS_IOCTL is handled (restricted in the ruleset), +> the LANDLOCK_ACCESS_FS_IOCTL access right governs the use of all IOCTL +> commands. +> + +**[v6: Add support for Vendor Defined Error Types in Einj Module](http://lore.kernel.org/linux-fsdevel/20231116224725.3695952-1-avadhut.naik@amd.com/)** + +> This patchset adds support for Vendor Defined Error types in the einj +> module by exporting a binary blob file in module's debugfs directory. +> Userspace tools can write OEM Defined Structures into the blob file as +> part of injecting Vendor defined errors. Similarly, the very tools can +> also read from the blob file for information, if any, provided by the +> firmware after error injection. +> + +**[v1: fs: fuse: dax: set fc->dax to NULL in fuse_dax_conn_free()](http://lore.kernel.org/linux-fsdevel/20231116075726.28634-1-hbh25y@gmail.com/)** + +> fuse_dax_conn_free() will be called when fuse_fill_super_common() fails +> after fuse_dax_conn_alloc(). Then deactivate_locked_super() in +> virtio_fs_get_tree() will call virtio_kill_sb() to release the discarded +> superblock. This will call fuse_dax_conn_free() again in fuse_conn_put(), +> resulting in a possible double free. +> + +**[v1: squashfs: squashfs_read_data need to check if the length is 0](http://lore.kernel.org/linux-fsdevel/20231116031352.40853-1-lizhi.xu@windriver.com/)** + +> when the length passed in is 0, the subsequent process should be exited. +> +> Reported-by: syzbot+32d3767580a1ea339a81@syzkaller.appspotmail.com +> + +**[v1: autofs: add: new_inode check in autofs_fill_super()](http://lore.kernel.org/linux-fsdevel/20231116000746.7359-1-raven@themaw.net/)** + +> Add missing NULL check of root_inode in autofs_fill_super(). +> +> While we are at it simplify the logic by taking advantage of the VFS +> cleanup procedures and get rid of the goto error handling, as suggested +> by Al Viro. +> + +**[v2: Introduce sysfs API for resend pending requests](http://lore.kernel.org/linux-fsdevel/20231115094930.296218-1-winters.zc@antgroup.com/)** + +> After the FUSE daemon crashes, the fuse mount point becomes inaccessible. +> In some production environments, a watchdog daemon is used to preserve +> the FUSE connection's file descriptor (fd). When the FUSE daemon crashes, +> a new FUSE daemon is started and takes over the fd from the watchdog +> daemon, allowing it to continue providing services. +> + +**[v4: Pass data lifetime information to SCSI disk devices](http://lore.kernel.org/linux-fsdevel/20231114214132.1486867-1-bvanassche@acm.org/)** + +> UFS vendors need the data lifetime information to achieve good performance. +> Providing data lifetime information to UFS devices can result in up to 40% +> lower write amplification. Hence this patch series that adds support in F2FS +> and also in the block layer for data lifetime information. The SCSI disk (sd) +> driver is modified such that it passes write hint information to SCSI devices +> via the GROUP NUMBER field. +> + +**[v1: Tidy up file permission hooks](http://lore.kernel.org/linux-fsdevel/20231114153321.1716028-1-amir73il@gmail.com/)** + +> I realize you won't have time to review this week, but wanted to get +> this series out for review for a wider audience soon. +> +> During my work on fanotify "pre content" events [1], Jan and I noticed +> some inconsistencies in the call sites of security_file_permission() +> hooks inside rw_verify_area() and remap_verify_area(). +> + +**[v1: fs: anon-inode: Prepend blank line separator to anon_inode_create_getfile() reasons list](http://lore.kernel.org/linux-fsdevel/20231114091243.32789-1-bagasdotme@gmail.com/)** + +> Stephen Rothwell reported htmldocs warning when merging kvm tree: +> +> Documentation/filesystems/api-summary:74: fs/anon_inodes.c:167: ERROR: Unexpected indentation. +> Documentation/filesystems/api-summary:74: fs/anon_inodes.c:168: WARNING: Block quote ends without a blank line; unexpected unindent. +> + +#### 网络设备 + +> * v2: [net-next: octeontx2: Multicast/mirror offload changes](http://lore.kernel.org/netdev/20231118180157.3593084-1-sumang@marvell.com/) +> +> This patchset includes changes to support TC multicast/mirror offload. +> + +**[v1: biops: add atomig find_bit() operations](http://lore.kernel.org/netdev/20231118155105.25678-1-yury.norov@gmail.com/)** + +> Add helpers around test_and_{set,clear}_bit() that allow to search for +> clear or set bits and flip them atomically. +> + +**[v1: net-next: MT7530 DSA subdriver improvements](http://lore.kernel.org/netdev/20231118123205.266819-1-arinc.unal@arinc9.com/)** + +> Hello! +> +> This patch series simplifies the MT7530 DSA subdriver and improves the +> logic of the support for MT7530, MT7531, and the switch on the MT7988 SoC. +> + +**[v1: genetlink: Prevent memory leak when krealloc fail](http://lore.kernel.org/netdev/20231118113357.1999-1-kamil.duljas@gmail.com/)** + +> genl_allocate_reserve_groups() allocs new memory in while loop +> but if krealloc fail, the memory allocated by kzalloc is not freed. +> It seems allocated memory is unnecessary when the function +> returns -ENOMEM +> + +**[v5: add qca8084 ethernet phy driver](http://lore.kernel.org/netdev/20231118062754.2453-1-quic_luoj@quicinc.com/)** + +> QCA8084 is four-port PHY with maximum link capability 2.5G, +> which supports the interface mode qusgmii and sgmii mode, +> there are two PCSs available to connected with ethernet port. +> + +**[v3: net-next: selftests/dpll: DPLL subsystem integration tests](http://lore.kernel.org/netdev/20231117190505.7819-1-michal.michalik@intel.com/)** + +> The recently merged common DPLL interface discussed on a mailing list[1] +> is introducing new, complex subsystem which requires proper integration +> testing - this patchset adds such a framework, as well as the initial +> test cases. Framework does not require neither any special hardware nor +> any special system architecture. +> + +**[v3: iwl-net: ice: Block PF reinit if attached to bond](http://lore.kernel.org/netdev/20231117164427.912563-1-sachin.bahadur@intel.com/)** + +> PF interface part of LAG should not allow driver reinit via devlink. The +> Bond config will be lost due to driver reinit. ice_devlink_reload_down is +> called before PF driver reinit. If PF is attached to bond, +> ice_devlink_reload_down returns error. +> + +**[v1: net-next: net: ethernet: mtk_wed: add support for devices with more than 4GB of dram](http://lore.kernel.org/netdev/1c7efdf5d384ea7af3c0209723e40b2ee0f956bf.1700239272.git.lorenzo@kernel.org/)** + +> Introduce WED offloading support for boards with more than 4GB of +> memory. +> + +**[v1: net-next: net: ethernet: mtk_wed: rely on __dev_alloc_page in mtk_wed_tx_buffer_alloc](http://lore.kernel.org/netdev/a7c859060069205e383a4917205cb265f41083f5.1700239075.git.lorenzo@kernel.org/)** + +> Simplify the code and use __dev_alloc_page() instead of __dev_alloc_pages() +> with order 0 in mtk_wed_tx_buffer_alloc routine +> + +**[v2: net-next: Introduce PHY listing and link_topology tracking](http://lore.kernel.org/netdev/20231117162323.626979-1-maxime.chevallier@bootlin.com/)** + +> As part of the ongoing effort to better describe the ethernet link +> topology, this series introduces the first step by allowing to maintain +> a list of all the ethernet PHYs that are connected to a given netdevice. +> + +**[v1: net-next: net: ctnetlink: support filtering by zone](http://lore.kernel.org/netdev/ZVeGFP2x-Wx6duYs@SIT-SDELAP4051.int.lidl.net/)** + +> conntrack zones are heavily used by tools like openvswitch to run +> multiple virtual "routers" on a single machine. In this context each +> conntrack zone matches to a single router, thereby preventing +> overlapping IPs from becoming issues. +> In these systems it is common to operate on all conntrack entries of a +> given zone, e.g. to delete them when a router is deleted. Previously this +> required these tools to dump the full conntrack table and filter out the +> relevant entries in userspace potentially causing performance issues. +> + +**[v2: net: wireguard: use DEV_STATS_INC()](http://lore.kernel.org/netdev/20231117141733.3344158-1-edumazet@google.com/)** + +> wg_xmit() can be called concurrently, KCSAN reported [1] +> some device stats updates can be lost. +> +> Use DEV_STATS_INC() for this unlikely case. +> + +**[v1: net-next: net: phylink: use for_each_set_bit()](http://lore.kernel.org/netdev/E1r3yPo-00CnKQ-JG@rmk-PC.armlinux.org.uk/)** + +> Use for_each_set_bit() rather than open coding the for() test_bit() +> loop. +> + +**[v1: nfp: flower: Added pointer check and continue.](http://lore.kernel.org/netdev/20231117125701.58927-1-arefev@swemel.ru/)** + +> Return value of a function 'kmalloc_array' is dereferenced at +> lag_conf.c without checking for null, but it is usually +> checked for this function. +> +> Found by Linux Verification Center (linuxtesting.org) with SVACE. +> + +**[v2: net-next: net: eth: am65-cpsw: add ethtool MAC stats](http://lore.kernel.org/netdev/20231117121755.104547-1-rogerq@kernel.org/)** + +> Gets 'ethtool -S eth0 --groups eth-mac' command to work. +> +> Also set default TX channels to maximum available and does +> cleanup in am65_cpsw_nuss_common_open() error path. +> + +**[v3: net-next: net/smc: avoid atomic_set and smp_wmb in the tx path when possible](http://lore.kernel.org/netdev/20231117111657.16266-1-lirongqing@baidu.com/)** + +> There is rare possibility that conn->tx_pushing is not 1, since +> tx_pushing is just checked with 1, so move the setting tx_pushing +> to 1 after atomic_dec_and_test() return false, to avoid atomic_set +> and smp_wmb in tx path +> + +**[v1: net-next: octeon_ep: support Octeon CN10K devices](http://lore.kernel.org/netdev/20231117103817.2468176-1-srasheed@marvell.com/)** + +> Add PCI Endpoint NIC support for Octeon CN10K devices. +> CN10K devices are part of Octeon 10 family products with +> similar PCI NIC characteristics. These include: +> - CN10KA +> - CNF10KA +> - CNF10KB +> - CN10KB +> +> Update supported device list in Documentation +> + +**[v4: Add support for Qualcomm ECPRI clock controller](http://lore.kernel.org/netdev/20231117095558.3313877-1-quic_imrashai@quicinc.com/)** + +> The ECPRI clock controller support for QDU1000 and QRU1000. The clock +> controller has a special branch which requires an additional memory to +> be enabled/disabled before the branch ops. +> + +**[v5: can: xilinx_can: Add ECC feature support](http://lore.kernel.org/netdev/1700213336-652-1-git-send-email-srinivas.goud@amd.com/)** + +> Add ECC feature support to Tx and Rx FIFOs for Xilinx CAN Controller. +> Part of this feature configuration and counter registers added in +> Xilinx AXI CAN Controller for 1bit/2bit ECC errors count and reset. +> Also driver reports 1bit/2bit ECC errors for FIFOs based on ECC error +> interrupts. +> + +**[v1: net: netsec: replace cpu_relax() with timeout handling for register checks](http://lore.kernel.org/netdev/20231117081002.60107-1-ryosuke.saito@linaro.org/)** + +> The cpu_relax() loops have the potential to hang if the specified +> register bits are not met on condition. The patch replaces it with +> usleep_range() and netsec_wait_while_busy() which includes timeout +> logic. +> + +**[v1: net/mlx5: DR, Use swap() instead of open coding it](http://lore.kernel.org/netdev/20231117071947.112856-1-jiapeng.chong@linux.alibaba.com/)** + +> Swap is a function interface that provides exchange function. To avoid +> code duplication, we can use swap function. +> +> ./drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.c:1254:50-51: WARNING opportunity for swap(). +> +> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=7580 +> + +**[v1: net-next: nfp: add flow-steering support](http://lore.kernel.org/netdev/20231117071114.10667-1-louis.peens@corigine.com/)** + +> This short series adds flow steering support for the nfp driver. +> The first patch adds the part to communicate with ethtool but +> stubs out the HW offload parts. The second patch implements the +> HW communication and offloads flow steering. +> + +**[v1: mctp-i2c: increase the MCTP_I2C_TX_WORK_LEN to 500](http://lore.kernel.org/netdev/20231117070457.1970786-1-jinliangw@google.com/)** + +> The original value (100) is not sufficient for our use case. +> For example, we have 4 NVMe-mi devices on the same i2c bus. +> When sending namespace create Admin command concurrently, they +> will send 4x4KB data to device concurrently, which may be +> split into 4x(4KB/64B)=256 packets. +> + +**[v2: net: net/smc: avoid data corruption caused by decline](http://lore.kernel.org/netdev/1700197181-83136-1-git-send-email-alibuda@linux.alibaba.com/)** + +> We found a data corruption issue during testing of SMC-R on Redis +> applications. +> +> The benchmark has a low probability of reporting a strange error as +> shown below. +> + +#### 安全增强 + +**[v1: kernfs: Convert from strlcpy() to strscpy()](http://lore.kernel.org/linux-hardening/20231116191718.work.246-kees@kernel.org/)** + +> One of the last users of strlcpy() is kernfs, which has some complex +> calling hierarchies that needed to be carefully examined. This series +> refactors the strlcpy() calls into strscpy() calls, and bubbles up all +> changes in return value checking for callers. +> + +**[v1: samples: Replace strlcpy() with strscpy()](http://lore.kernel.org/linux-hardening/20231116191510.work.550-kees@kernel.org/)** + +> strlcpy() reads the entire source buffer first. This read may exceed +> the destination size limit. This is both inefficient and can lead +> to linear read overflows if a source string is not NUL-terminated[1]. +> Additionally, it returns the size of the source string, not the +> resulting size of the destination string. In an effort to remove strlcpy() +> completely[2], replace strlcpy() here with strscpy(). +> + +**[v1: usb: gadget: f_midi: Replace strlcpy() with strscpy()](http://lore.kernel.org/linux-hardening/20231116191452.work.902-kees@kernel.org/)** + +> strlcpy() reads the entire source buffer first. This read may exceed +> the destination size limit. This is both inefficient and can lead +> to linear read overflows if a source string is not NUL-terminated[1]. +> Additionally, it returns the size of the source string, not the +> resulting size of the destination string. In an effort to remove strlcpy() +> completely[2], replace strlcpy() here with strscpy(). +> + +**[v1: scsi: zfcp: Replace strlcpy() with strscpy()](http://lore.kernel.org/linux-hardening/20231116191435.work.581-kees@kernel.org/)** + +> strlcpy() reads the entire source buffer first. This read may exceed +> the destination size limit. This is both inefficient and can lead +> to linear read overflows if a source string is not NUL-terminated[1]. +> Additionally, it returns the size of the source string, not the +> resulting size of the destination string. In an effort to remove strlcpy() +> completely[2], replace strlcpy() here with strscpy(). +> + +**[v1: dma-buf: Replace strlcpy() with strscpy()](http://lore.kernel.org/linux-hardening/20231116191409.work.634-kees@kernel.org/)** + +> strlcpy() reads the entire source buffer first. This read may exceed +> the destination size limit. This is both inefficient and can lead +> to linear read overflows if a source string is not NUL-terminated[1]. +> Additionally, it returns the size of the source string, not the +> resulting size of the destination string. In an effort to remove strlcpy() +> completely[2], replace strlcpy() here with strscpy(). +> + +**[v1: parisc: Replace strlcpy() with strscpy()](http://lore.kernel.org/linux-hardening/20231116191336.work.986-kees@kernel.org/)** + +> strlcpy() reads the entire source buffer first. This read may exceed +> the destination size limit. This is both inefficient and can lead +> to linear read overflows if a source string is not NUL-terminated[1]. +> Additionally, it returns the size of the source string, not the +> resulting size of the destination string. In an effort to remove strlcpy() +> completely[2], replace strlcpy() here with strscpy(). +> + +**[v1: next: xen: privcmd: Replace zero-length array with flex-array member and use __counted_by](http://lore.kernel.org/linux-hardening/ZVZlg3tPMPCRdteh@work/)** + +> Fake flexible arrays (zero-length and one-element arrays) are deprecated, +> and should be replaced by flexible-array members. So, replace +> zero-length array with a flexible-array member in `struct +> privcmd_kernel_ioreq`. +> + +**[v1: next: nouveau/gsp: replace zero-length array with flex-array member and use __counted_by](http://lore.kernel.org/linux-hardening/ZVZbX7C5suLMiBf+@work/)** + +> Fake flexible arrays (zero-length and one-element arrays) are deprecated, +> and should be replaced by flexible-array members. So, replace +> zero-length array with a flexible-array member in `struct +> PACKED_REGISTRY_TABLE`. +> + +**[v1: hwmon: Explicitly initialize nct6775_sio_names indexes](http://lore.kernel.org/linux-hardening/20231116140144.work.027-kees@kernel.org/)** + +> Changing the "kinds" enum start value to be 1-indexed instead of +> 0-indexed caused look-ups in nct6775_sio_namesp[] to be misaligned or +> off the end. Coverity reported: +> +> *** CID 1571052: Memory - illegal accesses (OVERRUN) +> drivers/hwmon/nct6775-platform.c:1075 in nct6775_find() +> vvv CID 1571052: Memory - illegal accesses (OVERRUN) +> vvv Overrunning array "nct6775_sio_names" of 13 8-byte elements at element index 13 (byte offset 111) using index "sio_data->kind" (which evaluates to 13). +> + +**[v1: next: Makefile: Enable -Wstringop-overflow globally](http://lore.kernel.org/linux-hardening/ZVWMCZ%2Fjb4nX3yHn@work/)** + +> It seems that we have finished addressing all the remaining +> issues regarding compiler option -Wstringop-overflow. So, we +> are now in good shape to enable this compiler option globally. +> + +**[v1: SUNRPC: Replace strlcpy() with strscpy()](http://lore.kernel.org/linux-hardening/20231114175407.work.410-kees@kernel.org/)** + +> strlcpy() reads the entire source buffer first. This read may exceed +> the destination size limit. This is both inefficient and can lead +> to linear read overflows if a source string is not NUL-terminated[1]. +> Additionally, it returns the size of the source string, not the +> resulting size of the destination string. In an effort to remove strlcpy() +> completely[2], replace strlcpy() here with strscpy(). +> + +**[v2: Hypervisor-Enforced Kernel Integrity](http://lore.kernel.org/linux-hardening/20231113022326.24388-1-mic@digikod.net/)** + +> This patch series is a proof-of-concept that implements new KVM features +> (guest memory attributes, MBEC support, CR pinning) and defines a new +> API to protect guest VMs. You can find related resources, including the +> related commits here: https://github.com/heki-linux +> We'll talk about this work and the related LVBS project at LPC: +> * https://lpc.events/event/17/contributions/1486/ +> * https://lpc.events/event/17/contributions/1515/ +> + +#### 异步 IO + +**[v3: io_uring: Statistics of the true utilization of sq threads.](http://lore.kernel.org/io-uring/20231115121839.12556-1-xiaobing.li@samsung.com/)** + +#### Rust For Linux + +**[v1: MODVERSIONS + RUST Redux](http://lore.kernel.org/rust-for-linux/20231115185858.2110875-1-mmaurer@google.com/)** + +> Support both MODVERSIONS and RUST by making symbol version information extensible. This works by having a separate section per field, and allowing iteration to work differently per field. The old module information remains available to allow existing kmod tools to continue to work on new modules if they are only looking for C information. +> + +#### BPF + +**[v3: bpf-next: BPF verifier log improvements](http://lore.kernel.org/bpf/20231118034623.3320920-1-andrii@kernel.org/)** + +> This patch set moves a big chunk of verifier log related code from gigantic +> verifier.c file into more focused kernel/bpf/log.c. This is not essential to +> the rest of functionality in this patch set, so I can undo it, but it felt +> like it's good to start chipping away from 20K+ verifier.c whenever we can. +> + +**[v2: bpf: verify callbacks as if they are called unknown number of times](http://lore.kernel.org/bpf/20231118013355.7943-1-eddyz87@gmail.com/)** + +> This series updates verifier logic for callback functions handling. +> Current master simulates callback body execution exactly once, +> which leads to verifier not detecting unsafe programs like below. +> + +**[v1: bpf-next: bpf: rename BPF_F_TEST_SANITY_STRICT to BPF_F_TEST_REG_INVARIANTS](http://lore.kernel.org/bpf/20231117171404.225508-1-andrii@kernel.org/)** + +> Rename verifier internal flag BPF_F_TEST_SANITY_STRICT to more neutral +> BPF_F_TEST_REG_INVARIANTS. This is a follow up to [0]. +> +> A few selftests and veristat need to be adjusted in the same patch as +> well. +> +> [0] https://patchwork.kernel.org/project/netdevbpf/patch/20231112010609.848406-5-andrii@kernel.org/ +> + +**[v1: bpf-next: bpftool: Add end-to-end testing](http://lore.kernel.org/bpf/20231116194236.1345035-1-chantr4@gmail.com/)** + +> This series introduce a more ergonomic end-to-end testing for `bpftool`. +> +> While there is already some `bpftool` tests, they are so far shallow +> tests, validating features (test_bpftool.py), or validating expectations +> by mean of grepping for expected output in payload +> (test_bpftool_metadata.sh), which is hard to extend. +> + +**[v8: net-next: Introducing P4TC](http://lore.kernel.org/bpf/20231116145948.203001-1-jhs@mojatatu.com/)** + +> We are seeking community feedback on P4TC patches. +> +> We have reduced the number of commits in this patchset including leaving out +> all the testcases and secondary patches in order to ease review. +> + +**[v7: bpf-next: XDP metadata via kfuncs for ice + VLAN hint](http://lore.kernel.org/bpf/20231115175301.534113-1-larysa.zaremba@intel.com/)** + +> This series introduces XDP hints via kfuncs [0] to the ice driver. +> +> Series brings the following existing hints to the ice driver: +> - HW timestamp +> - RX hash with type +> +> Series also introduces VLAN tag with protocol XDP hint, it now be accessed by +> XDP and userspace (AF_XDP) programs. They can also be checked with xdp_metadata +> test and xdp_hw_metadata program. +> + +**[v1: net-next:pull request: igc: Add support for physical + free-running timers](http://lore.kernel.org/bpf/20231114183640.1303163-1-anthony.l.nguyen@intel.com/)** + +> Vinicius Costa Gomes says: +> +> The objective is to allow having functionality that depends on the +> physical timer (taprio and ETF offloads, for example) and vclocks +> operating together. +> + +**[v1: bpf: kernel/bpf/task_iter.c: don't abuse next_thread()](http://lore.kernel.org/bpf/20231114163211.GA874@redhat.com/)** + +> Compile tested. +> +> Every lockless usage of next_thread() was wrong, bpf/task_iter.c is +> the last user and is no exception. +> + +**[v1: bpf-next: skmsg: Add the data length in skmsg to SIOCINQ ioctl and rx_queue](http://lore.kernel.org/bpf/1699962120-3390-1-git-send-email-yangpc@wangsu.com/)** + +> When using skmsg redirect, the msg is queued in psock->ingress_msg, +> and the application calling SIOCINQ ioctl will return a readable +> length of 0, and we cannot track the data length of ingress_msg from +> the ss tool. +> + +**[v1: bpf-next: bpf: Relax tracing prog recursive attach rules](http://lore.kernel.org/bpf/20231114084118.11095-1-9erthalion6@gmail.com/)** + +> Currently, it's not allowed to attach an fentry/fexit prog to another +> one of the same type. At the same time it's not uncommon to see a +> tracing program with lots of logic in use, and the attachment limitation +> prevents usage of fentry/fexit for performance analysis (e.g. with +> "bpftool prog profile" command) in this case. An example could be +> falcosecurity libs project that uses tp_btf tracing programs. +> + +**[v1: bpf-next: bpf: Use GFP_KERNEL in bpf_event_entry_gen()](http://lore.kernel.org/bpf/20231113141207.1459002-1-houtao@huaweicloud.com/)** + +> The simple patchset aims to replace GFP_ATOMIC in bpf_event_entry_gen(). +> These two patches in the patchset were preparatory patches in "Fix the +> release of inner map" patchset [1] and are not needed for v2, so re-post +> it to bpf-next tree. +> + +**[v1: bpf: Get the program type by resolve_prog_type() directly](http://lore.kernel.org/bpf/20231113031812.3639430-1-xwlpt@126.com/)** + +> The bpf program type can be get by resolve_prog_type() directly. +> + +**[v4: bpf-next: Add kind layout, CRCs to BTF](http://lore.kernel.org/bpf/20231112124834.388735-1-alan.maguire@oracle.com/)** + +> Update struct btf_header to add a new "kind_layout" section containing +> a description of how to parse the BTF kinds known about at BTF +> encoding time. This provides the opportunity for tools that might +> not know all of these kinds - as is the case when older tools run +> on more newly-generated BTF - to still parse the BTF provided, +> even if it cannot all be used. +> + +**[v1: -mm: security, bpf: Fine-grained control over memory policy adjustments with lsm bpf](http://lore.kernel.org/bpf/20231112073424.4216-1-laoar.shao@gmail.com/)** + +> In our containerized environment, we've identified unexpected OOM events +> where the OOM-killer terminates tasks despite having ample free memory. +> This anomaly is traced back to tasks within a container using mbind(2) to +> bind memory to a specific NUMA node. When the allocated memory on this node +> is exhausted, the OOM-killer, prioritizing tasks based on oom_score, +> indiscriminately kills tasks. This becomes more critical with guaranteed +> tasks (oom_score_adj: -998) aggravating the issue. +> + +### 周边技术动态 + +#### Qemu + +**[v2: target/riscv: don't verify ISA compatibility for zicntr and zihpm](http://lore.kernel.org/qemu-devel/20231114123913.536194-1-chigot@adacore.com/)** + +> The extensions zicntr and zihpm were officially added in the privilege +> instruction set specification 1.12. However, QEMU has been implemented +> them long before it and thus they are forced to be on during the cpu +> initialization to ensure compatibility (see riscv_cpu_init). +> riscv_cpu_disable_priv_spec_isa_exts was not updated when the above +> behavior was introduced, resulting in these extensions to be disabled +> after all. +> + +**[v3: Support RISC-V IOPMP](http://lore.kernel.org/qemu-devel/20231114094705.109146-1-ethan84@andestech.com/)** + +> This series implements IOPMP specification v1.0.0-draft4 rapid-k model. +> +> When IOPMP is enabled, a DMA device ATCDMAC300 is added to RISC-V virt +> platform. This DMA devce is connected to the IOPMP and has the functionalities +> required by IOPMP, including: +> - Support specify source-id (SID) +> - Support asynchronous I/O to handle stall transcations +> + +**[v1: RISC-V: Add TSO extensions (Ztso/Ssdtso)](http://lore.kernel.org/qemu-devel/20231113095605.1131443-1-christoph.muellner@vrull.eu/)** + +> This series picks up an earlier v2 Ztso patch from Palmer +> and adds a second to support Ssdtso. +> +> Palmer's v2 Ztso patch can be found here: +> https://patchwork.kernel.org/project/qemu-devel/patch/20220917072635.11616-1-palmer@rivosinc.com/ +> This patch did not apply cleanly but the necessary changes were trivial. +> There was a request to extend the commit message, which is part of the +> posted patch of this series. As this patch was reviewed a year ago, +> I believe it could be merged. +> + +#### U-Boot + +**[v3: doc: falcon: riscv: Falcon Mode boot on RISC-V](http://lore.kernel.org/u-boot/20231116130136.2796918-1-randolph@andestech.com/)** + +**[v1: risc-v: add ACPI support on QEMU](http://lore.kernel.org/u-boot/20231115142355.123038-1-heinrich.schuchardt@canonical.com/)** + +> QEMU 8.1.2 can create ACPI tables for the RISC-V architecture. +> Allow passing them through to the operating system. +> Provide a new defconfig that enables this. +> + +**[v1: Import "string" I/O functions from Linux](http://lore.kernel.org/u-boot/20231114110257.2488-1-ivprusov@salutedevices.com/)** + +> This series imports generic versions of ioread_rep/iowrite_rep and +> reads/writes from Linux. Some cleanup is done to make sure that all +> platforms have proper defines for implemented functions and there are no +> redefinitions. +> + + ## 20231112:第 68 期 ### 内核动态