diff --git a/news/README.md b/news/README.md index 8862cd8731d1f3c48fc4a9be5a85db57896863b2..eecdb1738b7178ce5a29c49bb3a91fa8e3b2958f 100644 --- a/news/README.md +++ b/news/README.md @@ -1,5 +1,706 @@ # RISC-V Linux 内核及周边技术动态 +## 20221225:第 26 期 + +### 内核动态 + +#### RISC-V 架构支持 + +* [v5: Add OPTPROBES feature on RISCV](http://lore.kernel.org/linux-riscv/20221224114315.850130-1-chenguokai17@mails.ucas.ac.cn/) + + Add jump optimization support for RISC-V. + + Replaces ebreak instructions used by normal kprobes with an + auipc+jalr instruction pair, at the aim of suppressing the probe-hit overhead. + +* [v5: Allow calls in alternatives](http://lore.kernel.org/linux-riscv/20221223221332.4127602-1-heiko@sntech.de/) + + For v2 I got into some sort of cleanup spree for the general instruction + parsing that already existed. A number of places do their own + instruction parsing and I tried consolidating some of them. + +* [v1: Add timer driver for StarFive JH7110 RISC-V SoC](http://lore.kernel.org/linux-riscv/20221223094801.181315-1-xingyu.wu@starfivetech.com/) + + This patch serises are to add timer driver for the StarFive JH7110 + RISC-V SoC. The first patch adds documentation to describe device + tree bindings. The subsequent patch adds timer driver and support + JH7110 SoC. The last patch adds device node about timer to JH7110 dts. + +* [v1: riscv: add base extensions to enum riscv_isa_ext_id](http://lore.kernel.org/linux-riscv/20221222224104.3449841-1-vineetg@rivosinc.com/) + + This allows for using the enum in general to refer to any extension. + +* [v1: Introduce __xchg, non-atomic xchg](http://lore.kernel.org/linux-riscv/20221222114635.1251934-1-andrzej.hajda@intel.com/) + + I hope there will be place for such tiny helper in kernel. + Quick cocci analyze shows there is probably few thousands places + where it could be useful. + I am not sure who is good person to review/ack such patches, + so I've used my intuition to construct to/cc lists, sorry for mistakes. + +* [v2: vdso: Improve cmd_vdso_check to check all dynamic relocations](http://lore.kernel.org/linux-riscv/20221221235147.45lkqmosndritfpe@google.com/) + + The actual intention is that no dynamic relocation exists. However, some + GNU ld ports produce unneeded R_*_NONE. (If a port fails to determine + the exact .rel[v1: a].dyn size, the trailing zeros become R_*_NONE + relocations. E.g. ld's powerpc port recently fixed + https://sourceware.org/bugzilla/show_bug.cgi?id=29540) R_*_NONE are + generally no-op in the dynamic loaders. So just ignore them. + +* [v1: dt-bindings: riscv: add SBI PMU event mappings](http://lore.kernel.org/linux-riscv/20221221141548.274408-1-conor@kernel.org/) + + The SBI PMU extension requires a firmware to be aware of the event to + counter/mhpmevent mappings supported by the hardware. OpenSBI may use + DeviceTree to describe the PMU mappings. This binding is currently + described in markdown in OpenSBI (since v1.0 in Dec 2021) & used by QEMU since v7.2.0. + +* [v13: Microchip Soft IP corePWM driver](http://lore.kernel.org/linux-riscv/20221221112912.147210-1-conor@kernel.org/) + + I was wrong about the behaviour of the sync-update bit: + It /does not/ get cleared at the start of a new period. I did actually + modify the driver to do a read_poll_timeout() on that bit which seemed + to work [v1: 0], but it turns out that that bit holds it's value until the + IP block is reset. I'm really not sure how it worked when I tested the other week... + +* [v1: hwrng: starfive - Add driver for TRNG module](http://lore.kernel.org/linux-riscv/20221221090819.1259443-1-jiajie.ho@starfivetech.com/) + + This patch series adds kernel support for StarFive hardware random + number generator. First 2 patches adds bindings documentations and + device driver for this module. Patch 3 adds devicetree entry for VisionFive v2 SoC. + +* [v1: About adding new Z extensions in ISA realization](http://lore.kernel.org/linux-riscv/20221220120236.219804-1-ruinland.tsai@sifive.com/) + + Currently we have plenty ratified exntensions left un-added, and some of them are already being used frequently, such as the B extension family - - namely, the zba, zbb, zbc, and zbs. + + Being unsure about whether I have done it in the right way, hence I wrote this RFC for adding those extensions. + + * [v1: : RFC PATCH RESEND bpf-next 0/4] Support bpf trampoline for Rv64: (http://lore.kernel.org/linux-riscv/20221220021319.1655871-1-pulehui@huaweicloud.com/) + + BPF trampoline is the critical infrastructure of the bpf subsystem, acting as a mediator between kernel functions and BPF programs. Numerous important features, such as using ebpf program for zero overhead kernel introspection, rely on this key component. We can't wait to support bpf trampoline on RV64. The implementation of bpf trampoline was closely to x86 and arm64 for future development. + +* [v3: Basic device tree support for StarFive JH7110 RISC-V SoC](http://lore.kernel.org/linux-riscv/20221220011247.35560-1-hal.feng@starfivetech.com/) + + This patch series adds basic device tree support for StarFive JH7110 SoC. + This patch series depends on series [v1: 1] and: 2: . You can simply get or + review the patches at the link [v1: 3]. + +* [v3: Basic pinctrl support for StarFive JH7110 RISC-V SoC](http://lore.kernel.org/linux-riscv/20221220005529.34744-1-hal.feng@starfivetech.com/) + + This patch series adds basic pinctrl support for StarFive JH7110 SoC. + You can simply get or review the patches at the link [v1: 1]. + +* [v2: Add watchdog driver for StarFive JH7110 RISC-V SoC](http://lore.kernel.org/linux-riscv/20221219094233.179153-1-xingyu.wu@starfivetech.com/) + + This patch serises are to add watchdog driver for the StarFive JH7110 + RISC-V SoC. The first patch adds docunmentation to describe device + tree bindings. The subsequent patch adds watchdog driver and support + JH7110 SoC. The last patch adds device node about watchdog to JH7110 dts. + +#### 内存管理 + +* [v1: mm/thp: check and bail out if page in deferred queue already](http://lore.kernel.org/linux-mm/20221223135207.2275317-1-fengwei.yin@intel.com/) + + Kernel build regression with LLVM was reported here: + https://lore.kernel.org/all/Y1GCYXGtEVZbcv%2F5@dev-arch.thelio-3990X/ + with commit f35b5d7d676e ("mm: align larger anonymous mappings on THP + boundaries"). And the commit f35b5d7d676e was reverted. + +* [v1: virtio_balloon: high order allocation](http://lore.kernel.org/linux-mm/20221223093527.12424-1-the.latticeheart@gmail.com/) + + At present, the VirtIO balloon device driver allocates pages + one by one using alloc_page(), and frees them using put_page(). + + The effect of this change has been confirmed by benchmarks that measure the elapsed time of inflation and deflation. + +* [v1: mm/MADV_COLLAPSE: don't expand collapse when vm_end is past requested end](http://lore.kernel.org/linux-mm/20221223003953.2795313-1-zokeefe@google.com/) + + MADV_COLLAPSE acts on one hugepage-aligned/sized region at a time, until + it has collapsed all eligible memory contained within the bounds + supplied by the user. + + At the top of each hugepage iteration we (re)lock mmap_lock and + (re)validate the VMA for eligibility and update variables that might + have changed while mmap_lock was dropped. +* [v1: mm/shmem: restore SHMEM_HUGE_DENY precedence over MADV_COLLAPSE](http://lore.kernel.org/linux-mm/20221223003833.2793963-1-zokeefe@google.com/) + + SHMEM_HUGE_DENY is for emergency use by the admin, to disable allocation + of shmem huge pages if, for example, a dangerous bug is found in their + usage: see "deny" in Documentation/mm/transhuge.rst. An app using + madvise(,,MADV_COLLAPSE) should not be allowed to override it: restore + its precedence over shmem_huge_force. + +* [v3: mm-unstable: mm: multi-gen LRU: memcg LRU](http://lore.kernel.org/linux-mm/20221222041905.2431096-1-yuzhao@google.com/) + + An memcg LRU is a per-node LRU of memcgs. It is also an LRU of LRUs, + since each node and memcg combination has an LRU of folios (see + mem_cgroup_lruvec()). + + Its goal is to improve the scalability of global reclaim, which is + critical to system-wide memory overcommit in data centers. Note that + memcg reclaim is currently out of scope. + +* [v1: mm: Move FOLL_* defs to mm_types.h](http://lore.kernel.org/linux-mm/2161258.1671657894@warthog.procyon.org.uk/) + + Is it too late to ask you to add this to the current merge window? It just + moves the FOLL_* flags between headers, flipping the order of the banner + comment and the defs. + +* [v2: mm: new primitive kvmemdup()](http://lore.kernel.org/linux-mm/20221221144245.27164-1-sunhao.th@gmail.com/) + + Similar to kmemdup(), but support large amount of bytes with kvmalloc() + and does *not* guarantee that the result will be physically contiguous. + Use only in cases where kvmalloc() is needed and free it with kvfree(). + Also adapt policy_unpack.c in case someone bisect into this. + +* [v1: ipc/mqueue: introduce msg cache](http://lore.kernel.org/linux-mm/20221220184813.1908318-1-roman.gushchin@linux.dev/) + + Sven Luther reported a regression in the posix message queues + performance caused by switching to the per-object tracking of + slab objects introduced by patch series ending with the commit 10befea91b61 ("mm: memcg/slab: use a single set of kmem_caches for all allocations"). + +* [v1: mm: kmem: optimize obj_cgroup pointer retrieval](http://lore.kernel.org/linux-mm/20221220182745.1903540-1-roman.gushchin@linux.dev/) + + This patchset improves the performance of get_obj_cgroup_from_current(), which + is used to get an objcg pointer on the kernel memory allocation fast path. + +* [v1: mm: implement granular soft-dirty vma support](http://lore.kernel.org/linux-mm/20221220162606.1595355-1-usama.anjum@collabora.com/) + + The VM_SOFTDIRTY is used to mark a whole VMA soft-dirty. Sometimes + soft-dirty and non-soft-dirty VMAs are merged making the non-soft-dirty + region soft-dirty. This creates problems as the whole VMA region comes + out to be soft-dirty while in-reality no there is no soft-dirty page. + +* [v3: Introduce Copy-On-Write to Page Table](http://lore.kernel.org/linux-mm/20221220072743.3039060-1-shiyn.lin@gmail.com/) + + RFC v2 -> v3 + - Change the sysctl with PID to prctl(PR_SET_COW_PTE). + - Account all the COW PTE mapped pages in fork() instead of defer it to + page fault (break COW PTE). + - If there is an unshareable mapped page (maybe pinned or private + device), recover all the entries that are already handled by COW PTE + fork, then copy to the new one. + - Remove COW_PTE_OWNER_EXCLUSIVE flag and handle the only case of GUP, + follow_pfn_pte(). + - Remove the PTE ownership since we don't need it. + +* [v4: kasan: allow sampling page_alloc allocations for HW_TAGS](http://lore.kernel.org/linux-mm/129da0614123bb85ed4dd61ae30842b2dd7c903f.1671471846.git.andreyknvl@google.com/) + + As Hardware Tag-Based KASAN is intended to be used in production, its + performance impact is crucial. As page_alloc allocations tend to be big, + tagging and checking all such allocations can introduce a significant slowdown. + +* [v1: mm-stable: mm/nommu: don't use VM_MAYSHARE for MAP_PRIVATE mappings](http://lore.kernel.org/linux-mm/20221219163013.259423-1-david@redhat.com/) + + Trying to reduce the confusion around VM_SHARED and VM_MAYSHARE first + requires !CONFIG_MMU to stop using VM_MAYSHARE for MAP_PRIVATE mappings. + CONFIG_MMU only sets VM_MAYSHARE for MAP_SHARED mappings. + +* [v1: Introduce cmpxchg128() -- aka. the demise of cmpxchg_double()](http://lore.kernel.org/linux-mm/20221219153525.632521981@infradead.org/) + + Since Linus hated on cmpxchg_double(), a few patches to get rid of it, as proposed here: + + https://lkml.kernel.org/r/Y2U3WdU61FvYlpUh@hirez.programming.kicks-ass.net + + based on tip/master because Linus' tree is moving a wee bit fast at the moment. + + 0day robot is all green for building, very limited testing on arm64/s390 + for obvious raisins -- I tried to get the asm right, but please, double + check. + +* [v8: -next:arm64: add machine check safe support](http://lore.kernel.org/linux-mm/20221219120008.3818828-1-tongtiangen@huawei.com/) + + With the increase of memory capacity and density, the probability of + memory error increases. The increasing size and density of server RAM + in the data center and cloud have shown increased uncorrectable memory errors. + + Currently, the kernel has a mechanism to recover from hardware memory errors. This patchset provides an new recovery mechanism. + +* [v3: move PG_slab flag to page_type](http://lore.kernel.org/linux-mm/20221218101901.373450-1-42.hyeyoo@gmail.com/) + + RFC v2: + https://lore.kernel.org/linux-mm/20221106140355.294845-1-42.hyeyoo@gmail.com/ + + This patch series moves PG_slab page flag to page_type, + freeing one bit in page->flags and introduces %pGt format + that prints human-readable page_type like %pGp for printing page flags. + +#### 文件系统 + +* [v2: fsverity: support for non-4K pages](http://lore.kernel.org/linux-fsdevel/20221223203638.41293-1-ebiggers@kernel.org/) + + [v1: This patchset applies to mainline + some fsverity cleanups I sent out + recently. You can get everything from tag "fsverity-non4k-v2" of + https://git.kernel.org/pub/scm/fs/fscrypt/fscrypt.git ] + + Currently, filesystems (ext4, f2fs, and btrfs) only support fsverity + when the Merkle tree block size, filesystem block size, and page size + are all the same. In practice that means 4K, since increasing the page + size, e.g. to 16K, forces the Merkle tree block size and filesystem + block size to be increased accordingly. That can be impractical; for + one, users want the same file signatures to work on all systems. + +* [v1: eventfd: use a generic helper instead of an open coded wait_event](http://lore.kernel.org/linux-fsdevel/tencent_B38979DE0FF3B9B3EA887A37487B123BBD05@qq.com/) + + Use wait_event_interruptible_locked_irq() in the eventfd_{write,read} to + avoid the longer, open coded equivalent. + +* [v1: mm: Move FOLL_* defs to mm_types.h](http://lore.kernel.org/linux-fsdevel/2161258.1671657894@warthog.procyon.org.uk/) + + Is it too late to ask you to add this to the current merge window? It just + moves the FOLL_* flags between headers, flipping the order of the banner + comment and the defs. + +* [v4: fs/ufs: Replace kmap() with kmap_local_page](http://lore.kernel.org/linux-fsdevel/20221221172802.18743-1-fmdefrancesco@gmail.com/) + + kmap() is being deprecated in favor of kmap_local_page(). + + There are two main problems with kmap(): (1) It comes with an overhead as + the mapping space is restricted and protected by a global lock for + synchronization and (2) it also requires global TLB invalidation when the + kmap’s pool wraps and it might block when the mapping space is fully + utilized until a slot becomes available. + +* [v4: epoll: use refcount to reduce ep_mutex contention](http://lore.kernel.org/linux-fsdevel/9d8ad7995e51ad3aecdfe6f7f9e72231b8c9d3b5.1671569682.git.pabeni@redhat.com/) + + The application is multi-threaded, creates a new epoll entry for + each incoming connection, and does not delete it before the + connection shutdown - that is, before the connection's fd close(). + + Many different threads compete frequently for the epmutex lock, + affecting the overall performance. + +* [v1: fs: add macro when api not used](http://lore.kernel.org/linux-fsdevel/20221220072858.32439-1-lingfuyi@126.com/) + + when CONFIG_ELF_CORE not defined but dump_emit_page only used in + dump_user_range(),will case some error like this: + + fs/coredump.c:841:12: error: ‘dump_emit_page’ defined but not used + [v1: -Werror=unused-function] + static int dump_emit_page(struct coredump_params *cprm, struct page *page) + +* [v1: pnode: terminate for peers](http://lore.kernel.org/linux-fsdevel/20221219175230.716541-1-brauner@kernel.org/) + + The propagate_mnt() functions handles mount propagation when creating mounts + and propagates the source mount tree headed by @source_mnt to the destination + propagation mount tree @dest_mnt. Unfortunately it contains a bug where it + fails to terminate correctly and causes a NULL dereference. + +* [v3: nsfs: add compat ioctl handler](http://lore.kernel.org/linux-fsdevel/20221214-nsfs-ioctl-compat-v3-1-b7f0eb7ccdd0@weissschuh.net/) + + As all parameters and return values of the ioctls have the same + representation on both 32bit and 64bit we can reuse the normal ioctl + handler for the compat handler via compat_ptr_ioctl(). + +* [v4: Turn iomap_page_ops into iomap_folio_ops](http://lore.kernel.org/linux-fsdevel/20221218221054.3946886-1-agruenba@redhat.com/) + + Here's an updated version that changes iomap_folio_prepare() to return + an ERR_PTR() instead of NULL when the folio cannot be obtained as + suggested by Matthew Wilcox. + +* [v1: Improve 9p performance for read operations](http://lore.kernel.org/linux-fsdevel/20221217185210.1431478-1-evanhensbergen@icloud.com/) + + This patch series adds a number of features to improve read/write + performance in the 9p filesystem. Mostly it is focused on fixing + readahead caching to help utilize the recently increased MSIZE + limits, but there are also some fixes for writeback caches in the + presence of readahead and/or mmap operations. + +* [v3: fs/ufs: replace kmap() with kmap_local_page](http://lore.kernel.org/linux-fsdevel/20221217184749.968-1-fmdefrancesco@gmail.com/) + + kmap() is being deprecated in favor of kmap_local_page(). + + There are two main problems with kmap(): (1) It comes with an overhead as + the mapping space is restricted and protected by a global lock for + synchronization and (2) it also requires global TLB invalidation when the + kmap’s pool wraps and it might block when the mapping space is fully + utilized until a slot becomes available. + +* [v1: blk: optimization for classic polling](http://lore.kernel.org/linux-fsdevel/3578876466-3733-1-git-send-email-nj.shetty@samsung.com/) + + This removes the dependency on interrupts to wake up task. Set task + state as TASK_RUNNING, if need_resched() returns true, + while polling for IO completion. + Earlier, polling task used to sleep, relying on interrupt to wake it up. + This made some IO take very long when interrupt-coalescing is enabled in NVMe. + +#### 网络设备 + +* [v1: net: hns3: refine the handling for VF heartbeat](http://lore.kernel.org/netdev/20221224103203.5652-1-lanhao@huawei.com/) + + Currently, the PF check the VF alive by the KEEP_ALVE + mailbox from VF. VF keep sending the mailbox per 2 + seconds. Once PF lost the mailbox for more than 8 + seconds, it will regards the VF is abnormal, and stop + notifying the state change to VF, include link state, + vf mac, reset, even though it receives the KEEP_ALIVE + mailbox again. It's inreasonable. + +* [v1: net-wan: Add check for NULL for utdm in ucc_hdlc_probe](http://lore.kernel.org/netdev/20221223143225.23153-1-eesina@astralinux.ru/) + + If uhdlc_priv_tsa != 1 then utdm is not initialized. + And if ret != NULL then goto undo_uhdlc_init, where utdm is dereferenced. + Same if dev == NULL. + + Found by Linux Verification Center (linuxtesting.org) with SVACE. + +* [v1: net-next: icmp: Add counters for rate limits](http://lore.kernel.org/netdev/12d652c903f1d67434b683606cf3f5f0f9df861a.1671801634.git.jamie.bainbridge@gmail.com/) + + There are multiple ICMP rate limiting mechanisms: + + * Global limits: net.ipv4.icmp_msgs_burst/icmp_msgs_per_sec + * v4 per-host limits: net.ipv4.icmp_ratelimit/ratemask + * v6 per-host limits: net.ipv6.icmp_ratelimit/ratemask + + However, when ICMP output is limited, there is no way to tell + which limit has been hit or even if the limits are responsible + for the lack of ICMP output. + +* [v3: Introduce ICSSG based ethernet Driver](http://lore.kernel.org/netdev/20221223110930.1337536-1-danishanwar@ti.com/) + + The Programmable Real-time Unit and Industrial Communication Subsystem + Gigabit (PRU_ICSSG) is a low-latency microcontroller subsystem in the TI + SoCs. This subsystem is provided for the use cases like the implementation of + custom peripheral interfaces, offloading of tasks from the other + processor cores of the SoC, etc. + +* [v1: net-next: wlcore: use strscpy() to instead of strncpy()](http://lore.kernel.org/netdev/202212231100187262916@zte.com.cn/) + + The implementation of strscpy() is more robust and safer. + That's now the recommended way to copy NUL-terminated strings. + +* [v1: softirq: uncontroversial change](http://lore.kernel.org/netdev/20221222221244.1290833-1-kuba@kernel.org/) + + Catching up on LWN I run across the article about softirq + changes, and then I noticed fresh patches in Peter's tree. + So probably wise for me to throw these out there. + + My (can I say Meta's?) problem is the opposite to what the RT + sensitive people complain about. In the current scheme once + ksoftirqd is woken no network processing happens until it runs. + +* [v1: next: sysctl: expose all net/core sysctls inside netns](http://lore.kernel.org/netdev/20221222191005.71787-1-maheshb@google.com/) + + All were not visible to the non-priv users inside netns. However, + with 4ecb90090c84 ("sysctl: allow override of /proc/sys/net with + CAP_NET_ADMIN"), these vars are protected from getting modified. + A proc with capable(CAP_NET_ADMIN) can change the values so + not having them visible inside netns is just causing nuisance to + process that check certain values (e.g. net.core.somaxconn) and + see different behavior in root-netns vs. other-netns + +* [v7: vmxnet3: Add XDP support.](http://lore.kernel.org/netdev/20221222154648.21497-1-u9012063@gmail.com/) + + The patch adds native-mode XDP support: XDP DROP, PASS, TX, and REDIRECT. + + The receive side of XDP is implemented for case A and B, by invoking the + bpf program at vmxnet3_rq_rx_complete and handle its returned action. + The new vmxnet3_run_xdp function handles the difference of using dataring + or ring0, and decides the next journey of the packet afterward. + +* [v1: net: dsa: mv88e6xxx: depend on PTP conditionally](http://lore.kernel.org/netdev/20221222143405.1304900-1-foss@jsl.io/) + + PTP hardware timestamping related objects are not linked when PTP + support for MV88E6xxx (NET_DSA_MV88E6XXX_PTP) is disabled, therefore + NET_DSA_MV88E6XXX should not depend on PTP_1588_CLOCK_OPTIONAL + regardless of NET_DSA_MV88E6XXX_PTP. + +* [v3: net: qlcnic: prevent ->dcb use-after-free on qlcnic_dcb_enable() failure](http://lore.kernel.org/netdev/20221222115228.1766265-1-d-tatianin@yandex-team.ru/) + + adapter->dcb would get silently freed inside qlcnic_dcb_enable() in + case qlcnic_dcb_attach() would return an error, which always happens + under OOM conditions. This would lead to use-after-free because both + of the existing callers invoke qlcnic_dcb_get_info() on the obtained + pointer, which is potentially freed at that point. + +* [v1: net: fec: Refactor: rename `adapter` to `fep`](http://lore.kernel.org/netdev/20221222094951.11234-1-csokas.bence@prolan.hu/) + + Commit 01b825f reverted a style fix, which renamed + `struct fec_enet_private *adapter` to `fep` to match + the rest of the driver. This commit factors out that style fix. + +* [v1: virtio-net: don't busy poll for cvq command](http://lore.kernel.org/netdev/20221222060427.21626-1-jasowang@redhat.com/) + + The code used to busy poll for cvq command which turns out to have + several side effects: + + 1) infinite poll for buggy devices + 2) bad interaction with scheduler + + So this series tries to use sleep + timeout instead of busy polling. + + Please review. + +* [v2: ethtool-next: add netlink support for rss get](http://lore.kernel.org/netdev/20221222001343.1220090-1-sudheer.mogilappagari@intel.com/) + + These patches add netlink based handler to fetch RSS information using "ethtool -x + +* [v1: virtio_net: send notification coalescing command only if value changed](http://lore.kernel.org/netdev/20221221120618.652074-1-alvaro.karsz@solid-run.com/) + + Don't send a VIRTIO_NET_CTRL_NOTF_COAL_TX_SET or + VIRTIO_NET_CTRL_NOTF_COAL_RX_SET command if the coalescing parameters + haven't changed. + +* [v3: usbnet: optimize usbnet_bh() to reduce CPU load](http://lore.kernel.org/netdev/20221221075924.1141346-1-lsahn@ooseel.net/) + + The current source pushes skb into dev->done queue by calling + skb_queue_tail() and then pop it by calling skb_dequeue() to branch to + rx_cleanup state for freeing urb/skb in usbnet_bh(). It takes extra CPU + load, 2.21% (skb_queue_tail) as follows. + +* [v1: net/ncsi: Add NC-SI 1.2 Get MC MAC Address command](http://lore.kernel.org/netdev/20221221052246.519674-1-peter@pjd.dev/) + + NC-SI 1.2 isn't officially released yet, but the DMTF takes way too long + to finalize stuff, and there's hardware out there that actually supports + this command (Just the Broadcom 200G NIC afaik). + +* [v5: bpf-next: xdp: hints via kfuncs](http://lore.kernel.org/netdev/20221220222043.3348718-1-sdf@google.com/) + + Please see the first patch in the series for the overall design and use-cases. + +* [v1: net: openvswitch: release vport resources on failure](http://lore.kernel.org/netdev/20221220212717.526780-1-aconole@redhat.com/) + + A recent commit introducing upcall packet accounting failed to properly + release the vport object when the per-cpu stats struct couldn't be + allocated. This can cause dangling pointers to dp objects long after + they've been released. + +* [v1: treewide: Convert del_timer*() to timer_shutdown*()](http://lore.kernel.org/netdev/20221220134519.3dd1318b@gandalf.local.home/) + + Due to several bugs caused by timers being re-armed after they are + shutdown and just before they are freed, a new state of timers was added called "shutdown". After a timer is set to this state, then it can no longer be re-armed. + +#### 安全增强 + +* [v1: bpf: Always use maximal size for copy_array()](http://lore.kernel.org/linux-hardening/20221223182836.never.866-kees@kernel.org/) + + Instead of counting on prior allocations to have sized allocations to + the next kmalloc bucket size, always perform a krealloc that is at least + ksize(dst) in size (which is a no-op), so the size can be correctly + tracked by all the various allocation size trackers (KASAN, + __alloc_size, etc). + +* [v1: next: i915/gvt: Replace one-element array with flexible-array member](http://lore.kernel.org/linux-hardening/Y6Eu2604cqtryP4g@mail.google.com/) + + One-element arrays are deprecated, and we are replacing them with + flexible array members instead. So, replace one-element array with + flexible-array member in struct gvt_firmware_header and refactor the + rest of the code accordingly. + +* [v1: Add compiler support for Control Flow Integrity](http://lore.kernel.org/linux-hardening/20221219055431.22596-1-ashimida.1990@gmail.com/) + + This series of patches is mainly used to support the control flow + integrity protection of the linux kernel [v1: 1], which is similar to + -fsanitize=kcfi in clang 16.0 [v1: 2,3]. + + I hope that this feature will also support user-mode CFI in the + future (at least for developers who can recompile the runtime), + so I use -fsanitize=cfi as a compilation option here. + +#### 异步 IO + +* [v1: io_uring: wake up optimisations](http://lore.kernel.org/io-uring/81104db1a04efbfcec90f5819081b4299542671a.1671559005.git.asml.silence@gmail.com/) + + NOT FOR INCLUSION, needs some ring poll workarounds + + Flush completions is done either from the submit syscall or by the + task_work, both are in the context of the submitter task, and when it + goes for a single threaded rings like implied by ->task_complete, there + won't be any waiters on ->cq_wait but the master task. + +* [v1: MAINTAINERS: io_uring: Add include/trace/events/io_uring.h](http://lore.kernel.org/io-uring/20221219164521.2481728-1-ammar.faizi@intel.com/) + + This header file was introduced in commit c826bd7a743f ("io_uring: add + set of tracing events"). It didn't get added to the io_uring + maintainers section. Add this header file to the io_uring maintainers section. + +* [v2: io_uring/net: ensure compat import handlers clear free_iov](http://lore.kernel.org/io-uring/1fcaa6f3-6dc7-0685-1cb3-3b1179409609@kernel.dk/) + + If we're not allocating the vectors because the count is below + UIO_FASTIOV, we still do need to properly clear ->free_iov to prevent + an erronous free of on-stack data. + +#### Rust For Linux + +* [v1: bpf: scripts: Exclude Rust CUs with pahole](http://lore.kernel.org/rust-for-linux/20221220203901.1333304-1-yakoyoku@gmail.com/) + + Version 1.24 of pahole has the capability to exclude compilation units + (CUs) of specific languages. Rust, as of writing, is not currently + supported by pahole and if it's used with a build that has BTF debugging + enabled it results in malformed kernel and module binaries (see + Rust-for-Linux/linux#735). So it's better for pahole to exclude Rust + CUs until support for it arrives. + +#### BPF + +* [v1: libbpf: Added the description of some API functions](http://lore.kernel.org/bpf/20221224112058.12038-1-liuxin350@huawei.com/) + + Currently, many API functions are not described in the document. + I have tried to add the API description of the following four API + functions: + libbpf_set_print + bpf_object__open + bpf_object__load + bpf_object__close + +* [v2: bpf: restore the ebpf program ID for BPF_AUDIT_UNLOAD and PERF_BPF_EVENT_PROG_UNLOAD](http://lore.kernel.org/bpf/20221223185531.222689-1-paul@paul-moore.com/) + + When changing the ebpf program put() routines to support being called + from within IRQ context the program ID was reset to zero prior to + calling the perf event and audit UNLOAD record generators, which + resulted in problems as the ebpf program ID was bogus (always zero). + + I also modified the bpf_audit_prog() logic used to associate the + AUDIT_BPF record with other associated records, e.g. @ctx != NULL. + Instead of keying off the operation, it now keys off the execution + context, e.g. '!in_irg && !irqs_disabled()', which is much more + appropriate and should help better connect the UNLOAD operations with + the associated audit state (other audit records). + +* [v1: bpf-next: BPF verifier state equivalence checks improvements](http://lore.kernel.org/bpf/20221223054921.958283-1-andrii@kernel.org/) + + Patches #2-#7 refactor regsafe() function which compares two register states + across old and current states. regsafe() is critical piece of logic, so to + make it easier to review and validate refactorings and logic fixes and + improvements, each patch makes a small change, explaining why the change is + correct and makes sense. Please see individual patches for details. + +* [v1: bpf-next: selftests/bpf: Add host-tools to gitignore](http://lore.kernel.org/bpf/20221222213958.2302320-1-sdf@google.com/) + + Shows up when cross-compiling: + + HOST_SCRATCH_DIR := $(OUTPUT)/host-tools + + vs + + SCRATCH_DIR := $(OUTPUT)/tools + HOST_SCRATCH_DIR := $(SCRATCH_DIR) + +* [v1: bpf: restore the ebpf audit UNLOAD id field](http://lore.kernel.org/bpf/20221222001343.489117-1-paul@paul-moore.com/) + + I also modified the bpf_audit_prog() logic used to associate the + AUDIT_BPF record with other associated records, e.g. @ctx != NULL. + Instead of keying off the operation, it now keys off the execution + context, e.g. '!in_irg && !irqs_disabled()', which is much more + appropriate and should help better connect the UNLOAD operations with + the associated audit state (other audit records). + +* [v4: bpf: selftests/bpf: Test bpf_skb_adjust_room on CHECKSUM_PARTIAL](http://lore.kernel.org/bpf/20221221185653.1589961-1-martin.lau@linux.dev/) + + When the bpf_skb_adjust_room() shrinks the skb such that + its csum_start is invalid, the skb->ip_summed should + be reset from CHECKSUM_PARTIAL to CHECKSUM_NONE. + + This patch adds a test to ensure the skb->ip_summed changed + from CHECKSUM_PARTIAL to CHECKSUM_NONE after bpf_skb_adjust_room(). + +* [v1: bpf-next: libbpf: start v1.2 development cycle](http://lore.kernel.org/bpf/20221221180049.853365-1-andrii@kernel.org/) + + Bump current version for new development cycle to v1.2. + +* [v1: bpf-next: selftests/bpf: move struct definitions out of function params](http://lore.kernel.org/bpf/20221221055856.2786043-1-james.hilliard1@gmail.com/) + + Anonymous structs can't be declared inside function parameter + definitions in current c standards, however clang doesn't detect this + condition currently while GCC does. + +* [v3: bpf-next: bpf: Reduce smap->elem_size](http://lore.kernel.org/bpf/20221221013036.3427431-1-martin.lau@linux.dev/) + + 'struct bpf_local_storage_elem' has an unused 56 byte padding at the + end due to struct's cache-line alignment requirement. This padding + space is overlapped by storage value contents, so if we use sizeof() + to calculate the total size, we overinflate it by 56 bytes. Use + offsetof() instead to calculate more exact memory use. + +* [v1: bpf-next: bpf: Allow access to perf sample data (v2)](http://lore.kernel.org/bpf/20221220220144.4016213-1-namhyung@kernel.org/) + + I'm working on perf event sample filtering using BPF. To do that BPF needs + to access perf sample data and return 0 or 1 to drop or keep the samples. + +* [v2: virtio_net: support multi buffer xdp](http://lore.kernel.org/bpf/20221220141449.115918-1-hengqi@linux.alibaba.com/) + + Currently, virtio net only supports xdp for single-buffer packets + or linearized multi-buffer packets. This patchset supports xdp for + multi-buffer packets, then larger MTU can be used if xdp sets the + xdp.frags. This does not affect single buffer handling. + +* [v1: perf lock contention: Add more filter options (v1)](http://lore.kernel.org/bpf/20221219201732.460111-1-namhyung@kernel.org/) + + This patchset adds a couple of filters to perf lock contention command. + + The -Y/--type-filter is to filter by lock types like spinlock or mutex. + +* [v1: bpf-next: xdp: introduce xdp-feature support](http://lore.kernel.org/bpf/cover.1671462950.git.lorenzo@kernel.org/) + + Introduce the capability to export the XDP features supported by the NIC. + Introduce a XDP compliance test tool (xdp_features) to check the features + exported by the NIC match the real features supported by the driver. + Allow XDP_REDIRECT of non-linear XDP frames into a devmap. + Export XDP features for each XDP capable driver. + +* [v3: Replace invocations of prandom_u32() with get_random_u32()](http://lore.kernel.org/bpf/cover.1671277662.git.david.keisarschm@mail.huji.ac.il/) + + This third series add some changes to the commit messages, + and also replaces get_random_u32 with get_random_u32_below, + in a case a modulo operation is done on the result. + + The security improvements for prandom_u32 done in commits c51f8f88d705 + from October 2020 and d4150779e60f from May 2022 didn't handle the cases + when prandom_bytes_state() and prandom_u32_state() are used. + +* [v3: bpftool: improve error handing for missing .BTF section](http://lore.kernel.org/bpf/20221217223509.88254-1-changbin.du@gmail.com/) + + Display error message for missing ".BTF" section and clean up empty + vmlinux.h file. + +### 周边技术动态 + +#### Qemu + +* [v2: riscv-to-apply queue](http://lore.kernel.org/qemu-devel/20221221224022.425831-1-alistair.francis@opensource.wdc.com/) + + The following changes since commit 222059a0fccf4af3be776fe35a5ea2d6a68f9a0b: + + Merge tag 'pull-ppc-20221221' of https://gitlab.com/danielhb/qemu into staging (2022-12-21 18:08:09 +0000) + + are available in the Git repository at: + + https://github.com/alistair23/qemu.git tags/pull-riscv-to-apply-20221222-1 + + for you to fetch changes up to 71a9bc59728a054036f3db7dd82dab8f8bd2baf9: + + hw/intc: sifive_plic: Fix the pending register range check (2022-12-22 08:36:30 +1000) + +#### Buildroot + +* [v1: configs/lichee_rv_dock: new defconfig](http://lore.kernel.org/buildroot/20221218145518.21A4286933@busybox.osuosl.org/) + + commit: https://git.buildroot.net/buildroot/commit/?id=f9ad317507cf8564200b88dda2463406d0171122 + branch: https://git.buildroot.net/buildroot/commit/?id=refs/heads/master + + Lichee RV Dock is a RISC-V Linux development kits with high integration, + small size and affordable price designed for opensource developer. + +#### U-Boot + +* [v1: riscv: bypass malloc when spl fit boots from ram](http://lore.kernel.org/u-boot/20221222072152.2624-1-rick@andestech.com/) + + When fit image boots from ram, the payload will + be prepared in the address of SPL_LOAD_FIT_ADDRESS. + In spl fit generic flow, it will malloc another + memory address and copy whole fit image to this + malloc address. But it is un-necessary for booting from RAM. + +* [v1: riscv: ae350: Enable CCTL_SUEN](http://lore.kernel.org/u-boot/20221221025942.28496-1-rick@andestech.com/) + + CCTL operations are available to Supervisor/User-mode + software under the control of the mcache_ctl.CCTL_SUEN + control bit. Enable it to support Superviosr(and User) + CCTL operations. + +* [v1: riscv: ae350: Support openSBI 1.0+ which enable FW_PIC](http://lore.kernel.org/u-boot/20221221022843.26092-1-rick@andestech.com/) + + Change openSBI load address from 0x1000000 to 0x0 and it will start to run at 0x0 directly without relocation. + ## 20221218:第 25 期 ### 内核动态