diff --git a/Documentation/dev-tools/index.rst b/Documentation/dev-tools/index.rst index 65c54b27a60b8d..00ccc4da003be7 100644 --- a/Documentation/dev-tools/index.rst +++ b/Documentation/dev-tools/index.rst @@ -32,6 +32,7 @@ Documentation/process/debugging/index.rst kfence kselftest kunit/index + kfuzztest ktap checkuapi gpio-sloppy-logic-analyzer diff --git a/Documentation/dev-tools/kfuzztest.rst b/Documentation/dev-tools/kfuzztest.rst new file mode 100644 index 00000000000000..0c74732ecf21cb --- /dev/null +++ b/Documentation/dev-tools/kfuzztest.rst @@ -0,0 +1,385 @@ +.. SPDX-License-Identifier: GPL-2.0 +.. Copyright 2025 Google LLC + +========================================= +Kernel Fuzz Testing Framework (KFuzzTest) +========================================= + +Overview +======== + +The Kernel Fuzz Testing Framework (KFuzzTest) is a framework designed to expose +internal kernel functions to a userspace fuzzing engine. + +It is intended for testing stateless or low-state functions that are difficult +to reach from the system call interface, such as routines involved in file +format parsing or complex data transformations. This provides a method for +in-situ fuzzing of kernel code without requiring that it be built as a separate +userspace library or that its dependencies be stubbed out. + +The framework consists of four main components: + +1. An API, based on the ``FUZZ_TEST`` macro, for defining test targets + directly in the kernel tree. +2. A binary serialization format for passing complex, pointer-rich data + structures from userspace to the kernel. +3. A ``debugfs`` interface through which a userspace fuzzer submits + serialized test inputs. +4. Metadata embedded in dedicated ELF sections of the ``vmlinux`` binary to + allow for the discovery of available fuzz targets by external tooling. + +.. warning:: + KFuzzTest is a debugging and testing tool. It exposes internal kernel + functions to userspace with minimal sanitization and is designed for + use in controlled test environments only. It must **NEVER** be enabled + in production kernels. + +Supported Architectures +======================= + +KFuzzTest is designed for generic architecture support. It has only been +explicitly tested on x86_64. + +Usage +===== + +To enable KFuzzTest, configure the kernel with:: + + CONFIG_KFUZZTEST=y + +which depends on ``CONFIG_DEBUGFS`` for receiving userspace inputs, and +``CONFIG_DEBUG_KERNEL`` as an additional guardrail for preventing KFuzzTest +from finding its way into a production build accidentally. + +The KFuzzTest sample fuzz targets can be built in with +``CONFIG_SAMPLE_KFUZZTEST``. + +KFuzzTest currently only supports targets that are built into the kernel, as the +core module's startup process discovers fuzz targets from a dedicated ELF +section during startup. Furthermore, constraints and annotations emit metadata +that can be scanned from a ``vmlinux`` binary by a userspace fuzzing engine. + +Declaring a KFuzzTest target +---------------------------- + +A fuzz target should be defined in a .c file. The recommended place to define +this is under the subsystem's ``/tests`` directory in a ``_kfuzz.c`` +file, following the convention used by KUnit. The only strict requirement is +that the function being fuzzed is visible to the fuzz target. + +Defining a fuzz target involves three main parts: defining an input structure, +writing the test body using the ``FUZZ_TEST`` macro, and optionally adding +metadata for the fuzzer. + +The following example illustrates how to create a fuzz target for a function +``int process_data(const char *data, size_t len)``. + +.. code-block:: c + + /* + * 1. Define a struct to model the inputs for the function under test. + * Each field corresponds to an argument needed by the function. + */ + struct process_data_inputs { + const char *data; + size_t len; + }; + + /* + * 2. Define the fuzz target using the FUZZ_TEST macro. + * The first parameter is a unique name for the target. + * The second parameter is the input struct defined above. + */ + FUZZ_TEST(test_process_data, struct process_data_inputs) + { + /* + * Within this body, the 'arg' variable is a pointer to a + * fully initialized 'struct process_data_inputs'. + */ + + /* + * 3. (Optional) Add constraints to define preconditions. + * This check ensures 'arg->data' is not NULL. If the condition + * is not met, the test exits early. This also creates metadata + * to inform the fuzzer. + */ + KFUZZTEST_EXPECT_NOT_NULL(process_data_inputs, data); + + /* + * 4. (Optional) Add annotations to provide semantic hints to the + * fuzzer. This annotation informs the fuzzer that the 'len' field is + * the length of the buffer pointed to by 'data'. Annotations do not + * add any runtime checks. + */ + KFUZZTEST_ANNOTATE_LEN(process_data_inputs, len, data); + + /* + * 5. Call the kernel function with the provided inputs. + * Memory errors like out-of-bounds accesses on 'arg->data' will + * be detected by KASAN or other memory error detection tools. + */ + process_data(arg->data, arg->len); + } + +KFuzzTest provides two families of macros to improve the quality of fuzzing: + +- ``KFUZZTEST_EXPECT_*``: These macros define constraints, which are + preconditions that must be true for the test to proceed. They are enforced + with a runtime check in the kernel. If a check fails, the current test run is + aborted. This metadata helps the userspace fuzzer avoid generating invalid + inputs. + +- ``KFUZZTEST_ANNOTATE_*``: These macros define annotations, which are purely + semantic hints for the fuzzer. They do not add any runtime checks and exist + only to help the fuzzer generate more intelligent and structurally correct + inputs. For example, KFUZZTEST_ANNOTATE_LEN links a size field to a pointer + field, which is a common pattern in C APIs. + +Metadata +-------- + +Macros ``FUZZ_TEST``, ``KFUZZTEST_EXPECT_*`` and ``KFUZZTEST_ANNOTATE_*`` embed +metadata into several sections within the main ``.data`` section of the final +``vmlinux`` binary; ``.kfuzztest_target``, ``.kfuzztest_constraint`` and +``.kfuzztest_annotation`` respectively. + +This serves two purposes: + +1. The core module uses the ``.kfuzztest_target`` section at boot to discover + every ``FUZZ_TEST`` instance and create its ``debugfs`` directory and + ``input`` file. +2. Userspace fuzzers can read this metadata from the ``vmlinux`` binary to + discover targets and learn about their rules and structure in order to + generate correct and effective inputs. + +The metadata in the ``.kfuzztest_*`` sections consists of arrays of fixed-size C +structs (e.g., ``struct kfuzztest_target``). Fields within these structs that +are pointers, such as ``name`` or ``arg_type_name``, contain addresses that +point to other locations in the ``vmlinux`` binary. A userspace tool that +parsing the ELF file must resolve these pointers to read the data that they +reference. For example, to get a target's name, a tool must: + +1. Read the ``struct kfuzztest_target`` from the ``.kfuzztest_target`` section. +2. Read the address in the ``.name`` field. +3. Use that address to locate and read null-terminated string from its position + elsewhere in the binary (e.g., ``.rodata``). + +Tooling Dependencies +-------------------- + +For userspace tools to parse the ``vmlinux`` binary and make use of emitted +KFuzzTest metadata, the kernel must be compiled with DWARF debug information. +This is required for tools to understand the layout of C structs, resolve type +information, and correctly interpret constraints and annotations. + +When using KFuzzTest with automated fuzzing tools, either +``CONFIG_DEBUG_INFO_DWARF4`` or ``CONFIG_DEBUG_INFO_DWARF5`` should be enabled. + +Input Format +============ + +KFuzzTest targets receive their inputs from userspace via a write to a dedicated +debugfs file ``/sys/kernel/debug/kfuzztest//input``. + +The data written to this file must be a single binary blob that follows a +specific serialization format. This format is designed to allow complex, +pointer-rich C structures to be represented in a flat buffer, requiring only a +single kernel allocation and copy from userspace. + +An input is first prefixed by an 8-byte header containing a magic value in the +first four bytes, defined as ``KFUZZTEST_HEADER_MAGIC`` in +```, and a version number in the subsequent four +bytes. + +Version 0 +--------- + +In version 0 (i.e., when the version number in the 8-byte header is equal to 0), +the input format consists of three main parts laid out sequentially: a region +array, a relocation table, and the payload.:: + + +----------------+---------------------+-----------+----------------+ + | region array | relocation table | padding | payload | + +----------------+---------------------+-----------+----------------+ + +Region Array +^^^^^^^^^^^^ + +This component is a header that describes how the raw data in the Payload is +partitioned into logical memory regions. It consists of a count of regions +followed by an array of ``struct reloc_region``, where each entry defines a +single region with its size and offset from the start of the payload. + +.. code-block:: c + + struct reloc_region { + uint32_t offset; + uint32_t size; + }; + + struct reloc_region_array { + uint32_t num_regions; + struct reloc_region regions[]; + }; + +By convention, region 0 represents the top-level input struct that is passed +as the arg variable to the ``FUZZ_TEST`` body. Subsequent regions typically +represent data buffers or structs pointed to by fields within that struct. +Region array entries must be ordered by ascending offset, and must not overlap +with one another. + +Relocation Table +^^^^^^^^^^^^^^^^ + +The relocation table contains the instructions for the kernel to "hydrate" the +payload by patching pointer fields. It contains an array of +``struct reloc_entry`` items. Each entry acts as a linking instruction, +specifying: + +- The location of a pointer that needs to be patched (identified by a region + ID and an offset within that region). + +- The target region that the pointer should point to (identified by the + target's region ID) or ``KFUZZTEST_REGIONID_NULL`` if the pointer is ``NULL``. + +This table also specifies the amount of padding between its end and the start +of the payload, which should be at least 8 bytes. + +.. code-block:: c + + struct reloc_entry { + uint32_t region_id; + uint32_t region_offset; + uint32_t value; + }; + + struct reloc_table { + uint32_t num_entries; + uint32_t padding_size; + struct reloc_entry entries[]; + }; + +Payload +^^^^^^^ + +The payload contains the raw binary data for all regions, concatenated together +according to their specified offsets. + +- Region specific alignment: The data for each individual region must start at + an offset that is aligned to its own C type's requirements. For example, a + ``uint64_t`` must begin on an 8-byte boundary. + +- Minimum alignment: The offset of each region, as well as the beginning of the + payload, must also be a multiple of the overall minimum alignment value. This + value is determined by the greater of ``ARCH_KMALLOC_MINALIGN`` and + ``KASAN_GRANULE_SIZE`` (which is represented by ``KFUZZTEST_POISON_SIZE`` in + ``/include/linux/kfuzztest.h``). This minimum alignment ensures that all + function inputs respect C calling conventions. + +- Padding: The space between the end of one region's data and the beginning of + the next must be sufficient for padding. The padding must also be at least + the same minimum alignment value mentioned above. This is crucial for KASAN + builds, as it allows KFuzzTest to poison this unused space enabling precise + detection of out-of-bounds memory accesses between adjacent buffers. + +The minimum alignment value is architecture-dependent and is exposed to +userspace via the read-only file +``/sys/kernel/debug/kfuzztest/_config/minalign``. The framework relies on +userspace tooling to construct the payload correctly, adhering to all three of +these rules for every region. + +KFuzzTest Bridge Tool +===================== + +The ``kfuzztest-bridge`` program is a userspace utility that encodes a random +byte stream into the structured binary format expected by a KFuzzTest harness. +It allows users to describe the target's input structure textually, making it +easy to perform smoke tests or connect harnesses to blob-based fuzzing engines. + +This tool is intended to be simple, both in usage and implementation. Its +structure and DSL are sufficient for simpler use-cases. For more advanced +coverage-guided fuzzing it is recommended to use +`syzkaller ` which implements deeper +support for KFuzzTest targets. + +Usage +----- + +The tool can be built with ``make tools/testing/kfuzztest-bridge``. In the case +of libc incompatibilities, the tool will have to be linked statically or built +on the target system. + +Example: + +.. code-block:: sh + + ./tools/testing/kfuzztest-bridge \ + "foo { u32 ptr[bar] }; bar { ptr[data] len[data, u64]}; data { arr[u8, 42] };" \ + "my-fuzz-target" /dev/urandom + +The command takes three arguments + +1. A string describing the input structure (see `Textual Format`_ sub-section). +2. The name of the target test, which corresponds to its directory in + ``/sys/kernel/debug/kfuzztest/``. +3. A path to a file providing a stream of random data, such as + ``/dev/urandom``. + +The structure string in the example corresponds to the following C data +structures: + +.. code-block:: c + + struct foo { + u32 a; + struct bar *b; + }; + + struct bar { + struct data *d; + u64 data_len; /* Equals 42. */ + }; + + struct data { + char arr[42]; + }; + +Textual Format +-------------- + +The textual format is a human-readable representation of the region-based binary +format used by KFuzzTest. It is described by the following grammar: + +.. code-block:: text + + schema ::= region ( ";" region )* [";"] + region ::= identifier "{" type ( " " type )* "}" + type ::= primitive | pointer | array | length | string + primitive ::= "u8" | "u16" | "u32" | "u64" + pointer ::= "ptr" "[" identifier "]" + array ::= "arr" "[" primitive "," integer "]" + length ::= "len" "[" identifier "," primitive "]" + string ::= "str" "[" integer "]" + identifier ::= [a-zA-Z_][a-zA-Z1-9_]* + integer ::= [0-9]+ + +Pointers must reference a named region. + +To fuzz a raw buffer, the buffer must be defined in its own region, as shown +below: + +.. code-block:: c + + struct my_struct { + char *buf; + size_t buflen; + }; + +This would correspond to the following textual description: + +.. code-block:: text + + my_struct { ptr[buf] len[buf, u64] }; buf { arr[u8, n] }; + +Here, ``n`` is some integer value defining the size of the byte array inside of +the ``buf`` region. diff --git a/MAINTAINERS b/MAINTAINERS index 6dcfbd11efef87..14972e3e9d6a96 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -13641,6 +13641,14 @@ F: include/linux/kfifo.h F: lib/kfifo.c F: samples/kfifo/ +KFUZZTEST +M: Ethan Graham +R: Alexander Potapenko +F: include/linux/kfuzztest.h +F: lib/kfuzztest/ +F: Documentation/dev-tools/kfuzztest.rst +F: tools/kfuzztest-bridge/ + KGDB / KDB /debug_core M: Jason Wessel M: Daniel Thompson diff --git a/crypto/asymmetric_keys/Makefile b/crypto/asymmetric_keys/Makefile index bc65d3b98dcbfa..77b825aee6b24f 100644 --- a/crypto/asymmetric_keys/Makefile +++ b/crypto/asymmetric_keys/Makefile @@ -67,6 +67,8 @@ obj-$(CONFIG_PKCS7_TEST_KEY) += pkcs7_test_key.o pkcs7_test_key-y := \ pkcs7_key_type.o +obj-y += tests/ + # # Signed PE binary-wrapped key handling # diff --git a/crypto/asymmetric_keys/tests/Makefile b/crypto/asymmetric_keys/tests/Makefile new file mode 100644 index 00000000000000..023d6a65fb8913 --- /dev/null +++ b/crypto/asymmetric_keys/tests/Makefile @@ -0,0 +1,4 @@ +pkcs7-kfuzz-y := $(and $(CONFIG_KFUZZTEST),$(CONFIG_PKCS7_MESSAGE_PARSER)) +rsa-helper-kfuzz-y := $(and $(CONFIG_KFUZZTEST),$(CONFIG_CRYPTO_RSA)) +obj-$(pkcs7-kfuzz-y) += pkcs7_kfuzz.o +obj-$(rsa-helper-kfuzz-y) += rsa_helper_kfuzz.o diff --git a/crypto/asymmetric_keys/tests/pkcs7_kfuzz.c b/crypto/asymmetric_keys/tests/pkcs7_kfuzz.c new file mode 100644 index 00000000000000..c801f6b59de252 --- /dev/null +++ b/crypto/asymmetric_keys/tests/pkcs7_kfuzz.c @@ -0,0 +1,26 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * PKCS#7 parser KFuzzTest target + * + * Copyright 2025 Google LLC + */ +#include +#include + +struct pkcs7_parse_message_arg { + const void *data; + size_t datalen; +}; + +FUZZ_TEST(test_pkcs7_parse_message, struct pkcs7_parse_message_arg) +{ + struct pkcs7_message *msg; + + KFUZZTEST_EXPECT_NOT_NULL(pkcs7_parse_message_arg, data); + KFUZZTEST_ANNOTATE_ARRAY(pkcs7_parse_message_arg, data); + KFUZZTEST_ANNOTATE_LEN(pkcs7_parse_message_arg, datalen, data); + + msg = pkcs7_parse_message(arg->data, arg->datalen); + if (msg && !IS_ERR(msg)) + kfree(msg); +} diff --git a/crypto/asymmetric_keys/tests/rsa_helper_kfuzz.c b/crypto/asymmetric_keys/tests/rsa_helper_kfuzz.c new file mode 100644 index 00000000000000..bd29ed5e8c8253 --- /dev/null +++ b/crypto/asymmetric_keys/tests/rsa_helper_kfuzz.c @@ -0,0 +1,38 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * RSA key extract helper KFuzzTest targets + * + * Copyright 2025 Google LLC + */ +#include +#include + +struct rsa_parse_pub_key_arg { + const void *key; + size_t key_len; +}; + +FUZZ_TEST(test_rsa_parse_pub_key, struct rsa_parse_pub_key_arg) +{ + KFUZZTEST_EXPECT_NOT_NULL(rsa_parse_pub_key_arg, key); + KFUZZTEST_ANNOTATE_ARRAY(rsa_parse_pub_key_arg, key); + KFUZZTEST_ANNOTATE_LEN(rsa_parse_pub_key_arg, key_len, key); + + struct rsa_key out; + rsa_parse_pub_key(&out, arg->key, arg->key_len); +} + +struct rsa_parse_priv_key_arg { + const void *key; + size_t key_len; +}; + +FUZZ_TEST(test_rsa_parse_priv_key, struct rsa_parse_priv_key_arg) +{ + KFUZZTEST_EXPECT_NOT_NULL(rsa_parse_priv_key_arg, key); + KFUZZTEST_ANNOTATE_ARRAY(rsa_parse_priv_key_arg, key); + KFUZZTEST_ANNOTATE_LEN(rsa_parse_priv_key_arg, key_len, key); + + struct rsa_key out; + rsa_parse_priv_key(&out, arg->key, arg->key_len); +} diff --git a/drivers/auxdisplay/charlcd.c b/drivers/auxdisplay/charlcd.c index 09020bb8ad15fa..e079b5a9c93c81 100644 --- a/drivers/auxdisplay/charlcd.c +++ b/drivers/auxdisplay/charlcd.c @@ -682,3 +682,11 @@ EXPORT_SYMBOL_GPL(charlcd_unregister); MODULE_DESCRIPTION("Character LCD core support"); MODULE_LICENSE("GPL"); + +/* + * When CONFIG_KFUZZTEST is enabled, we include this _kfuzz.c file to ensure + * that KFuzzTest targets are built. + */ +#ifdef CONFIG_KFUZZTEST +#include "tests/charlcd_kfuzz.c" +#endif /* CONFIG_KFUZZTEST */ diff --git a/drivers/auxdisplay/tests/charlcd_kfuzz.c b/drivers/auxdisplay/tests/charlcd_kfuzz.c new file mode 100644 index 00000000000000..28ce7069c65c3b --- /dev/null +++ b/drivers/auxdisplay/tests/charlcd_kfuzz.c @@ -0,0 +1,20 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * charlcd KFuzzTest target + * + * Copyright 2025 Google LLC + */ +#include + +struct parse_xy_arg { + const char *s; +}; + +FUZZ_TEST(test_parse_xy, struct parse_xy_arg) +{ + unsigned long x, y; + + KFUZZTEST_EXPECT_NOT_NULL(parse_xy_arg, s); + KFUZZTEST_ANNOTATE_STRING(parse_xy_arg, s); + parse_xy(arg->s, &x, &y); +} diff --git a/fs/binfmt_script.c b/fs/binfmt_script.c index 637daf6e4d4520..c09f224d6d7ecf 100644 --- a/fs/binfmt_script.c +++ b/fs/binfmt_script.c @@ -157,3 +157,11 @@ core_initcall(init_script_binfmt); module_exit(exit_script_binfmt); MODULE_DESCRIPTION("Kernel support for scripts starting with #!"); MODULE_LICENSE("GPL"); + +/* + * When CONFIG_KFUZZTEST is enabled, we include this _kfuzz.c file to ensure + * that KFuzzTest targets are built. + */ +#ifdef CONFIG_KFUZZTEST +#include "tests/binfmt_script_kfuzz.c" +#endif /* CONFIG_KFUZZTEST */ diff --git a/fs/tests/binfmt_script_kfuzz.c b/fs/tests/binfmt_script_kfuzz.c new file mode 100644 index 00000000000000..26397a465270bf --- /dev/null +++ b/fs/tests/binfmt_script_kfuzz.c @@ -0,0 +1,58 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * binfmt_script loader KFuzzTest target + * + * Copyright 2025 Google LLC + */ +#include +#include +#include +#include + +struct load_script_arg { + char buf[BINPRM_BUF_SIZE]; +}; + +FUZZ_TEST(test_load_script, struct load_script_arg) +{ + struct linux_binprm bprm = {}; + char *arg_page; + + arg_page = (char *)get_zeroed_page(GFP_KERNEL); + if (!arg_page) + return; + + memcpy(bprm.buf, arg->buf, sizeof(bprm.buf)); + /* + * `load_script` calls remove_arg_zero, which expects argc != 0. A + * static value of 1 is sufficient for fuzzing. + */ + bprm.argc = 1; + bprm.p = (unsigned long)arg_page + PAGE_SIZE; + bprm.filename = kstrdup("fuzz_script", GFP_KERNEL); + if (!bprm.filename) + goto cleanup; + bprm.interp = kstrdup(bprm.filename, GFP_KERNEL); + if (!bprm.interp) + goto cleanup; + + bprm.mm = mm_alloc(); + if (!bprm.mm) + goto cleanup; + + /* + * Call the target function. We expect it to fail and return an error + * (e.g., at open_exec), which is fine. The goal is to survive the + * initial parsing logic without crashing. + */ + load_script(&bprm); + +cleanup: + if (bprm.mm) + mmput(bprm.mm); + if (bprm.interp) + kfree(bprm.interp); + if (bprm.filename) + kfree(bprm.filename); + free_page((unsigned long)arg_page); +} diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h index ae2d2359b79e9e..9afe569d013b6b 100644 --- a/include/asm-generic/vmlinux.lds.h +++ b/include/asm-generic/vmlinux.lds.h @@ -373,7 +373,8 @@ defined(CONFIG_AUTOFDO_CLANG) || defined(CONFIG_PROPELLER_CLANG) TRACE_PRINTKS() \ BPF_RAW_TP() \ TRACEPOINT_STR() \ - KUNIT_TABLE() + KUNIT_TABLE() \ + KFUZZTEST_TABLE() /* * Data section helpers @@ -966,6 +967,25 @@ defined(CONFIG_AUTOFDO_CLANG) || defined(CONFIG_PROPELLER_CLANG) BOUNDED_SECTION_POST_LABEL(.kunit_init_test_suites, \ __kunit_init_suites, _start, _end) +#ifdef CONFIG_KFUZZTEST +#define KFUZZTEST_TABLE() \ + . = ALIGN(PAGE_SIZE); \ + __kfuzztest_targets_start = .; \ + KEEP(*(.kfuzztest_target)); \ + __kfuzztest_targets_end = .; \ + . = ALIGN(PAGE_SIZE); \ + __kfuzztest_constraints_start = .; \ + KEEP(*(.kfuzztest_constraint)); \ + __kfuzztest_constraints_end = .; \ + . = ALIGN(PAGE_SIZE); \ + __kfuzztest_annotations_start = .; \ + KEEP(*(.kfuzztest_annotation)); \ + __kfuzztest_annotations_end = .; + +#else /* CONFIG_KFUZZTEST */ +#define KFUZZTEST_TABLE() +#endif /* CONFIG_KFUZZTEST */ + #ifdef CONFIG_BLK_DEV_INITRD #define INIT_RAM_FS \ . = ALIGN(4); \ diff --git a/include/linux/kasan.h b/include/linux/kasan.h index 890011071f2b14..cd6cdf732378c2 100644 --- a/include/linux/kasan.h +++ b/include/linux/kasan.h @@ -102,6 +102,16 @@ static inline bool kasan_has_integrated_init(void) } #ifdef CONFIG_KASAN + +/** + * kasan_poison_range - poison the memory range [@addr, @addr + @size) + * + * The exact behavior is subject to alignment with KASAN_GRANULE_SIZE, defined + * in : if @start is unaligned, the initial partial granule + * at the beginning of the range is only poisoned if CONFIG_KASAN_GENERIC=y. + */ +int kasan_poison_range(const void *addr, size_t size); + void __kasan_unpoison_range(const void *addr, size_t size); static __always_inline void kasan_unpoison_range(const void *addr, size_t size) { @@ -402,6 +412,7 @@ static __always_inline bool kasan_check_byte(const void *addr) #else /* CONFIG_KASAN */ +static inline int kasan_poison_range(const void *start, size_t size) { return 0; } static inline void kasan_unpoison_range(const void *address, size_t size) {} static inline void kasan_poison_pages(struct page *page, unsigned int order, bool init) {} diff --git a/include/linux/kfuzztest.h b/include/linux/kfuzztest.h new file mode 100644 index 00000000000000..2620e48bb62018 --- /dev/null +++ b/include/linux/kfuzztest.h @@ -0,0 +1,497 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * The Kernel Fuzz Testing Framework (KFuzzTest) API for defining fuzz targets + * for internal kernel functions. + * + * For more information please see Documentation/dev-tools/kfuzztest.rst. + * + * Copyright 2025 Google LLC + */ +#ifndef KFUZZTEST_H +#define KFUZZTEST_H + +#include +#include +#include + +#define KFUZZTEST_HEADER_MAGIC (0xBFACE) +#define KFUZZTEST_V0 (0) + +/** + * @brief The KFuzzTest Input Serialization Format + * + * KFuzzTest receives its input from userspace as a single binary blob. This + * format allows for the serialization of complex, pointer-rich C structures + * into a flat buffer that can be safely passed into the kernel. This format + * requires only a single copy from userspace into a kernel buffer, and no + * further kernel allocations. Pointers are patched internally using a "region" + * system where each region corresponds to some pointed-to data. + * + * Regions should be padded to respect alignment constraints of their underlying + * types, and should be followed by at least 8 bytes of padding. These padded + * regions are poisoned by KFuzzTest to ensure that KASAN catches OOB accesses. + * + * The format consists of a header and three main components: + * 1. An 8-byte header: Contains KFUZZTEST_MAGIC in the first 4 bytes, and the + * version number in the subsequent 4 bytes. This ensures backwards + * compatibility in the event of future format changes. + * 2. A reloc_region_array: Defines the memory layout of the target structure + * by partitioning the payload into logical regions. Each logical region + * should contain the byte representation of the type that it represents, + * including any necessary padding. The region descriptors should be + * ordered by offset ascending. + * 3. A reloc_table: Provides "linking" instructions that tell the kernel how + * to patch pointer fields to point to the correct regions. By design, + * the first region (index 0) is passed as input into a FUZZ_TEST. + * 4. A Payload: The raw binary data for the target structure and its associated + * buffers. This should be aligned to the maximum alignment of all + * regions to satisfy alignment requirements of the input types, but this + * isn't checked by the parser. + * + * For a detailed specification of the binary layout see the full documentation + * at: Documentation/dev-tools/kfuzztest.rst + */ + +/** + * struct reloc_region - single contiguous memory region in the payload + * + * @offset: The byte offset of this region from the start of the payload, which + * should be aligned to the alignment requirements of the region's + * underlying type. + * @size: The size of this region in bytes. + */ +struct reloc_region { + uint32_t offset; + uint32_t size; +}; + +/** + * struct reloc_region_array - array of regions in an input + * + * @num_regions: The total number of regions defined. + * @regions: A flexible array of `num_regions` region descriptors. + */ +struct reloc_region_array { + uint32_t num_regions; + struct reloc_region regions[]; +}; + +/** + * struct reloc_entry - a single pointer to be patched in an input + * + * @region_id: The index of the region in the `reloc_region_array` that + * contains the pointer. + * @region_offset: The start offset of the pointer inside of the region. + * @value: contains the index of the pointee region, or KFUZZTEST_REGIONID_NULL + * if the pointer is NULL. + */ +struct reloc_entry { + uint32_t region_id; + uint32_t region_offset; + uint32_t value; +}; + +/** + * struct reloc_table - array of relocations required by an input + * + * @num_entries: the number of pointer relocations. + * @padding_size: the number of padded bytes between the last relocation in + * entries, and the start of the payload data. This should be at least + * 8 bytes, as it is used for poisoning. + * @entries: array of relocations. + */ +struct reloc_table { + uint32_t num_entries; + uint32_t padding_size; + struct reloc_entry entries[]; +}; + +/** + * kfuzztest_parse_and_relocate - validate and relocate a KFuzzTest input + * + * @input: A buffer containing the serialized input for a fuzz target. + * @input_size: the size in bytes of the @input buffer. + * @arg_ret: return pointer for the test case's input structure. + */ +int kfuzztest_parse_and_relocate(void *input, size_t input_size, void **arg_ret); + +/* + * Dump some information on the parsed headers and payload. Can be useful for + * debugging inputs when writing an encoder for the KFuzzTest input format. + */ +__attribute__((unused)) static inline void kfuzztest_debug_header(struct reloc_region_array *regions, + struct reloc_table *rt, void *payload_start, + void *payload_end) +{ + uint32_t i; + + pr_info("regions: { num_regions = %u } @ %px", regions->num_regions, regions); + for (i = 0; i < regions->num_regions; i++) { + pr_info(" region_%u: { start: 0x%x, size: 0x%x }", i, regions->regions[i].offset, + regions->regions[i].size); + } + + pr_info("reloc_table: { num_entries = %u, padding = %u } @ offset 0x%tx", rt->num_entries, rt->padding_size, + (char *)rt - (char *)regions); + for (i = 0; i < rt->num_entries; i++) { + pr_info(" reloc_%u: { src: %u, offset: 0x%x, dst: %u }", i, rt->entries[i].region_id, + rt->entries[i].region_offset, rt->entries[i].value); + } + + pr_info("payload: [0x%lx, 0x%tx)", (char *)payload_start - (char *)regions, + (char *)payload_end - (char *)regions); +} + +struct kfuzztest_target { + const char *name; + const char *arg_type_name; + ssize_t (*write_input_cb)(struct file *filp, const char __user *buf, size_t len, loff_t *off); +} __aligned(32); + +#define KFUZZTEST_MAX_INPUT_SIZE (PAGE_SIZE * 16) + +/* Increments a global counter after a successful invocation. */ +void record_invocation(void); + +/** + * FUZZ_TEST - defines a KFuzzTest target + * + * @test_name: The unique identifier for the fuzz test, which is used to name + * the debugfs entry, e.g., /sys/kernel/debug/kfuzztest/@test_name. + * @test_arg_type: The struct type that defines the inputs for the test. This + * must be the full struct type (e.g., "struct my_inputs"), not a typedef. + * + * Context: + * This macro is the primary entry point for the KFuzzTest framework. It + * generates all the necessary boilerplate for a fuzz test, including: + * - A static `struct kfuzztest_target` instance that is placed in a + * dedicated ELF section for discovery by userspace tools. + * - A `debugfs` write callback that handles receiving serialized data from + * a fuzzer, parsing it, and "hydrating" it into a valid C struct. + * - A function stub where the developer places the test logic. + * + * User-Provided Logic: + * The developer must provide the body of the fuzz test logic within the curly + * braces following the macro invocation. Within this scope, the framework + * provides the `arg` variable, which is a pointer of type `@test_arg_type *` + * to the fully hydrated input structure. All pointer fields within this struct + * have been relocated and are valid kernel pointers. This is the primary + * variable to use for accessing fuzzing inputs. + * + * Example Usage: + * + * // 1. The kernel function we want to fuzz. + * int process_data(const char *data, size_t len); + * + * // 2. Define a struct to hold all inputs for the function. + * struct process_data_inputs { + * const char *data; + * size_t len; + * }; + * + * // 3. Define the fuzz test using the FUZZ_TEST macro. + * FUZZ_TEST(process_data_fuzzer, struct process_data_inputs) + * { + * int ret; + * // Use KFUZZTEST_EXPECT_* to enforce preconditions. + * // The test will exit early if data is NULL. + * KFUZZTEST_EXPECT_NOT_NULL(process_data_inputs, data); + * + * // Use KFUZZTEST_ANNOTATE_* to provide hints to the fuzzer. + * // This links the 'len' field to the 'data' buffer. + * KFUZZTEST_ANNOTATE_LEN(process_data_inputs, len, data); + * + * // Call the function under test using the 'arg' variable. OOB memory + * // accesses will be caught by KASAN, but the user can also choose to + * // validate the return value and log any failures. + * ret = process_data(arg->data, arg->len); + * } + */ +#define FUZZ_TEST(test_name, test_arg_type) \ + static ssize_t kfuzztest_write_cb_##test_name(struct file *filp, const char __user *buf, size_t len, \ + loff_t *off); \ + static void kfuzztest_logic_##test_name(test_arg_type *arg); \ + static const struct kfuzztest_target __fuzz_test__##test_name __section(".kfuzztest_target") __used = { \ + .name = #test_name, \ + .arg_type_name = #test_arg_type, \ + .write_input_cb = kfuzztest_write_cb_##test_name, \ + }; \ + static ssize_t kfuzztest_write_cb_##test_name(struct file *filp, const char __user *buf, size_t len, \ + loff_t *off) \ + { \ + test_arg_type *arg; \ + void *buffer; \ + int ret; \ + \ + /* + * Taint the kernel on the first fuzzing invocation. The debugfs + * interface provides a high-risk entry point for userspace to + * call kernel functions with untrusted input. + */ \ + if (!test_taint(TAINT_TEST)) \ + add_taint(TAINT_TEST, LOCKDEP_STILL_OK); \ + if (len >= KFUZZTEST_MAX_INPUT_SIZE) { \ + pr_warn(#test_name ": user input of size %zu is too large", len); \ + return -EINVAL; \ + } \ + buffer = kmalloc(len, GFP_KERNEL); \ + if (!buffer) \ + return -ENOMEM; \ + ret = simple_write_to_buffer(buffer, len, off, buf, len); \ + if (ret != len){ \ + ret = -EFAULT; \ + goto out; \ + }; \ + ret = kfuzztest_parse_and_relocate(buffer, len, (void **)&arg); \ + if (ret < 0) \ + goto out; \ + kfuzztest_logic_##test_name(arg); \ + record_invocation(); \ + ret = len; \ +out: \ + kfree(buffer); \ + return ret; \ + } \ + static void kfuzztest_logic_##test_name(test_arg_type *arg) + +enum kfuzztest_constraint_type { + EXPECT_EQ, + EXPECT_NE, + EXPECT_LT, + EXPECT_LE, + EXPECT_GT, + EXPECT_GE, + EXPECT_IN_RANGE, +}; + +/** + * struct kfuzztest_constraint - a metadata record for a domain constraint + * + * Domain constraints are rules about the input data that must be satisfied for + * a fuzz test to proceed. While they are enforced in the kernel with a runtime + * check, they are primarily intended as a discoverable contract for userspace + * fuzzers. + * + * Instances of this struct are generated by the KFUZZTEST_EXPECT_* macros + * and placed into the read-only ".kfuzztest_constraint" ELF section of the + * vmlinux binary. A fuzzer can parse this section to learn about the + * constraints and generate valid inputs more intelligently. + * + * For an example of how these constraints are used within a fuzz test, see the + * documentation for the FUZZ_TEST() macro. + * + * @input_type: The name of the input struct type, without the leading + * "struct ". + * @field_name: The name of the field within the struct that this constraint + * applies to. + * @value1: The primary value used in the comparison (e.g., the upper + * bound for EXPECT_LE). + * @value2: The secondary value, used only for multi-value comparisons + * (e.g., the upper bound for EXPECT_IN_RANGE). + * @type: The type of the constraint. + */ +struct kfuzztest_constraint { + const char *input_type; + const char *field_name; + uintptr_t value1; + uintptr_t value2; + enum kfuzztest_constraint_type type; +} __aligned(64); + +#define __KFUZZTEST_DEFINE_CONSTRAINT(arg_type, field, val1, val2, tpe, predicate) \ + do { \ + static struct kfuzztest_constraint __constraint_##arg_type##_##field \ + __section(".kfuzztest_constraint") __used = { \ + .input_type = "struct " #arg_type, \ + .field_name = #field, \ + .value1 = (uintptr_t)val1, \ + .value2 = (uintptr_t)val2, \ + .type = tpe, \ + }; \ + if (!(predicate)) \ + return; \ + } while (0) + +/** + * KFUZZTEST_EXPECT_EQ - constrain a field to be equal to a value + * + * @arg_type: name of the input structure, without the leading "struct ". + * @field: some field that is comparable + * @val: a value of the same type as @arg_type.@field + */ +#define KFUZZTEST_EXPECT_EQ(arg_type, field, val) \ + __KFUZZTEST_DEFINE_CONSTRAINT(arg_type, field, val, 0x0, EXPECT_EQ, arg->field == val) + +/** + * KFUZZTEST_EXPECT_NE - constrain a field to be not equal to a value + * + * @arg_type: name of the input structure, without the leading "struct ". + * @field: some field that is comparable. + * @val: a value of the same type as @arg_type.@field. + */ +#define KFUZZTEST_EXPECT_NE(arg_type, field, val) \ + __KFUZZTEST_DEFINE_CONSTRAINT(arg_type, field, val, 0x0, EXPECT_NE, arg->field != val) + +/** + * KFUZZTEST_EXPECT_LT - constrain a field to be less than a value + * + * @arg_type: name of the input structure, without the leading "struct ". + * @field: some field that is comparable. + * @val: a value of the same type as @arg_type.@field. + */ +#define KFUZZTEST_EXPECT_LT(arg_type, field, val) \ + __KFUZZTEST_DEFINE_CONSTRAINT(arg_type, field, val, 0x0, EXPECT_LT, arg->field < val) + +/** + * KFUZZTEST_EXPECT_LE - constrain a field to be less than or equal to a value + * + * @arg_type: name of the input structure, without the leading "struct ". + * @field: some field that is comparable. + * @val: a value of the same type as @arg_type.@field. + */ +#define KFUZZTEST_EXPECT_LE(arg_type, field, val) \ + __KFUZZTEST_DEFINE_CONSTRAINT(arg_type, field, val, 0x0, EXPECT_LE, arg->field <= val) + +/** + * KFUZZTEST_EXPECT_GT - constrain a field to be greater than a value + * + * @arg_type: name of the input structure, without the leading "struct ". + * @field: some field that is comparable. + * @val: a value of the same type as @arg_type.@field. + */ +#define KFUZZTEST_EXPECT_GT(arg_type, field, val) \ + __KFUZZTEST_DEFINE_CONSTRAINT(arg_type, field, val, 0x0, EXPECT_GT, arg->field > val) + +/** + * KFUZZTEST_EXPECT_GE - constrain a field to be greater than or equal to a value + * + * @arg_type: name of the input structure, without the leading "struct ". + * @field: some field that is comparable. + * @val: a value of the same type as @arg_type.@field. + */ +#define KFUZZTEST_EXPECT_GE(arg_type, field, val) \ + __KFUZZTEST_DEFINE_CONSTRAINT(arg_type, field, val, 0x0, EXPECT_GE, arg->field >= val) + +/** + * KFUZZTEST_EXPECT_NOT_NULL - constrain a pointer field to be non-NULL + * + * @arg_type: name of the input structure, without the leading "struct ". + * @field: a pointer field. + */ +#define KFUZZTEST_EXPECT_NOT_NULL(arg_type, field) KFUZZTEST_EXPECT_NE(arg_type, field, NULL) + +/** + * KFUZZTEST_EXPECT_IN_RANGE - constrain a field to be within a range + * + * @arg_type: name of the input structure, without the leading "struct ". + * @field: some field that is comparable. + * @lower_bound: a lower bound of the same type as @arg_type.@field. + * @upper_bound: an upper bound of the same type as @arg_type.@field. + */ +#define KFUZZTEST_EXPECT_IN_RANGE(arg_type, field, lower_bound, upper_bound) \ + __KFUZZTEST_DEFINE_CONSTRAINT(arg_type, field, lower_bound, upper_bound, \ + EXPECT_IN_RANGE, arg->field >= lower_bound && arg->field <= upper_bound) + +/** + * Annotations express attributes about structure fields that can't be easily + * or safely verified at runtime. They are intended as hints to the fuzzing + * engine to help it generate more semantically correct and effective inputs. + * Unlike constraints, annotations do not add any runtime checks and do not + * cause a test to exit early. + * + * For example, a `char *` field could be a raw byte buffer or a C-style + * null-terminated string. A fuzzer that is aware of this distinction can avoid + * creating inputs that would cause trivial, uninteresting crashes from reading + * past the end of a non-null-terminated buffer. + */ +enum kfuzztest_annotation_attribute { + ATTRIBUTE_LEN, + ATTRIBUTE_STRING, + ATTRIBUTE_ARRAY, +}; + +/** + * struct kfuzztest_annotation - a metadata record for a fuzzer hint + * + * This struct captures a single hint about a field in the input structure. + * Instances are generated by the KFUZZTEST_ANNOTATE_* macros and are placed + * into the read-only ".kfuzztest_annotation" ELF section of the vmlinux binary. + * + * A userspace fuzzer can parse this section to understand the semantic + * relationships between fields (e.g., which field is a length for which + * buffer) and the expected format of the data (e.g., a null-terminated + * string). This allows the fuzzer to be much more intelligent during input + * generation and mutation. + * + * For an example of how annotations are used within a fuzz test, see the + * documentation for the FUZZ_TEST() macro. + * + * @input_type: The name of the input struct type. + * @field_name: The name of the field being annotated (e.g., the data + * buffer field). + * @linked_field_name: For annotations that link two fields (like + * ATTRIBUTE_LEN), this is the name of the related field (e.g., the + * length field). For others, this may be unused. + * @attrib: The type of the annotation hint. + */ +struct kfuzztest_annotation { + const char *input_type; + const char *field_name; + const char *linked_field_name; + enum kfuzztest_annotation_attribute attrib; +} __aligned(32); + +#define __KFUZZTEST_ANNOTATE(arg_type, field, linked_field, attribute) \ + static struct kfuzztest_annotation __annotation_##arg_type##_##field __section(".kfuzztest_annotation") \ + __used = { \ + .input_type = "struct " #arg_type, \ + .field_name = #field, \ + .linked_field_name = #linked_field, \ + .attrib = attribute, \ + } + +/** + * KFUZZTEST_ANNOTATE_STRING - annotate a char* field as a C string + * + * We define a C string as a sequence of non-zero characters followed by exactly + * one null terminator. + * + * @arg_type: name of the input structure, without the leading "struct ". + * @field: the name of the field to annotate. + */ +#define KFUZZTEST_ANNOTATE_STRING(arg_type, field) __KFUZZTEST_ANNOTATE(arg_type, field, NULL, ATTRIBUTE_STRING) + +/** + * KFUZZTEST_ANNOTATE_ARRAY - annotate a pointer as an array + * + * We define an array as a contiguous memory region containing zero or more + * elements of the same type. + * + * @arg_type: name of the input structure, without the leading "struct ". + * @field: the name of the field to annotate. + */ +#define KFUZZTEST_ANNOTATE_ARRAY(arg_type, field) __KFUZZTEST_ANNOTATE(arg_type, field, NULL, ATTRIBUTE_ARRAY) + +/** + * KFUZZTEST_ANNOTATE_LEN - annotate a field as the length of another + * + * This expresses the relationship `arg_type.field == len(linked_field)`, where + * `linked_field` is an array. + * + * @arg_type: name of the input structure, without the leading "struct ". + * @field: the name of the field to annotate. + * @linked_field: the name of an array field with length @field. + */ +#define KFUZZTEST_ANNOTATE_LEN(arg_type, field, linked_field) \ + __KFUZZTEST_ANNOTATE(arg_type, field, linked_field, ATTRIBUTE_LEN) + +#define KFUZZTEST_REGIONID_NULL U32_MAX + +/** + * The end of the input should be padded by at least this number of bytes as + * it is poisoned to detect out of bounds accesses at the end of the last + * region. + */ +#define KFUZZTEST_POISON_SIZE 0x8 + +#endif /* KFUZZTEST_H */ diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index dc0e0c6ed075e9..49a1748b9f241c 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -1947,6 +1947,7 @@ endmenu menu "Kernel Testing and Coverage" source "lib/kunit/Kconfig" +source "lib/kfuzztest/Kconfig" config NOTIFIER_ERROR_INJECTION tristate "Notifier error injection" diff --git a/lib/Makefile b/lib/Makefile index 392ff808c9b902..02789bf8849927 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -325,6 +325,8 @@ obj-$(CONFIG_GENERIC_LIB_CMPDI2) += cmpdi2.o obj-$(CONFIG_GENERIC_LIB_UCMPDI2) += ucmpdi2.o obj-$(CONFIG_OBJAGG) += objagg.o +obj-$(CONFIG_KFUZZTEST) += kfuzztest/ + # pldmfw library obj-$(CONFIG_PLDMFW) += pldmfw/ diff --git a/lib/kfuzztest/Kconfig b/lib/kfuzztest/Kconfig new file mode 100644 index 00000000000000..f9fb5abf8d27db --- /dev/null +++ b/lib/kfuzztest/Kconfig @@ -0,0 +1,20 @@ +# SPDX-License-Identifier: GPL-2.0-only + +config KFUZZTEST + bool "KFuzzTest - enable support for internal fuzz targets" + depends on DEBUG_FS && DEBUG_KERNEL + help + Enables support for the kernel fuzz testing framework (KFuzzTest), an + interface for exposing internal kernel functions to a userspace fuzzing + engine. KFuzzTest targets are exposed via a debugfs interface that + accepts serialized userspace inputs, and is designed to make it easier + to fuzz deeply nested kernel code that is hard to reach from the system + call boundary. Using a simple macro-based API, developers can add a new + fuzz target with minimal boilerplate code. + + It is strongly recommended to also enable CONFIG_KASAN for byte-accurate + out-of-bounds detection, as KFuzzTest was designed with this in mind. It + is also recommended to enable CONFIG_KCOV for coverage guided fuzzing. + + WARNING: This exposes internal kernel functions directly to userspace + and must NEVER be enabled in production builds. diff --git a/lib/kfuzztest/Makefile b/lib/kfuzztest/Makefile new file mode 100644 index 00000000000000..142d16007eea98 --- /dev/null +++ b/lib/kfuzztest/Makefile @@ -0,0 +1,4 @@ +# SPDX-License-Identifier: GPL-2.0 + +obj-$(CONFIG_KFUZZTEST) += kfuzztest.o +kfuzztest-objs := main.o parse.o diff --git a/lib/kfuzztest/main.c b/lib/kfuzztest/main.c new file mode 100644 index 00000000000000..c36a7a0b760288 --- /dev/null +++ b/lib/kfuzztest/main.c @@ -0,0 +1,242 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * KFuzzTest core module initialization and debugfs interface. + * + * Copyright 2025 Google LLC + */ +#include +#include +#include +#include +#include +#include +#include + +MODULE_LICENSE("GPL"); +MODULE_AUTHOR("Ethan Graham "); +MODULE_DESCRIPTION("Kernel Fuzz Testing Framework (KFuzzTest)"); + +/* + * Enforce a fixed struct size to ensure a consistent stride when iterating over + * the array of these structs in the dedicated ELF section. + */ +static_assert(sizeof(struct kfuzztest_target) == 32, "struct kfuzztest_target should have size 32"); +static_assert(sizeof(struct kfuzztest_constraint) == 64, "struct kfuzztest_constraint should have size 64"); +static_assert(sizeof(struct kfuzztest_annotation) == 32, "struct kfuzztest_annotation should have size 32"); + +extern const struct kfuzztest_target __kfuzztest_targets_start[]; +extern const struct kfuzztest_target __kfuzztest_targets_end[]; + +/** + * struct kfuzztest_state - global state for the KFuzzTest module + * + * @kfuzztest_dir: The root debugfs directory, /sys/kernel/debug/kfuzztest/. + * @num_invocations: total number of target invocations. + * @num_targets: number of registered targets. + * @target_fops: array of file operations for each registered target. + * @minalign_fops: file operations for the /_config/minalign file. + * @num_invocations_fops: file operations for the /_config/num_invocations file. + */ +struct kfuzztest_state { + struct dentry *kfuzztest_dir; + atomic_t num_invocations; + size_t num_targets; + + struct file_operations *target_fops; + struct file_operations minalign_fops; + struct file_operations num_invocations_fops; +}; + +static struct kfuzztest_state state; + +void record_invocation(void) +{ + atomic_inc(&state.num_invocations); +} + +static void cleanup_kfuzztest_state(struct kfuzztest_state *st) +{ + debugfs_remove_recursive(st->kfuzztest_dir); + st->num_targets = 0; + st->num_invocations = (atomic_t)ATOMIC_INIT(0); + kfree(st->target_fops); + st->target_fops = NULL; +} + +static const umode_t KFUZZTEST_INPUT_PERMS = 0222; +static const umode_t KFUZZTEST_MINALIGN_PERMS = 0444; + +static ssize_t read_cb_integer(struct file *filp, char __user *buf, size_t count, loff_t *f_pos, size_t value) +{ + char buffer[64]; + int len; + + len = scnprintf(buffer, sizeof(buffer), "%zu\n", value); + return simple_read_from_buffer(buf, count, f_pos, buffer, len); +} + +/* + * Callback for /sys/kernel/debug/kfuzztest/_config/minalign. Minalign + * corresponds to the minimum alignment that regions in a KFuzzTest input must + * satisfy. This callback returns that value in string format. + */ +static ssize_t minalign_read_cb(struct file *filp, char __user *buf, size_t count, loff_t *f_pos) +{ + int minalign = MAX(KFUZZTEST_POISON_SIZE, ARCH_KMALLOC_MINALIGN); + return read_cb_integer(filp, buf, count, f_pos, minalign); +} + +/* + * Callback for /sys/kernel/debug/kfuzztest/_config/num_invocations, which + * returns the value in string format. + */ +static ssize_t num_invocations_read_cb(struct file *filp, char __user *buf, size_t count, loff_t *f_pos) +{ + return read_cb_integer(filp, buf, count, f_pos, atomic_read(&state.num_invocations)); +} + +static int create_read_only_file(struct dentry *parent, const char *name, struct file_operations *fops) +{ + struct dentry *file; + int err = 0; + + file = debugfs_create_file(name, KFUZZTEST_MINALIGN_PERMS, parent, NULL, fops); + if (!file) + err = -ENOMEM; + else if (IS_ERR(file)) + err = PTR_ERR(file); + return err; +} + +static int initialize_config_dir(struct kfuzztest_state *st) +{ + struct dentry *dir; + int err = 0; + + dir = debugfs_create_dir("_config", st->kfuzztest_dir); + if (!dir) + err = -ENOMEM; + else if (IS_ERR(dir)) + err = PTR_ERR(dir); + if (err) { + pr_info("kfuzztest: failed to create /_config dir"); + goto out; + } + + st->minalign_fops = (struct file_operations){ + .owner = THIS_MODULE, + .read = minalign_read_cb, + }; + err = create_read_only_file(dir, "minalign", &st->minalign_fops); + if (err) { + pr_info("kfuzztest: failed to create /_config/minalign"); + goto out; + } + + st->num_invocations_fops = (struct file_operations){ + .owner = THIS_MODULE, + .read = num_invocations_read_cb, + }; + err = create_read_only_file(dir, "num_invocations", &st->num_invocations_fops); + if (err) + pr_info("kfuzztest: failed to create /_config/num_invocations"); +out: + return err; +} + +static int initialize_target_dir(struct kfuzztest_state *st, const struct kfuzztest_target *targ, + struct file_operations *fops) +{ + struct dentry *dir, *input; + int err = 0; + + dir = debugfs_create_dir(targ->name, st->kfuzztest_dir); + if (!dir) + err = -ENOMEM; + else if (IS_ERR(dir)) + err = PTR_ERR(dir); + if (err) { + pr_info("kfuzztest: failed to create /kfuzztest/%s dir", targ->name); + goto out; + } + + input = debugfs_create_file("input", KFUZZTEST_INPUT_PERMS, dir, NULL, fops); + if (!input) + err = -ENOMEM; + else if (IS_ERR(input)) + err = PTR_ERR(input); + if (err) + pr_info("kfuzztest: failed to create /kfuzztest/%s/input", targ->name); +out: + return err; +} + +/** + * kfuzztest_init - initializes the debug filesystem for KFuzzTest + * + * Each registered target in the ".kfuzztest_targets" section gets its own + * subdirectory under "/sys/kernel/debug/kfuzztest/" containing one + * write-only "input" file used for receiving inputs from userspace. + * Furthermore, a directory "/sys/kernel/debug/kfuzztest/_config" is created, + * containing two read-only files "minalign" and "num_invocations", that return + * the minimum required region alignment and number of target invocations + * respectively. + * + * @return 0 on success or an error + */ +static int __init kfuzztest_init(void) +{ + const struct kfuzztest_target *targ; + int err = 0; + int i = 0; + + state.num_targets = __kfuzztest_targets_end - __kfuzztest_targets_start; + state.target_fops = kzalloc(sizeof(struct file_operations) * state.num_targets, GFP_KERNEL); + if (!state.target_fops) + return -ENOMEM; + + /* Create the main "kfuzztest" directory in /sys/kernel/debug. */ + state.kfuzztest_dir = debugfs_create_dir("kfuzztest", NULL); + if (!state.kfuzztest_dir) { + pr_warn("kfuzztest: could not create 'kfuzztest' debugfs directory"); + return -ENOMEM; + } + if (IS_ERR(state.kfuzztest_dir)) { + pr_warn("kfuzztest: could not create 'kfuzztest' debugfs directory"); + err = PTR_ERR(state.kfuzztest_dir); + state.kfuzztest_dir = NULL; + return err; + } + + err = initialize_config_dir(&state); + if (err) + goto cleanup_failure; + + for (targ = __kfuzztest_targets_start; targ < __kfuzztest_targets_end; targ++, i++) { + state.target_fops[i] = (struct file_operations){ + .owner = THIS_MODULE, + .write = targ->write_input_cb, + }; + err = initialize_target_dir(&state, targ, &state.target_fops[i]); + /* Bail out if a single target fails to initialize. This avoids + * partial setup, and a failure here likely indicates an issue + * with debugfs. */ + if (err) + goto cleanup_failure; + pr_info("kfuzztest: registered target %s", targ->name); + } + return 0; + +cleanup_failure: + cleanup_kfuzztest_state(&state); + return err; +} + +static void __exit kfuzztest_exit(void) +{ + pr_info("kfuzztest: exiting"); + cleanup_kfuzztest_state(&state); +} + +module_init(kfuzztest_init); +module_exit(kfuzztest_exit); diff --git a/lib/kfuzztest/parse.c b/lib/kfuzztest/parse.c new file mode 100644 index 00000000000000..5aaeca6a7fdedf --- /dev/null +++ b/lib/kfuzztest/parse.c @@ -0,0 +1,204 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * KFuzzTest input parsing and validation. + * + * Copyright 2025 Google LLC + */ +#include +#include + +static int kfuzztest_relocate_v0(struct reloc_region_array *regions, struct reloc_table *rt, + unsigned char *payload_start, unsigned char *payload_end) +{ + unsigned char *poison_start, *poison_end; + struct reloc_region reg, src, dst; + uintptr_t *ptr_location; + struct reloc_entry re; + size_t i; + int ret; + + /* Patch pointers. */ + for (i = 0; i < rt->num_entries; i++) { + re = rt->entries[i]; + src = regions->regions[re.region_id]; + ptr_location = (uintptr_t *)(payload_start + src.offset + re.region_offset); + if (re.value == KFUZZTEST_REGIONID_NULL) + *ptr_location = (uintptr_t)NULL; + else if (re.value < regions->num_regions) { + dst = regions->regions[re.value]; + *ptr_location = (uintptr_t)(payload_start + dst.offset); + } else { + return -EINVAL; + } + } + + /* Poison the padding between regions. */ + for (i = 0; i < regions->num_regions; i++) { + reg = regions->regions[i]; + + /* Points to the beginning of the inter-region padding */ + poison_start = payload_start + reg.offset + reg.size; + if (i < regions->num_regions - 1) + poison_end = payload_start + regions->regions[i + 1].offset; + else + poison_end = payload_end; + + if (poison_end > payload_end) + return -EINVAL; + + ret = kasan_poison_range(poison_start, poison_end - poison_start); + if (ret) + return ret; + } + + /* Poison the padded area preceding the payload. */ + return kasan_poison_range(payload_start - rt->padding_size, rt->padding_size); +} + +static bool kfuzztest_input_is_valid(struct reloc_region_array *regions, struct reloc_table *rt, + unsigned char *payload_start, unsigned char *payload_end) +{ + size_t payload_size = payload_end - payload_start; + struct reloc_region reg, next_reg; + size_t usable_payload_size; + uint32_t region_end_offset; + struct reloc_entry reloc; + uint32_t i; + + if (payload_start > payload_end) + return false; + if (payload_size < KFUZZTEST_POISON_SIZE) + return false; + if ((uintptr_t)payload_end % KFUZZTEST_POISON_SIZE) + return false; + usable_payload_size = payload_size - KFUZZTEST_POISON_SIZE; + + for (i = 0; i < regions->num_regions; i++) { + reg = regions->regions[i]; + if (check_add_overflow(reg.offset, reg.size, ®ion_end_offset)) + return false; + if ((size_t)region_end_offset > usable_payload_size) + return false; + + if (i < regions->num_regions - 1) { + next_reg = regions->regions[i + 1]; + if (reg.offset > next_reg.offset) + return false; + /* Enforce the minimum poisonable gap between + * consecutive regions. */ + if (reg.offset + reg.size + KFUZZTEST_POISON_SIZE > next_reg.offset) + return false; + } + } + + if (rt->padding_size < KFUZZTEST_POISON_SIZE) { + pr_info("validation failed because rt->padding_size = %u", rt->padding_size); + return false; + } + + for (i = 0; i < rt->num_entries; i++) { + reloc = rt->entries[i]; + if (reloc.region_id >= regions->num_regions) + return false; + if (reloc.value != KFUZZTEST_REGIONID_NULL && reloc.value >= regions->num_regions) + return false; + + reg = regions->regions[reloc.region_id]; + if (reloc.region_offset % (sizeof(uintptr_t)) || reloc.region_offset + sizeof(uintptr_t) > reg.size) + return false; + } + + return true; +} + +static int kfuzztest_parse_input_v0(unsigned char *input, size_t input_size, struct reloc_region_array **ret_regions, + struct reloc_table **ret_reloc_table, unsigned char **ret_payload_start, + unsigned char **ret_payload_end) +{ + size_t reloc_entries_size, reloc_regions_size; + unsigned char *payload_end, *payload_start; + size_t reloc_table_size, regions_size; + struct reloc_region_array *regions; + struct reloc_table *rt; + size_t curr_offset = 0; + + if (input_size < sizeof(struct reloc_region_array) + sizeof(struct reloc_table)) + return -EINVAL; + + regions = (struct reloc_region_array *)input; + if (check_mul_overflow(regions->num_regions, sizeof(struct reloc_region), &reloc_regions_size)) + return -EINVAL; + if (check_add_overflow(sizeof(*regions), reloc_regions_size, ®ions_size)) + return -EINVAL; + + curr_offset = regions_size; + if (curr_offset > input_size) + return -EINVAL; + if (input_size - curr_offset < sizeof(struct reloc_table)) + return -EINVAL; + + rt = (struct reloc_table *)(input + curr_offset); + + if (check_mul_overflow((size_t)rt->num_entries, sizeof(struct reloc_entry), &reloc_entries_size)) + return -EINVAL; + if (check_add_overflow(sizeof(*rt), reloc_entries_size, &reloc_table_size)) + return -EINVAL; + if (check_add_overflow(reloc_table_size, rt->padding_size, &reloc_table_size)) + return -EINVAL; + + if (check_add_overflow(curr_offset, reloc_table_size, &curr_offset)) + return -EINVAL; + if (curr_offset > input_size) + return -EINVAL; + + payload_start = input + curr_offset; + payload_end = input + input_size; + + if (!kfuzztest_input_is_valid(regions, rt, payload_start, payload_end)) + return -EINVAL; + + *ret_regions = regions; + *ret_reloc_table = rt; + *ret_payload_start = payload_start; + *ret_payload_end = payload_end; + return 0; +} + +static int kfuzztest_parse_and_relocate_v0(unsigned char *input, size_t input_size, void **arg_ret) +{ + unsigned char *payload_start, *payload_end; + struct reloc_region_array *regions; + struct reloc_table *reloc_table; + int ret; + + ret = kfuzztest_parse_input_v0(input, input_size, ®ions, &reloc_table, &payload_start, &payload_end); + if (ret < 0) + return ret; + + ret = kfuzztest_relocate_v0(regions, reloc_table, payload_start, payload_end); + if (ret < 0) + return ret; + *arg_ret = (void *)payload_start; + return 0; +} + +int kfuzztest_parse_and_relocate(void *input, size_t input_size, void **arg_ret) +{ + size_t header_size = 2 * sizeof(u32); + u32 version, magic; + + if (input_size < sizeof(u32) + sizeof(u32)) + return -EINVAL; + + magic = *(u32 *)input; + if (magic != KFUZZTEST_HEADER_MAGIC) + return -EINVAL; + + version = *(u32 *)(input + sizeof(u32)); + switch (version) { + case KFUZZTEST_V0: + return kfuzztest_parse_and_relocate_v0(input + header_size, input_size - header_size, arg_ret); + } + + return -EINVAL; +} diff --git a/mm/kasan/shadow.c b/mm/kasan/shadow.c index d2c70cd2afb1de..7faed02264f2ba 100644 --- a/mm/kasan/shadow.c +++ b/mm/kasan/shadow.c @@ -147,6 +147,40 @@ void kasan_poison(const void *addr, size_t size, u8 value, bool init) } EXPORT_SYMBOL_GPL(kasan_poison); +int kasan_poison_range(const void *addr, size_t size) +{ + uintptr_t start_addr = (uintptr_t)addr; + uintptr_t head_granule_start; + uintptr_t poison_body_start; + uintptr_t poison_body_end; + size_t head_prefix_size; + uintptr_t end_addr; + + if ((start_addr + size) % KASAN_GRANULE_SIZE) + return -EINVAL; + + end_addr = ALIGN_DOWN(start_addr + size, KASAN_GRANULE_SIZE); + if (start_addr >= end_addr) + return -EINVAL; + + head_granule_start = ALIGN_DOWN(start_addr, KASAN_GRANULE_SIZE); + head_prefix_size = start_addr - head_granule_start; + + if (IS_ENABLED(CONFIG_KASAN_GENERIC) && head_prefix_size > 0) + kasan_poison_last_granule((void *)head_granule_start, + head_prefix_size); + + poison_body_start = ALIGN(start_addr, KASAN_GRANULE_SIZE); + poison_body_end = ALIGN_DOWN(end_addr, KASAN_GRANULE_SIZE); + + if (poison_body_start < poison_body_end) + kasan_poison((void *)poison_body_start, + poison_body_end - poison_body_start, + KASAN_SLAB_REDZONE, false); + return 0; +} +EXPORT_SYMBOL(kasan_poison_range); + #ifdef CONFIG_KASAN_GENERIC void kasan_poison_last_granule(const void *addr, size_t size) { diff --git a/samples/Kconfig b/samples/Kconfig index 6e072a5f1ed86d..5209dd9d7a5cf8 100644 --- a/samples/Kconfig +++ b/samples/Kconfig @@ -320,6 +320,13 @@ config SAMPLE_HUNG_TASK Reading these files with multiple processes triggers hung task detection by holding locks for a long time (256 seconds). +config SAMPLE_KFUZZTEST + bool "Build KFuzzTest sample targets" + depends on KFUZZTEST + help + Build KFuzzTest sample targets that serve as selftests for input + deserialization and inter-region redzone poisoning logic. + source "samples/rust/Kconfig" source "samples/damon/Kconfig" diff --git a/samples/Makefile b/samples/Makefile index 07641e177bd8bb..3a0e7f744f445d 100644 --- a/samples/Makefile +++ b/samples/Makefile @@ -44,4 +44,5 @@ obj-$(CONFIG_SAMPLE_DAMON_WSSE) += damon/ obj-$(CONFIG_SAMPLE_DAMON_PRCL) += damon/ obj-$(CONFIG_SAMPLE_DAMON_MTIER) += damon/ obj-$(CONFIG_SAMPLE_HUNG_TASK) += hung_task/ +obj-$(CONFIG_SAMPLE_KFUZZTEST) += kfuzztest/ obj-$(CONFIG_SAMPLE_TSM_MR) += tsm-mr/ diff --git a/samples/kfuzztest/Makefile b/samples/kfuzztest/Makefile new file mode 100644 index 00000000000000..4f8709876c9e2d --- /dev/null +++ b/samples/kfuzztest/Makefile @@ -0,0 +1,3 @@ +# SPDX-License-Identifier: GPL-2.0-only + +obj-$(CONFIG_SAMPLE_KFUZZTEST) += overflow_on_nested_buffer.o underflow_on_buffer.o diff --git a/samples/kfuzztest/overflow_on_nested_buffer.c b/samples/kfuzztest/overflow_on_nested_buffer.c new file mode 100644 index 00000000000000..2f1c3ff9f750aa --- /dev/null +++ b/samples/kfuzztest/overflow_on_nested_buffer.c @@ -0,0 +1,71 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * This file contains a KFuzzTest example target that ensures that a buffer + * overflow on a nested region triggers a KASAN OOB access report. + * + * Copyright 2025 Google LLC + */ + +/** + * DOC: test_overflow_on_nested_buffer + * + * This test uses a struct with two distinct dynamically allocated buffers. + * It checks that KFuzzTest's memory layout correctly poisons the memory + * regions and that KASAN can detect an overflow when reading one byte past the + * end of the first buffer (`a`). + * + * It can be invoked with kfuzztest-bridge using the following command: + * + * ./kfuzztest-bridge \ + * "nested_buffers { ptr[a] len[a, u64] ptr[b] len[b, u64] }; \ + * a { arr[u8, 64] }; b { arr[u8, 64] };" \ + * "test_overflow_on_nested_buffer" /dev/urandom + * + * The first argument describes the C struct `nested_buffers` and specifies that + * both `a` and `b` are pointers to arrays of 64 bytes. + */ +#include + +static void overflow_on_nested_buffer(const char *a, size_t a_len, const char *b, size_t b_len) +{ + size_t i; + pr_info("a = [%px, %px)", a, a + a_len); + pr_info("b = [%px, %px)", b, b + b_len); + + /* Ensure that all bytes in arg->b are accessible. */ + for (i = 0; i < b_len; i++) + READ_ONCE(b[i]); + /* + * Check that all bytes in arg->a are accessible, and provoke an OOB on + * the first byte to the right of the buffer which will trigger a KASAN + * report. + */ + for (i = 0; i <= a_len; i++) + READ_ONCE(a[i]); +} + +struct nested_buffers { + const char *a; + size_t a_len; + const char *b; + size_t b_len; +}; + +/** + * The KFuzzTest input format specifies that struct nested buffers should + * be expanded as: + * + * | a | b | pad[8] | *a | pad[8] | *b | + * + * where the padded regions are poisoned. We expect to trigger a KASAN report by + * overflowing one byte into the `a` buffer. + */ +FUZZ_TEST(test_overflow_on_nested_buffer, struct nested_buffers) +{ + KFUZZTEST_EXPECT_NOT_NULL(nested_buffers, a); + KFUZZTEST_EXPECT_NOT_NULL(nested_buffers, b); + KFUZZTEST_ANNOTATE_LEN(nested_buffers, a_len, a); + KFUZZTEST_ANNOTATE_LEN(nested_buffers, b_len, b); + + overflow_on_nested_buffer(arg->a, arg->a_len, arg->b, arg->b_len); +} diff --git a/samples/kfuzztest/underflow_on_buffer.c b/samples/kfuzztest/underflow_on_buffer.c new file mode 100644 index 00000000000000..02704a1bfebb46 --- /dev/null +++ b/samples/kfuzztest/underflow_on_buffer.c @@ -0,0 +1,59 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * This file contains a KFuzzTest example target that ensures that a buffer + * underflow on a region triggers a KASAN OOB access report. + * + * Copyright 2025 Google LLC + */ + +/** + * DOC: test_underflow_on_buffer + * + * This test ensures that the region between the metadata struct and the + * dynamically allocated buffer is poisoned. It provokes a one-byte underflow + * on the buffer, which should be caught by KASAN. + * + * It can be invoked with kfuzztest-bridge using the following command: + * + * ./kfuzztest-bridge \ + * "some_buffer { ptr[buf] len[buf, u64]}; buf { arr[u8, 128] };" \ + * "test_underflow_on_buffer" /dev/urandom + * + * The first argument describes the C struct `some_buffer` and specifies that + * `buf` is a pointer to an array of 128 bytes. The second argument is the test + * name, and the third is a seed file. + */ +#include + +static void underflow_on_buffer(char *buf, size_t buflen) +{ + size_t i; + + pr_info("buf = [%px, %px)", buf, buf + buflen); + + /* First ensure that all bytes in arg->b are accessible. */ + for (i = 0; i < buflen; i++) + READ_ONCE(buf[i]); + /* + * Provoke a buffer overflow on the first byte preceding b, triggering + * a KASAN report. + */ + READ_ONCE(*((char *)buf - 1)); +} + +struct some_buffer { + char *buf; + size_t buflen; +}; + +/** + * Tests that the region between struct some_buffer and the expanded *buf field + * is correctly poisoned by accessing the first byte before *buf. + */ +FUZZ_TEST(test_underflow_on_buffer, struct some_buffer) +{ + KFUZZTEST_EXPECT_NOT_NULL(some_buffer, buf); + KFUZZTEST_ANNOTATE_LEN(some_buffer, buflen, buf); + + underflow_on_buffer(arg->buf, arg->buflen); +} diff --git a/tools/Makefile b/tools/Makefile index c31cbbd12c456a..dfb0cd19aeb939 100644 --- a/tools/Makefile +++ b/tools/Makefile @@ -21,6 +21,7 @@ help: @echo ' hv - tools used when in Hyper-V clients' @echo ' iio - IIO tools' @echo ' intel-speed-select - Intel Speed Select tool' + @echo ' kfuzztest-bridge - KFuzzTest userspace utility' @echo ' kvm_stat - top-like utility for displaying kvm statistics' @echo ' leds - LEDs tools' @echo ' nolibc - nolibc headers testing and installation' @@ -98,6 +99,9 @@ sched_ext: FORCE selftests: FORCE $(call descend,testing/$@) +kfuzztest-bridge: FORCE + $(call descend,testing/kfuzztest-bridge) + thermal: FORCE $(call descend,lib/$@) @@ -126,7 +130,8 @@ all: acpi counter cpupower gpio hv firewire \ perf selftests bootconfig spi turbostat usb \ virtio mm bpf x86_energy_perf_policy \ tmon freefall iio objtool kvm_stat wmi \ - debugging tracing thermal thermometer thermal-engine ynl + debugging tracing thermal thermometer thermal-engine ynl \ + kfuzztest-bridge acpi_install: $(call descend,power/$(@:_install=),install) @@ -140,6 +145,9 @@ counter_install firewire_install gpio_install hv_install iio_install perf_instal selftests_install: $(call descend,testing/$(@:_install=),install) +kfuzztest-bridge_install: + $(call descend,testing/kfuzztest-bridge,install) + thermal_install: $(call descend,lib/$(@:_install=),install) @@ -170,7 +178,8 @@ install: acpi_install counter_install cpupower_install gpio_install \ virtio_install mm_install bpf_install x86_energy_perf_policy_install \ tmon_install freefall_install objtool_install kvm_stat_install \ wmi_install debugging_install intel-speed-select_install \ - tracing_install thermometer_install thermal-engine_install ynl_install + tracing_install thermometer_install thermal-engine_install ynl_install \ + kfuzztest-bridge_install acpi_clean: $(call descend,power/acpi,clean) @@ -200,6 +209,9 @@ sched_ext_clean: selftests_clean: $(call descend,testing/$(@:_clean=),clean) +kfuzztest-bridge_clean: + $(call descend,testing/kfuzztest-bridge,clean) + thermal_clean: $(call descend,lib/thermal,clean) @@ -230,6 +242,6 @@ clean: acpi_clean counter_clean cpupower_clean hv_clean firewire_clean \ freefall_clean build_clean libbpf_clean libsubcmd_clean \ gpio_clean objtool_clean leds_clean wmi_clean firmware_clean debugging_clean \ intel-speed-select_clean tracing_clean thermal_clean thermometer_clean thermal-engine_clean \ - sched_ext_clean ynl_clean + sched_ext_clean ynl_clean kfuzztest-bridge_clean .PHONY: FORCE diff --git a/tools/testing/kfuzztest-bridge/.gitignore b/tools/testing/kfuzztest-bridge/.gitignore new file mode 100644 index 00000000000000..4aa9fb0d44e292 --- /dev/null +++ b/tools/testing/kfuzztest-bridge/.gitignore @@ -0,0 +1,2 @@ +# SPDX-License-Identifier: GPL-2.0-only +kfuzztest-bridge diff --git a/tools/testing/kfuzztest-bridge/Build b/tools/testing/kfuzztest-bridge/Build new file mode 100644 index 00000000000000..d07341a226d636 --- /dev/null +++ b/tools/testing/kfuzztest-bridge/Build @@ -0,0 +1,6 @@ +kfuzztest-bridge-y += bridge.o +kfuzztest-bridge-y += byte_buffer.o +kfuzztest-bridge-y += encoder.o +kfuzztest-bridge-y += input_lexer.o +kfuzztest-bridge-y += input_parser.o +kfuzztest-bridge-y += rand_stream.o diff --git a/tools/testing/kfuzztest-bridge/Makefile b/tools/testing/kfuzztest-bridge/Makefile new file mode 100644 index 00000000000000..6e110bdeaee51c --- /dev/null +++ b/tools/testing/kfuzztest-bridge/Makefile @@ -0,0 +1,49 @@ +# SPDX-License-Identifier: GPL-2.0 +# Makefile for KFuzzTest-Bridge +include ../../scripts/Makefile.include + +bindir ?= /usr/bin + +ifeq ($(srctree),) +srctree := $(patsubst %/,%,$(dir $(CURDIR))) +srctree := $(patsubst %/,%,$(dir $(srctree))) +srctree := $(patsubst %/,%,$(dir $(srctree))) +endif + +MAKEFLAGS += -r + +override CFLAGS += -O2 -g +override CFLAGS += -Wall -Wextra +override CFLAGS += -D_GNU_SOURCE +override CFLAGS += -I$(OUTPUT)include -I$(srctree)/tools/include + +ALL_TARGETS := kfuzztest-bridge +ALL_PROGRAMS := $(patsubst %,$(OUTPUT)%,$(ALL_TARGETS)) + +KFUZZTEST_BRIDGE_IN := $(OUTPUT)kfuzztest-bridge-in.o +KFUZZTEST_BRIDGE := $(OUTPUT)kfuzztest-bridge + +all: $(ALL_PROGRAMS) + +export srctree OUTPUT CC LD CFLAGS +include $(srctree)/tools/build/Makefile.include + +$(KFUZZTEST_BRIDGE_IN): FORCE + $(Q)$(MAKE) $(build)=kfuzztest-bridge + +$(KFUZZTEST_BRIDGE): $(KFUZZTEST_BRIDGE_IN) + $(QUIET_LINK)$(CC) $(CFLAGS) $< -o $@ $(LDFLAGS) + +clean: + rm -f $(ALL_PROGRAMS) + find $(or $(OUTPUT),.) -name '*.o' -delete -o -name '\.*.d' -delete -o -name '\.*.o.cmd' -delete + +install: $(ALL_PROGRAMS) + install -d -m 755 $(DESTDIR)$(bindir); \ + for program in $(ALL_PROGRAMS); do \ + install $$program $(DESTDIR)$(bindir); \ + done + +FORCE: + +.PHONY: all install clean FORCE prepare diff --git a/tools/testing/kfuzztest-bridge/bridge.c b/tools/testing/kfuzztest-bridge/bridge.c new file mode 100644 index 00000000000000..aec0eb4e9ff73c --- /dev/null +++ b/tools/testing/kfuzztest-bridge/bridge.c @@ -0,0 +1,115 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * KFuzzTest tool for sending inputs into a KFuzzTest harness + * + * Copyright 2025 Google LLC + */ +#include +#include +#include +#include +#include + +#include "byte_buffer.h" +#include "encoder.h" +#include "input_lexer.h" +#include "input_parser.h" +#include "rand_stream.h" + +static int invoke_kfuzztest_target(const char *target_name, const char *data, ssize_t data_size) +{ + ssize_t bytes_written; + char *buf = NULL; + int ret; + int fd; + + if (asprintf(&buf, "/sys/kernel/debug/kfuzztest/%s/input", target_name) < 0) + return -ENOMEM; + + fd = openat(AT_FDCWD, buf, O_WRONLY, 0); + if (fd < 0) { + ret = -errno; + goto out_free; + } + + /* + * A KFuzzTest target's debugfs handler expects the entire input to be + * written in a single contiguous blob. Treat partial writes as errors. + */ + bytes_written = write(fd, data, data_size); + if (bytes_written != data_size) { + ret = (bytes_written < 0) ? -errno : -EIO; + goto out_close; + } + ret = 0; + +out_close: + if (close(fd) != 0 && ret == 0) + ret = -errno; +out_free: + free(buf); + return ret; +} + +static int invoke_one(const char *input_fmt, const char *fuzz_target, const char *input_filepath) +{ + struct ast_node *ast_prog; + struct byte_buffer *bb; + struct rand_stream *rs; + struct token **tokens; + size_t num_tokens; + size_t num_bytes; + int err; + + err = tokenize(input_fmt, &tokens, &num_tokens); + if (err) { + fprintf(stderr, "tokenization failed: %s\n", strerror(-err)); + return err; + } + + err = parse(tokens, num_tokens, &ast_prog); + if (err) { + fprintf(stderr, "parsing failed: %s\n", strerror(-err)); + goto cleanup_tokens; + } + + rs = new_rand_stream(input_filepath, 1024); + if (!rs) { + err = -ENOMEM; + goto cleanup_ast; + } + + err = encode(ast_prog, rs, &num_bytes, &bb); + if (err == STREAM_EOF) { + fprintf(stderr, "encoding failed: reached EOF in %s\n", input_filepath); + err = -EINVAL; + goto cleanup_rs; + } else if (err) { + fprintf(stderr, "encoding failed: %s\n", strerror(-err)); + goto cleanup_rs; + } + + err = invoke_kfuzztest_target(fuzz_target, bb->buffer, (ssize_t)num_bytes); + if (err) + fprintf(stderr, "invocation failed: %s\n", strerror(-err)); + + destroy_byte_buffer(bb); +cleanup_rs: + destroy_rand_stream(rs); +cleanup_ast: + destroy_ast_node(ast_prog); +cleanup_tokens: + destroy_tokens(tokens, num_tokens); + return err; +} + +int main(int argc, char *argv[]) +{ + if (argc != 4) { + printf("Usage: %s \n", argv[0]); + printf("For more detailed information see Documentation/dev-tools/kfuzztest.rst\n"); + return 1; + } + + return invoke_one(argv[1], argv[2], argv[3]); +} diff --git a/tools/testing/kfuzztest-bridge/byte_buffer.c b/tools/testing/kfuzztest-bridge/byte_buffer.c new file mode 100644 index 00000000000000..1974dbf3862e8c --- /dev/null +++ b/tools/testing/kfuzztest-bridge/byte_buffer.c @@ -0,0 +1,85 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * A simple byte buffer implementation for encoding binary data + * + * Copyright 2025 Google LLC + */ +#include +#include +#include + +#include "byte_buffer.h" + +struct byte_buffer *new_byte_buffer(size_t initial_size) +{ + struct byte_buffer *ret; + size_t alloc_size = initial_size >= 8 ? initial_size : 8; + + ret = malloc(sizeof(*ret)); + if (!ret) + return NULL; + + ret->alloc_size = alloc_size; + ret->buffer = malloc(alloc_size); + if (!ret->buffer) { + free(ret); + return NULL; + } + ret->num_bytes = 0; + return ret; +} + +void destroy_byte_buffer(struct byte_buffer *buf) +{ + free(buf->buffer); + free(buf); +} + +int append_bytes(struct byte_buffer *buf, const char *bytes, size_t num_bytes) +{ + size_t req_size; + size_t new_size; + char *new_ptr; + + req_size = buf->num_bytes + num_bytes; + new_size = buf->alloc_size; + + while (req_size > new_size) + new_size *= 2; + if (new_size != buf->alloc_size) { + new_ptr = realloc(buf->buffer, new_size); + if (!new_ptr) + return -ENOMEM; + buf->buffer = new_ptr; + buf->alloc_size = new_size; + } + memcpy(buf->buffer + buf->num_bytes, bytes, num_bytes); + buf->num_bytes += num_bytes; + return 0; +} + +int append_byte(struct byte_buffer *buf, char c) +{ + return append_bytes(buf, &c, 1); +} + +int encode_le(struct byte_buffer *buf, uint64_t value, size_t byte_width) +{ + size_t i; + int ret; + + for (i = 0; i < byte_width; ++i) + if ((ret = append_byte(buf, (uint8_t)((value >> (i * 8)) & 0xFF)))) + return ret; + return 0; +} + +int pad(struct byte_buffer *buf, size_t num_padding) +{ + int ret; + size_t i; + for (i = 0; i < num_padding; i++) + if ((ret = append_byte(buf, 0))) + return ret; + return 0; +} diff --git a/tools/testing/kfuzztest-bridge/byte_buffer.h b/tools/testing/kfuzztest-bridge/byte_buffer.h new file mode 100644 index 00000000000000..6a31bfb5e78f43 --- /dev/null +++ b/tools/testing/kfuzztest-bridge/byte_buffer.h @@ -0,0 +1,31 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * A simple byte buffer implementation for encoding binary data + * + * Copyright 2025 Google LLC + */ +#ifndef KFUZZTEST_BRIDGE_BYTE_BUFFER_H +#define KFUZZTEST_BRIDGE_BYTE_BUFFER_H + +#include +#include + +struct byte_buffer { + char *buffer; + size_t num_bytes; + size_t alloc_size; +}; + +struct byte_buffer *new_byte_buffer(size_t initial_size); + +void destroy_byte_buffer(struct byte_buffer *buf); + +int append_bytes(struct byte_buffer *buf, const char *bytes, size_t num_bytes); + +int append_byte(struct byte_buffer *buf, char c); + +int encode_le(struct byte_buffer *buf, uint64_t value, size_t byte_width); + +int pad(struct byte_buffer *buf, size_t num_padding); + +#endif /* KFUZZTEST_BRIDGE_BYTE_BUFFER_H */ diff --git a/tools/testing/kfuzztest-bridge/encoder.c b/tools/testing/kfuzztest-bridge/encoder.c new file mode 100644 index 00000000000000..11ff5bd589d3f2 --- /dev/null +++ b/tools/testing/kfuzztest-bridge/encoder.c @@ -0,0 +1,390 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Encoder for KFuzzTest binary input format + * + * Copyright 2025 Google LLC + */ +#include +#include +#include +#include +#include + +#include "byte_buffer.h" +#include "input_parser.h" +#include "rand_stream.h" + +#define KFUZZTEST_MAGIC 0xBFACE +#define KFUZZTEST_PROTO_VERSION 0 + +/* + * The KFuzzTest binary input format requires at least 8 bytes of padding + * at the head and tail of every region. + */ +#define KFUZZTEST_POISON_SIZE 8 + +#define BUFSIZE_SMALL 32 +#define BUFSIZE_LARGE 128 + +struct region_info { + const char *name; + uint32_t offset; + uint32_t size; +}; + +struct reloc_info { + uint32_t src_reg; + uint32_t offset; + uint32_t dst_reg; +}; + +struct encoder_ctx { + struct byte_buffer *payload; + struct rand_stream *rand; + + struct region_info *regions; + size_t num_regions; + + struct reloc_info *relocations; + size_t num_relocations; + + size_t minalign; + size_t reg_offset; + int curr_reg; +}; + +static void cleanup_ctx(struct encoder_ctx *ctx) +{ + if (ctx->regions) + free(ctx->regions); + if (ctx->relocations) + free(ctx->relocations); + if (ctx->payload) + destroy_byte_buffer(ctx->payload); +} + +static int read_minalign(struct encoder_ctx *ctx) +{ + const char *minalign_file = "/sys/kernel/debug/kfuzztest/_config/minalign"; + char buffer[64 + 1] = { 0 }; + int ret = 0; + + FILE *f = fopen(minalign_file, "r"); + if (!f) + return -ENOENT; + + fread(&buffer, 1, sizeof(buffer) - 1, f); + if (ferror(f)) + return ferror(f); + + /* + * atoi returns 0 on error. Since we expect a strictly positive + * minalign value on all architectures, any non-positive value + * represents an error. + */ + ret = atoi(buffer); + if (ret <= 0) { + fclose(f); + return -EINVAL; + } + ctx->minalign = ret; + fclose(f); + return 0; +} + +static int pad_payload(struct encoder_ctx *ctx, size_t amount) +{ + int ret; + + if ((ret = pad(ctx->payload, amount))) + return ret; + ctx->reg_offset += amount; + return 0; +} + +static int align_payload(struct encoder_ctx *ctx, size_t alignment) +{ + size_t pad_amount = ROUND_UP_TO_MULTIPLE(ctx->payload->num_bytes, alignment) - ctx->payload->num_bytes; + return pad_payload(ctx, pad_amount); +} + +static int lookup_reg(struct encoder_ctx *ctx, const char *name) +{ + size_t i; + + for (i = 0; i < ctx->num_regions; i++) { + if (strcmp(ctx->regions[i].name, name) == 0) + return i; + } + return -ENOENT; +} + +static int add_reloc(struct encoder_ctx *ctx, struct reloc_info reloc) +{ + void *new_ptr = realloc(ctx->relocations, (ctx->num_relocations + 1) * sizeof(struct reloc_info)); + if (!new_ptr) + return -ENOMEM; + + ctx->relocations = new_ptr; + ctx->relocations[ctx->num_relocations] = reloc; + ctx->num_relocations++; + return 0; +} + +static int build_region_map(struct encoder_ctx *ctx, struct ast_node *top_level) +{ + struct ast_program *prog; + struct ast_node *reg; + size_t i; + + if (top_level->type != NODE_PROGRAM) + return -EINVAL; + + prog = &top_level->data.program; + ctx->regions = malloc(prog->num_members * sizeof(struct region_info)); + if (!ctx->regions) + return -ENOMEM; + + ctx->num_regions = prog->num_members; + for (i = 0; i < ctx->num_regions; i++) { + reg = prog->members[i]; + /* Offset is determined after the second pass. */ + ctx->regions[i] = (struct region_info){ + .name = reg->data.region.name, + .size = node_size(reg), + }; + } + return 0; +} +/** + * Encodes a value node as little-endian. A value node is one that has no + * children, and can therefore be directly written into the payload. + */ +static int encode_value_le(struct encoder_ctx *ctx, struct ast_node *node) +{ + size_t array_size; + char rand_char; + size_t length; + size_t i; + int reg; + int ret; + + switch (node->type) { + case NODE_ARRAY: + array_size = node->data.array.num_elems * node->data.array.elem_size; + for (i = 0; i < array_size; i++) { + if ((ret = next_byte(ctx->rand, &rand_char))) + return ret; + if ((ret = append_byte(ctx->payload, rand_char))) + return ret; + } + ctx->reg_offset += array_size; + if (node->data.array.null_terminated) { + if ((ret = pad_payload(ctx, 1))) + return ret; + ctx->reg_offset++; + } + break; + case NODE_LENGTH: + reg = lookup_reg(ctx, node->data.length.length_of); + if (reg < 0) + return reg; + length = ctx->regions[reg].size; + if ((ret = encode_le(ctx->payload, length, node->data.length.byte_width))) + return ret; + ctx->reg_offset += node->data.length.byte_width; + break; + case NODE_PRIMITIVE: + for (i = 0; i < node->data.primitive.byte_width; i++) { + if ((ret = next_byte(ctx->rand, &rand_char))) + return ret; + if ((ret = append_byte(ctx->payload, rand_char))) + return ret; + } + ctx->reg_offset += node->data.primitive.byte_width; + break; + case NODE_POINTER: + reg = lookup_reg(ctx, node->data.pointer.points_to); + if (reg < 0) + return reg; + if ((ret = add_reloc(ctx, (struct reloc_info){ .src_reg = ctx->curr_reg, + .offset = ctx->reg_offset, + .dst_reg = reg }))) + return ret; + /* Placeholder pointer value, as pointers are patched by KFuzzTest anyways. */ + if ((ret = encode_le(ctx->payload, UINTPTR_MAX, sizeof(uintptr_t)))) + return ret; + ctx->reg_offset += sizeof(uintptr_t); + break; + case NODE_PROGRAM: + case NODE_REGION: + default: + return -EINVAL; + } + return 0; +} + +static int encode_region(struct encoder_ctx *ctx, struct ast_region *reg) +{ + struct ast_node *child; + size_t i; + int ret; + + ctx->reg_offset = 0; + for (i = 0; i < reg->num_members; i++) { + child = reg->members[i]; + if ((ret = align_payload(ctx, node_alignment(child)))) + return ret; + if ((ret = encode_value_le(ctx, child))) + return ret; + } + return 0; +} + +static int encode_payload(struct encoder_ctx *ctx, struct ast_node *top_level) +{ + struct ast_node *reg; + size_t i; + int ret; + + for (i = 0; i < ctx->num_regions; i++) { + reg = top_level->data.program.members[i]; + if ((ret = align_payload(ctx, MAX(ctx->minalign, node_alignment(reg))))) + return ret; + + ctx->curr_reg = i; + ctx->regions[i].offset = ctx->payload->num_bytes; + if ((ret = encode_region(ctx, ®->data.region))) + return ret; + if ((ret = pad_payload(ctx, KFUZZTEST_POISON_SIZE))) + return ret; + } + return align_payload(ctx, ctx->minalign); +} + +static int encode_region_array(struct encoder_ctx *ctx, struct byte_buffer **ret) +{ + struct byte_buffer *reg_array; + struct region_info info; + int retcode; + size_t i; + + reg_array = new_byte_buffer(BUFSIZE_SMALL); + if (!reg_array) + return -ENOMEM; + + if ((retcode = encode_le(reg_array, ctx->num_regions, sizeof(uint32_t)))) + goto fail; + + for (i = 0; i < ctx->num_regions; i++) { + info = ctx->regions[i]; + if ((retcode = encode_le(reg_array, info.offset, sizeof(uint32_t)))) + goto fail; + if ((retcode = encode_le(reg_array, info.size, sizeof(uint32_t)))) + goto fail; + } + *ret = reg_array; + return 0; + +fail: + destroy_byte_buffer(reg_array); + return retcode; +} + +static int encode_reloc_table(struct encoder_ctx *ctx, size_t padding_amount, struct byte_buffer **ret) +{ + struct byte_buffer *reloc_table; + struct reloc_info info; + int retcode; + size_t i; + + reloc_table = new_byte_buffer(BUFSIZE_SMALL); + if (!reloc_table) + return -ENOMEM; + + if ((retcode = encode_le(reloc_table, ctx->num_relocations, sizeof(uint32_t))) || + (retcode = encode_le(reloc_table, padding_amount, sizeof(uint32_t)))) + goto fail; + + for (i = 0; i < ctx->num_relocations; i++) { + info = ctx->relocations[i]; + if ((retcode = encode_le(reloc_table, info.src_reg, sizeof(uint32_t))) || + (retcode = encode_le(reloc_table, info.offset, sizeof(uint32_t))) || + (retcode = encode_le(reloc_table, info.dst_reg, sizeof(uint32_t)))) + goto fail; + } + pad(reloc_table, padding_amount); + *ret = reloc_table; + return 0; + +fail: + destroy_byte_buffer(reloc_table); + return retcode; +} + +static size_t reloc_table_size(struct encoder_ctx *ctx) +{ + return 2 * sizeof(uint32_t) + 3 * ctx->num_relocations * sizeof(uint32_t); +} + +int encode(struct ast_node *top_level, struct rand_stream *r, size_t *num_bytes, struct byte_buffer **ret) +{ + struct byte_buffer *region_array = NULL; + struct byte_buffer *final_buffer = NULL; + struct byte_buffer *reloc_table = NULL; + size_t header_size; + int alignment; + int retcode; + + struct encoder_ctx ctx = { 0 }; + if ((retcode = read_minalign(&ctx))) + return retcode; + + if ((retcode = build_region_map(&ctx, top_level))) + goto fail; + + ctx.rand = r; + ctx.payload = new_byte_buffer(BUFSIZE_SMALL); + if (!ctx.payload) { + retcode = -ENOMEM; + goto fail; + } + if ((retcode = encode_payload(&ctx, top_level))) + goto fail; + + if ((retcode = encode_region_array(&ctx, ®ion_array))) + goto fail; + + header_size = sizeof(uint64_t) + region_array->num_bytes + reloc_table_size(&ctx); + alignment = node_alignment(top_level); + if ((retcode = encode_reloc_table( + &ctx, ROUND_UP_TO_MULTIPLE(header_size + KFUZZTEST_POISON_SIZE, alignment) - header_size, + &reloc_table))) + goto fail; + + final_buffer = new_byte_buffer(BUFSIZE_LARGE); + if (!final_buffer) { + retcode = -ENOMEM; + goto fail; + } + + if ((retcode = encode_le(final_buffer, KFUZZTEST_MAGIC, sizeof(uint32_t))) || + (retcode = encode_le(final_buffer, KFUZZTEST_PROTO_VERSION, sizeof(uint32_t))) || + (retcode = append_bytes(final_buffer, region_array->buffer, region_array->num_bytes)) || + (retcode = append_bytes(final_buffer, reloc_table->buffer, reloc_table->num_bytes)) || + (retcode = append_bytes(final_buffer, ctx.payload->buffer, ctx.payload->num_bytes))) { + destroy_byte_buffer(final_buffer); + goto fail; + } + + *num_bytes = final_buffer->num_bytes; + *ret = final_buffer; + +fail: + if (region_array) + destroy_byte_buffer(region_array); + if (reloc_table) + destroy_byte_buffer(reloc_table); + cleanup_ctx(&ctx); + return retcode; +} diff --git a/tools/testing/kfuzztest-bridge/encoder.h b/tools/testing/kfuzztest-bridge/encoder.h new file mode 100644 index 00000000000000..73f8c4b7893cb8 --- /dev/null +++ b/tools/testing/kfuzztest-bridge/encoder.h @@ -0,0 +1,16 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Encoder for KFuzzTest binary input format + * + * Copyright 2025 Google LLC + */ +#ifndef KFUZZTEST_BRIDGE_ENCODER_H +#define KFUZZTEST_BRIDGE_ENCODER_H + +#include "input_parser.h" +#include "rand_stream.h" +#include "byte_buffer.h" + +int encode(struct ast_node *top_level, struct rand_stream *r, size_t *num_bytes, struct byte_buffer **ret); + +#endif /* KFUZZTEST_BRIDGE_ENCODER_H */ diff --git a/tools/testing/kfuzztest-bridge/input_lexer.c b/tools/testing/kfuzztest-bridge/input_lexer.c new file mode 100644 index 00000000000000..d0a3e352a2654a --- /dev/null +++ b/tools/testing/kfuzztest-bridge/input_lexer.c @@ -0,0 +1,256 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Parser for KFuzzTest textual input format + * + * Copyright 2025 Google LLC + */ +#include +#include +#include +#include +#include +#include + +#include "input_lexer.h" + +struct keyword_map { + const char *keyword; + enum token_type type; +}; + +static struct keyword_map keywords[] = { + { "ptr", TOKEN_KEYWORD_PTR }, { "arr", TOKEN_KEYWORD_ARR }, + { "len", TOKEN_KEYWORD_LEN }, { "str", TOKEN_KEYWORD_STR }, + { "u8", TOKEN_KEYWORD_U8 }, { "u16", TOKEN_KEYWORD_U16 }, + { "u32", TOKEN_KEYWORD_U32 }, { "u64", TOKEN_KEYWORD_U64 }, +}; + +static struct token *make_token(enum token_type type, size_t position) +{ + struct token *ret = calloc(1, sizeof(*ret)); + ret->position = position; + ret->type = type; + return ret; +} + +void destroy_tokens(struct token **tokens, size_t num_tokens) +{ + size_t i; + + if (!tokens) + return; + + for (i = 0; i < num_tokens; i++) + if (tokens[i]) + free(tokens[i]); + free(tokens); +} + +struct lexer { + const char *start; + const char *current; + size_t position; +}; + +static char advance(struct lexer *l) +{ + l->current++; + l->position++; + return l->current[-1]; +} + +static void retreat(struct lexer *l) +{ + l->position--; + l->current--; +} + +static char peek(struct lexer *l) +{ + return *l->current; +} + +static bool is_digit(char c) +{ + return c >= '0' && c <= '9'; +} + +static bool is_alpha(char c) +{ + return (c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z'); +} + +static bool is_whitespace(char c) +{ + switch (c) { + case ' ': + case '\r': + case '\t': + case '\n': + return true; + default: + return false; + } +} + +static void skip_whitespace(struct lexer *l) +{ + while (is_whitespace(peek(l))) + advance(l); +} + +static struct token *number(struct lexer *l) +{ + struct token *tok; + uint64_t value; + while (is_digit(peek(l))) + advance(l); + value = strtoull(l->start, NULL, 10); + tok = make_token(TOKEN_INTEGER, l->position); + tok->data.integer = value; + return tok; +} + +static enum token_type check_keyword(struct lexer *l, const char *keyword, + enum token_type type) +{ + size_t len = strlen(keyword); + + if (((size_t)(l->current - l->start) == len) && + strncmp(l->start, keyword, len) == 0) + return type; + return TOKEN_IDENTIFIER; +} + +static struct token *identifier(struct lexer *l) +{ + enum token_type type = TOKEN_IDENTIFIER; + struct token *tok; + size_t i; + + while (is_digit(peek(l)) || is_alpha(peek(l)) || peek(l) == '_') + advance(l); + + for (i = 0; i < ARRAY_SIZE(keywords); i++) { + if (check_keyword(l, keywords[i].keyword, keywords[i].type) != + TOKEN_IDENTIFIER) { + type = keywords[i].type; + break; + } + } + + tok = make_token(type, l->position); + if (!tok) + return NULL; + if (type == TOKEN_IDENTIFIER) { + tok->data.identifier.start = l->start; + tok->data.identifier.length = l->current - l->start; + } + return tok; +} + +static struct token *scan_token(struct lexer *l) +{ + char c; + skip_whitespace(l); + + l->start = l->current; + c = peek(l); + + if (c == '\0') + return make_token(TOKEN_EOF, l->position); + + advance(l); + switch (c) { + case '{': + return make_token(TOKEN_LBRACE, l->position); + case '}': + return make_token(TOKEN_RBRACE, l->position); + case '[': + return make_token(TOKEN_LBRACKET, l->position); + case ']': + return make_token(TOKEN_RBRACKET, l->position); + case ',': + return make_token(TOKEN_COMMA, l->position); + case ';': + return make_token(TOKEN_SEMICOLON, l->position); + default: + retreat(l); + if (is_digit(c)) + return number(l); + if (is_alpha(c) || c == '_') + return identifier(l); + return make_token(TOKEN_ERROR, l->position); + } +} + +int primitive_byte_width(enum token_type type) +{ + switch (type) { + case TOKEN_KEYWORD_U8: + return 1; + case TOKEN_KEYWORD_U16: + return 2; + case TOKEN_KEYWORD_U32: + return 4; + case TOKEN_KEYWORD_U64: + return 8; + default: + return 0; + } +} + +int tokenize(const char *input, struct token ***tokens, size_t *num_tokens) +{ + struct lexer l = { .start = input, .current = input }; + struct token **ret_tokens; + size_t token_arr_size; + size_t token_count; + struct token *tok; + void *tmp; + int err; + + token_arr_size = 128; + ret_tokens = calloc(token_arr_size, sizeof(struct token *)); + if (!ret_tokens) + return -ENOMEM; + + token_count = 0; + do { + tok = scan_token(&l); + if (!tok) { + err = -ENOMEM; + goto failure; + } + + if (token_count == token_arr_size) { + token_arr_size *= 2; + tmp = realloc(ret_tokens, token_arr_size); + if (!tmp) { + err = -ENOMEM; + goto failure; + } + ret_tokens = tmp; + } + + ret_tokens[token_count] = tok; + if (tok->type == TOKEN_ERROR) { + err = -EINVAL; + goto failure; + } + token_count++; + } while (tok->type != TOKEN_EOF); + + *tokens = ret_tokens; + *num_tokens = token_count; + return 0; + +failure: + destroy_tokens(ret_tokens, token_count); + return err; +} + +bool is_primitive(struct token *tok) +{ + return tok->type >= TOKEN_KEYWORD_U8 && tok->type <= TOKEN_KEYWORD_U64; +} diff --git a/tools/testing/kfuzztest-bridge/input_lexer.h b/tools/testing/kfuzztest-bridge/input_lexer.h new file mode 100644 index 00000000000000..40814493c24de8 --- /dev/null +++ b/tools/testing/kfuzztest-bridge/input_lexer.h @@ -0,0 +1,58 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Lexer for KFuzzTest textual input format + * + * Copyright 2025 Google LLC + */ +#ifndef KFUZZTEST_BRIDGE_INPUT_LEXER_H +#define KFUZZTEST_BRIDGE_INPUT_LEXER_H + +#include +#include +#include + +#define ARRAY_SIZE(x) (sizeof(x) / sizeof(x[0])) + +enum token_type { + TOKEN_LBRACE, + TOKEN_RBRACE, + TOKEN_LBRACKET, + TOKEN_RBRACKET, + TOKEN_COMMA, + TOKEN_SEMICOLON, + + TOKEN_KEYWORD_PTR, + TOKEN_KEYWORD_ARR, + TOKEN_KEYWORD_LEN, + TOKEN_KEYWORD_STR, + TOKEN_KEYWORD_U8, + TOKEN_KEYWORD_U16, + TOKEN_KEYWORD_U32, + TOKEN_KEYWORD_U64, + + TOKEN_IDENTIFIER, + TOKEN_INTEGER, + + TOKEN_EOF, + TOKEN_ERROR, +}; + +struct token { + enum token_type type; + union { + uint64_t integer; + struct { + const char *start; + size_t length; + } identifier; + } data; + int position; +}; + +int tokenize(const char *input, struct token ***tokens, size_t *num_tokens); +void destroy_tokens(struct token **tokens, size_t num_tokens); + +bool is_primitive(struct token *tok); +int primitive_byte_width(enum token_type type); + +#endif /* KFUZZTEST_BRIDGE_INPUT_LEXER_H */ diff --git a/tools/testing/kfuzztest-bridge/input_parser.c b/tools/testing/kfuzztest-bridge/input_parser.c new file mode 100644 index 00000000000000..feaa59de49d7f4 --- /dev/null +++ b/tools/testing/kfuzztest-bridge/input_parser.c @@ -0,0 +1,425 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Parser for the KFuzzTest textual input format + * + * This file implements a parser for a simple DSL used to describe C-like data + * structures. This format allows the kfuzztest-bridge tool to encode a random + * byte stream into the structured binary format expected by a KFuzzTest + * harness. + * + * The format consists of semicolon-separated "regions," which are analogous to + * C structs. For example: + * + * "my_struct { ptr[buf] len[buf, u64] }; buf { arr[u8, 42] };" + * + * This describes a `my_struct` region that contains a pointer to a `buf` region + * and its corresponding length encoded over 8 bytes, where `buf` itself + * contains a 42-byte array. + * + * The full grammar is documented in Documentation/dev-tools/kfuzztest.rst. + * + * Copyright 2025 Google LLC + */ +#include +#include +#include + +#include "input_lexer.h" +#include "input_parser.h" + +static struct token *peek(struct parser *p) +{ + return p->tokens[p->curr_token]; +} + +static struct token *advance(struct parser *p) +{ + struct token *tok; + if (p->curr_token >= p->token_count) + return NULL; + tok = peek(p); + p->curr_token++; + return tok; +} + +static struct token *consume(struct parser *p, enum token_type type, const char *err_msg) +{ + if (peek(p)->type != type) { + printf("parser failure at position %d: %s\n", peek(p)->position, err_msg); + return NULL; + } + return advance(p); +} + +static bool match(struct parser *p, enum token_type t) +{ + struct token *tok = peek(p); + return tok->type == t; +} + +static int parse_primitive(struct parser *p, struct ast_node **node_ret) +{ + struct ast_node *ret; + struct token *tok; + int byte_width; + + tok = advance(p); + byte_width = primitive_byte_width(tok->type); + if (!byte_width) + return -EINVAL; + + ret = malloc(sizeof(*ret)); + if (!ret) + return -ENOMEM; + + ret->type = NODE_PRIMITIVE; + ret->data.primitive.byte_width = byte_width; + *node_ret = ret; + return 0; +} + +static int parse_ptr(struct parser *p, struct ast_node **node_ret) +{ + const char *points_to; + struct ast_node *ret; + struct token *tok; + if (!consume(p, TOKEN_KEYWORD_PTR, "expected 'ptr'")) + return -EINVAL; + if (!consume(p, TOKEN_LBRACKET, "expected '['")) + return -EINVAL; + + tok = consume(p, TOKEN_IDENTIFIER, "expected identifier"); + if (!tok) + return -EINVAL; + + if (!consume(p, TOKEN_RBRACKET, "expected ']'")) + return -EINVAL; + + ret = malloc(sizeof(*ret)); + ret->type = NODE_POINTER; + + points_to = strndup(tok->data.identifier.start, tok->data.identifier.length); + if (!points_to) { + free(ret); + return -EINVAL; + } + + ret->data.pointer.points_to = points_to; + *node_ret = ret; + return 0; +} + +static int parse_arr(struct parser *p, struct ast_node **node_ret) +{ + struct token *type, *num_elems; + struct ast_node *ret; + + if (!consume(p, TOKEN_KEYWORD_ARR, "expected 'arr'") || !consume(p, TOKEN_LBRACKET, "expected '['")) + return -EINVAL; + + type = advance(p); + if (!is_primitive(type)) + return -EINVAL; + + if (!consume(p, TOKEN_COMMA, "expected ','")) + return -EINVAL; + + num_elems = consume(p, TOKEN_INTEGER, "expected integer"); + if (!num_elems) + return -EINVAL; + + if (!consume(p, TOKEN_RBRACKET, "expected ']'")) + return -EINVAL; + + ret = malloc(sizeof(*ret)); + if (!ret) + return -ENOMEM; + + ret->type = NODE_ARRAY; + ret->data.array.num_elems = num_elems->data.integer; + ret->data.array.elem_size = primitive_byte_width(type->type); + ret->data.array.null_terminated = false; + *node_ret = ret; + return 0; +} + +static int parse_str(struct parser *p, struct ast_node **node_ret) +{ + struct ast_node *ret; + struct token *len; + + if (!consume(p, TOKEN_KEYWORD_STR, "expected 'str'") || !consume(p, TOKEN_LBRACKET, "expected '['")) + return -EINVAL; + + len = consume(p, TOKEN_INTEGER, "expected integer"); + if (!len) + return -EINVAL; + + if (!consume(p, TOKEN_RBRACKET, "expected ']'")) + return -EINVAL; + + ret = malloc(sizeof(*ret)); + if (!ret) + return -ENOMEM; + + /* A string is the susbet of byte arrays that are null-terminated. */ + ret->type = NODE_ARRAY; + ret->data.array.num_elems = len->data.integer; + ret->data.array.elem_size = sizeof(char); + ret->data.array.null_terminated = true; + *node_ret = ret; + return 0; +} + +static int parse_len(struct parser *p, struct ast_node **node_ret) +{ + struct token *type, *len; + const char *length_of; + struct ast_node *ret; + + if (!consume(p, TOKEN_KEYWORD_LEN, "expected 'len'") || !consume(p, TOKEN_LBRACKET, "expected '['")) + return -EINVAL; + + len = advance(p); + if (len->type != TOKEN_IDENTIFIER) + return -EINVAL; + + if (!consume(p, TOKEN_COMMA, "expected ','")) + return -EINVAL; + + type = advance(p); + if (!is_primitive(type)) + return -EINVAL; + + if (!consume(p, TOKEN_RBRACKET, "expected ']'")) + return -EINVAL; + + ret = malloc(sizeof(*ret)); + if (!ret) + return -ENOMEM; + + length_of = strndup(len->data.identifier.start, len->data.identifier.length); + if (!length_of) { + free(ret); + return -ENOMEM; + } + + ret->type = NODE_LENGTH; + ret->data.length.length_of = length_of; + ret->data.length.byte_width = primitive_byte_width(type->type); + + *node_ret = ret; + return 0; +} + +static int parse_type(struct parser *p, struct ast_node **node_ret) +{ + if (is_primitive(peek(p))) + return parse_primitive(p, node_ret); + + if (peek(p)->type == TOKEN_KEYWORD_PTR) + return parse_ptr(p, node_ret); + + if (peek(p)->type == TOKEN_KEYWORD_ARR) + return parse_arr(p, node_ret); + + if (peek(p)->type == TOKEN_KEYWORD_STR) + return parse_str(p, node_ret); + + if (peek(p)->type == TOKEN_KEYWORD_LEN) + return parse_len(p, node_ret); + + return -EINVAL; +} + +static int parse_region(struct parser *p, struct ast_node **node_ret) +{ + struct token *tok, *identifier; + struct ast_region *region; + struct ast_node *node; + struct ast_node *ret; + void *new_ptr; + int err; + + identifier = consume(p, TOKEN_IDENTIFIER, "expected identifier"); + if (!identifier) + return -EINVAL; + + ret = malloc(sizeof(*ret)); + if (!ret) + return -ENOMEM; + + tok = consume(p, TOKEN_LBRACE, "expected '{'"); + if (!tok) { + err = -EINVAL; + goto fail_early; + } + + region = &ret->data.region; + region->name = strndup(identifier->data.identifier.start, identifier->data.identifier.length); + if (!region->name) { + err = -ENOMEM; + goto fail_early; + } + + region->num_members = 0; + while (!match(p, TOKEN_RBRACE)) { + err = parse_type(p, &node); + if (err) + goto fail; + new_ptr = realloc(region->members, (region->num_members + 1) * sizeof(struct ast_node *)); + if (!new_ptr) { + err = -ENOMEM; + goto fail; + } + region->num_members++; + region->members = new_ptr; + region->members[region->num_members - 1] = node; + } + + if (!consume(p, TOKEN_RBRACE, "expected '}'") || !consume(p, TOKEN_SEMICOLON, "expected ';'")) { + err = -EINVAL; + goto fail; + } + + ret->type = NODE_REGION; + *node_ret = ret; + return 0; + +fail: + destroy_ast_node(ret); + return err; + +fail_early: + free(ret); + return err; +} + +static int parse_program(struct parser *p, struct ast_node **node_ret) +{ + struct ast_program *prog; + struct ast_node *reg; + struct ast_node *ret; + void *new_ptr; + int err; + + ret = malloc(sizeof(*ret)); + if (!ret) + return -ENOMEM; + ret->type = NODE_PROGRAM; + + prog = &ret->data.program; + prog->num_members = 0; + prog->members = NULL; + while (!match(p, TOKEN_EOF)) { + err = parse_region(p, ®); + if (err) + goto fail; + + new_ptr = realloc(prog->members, ++prog->num_members * sizeof(struct ast_node *)); + if (!new_ptr) { + err = -ENOMEM; + goto fail; + } + prog->members = new_ptr; + prog->members[prog->num_members - 1] = reg; + } + + *node_ret = ret; + return 0; + +fail: + destroy_ast_node(ret); + return err; +} + +size_t node_alignment(struct ast_node *node) +{ + size_t max_alignment = 1; + size_t i; + + switch (node->type) { + case NODE_PROGRAM: + for (i = 0; i < node->data.program.num_members; i++) + max_alignment = MAX(max_alignment, node_alignment(node->data.program.members[i])); + return max_alignment; + case NODE_REGION: + for (i = 0; i < node->data.region.num_members; i++) + max_alignment = MAX(max_alignment, node_alignment(node->data.region.members[i])); + return max_alignment; + case NODE_ARRAY: + return node->data.array.elem_size; + case NODE_LENGTH: + return node->data.length.byte_width; + case NODE_PRIMITIVE: + /* Primitives are aligned to their size. */ + return node->data.primitive.byte_width; + case NODE_POINTER: + return sizeof(uintptr_t); + } + + /* Anything should be at least 1-byte-aligned. */ + return 1; +} + +size_t node_size(struct ast_node *node) +{ + size_t total = 0; + size_t i; + + switch (node->type) { + case NODE_PROGRAM: + for (i = 0; i < node->data.program.num_members; i++) + total += node_size(node->data.program.members[i]); + return total; + case NODE_REGION: + for (i = 0; i < node->data.region.num_members; i++) { + /* Account for padding within region. */ + total = ROUND_UP_TO_MULTIPLE(total, node_alignment(node->data.region.members[i])); + total += node_size(node->data.region.members[i]); + } + return total; + case NODE_ARRAY: + return node->data.array.elem_size * node->data.array.num_elems + + (node->data.array.null_terminated ? 1 : 0); + case NODE_LENGTH: + return node->data.length.byte_width; + case NODE_PRIMITIVE: + return node->data.primitive.byte_width; + case NODE_POINTER: + return sizeof(uintptr_t); + } + return 0; +} + +int parse(struct token **tokens, size_t token_count, struct ast_node **node_ret) +{ + struct parser p = { .tokens = tokens, .token_count = token_count, .curr_token = 0 }; + return parse_program(&p, node_ret); +} + +void destroy_ast_node(struct ast_node *node) +{ + size_t i; + + switch (node->type) { + case NODE_PROGRAM: + for (i = 0; i < node->data.program.num_members; i++) + destroy_ast_node(node->data.program.members[i]); + break; + case NODE_REGION: + for (i = 0; i < node->data.region.num_members; i++) + destroy_ast_node(node->data.region.members[i]); + free((void *)node->data.region.name); + break; + case NODE_LENGTH: + free((void *)node->data.length.length_of); + break; + case NODE_POINTER: + free((void *)node->data.pointer.points_to); + break; + default: + break; + } + free(node); +} diff --git a/tools/testing/kfuzztest-bridge/input_parser.h b/tools/testing/kfuzztest-bridge/input_parser.h new file mode 100644 index 00000000000000..5f444b40f672a6 --- /dev/null +++ b/tools/testing/kfuzztest-bridge/input_parser.h @@ -0,0 +1,82 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Parser for KFuzzTest textual input format + * + * Copyright 2025 Google LLC + */ +#ifndef KFUZZTEST_BRIDGE_INPUT_PARSER_H +#define KFUZZTEST_BRIDGE_INPUT_PARSER_H + +#include + +/* Rounds x up to the nearest multiple of n. */ +#define ROUND_UP_TO_MULTIPLE(x, n) (((n) == 0) ? (0) : (((x) + (n) - 1) / (n)) * (n)) + +#define MAX(a, b) ((a) > (b) ? (a) : (b)) + +enum ast_node_type { + NODE_PROGRAM, + NODE_REGION, + NODE_ARRAY, + NODE_LENGTH, + NODE_PRIMITIVE, + NODE_POINTER, +}; + +struct ast_node; /* Forward declaration. */ + +struct ast_program { + struct ast_node **members; + size_t num_members; +}; + +struct ast_region { + const char *name; + struct ast_node **members; + size_t num_members; +}; + +struct ast_array { + int elem_size; + int null_terminated; /* True iff the array should always end with 0. */ + size_t num_elems; +}; + +struct ast_length { + size_t byte_width; + const char *length_of; +}; + +struct ast_primitive { + size_t byte_width; +}; + +struct ast_pointer { + const char *points_to; +}; + +struct ast_node { + enum ast_node_type type; + union { + struct ast_program program; + struct ast_region region; + struct ast_array array; + struct ast_length length; + struct ast_primitive primitive; + struct ast_pointer pointer; + } data; +}; + +struct parser { + struct token **tokens; + size_t token_count; + size_t curr_token; +}; + +int parse(struct token **tokens, size_t token_count, struct ast_node **node_ret); +void destroy_ast_node(struct ast_node *node); + +size_t node_size(struct ast_node *node); +size_t node_alignment(struct ast_node *node); + +#endif /* KFUZZTEST_BRIDGE_INPUT_PARSER_H */ diff --git a/tools/testing/kfuzztest-bridge/rand_stream.c b/tools/testing/kfuzztest-bridge/rand_stream.c new file mode 100644 index 00000000000000..bca6b3de5aadc4 --- /dev/null +++ b/tools/testing/kfuzztest-bridge/rand_stream.c @@ -0,0 +1,77 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Implements a cached file-reader for iterating over a byte stream of + * pseudo-random data + * + * Copyright 2025 Google LLC + */ +#include "rand_stream.h" + +static int refill(struct rand_stream *rs) +{ + rs->valid_bytes = fread(rs->buffer, sizeof(char), rs->buffer_size, rs->source); + rs->buffer_pos = 0; + if (rs->valid_bytes != rs->buffer_size && ferror(rs->source)) + return ferror(rs->source); + return 0; +} + +struct rand_stream *new_rand_stream(const char *path_to_file, size_t cache_size) +{ + struct rand_stream *rs; + + rs = malloc(sizeof(*rs)); + if (!rs) + return NULL; + + rs->valid_bytes = 0; + rs->source = fopen(path_to_file, "rb"); + if (!rs->source) { + free(rs); + return NULL; + } + + if (fseek(rs->source, 0, SEEK_END)) { + fclose(rs->source); + free(rs); + return NULL; + } + rs->source_size = ftell(rs->source); + + if (fseek(rs->source, 0, SEEK_SET)) { + fclose(rs->source); + free(rs); + return NULL; + } + + rs->buffer = malloc(cache_size); + if (!rs->buffer) { + fclose(rs->source); + free(rs); + return NULL; + } + rs->buffer_size = cache_size; + return rs; +} + +void destroy_rand_stream(struct rand_stream *rs) +{ + fclose(rs->source); + free(rs->buffer); + free(rs); +} + +int next_byte(struct rand_stream *rs, char *ret) +{ + int res; + + if (rs->buffer_pos >= rs->valid_bytes) { + res = refill(rs); + if (res) + return res; + if (rs->valid_bytes == 0) + return STREAM_EOF; + } + *ret = rs->buffer[rs->buffer_pos++]; + return 0; +} diff --git a/tools/testing/kfuzztest-bridge/rand_stream.h b/tools/testing/kfuzztest-bridge/rand_stream.h new file mode 100644 index 00000000000000..acb3271d30caa1 --- /dev/null +++ b/tools/testing/kfuzztest-bridge/rand_stream.h @@ -0,0 +1,57 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Implements a cached file-reader for iterating over a byte stream of + * pseudo-random data + * + * Copyright 2025 Google LLC + */ +#ifndef KFUZZTEST_BRIDGE_RAND_STREAM_H +#define KFUZZTEST_BRIDGE_RAND_STREAM_H + +#include +#include + +#define STREAM_EOF 1 + +/** + * struct rand_stream - a buffered bytestream reader + * + * Reads and returns bytes from a file, using buffered pre-fetching to amortize + * the cost of reads. + */ +struct rand_stream { + FILE *source; + size_t source_size; + char *buffer; + size_t buffer_size; + size_t buffer_pos; + size_t valid_bytes; +}; + +/** + * new_rand_stream - return a new struct rand_stream + * + * @path_to_file: source of the output byte stream. + * @cache_size: size of the read-ahead cache in bytes. + */ +struct rand_stream *new_rand_stream(const char *path_to_file, size_t cache_size); + +/** + * destroy_rand_stream - clean up a rand stream's resources + * + * @rs: a struct rand_stream + */ +void destroy_rand_stream(struct rand_stream *rs); + +/** + * next_byte - return the next byte from a struct rand_stream + * + * @rs: an initialized struct rand_stream. + * @ret: return pointer. + * + * @return 0 on success or a negative value on failure. + * + */ +int next_byte(struct rand_stream *rs, char *ret); + +#endif /* KFUZZTEST_BRIDGE_RAND_STREAM_H */