Skip to content

latin1 encoding difference when using &'static str and String #125

@Dreistein

Description

@Dreistein

Hi,
First of all, thank you for creating and maintaining this library!

I was debugging some strange bug today in our library and found a difference in the output between passing &'static str and String.
Consider this minimal example:

#[test]
fn test1() {
    let str = "nopqrstuvwxyz{|}~¡¢£¤¥¦§¨©";

    // encode
    let mut buf = vec![0; str.len()];
    encoding_rs::mem::convert_utf8_to_latin1_lossy(str.as_bytes(), &mut buf);
    println!("encoded buffer: {buf:?}");

    // decode
    let mut out = " ".repeat(buf.len() * 2);
    encoding_rs::mem::convert_latin1_to_str(&buf, &mut out);
    println!("decoded string: '{}'", out);

    assert_eq!(str, out.trim_end_matches(|c| c == '\0' || c == ' '));
}

which is OK:

running 1 test
test string::test::test1 ... ok

successes:

---- string::test::test1 stdout ----
encoded buffer: [110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 161, 162, 163, 164, 165, 166, 167, 168, 169, 0, 0, 0, 0, 0, 0, 0, 0, 0]
decoded string: 'nopqrstuvwxyz{|}~¡¢£¤¥¦§¨©          '


successes:
    string::test::test1

But when i change the static string reference to a heap allocated one:

#[test]
fn test2() {
    let str = String::from("nopqrstuvwxyz{|}~¡¢£¤¥¦§¨©");

    // encode
    let mut buf = vec![0; str.len()];
    encoding_rs::mem::convert_utf8_to_latin1_lossy(str.as_bytes(), &mut buf);
    println!("encoded buffer: {buf:?}");

    // decode
    let mut out = " ".repeat(buf.len() * 2);
    encoding_rs::mem::convert_latin1_to_str(&buf, &mut out);
    println!("decoded string: '{}'", out);

    assert_eq!(str, out.trim_end_matches(|c| c == '\0' || c == ' '));
}

The test fails:

running 1 test
test string::test::test2 ... FAILED

successes:

successes:

failures:

---- string::test::test2 stdout ----
encoded buffer: [110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 161, 162, 163, 164, 165, 166, 167, 168, 169, 165, 194, 166, 194, 167, 194, 0, 0, 0]
decoded string: 'nopqrstuvwxyz{|}~¡¢£¤¥¦§¨©¥Â¦Â§Â    '

thread 'string::test::test2' panicked at src/string.rs:303:5:
assertion `left == right` failed
  left: "nopqrstuvwxyz{|}~¡¢£¤¥¦§¨©"
 right: "nopqrstuvwxyz{|}~¡¢£¤¥¦§¨©¥Â¦Â§Â"
stack backtrace:
   0: __rustc::rust_begin_unwind
             at /rustc/1159e78c4747b02ef996e55082b704c09b970588/library/std/src/panicking.rs:697:5
   1: core::panicking::panic_fmt
             at /rustc/1159e78c4747b02ef996e55082b704c09b970588/library/core/src/panicking.rs:75:14
   2: core::panicking::assert_failed_inner
             at /rustc/1159e78c4747b02ef996e55082b704c09b970588/library/core/src/panicking.rs:448:17
   3: core::panicking::assert_failed
             at .rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/panicking.rs:403:5
   4: string::test::test2
             at ./src/string.rs:303:5
   5: string::test::test2::{{closure}}
             at ./src/string.rs:290:11
   6: core::ops::function::FnOnce::call_once
             at .rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/function.rs:253:5
   7: core::ops::function::FnOnce::call_once
             at /rustc/1159e78c4747b02ef996e55082b704c09b970588/library/core/src/ops/function.rs:253:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.


failures:
    string::test::test2

test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 14 filtered out; finished in 0.06s

Note that the reason why it fails is totally on me. Nonetheless it's
a) completely unexpected for me that you write to the other part of the buffer
b) it's strange that the function has different behavior depending on the memory location. At least that is what i think is happening.

BR

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions