Skip to content

Conversation

@hydrogen18
Copy link
Contributor

There is at least one issue here with lock usage leading to a panic. The code would call .Destroyed() then take the lock and assume the coffer is not destroyed, when it can be destroyed after the lock is acquired. This caused a panic in the test I added. This has been refactored to take the lock, check the function .destroyed() and then make decisions based off that.

@awnumar
Copy link
Owner

awnumar commented Jul 11, 2025

Tests have a data race now


==================
WARNING: DATA RACE
Write at 0x00c0000500c0 by goroutine 25:
  github.com/awnumar/memguard/core.TestCofferConcurrent()
      /home/runner/work/memguard/memguard/core/coffer_test.go:193 +0x179
  testing.tRunner()
      /opt/hostedtoolcache/go/1.24.5/x64/src/testing/testing.go:1792 +0x225
  testing.(*T).Run.gowrap1()
      /opt/hostedtoolcache/go/1.24.5/x64/src/testing/testing.go:1851 +0x44

Previous read at 0x00c0000500c0 by goroutine 272:
  github.com/awnumar/memguard/core.TestCofferConcurrent.func4()
      /home/runner/work/memguard/memguard/core/coffer_test.go:202 +0x19e
  github.com/awnumar/memguard/core.TestCofferConcurrent.gowrap1()
      /home/runner/work/memguard/memguard/core/coffer_test.go:213 +0x6e

Goroutine 25 (running) created at:
  testing.(*T).Run()
      /opt/hostedtoolcache/go/1.24.5/x64/src/testing/testing.go:1851 +0x8f2
  testing.runTests.func1()
      /opt/hostedtoolcache/go/1.24.5/x64/src/testing/testing.go:2279 +0x85
  testing.tRunner()
      /opt/hostedtoolcache/go/1.24.5/x64/src/testing/testing.go:1792 +0x225
  testing.runTests()
      /opt/hostedtoolcache/go/1.24.5/x64/src/testing/testing.go:2277 +0x96c
  testing.(*M).Run()
      /opt/hostedtoolcache/go/1.24.5/x64/src/testing/testing.go:2142 +0xeea
  main.main()
      _testmain.go:97 +0x164

Goroutine 272 (running) created at:
  github.com/awnumar/memguard/core.TestCofferConcurrent()
      /home/runner/work/memguard/memguard/core/coffer_test.go:197 +0x431
  testing.tRunner()
      /opt/hostedtoolcache/go/1.24.5/x64/src/testing/testing.go:1792 +0x225
  testing.(*T).Run.gowrap1()
      /opt/hostedtoolcache/go/1.24.5/x64/src/testing/testing.go:1851 +0x44
==================
panic: <memcall> could not acquire lock on 0x7f4e0320c000, limit reached? [Err: cannot allocate memory]

goroutine 921 [running]:
github.com/awnumar/memguard/core.Panic(...)
	/home/runner/work/memguard/memguard/core/exit.go:86
github.com/awnumar/memguard/core.NewBuffer(0x20)
	/home/runner/work/memguard/memguard/core/buffer.go:73 +0xc7a
github.com/awnumar/memguard/core.(*Coffer).View(0xc000578800)
	/home/runner/work/memguard/memguard/core/coffer.go:83 +0x1a7
github.com/awnumar/memguard/core.TestCofferConcurrent.func3(0xc000578800)
	/home/runner/work/memguard/memguard/core/coffer_test.go:188 +0x27
github.com/awnumar/memguard/core.TestCofferConcurrent.func4({0x69dd00, 0xc0000ac150}, 0xc0000126b0, 0xc000578800)
	/home/runner/work/memguard/memguard/core/coffer_test.go:202 +0x1b7
created by github.com/awnumar/memguard/core.TestCofferConcurrent in goroutine 25
	/home/runner/work/memguard/memguard/core/coffer_test.go:197 +0x432
FAIL	github.com/awnumar/memguard/core	11.726s

@hydrogen18 hydrogen18 requested a review from awnumar July 12, 2025 13:56
@awnumar
Copy link
Owner

awnumar commented Jul 12, 2025

The higher concurrency caught a data race in the windows test, we don't want to work around that by just lowering the number of threads.

https://github.com/awnumar/memguard/actions/runs/16229178136/job/45830415615#step:6:346

@hydrogen18
Copy link
Contributor Author

  1. memcall.Lock seems to return an error on windows according to the trace
  2. The Purge() function tries to do something at that point, although I don't know what. It's not thread safe
  3. Most of this code uses locks. Very little of them are used in a manner that makes sense, there is literally a comment saying // TODO: this does not protect 'data' field
  4. I don't have a windows computer to run that test on
  5. You can set the test concurrency to whatever you want on any platform by setting the env. var. I added
  6. I added .IsDestroyed() which likely eliminates the trace you linked. This does not fix the original problem that the Buffer struct is simply not thread safe nor is it used in a manner that is thread safe

@awnumar
Copy link
Owner

awnumar commented Jul 13, 2025

memcall.Lock seems to return an error on windows according to the trace

Looks like the CI runner ran out of mlock capacity and so panicked, which then triggered the global cleanup routines. Having a global shared mutable variables to store pointers to all allocated memory was a big API mistake in hindsight.

Most of this code uses locks. Very little of them are used in a manner that makes sense, there is literally a comment saying // TODO: this does not protect 'data' field

This is out of necessity, as we cannot control what callers do with their returned memory once it's given to them.

Thanks for your time on this. Please revert the change that adds IsDestroyed, or make the method lowercase at least, as I'm not sure we'll end up keeping that method if we solve the race a different way. Once that's done, I'll merge and iterate separately to this PR.

@awnumar awnumar merged commit 3152cda into awnumar:master Jul 14, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants