Skip to content

Conversation

@dprittie
Copy link

I am using the nrpe_exporter (https://github.com/RobustPerception/nrpe_exporter) to make about 4500 nrpe client requests every 70s. The nrpe_exporter relies on https://github.com/aperum/nrpe which is a fork of this repo. A few times a day at irregular intervals the nrpe_exporter crashes with the following error:

panic: runtime error: index out of range

goroutine 3888620 [running]:
math/rand.(*rngSource).Int63(0xc420083500, 0x5d3ea3c976019300)
        /home/dprittie/.cache/pants/bin/go/linux/x86_64/1.8.3/go/go/src/math/rand/rng.go:231 +0x8c
math/rand.(*Rand).Int63(0xc420018c30, 0x5d3ea3c976019300)
        /home/dprittie/.cache/pants/bin/go/linux/x86_64/1.8.3/go/go/src/math/rand/rand.go:81 +0x33
math/rand.(*Rand).Uint32(0xc420018c30, 0xc420765ae4)
        /home/dprittie/.cache/pants/bin/go/linux/x86_64/1.8.3/go/go/src/math/rand/rand.go:84 +0x2b
nrpe.randomizeBuffer(0xc4208e4480, 0x40c, 0x40c)
        /opt/teamcity-agent-01/work/96ee0984d4e5ff87/.pants.d/compile/go/src.go.src.nrpe.nrpe/src/nrpe/nrpe.go:110 +0x52
nrpe.buildPacket(0xc400000001, 0xc420765bb0, 0x8, 0x20, 0x8)
        /opt/teamcity-agent-01/work/96ee0984d4e5ff87/.pants.d/compile/go/src.go.src.nrpe.nrpe/src/nrpe/nrpe.go:202 +0x208
nrpe.Run(0xcce260, 0xc4202ac0f0, 0xc4202cce54, 0x8, 0x0, 0x0, 0x0, 0x1, 0x0, 0x0, ...)
        /opt/teamcity-agent-01/work/96ee0984d4e5ff87/.pants.d/compile/go/src.go.src.nrpe.nrpe/src/nrpe/nrpe.go:282 +0x10f
main.collectCommandMetrics(0xc4202cce54, 0x8, 0x0, 0x0, 0x0, 0xcce140, 0xc420b300c8, 0xcc53a0, 0xc420123620, 0xc4208d7670, ...)
        /opt/teamcity-agent-01/work/96ee0984d4e5ff87/.pants.d/compile/go/src.go.src.nrpe_exporter.nrpe_exporter/src/nrpe_exporter/nrpe_exporter.go:49 +0x120
main.(*Collector).Collect(0xc420596b90, 0xc4202cd080)
        /opt/teamcity-agent-01/work/96ee0984d4e5ff87/.pants.d/compile/go/src.go.src.nrpe_exporter.nrpe_exporter/src/nrpe_exporter/nrpe_exporter.go:72 +0x357
github.com/prometheus/client_golang/prometheus.(*Registry).Gather.func2(0xc4201bc5e0, 0xc4202cd080, 0xcc89a0, 0xc420596b90)
        /opt/teamcity-agent-01/work/96ee0984d4e5ff87/.pants.d/compile/go/3rdparty.go.github.com.prometheus.client_golang.prometheus/src/github.com/prometheus/client_golang/prometheus/registry.go:433 +0x61
created by github.com/prometheus/client_golang/prometheus.(*Registry).Gather
        /opt/teamcity-agent-01/work/96ee0984d4e5ff87/.pants.d/compile/go/3rdparty.go.github.com.prometheus.client_golang.prometheus/src/github.com/prometheus/client_golang/prometheus/registry.go:434 +0x2ec

I believe we see this error because rand.NewSource is not safe for concurrent use, which I found documented at https://pkg.go.dev/math/rand. This PR wraps a mutex around calls to Seed and Uint32. I have not included a test because I cannot think of a way to include a deterministic unit test for this type of error. I did temporarily include a test that I was able to reproduce the issue with on my system, but results would vary across different hardware and OS package versions.

Test I was using:

func TestRandSource(t *testing.T) {
        const c = 1000000
        var wg sync.WaitGroup
        wg.Add(c)

        for i := 0; i < c; i++ {
                go func(i int) {
                        defer wg.Done()
                        randSource.Seed(0)
                        randSource.Uint32()
                }(i)
        }
        wg.Wait()
}

Output of test:

panic: runtime error: index out of range [-1]

goroutine 74210 [running]:
math/rand.(*rngSource).Uint64(...)
        /usr/lib/golang/src/math/rand/rng.go:249
math/rand.(*rngSource).Int63(0xc0000d5500, 0x0)
        /usr/lib/golang/src/math/rand/rng.go:234 +0x98
math/rand.(*Rand).Int63(...)
        /usr/lib/golang/src/math/rand/rand.go:85
math/rand.(*Rand).Uint32(...)
        /usr/lib/golang/src/math/rand/rand.go:88
_/home/dprittie/git/github/nrpe-envimate.TestRandSource.func1(0xc0000b02d0, 0x12178)
        /home/dprittie/git/github/nrpe-envimate/nrpe_test.go:230 +0x7e
created by _/home/dprittie/git/github/nrpe-envimate.TestRandSource
        /home/dprittie/git/github/nrpe-envimate/nrpe_test.go:227 +0x7d
exit status 2
FAIL    _/home/dprittie/git/github/nrpe-envimate 0.178s

@dprittie
Copy link
Author

@spirius - any chance you can take a look at this? I can see this project has had no activity for quite a few years but you are still active on github so I'm hoping we can fix this here. If not I'll open one on https://github.com/aperum/nrpe, though I suspect I will run into the same issue. Maybe I'll end up creating another fork and getting that into https://github.com/RobustPerception/nrpe_exporter.

@nisabek
Copy link
Member

nisabek commented Aug 26, 2021

Hey @dprittie!

Thanks for your contribution. I personally won't be maintaining the repo, so I suggest you open the PR against https://github.com/spirius/nrpe and I will write in the README that moving forward https://github.com/spirius/nrpe is the main fork. That is if @spirius is ready to continue maintaining this?

Cheers,
Nune

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants