Skip to content

Conversation

@foxfirecodes
Copy link

(i believe this closes #151)

i was seeing panics happening when trying to print stats while load-testing a relatively inefficient, but not THAT inefficient site

did a bit of digging and realized the upper bound of the histogram implied an upper limit of 3.6s or 3600ms, which is obviously trivial to exceed when hitting any kind of relatively expensive API

it looks like the histogram initialization was just copy-pasta'd from the example in the docs:
image

however, in drill's case, it's actually measuring in microseconds, not milliseconds (and then divided down in all the stats, as evidenced by the fact that the request duration (already in ms) is being multiplied by 1000 before appending to the histogram:
image

as such I made two changes:

  1. introduce a max timeout, so we can reasonably set an upper bound that should be impossible to ever exceed
  2. convert the max timeout to microseconds and use that as the histogram's upper bound

if you're against the MAX_TIMEOUT_SECONDS constant that I introduced, I'm ok w dropping it but in that case, we either continue to risk panics or we have to find a smarter way to derive the histogram's upper bound. we could base it on the actual timeout option, but the config is initialized inside of benchmark::execute & not returned back out, so it would require a more structural change in order to achieve this which I wanted to avoid in this PR

do let me know if there's a more preferrable solution to the issue, but hopefully this is at least a quick and dirty solution so that 3.6s+ requests don't cause drill to panic :3

@yyaroshevich
Copy link

Hello @fcsonline! Sorry for bothering you by directly mentioning you here. Is there any chance you will be able to have a look in this pull request which fixes quite annoying issue with panic after benchmark completes? Thanks a lot in advance, Yury.

@Jayx239
Copy link

Jayx239 commented Jul 23, 2025

So it looks like this just sets the timeout to 120 seconds by default. But I don't I am explicitly attempting to set the timeout to > 10 seeconds in my test and it doesnt work (without this change). So wondering if the --timeout flag is just not being respected? Either way, testing with this change fixed the issue for my use case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Panic when some requests time out

3 participants