-
Notifications
You must be signed in to change notification settings - Fork 1.9k
wip: check cost of hashing in multi column aggregate [IGNORE] #19346
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
run benchmark aggregate_query_sql |
|
show benchmark queue |
|
🤖 Hi @rluvaton, you asked to view the benchmark queue (#19346 (comment)).
|
|
run benchmark aggregate_query_sql |
|
show benchmark queue |
|
🤖 Hi @Dandandan, you asked to view the benchmark queue (#19346 (comment)).
|
|
@alamb any idea why it's not working? |
My script runner bails in error and there was something wrong with one of the jobs. I need to make it more resilent to errors and better error reporting |
|
🤖 |
|
show benchmark queue |
|
🤖 Hi @alamb, you asked to view the benchmark queue (#19346 (comment)).
|
|
show benchmark queue |
|
🤖 Hi @rluvaton, you asked to view the benchmark queue (#19346 (comment)).
|
|
I just checked the runner. Whatever this benchmark is it is taking a long time to complete The most recent one (note it says it is going to take 3379.1s, almost an hour!!! to run a single benchmark ) |
|
Probably this pr cause that I think. |
|
Is it possible to cancel benchmark runs? |
|
And is there a way to see the current job status to know for next time? |
|
And is the run logs from main or this branch? |
I just did it manually. I don't have any automated way to do it yet
not that I know of yet. That would also be a great feature 🤔 |
|
I think the change probably makes it super slow by creating a lot of hash "collisions" (by only looking at the first column) |
|
I wanted to see for group by wide u64 and string how much saving the string hashing would save us as it's irrelevant - in wide u64 case all values are unique, so you don't need to hash by string. So yes, I was aware that I would create hash collisions for the rest of the benchmarks (I didn't know it will be that slow though). What surprise me is that the group by wide was the one to take a long time which I explained shouldn't be as it's the same as hashing by a single column |
|
we can check if useful after: |
Which issue does this PR close?
Rationale for this change
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?