Counting in Postgresql
I have a table numbers (n, ..)
that I want to get the count of.
I can either select count(*)
or select count(1)
, and they might be different in runtime.
If they are different, one is faster and one is slower, but which one?
On my Mac Pro, 9 years old:
$ psql ..
(
=# select count(1) from numbers; select count(*) from numbers; select count(1) from numbers;
x3
=# select count(*) from numbers; select count(1) from numbers; select count(*) from numbers;
x3
) x2
And we get (in ms)
run | count(1) | count(*) | count(1) | count(*) |
---|---|---|---|---|
1 | 10.793 | 5.071 | 6.009 | |
2 | 6.290 | 5.457 | 5.461 | |
3 | 6.324 | 5.523 | 5.242 | |
4 | 6.027 | 5.644 | 4.850 | |
5 | 5.561 | 5.252 | 4.786 | |
6 | 5.861 | 5.612 | 5.145 | |
7 | 6.358 | 5.436 | 5.122 | |
8 | 6.311 | 5.091 | 5.058 | |
9 | 6.404 | 5.180 | 5.502 | |
10 | 5.567 | 5.800 | 4.810 | |
11 | 5.534 | 5.219 | 5.587 | |
12 | 5.861 | 5.802 | 5.494 |
So
count | low | mean | high |
---|---|---|---|
count(*) | 4.786 | 5.380 | 6.027 |
count(1) | 5.091 | 6.013 | 10.793 |
This is a pretty rough benchmark; we can do much better. I want to run it on larger sizes, so I really need to automate/scrape the runtime.
It looks like count(*) is slightly faster on a 49,999-row count, no conditions.
Going deeper
I wrote some benchmark code, largely based upon this great jOOQ blog post, at the bottom of this post.
I didn’t parameterize the queries, but it’s very much an improvement over doing any of it by hand.
Let’s repeat with a v_repeat of 5 and decrease the outermost 30 iterations down to 3:
=# select bench_ab(5000); select ab, min(runtime), avg(runtime), max(runtime), count(runtime) from runtimes group by ab;
ab | min | avg | max | count |
---|---|---|---|---|
count(*) | 00:00:01.333068 | 00:00:01.348974 | 00:00:01.373866 | 9 |
count(1) | 00:00:01.327441 | 00:00:01.350864 | 00:00:01.373268 | 9 |
Dividing the averages,
count(*) gives 1.350864 / 1.348974 = 1.001x latency speedup on 49,999-row table counts (no conditions)
More rows
Instead of 50,000 rows, let’s do 5,000,000.
delete from numbers;
insert into numbers (n, r2, r3, r5, r7)
select n, n%2, n%3, n%5, n%7 from generate_series(3, 1e7, 2) as numbers(n);
select count(*) from numbers;
I had to reduce the innermost repeat down from 5,000 down to 5 to not wait all night:
=# select bench_ab(5); select ab, min(runtime), avg(runtime), max(runtime), count(runtime) from runtimes group by ab;
ab | min | avg | max | count |
---|---|---|---|---|
count(*) | 00:00:03.631091 | 00:00:04.401283 | 00:00:10.299831 | 9 |
count(1) | 00:00:03.726291 | 00:00:03.818642 | 00:00:04.149109 | 9 |
Dividing the averages,
count(1) gives 4.401283 / 3.8186421 = 1.153x latency speedup on 4,999,999-now table counts (no conditions)
Once with conditions
Let’s just get the numbers that are fizzbuzz, divisible by both 3 and 5:
select count(*)
from numbers
where r3 = 0 and r5 = 0
ab | min | avg | max | count |
---|---|---|---|---|
count(*) | 00:00:06.954637 | 00:00:07.10634 | 00:00:07.211031 | 9 |
count(1) | 00:00:06.960223 | 00:00:07.089915 | 00:00:07.154958 | 9 |
Dividing the averages,
count(1) gives 7.10634 / 7.089915 = 1.002x latency speedup on 333,333-row table counts (with conditions)
Pretty close.
These results looked a little odd, maybe, so I ran it again the next day, and it flipped back to count(*) generally faster; 4,500 runs in total:
ab | min | avg | max | count |
---|---|---|---|---|
count(*) | 00:02:44.51731 | 00:02:46.415199 | 00:02:49.43029 | 9 |
count(1) | 00:02:47.562048 | 00:02:50.376417 | 00:02:54.533021 | 9 |
Giving a 170.376417 / 166.415199 = 1.024x latency speedup. I wonder why the benchmark was different yesterday; I might run it once more just to be sure.
Benchmark code
-- delete from numbers;
-- insert into numbers (n, r2, r3, r5, r7)
-- select n, n%2, n%3, n%5, n%7 from generate_series(3, 1e5, 2) as numbers(n);
-- select count(*) from numbers;
--drop function bench_ab
create table if not exists runtimes (ab text, runtime interval);
create or replace function bench_ab(v_repeat int)
returns text
language plpgsql
as $$
declare
v_ts timestamp;
v_ts0 timestamp;
v_ts2 timestamp;
jmax constant int := 3;
imax constant int := 3;
count int;
begin
delete from runtimes;
for j in 1..jmax loop
for i in 1..imax loop
v_ts0 := clock_timestamp();
v_ts := clock_timestamp();
for i in 1..v_repeat loop
perform count(1)
from numbers;
end loop;
v_ts2 := clock_timestamp();
insert into runtimes("ab", runtime) values ('count(1)', v_ts2 - v_ts);
v_ts = v_ts2;
end loop;
raise info '%',j;
for i in 1..imax loop
v_ts := clock_timestamp();
for i in 1..v_repeat loop
perform count(*)
from numbers;
end loop;
v_ts2 := clock_timestamp();
insert into runtimes("ab", runtime) values ('count(*)', v_ts2 - v_ts);
v_ts = v_ts2;
end loop;
end loop;
return (select to_char(jmax*imax*v_repeat, '999,999,999')||' runs in total');
end
$$