Skip to content

Conversation

@brandur
Copy link
Contributor

@brandur brandur commented May 16, 2025

This one follows up #870 to add an optimization for job completion where
we separate out the most common case of setting jobs to completed
without any metadata required and update all of them in a simplified
batch query, then do the rest of the completions afterwards.

In any non-degenerate queue, most completions will be setting success
states so this should help with real world uses, but it also helps us
significantly improve SQLite's benchmarking numbers.

Here's a new benchmark run where throughput is ~4x what it was doing
before and roughly on par with Postgres:

$ go run ./cmd/river bench --database-url "sqlite://:memory:" --num-total-jobs 1_000_000
bench: jobs worked [          0 ], inserted [    1000000 ], job/sec [        0.0 ] [0s]
bench: jobs worked [      88218 ], inserted [          0 ], job/sec [    44109.0 ] [2s]
bench: jobs worked [      91217 ], inserted [          0 ], job/sec [    45608.5 ] [2s]
bench: jobs worked [      88858 ], inserted [          0 ], job/sec [    44429.0 ] [2s]
bench: jobs worked [      77219 ], inserted [          0 ], job/sec [    38609.5 ] [2s]
bench: jobs worked [      82045 ], inserted [          0 ], job/sec [    41022.5 ] [2s]
bench: jobs worked [      84052 ], inserted [          0 ], job/sec [    42026.0 ] [2s]
bench: jobs worked [      72028 ], inserted [          0 ], job/sec [    36014.0 ] [2s]
bench: jobs worked [      90047 ], inserted [          0 ], job/sec [    45023.5 ] [2s]
bench: jobs worked [      88875 ], inserted [          0 ], job/sec [    44437.5 ] [2s]
bench: jobs worked [      89240 ], inserted [          0 ], job/sec [    44620.0 ] [2s]
bench: jobs worked [      88842 ], inserted [          0 ], job/sec [    44421.0 ] [2s]
bench: jobs worked [      59359 ], inserted [          0 ], job/sec [    29679.5 ] [2s]
bench: total jobs worked [    1000000 ], total jobs inserted [    1000000 ], overall job/sec [    42822.8 ], running 23.35203575s

Here's a normal non-memory file-based database:

$ go run ./cmd/river bench --database-url "sqlite://./sqlite/bench.sqlite3" --num-total-jobs 1_000_000
bench: jobs worked [          0 ], inserted [    1000000 ], job/sec [        0.0 ] [0s]
bench: jobs worked [      83657 ], inserted [          0 ], job/sec [    41828.5 ] [2s]
bench: jobs worked [      76648 ], inserted [          0 ], job/sec [    38324.0 ] [2s]
bench: jobs worked [      88036 ], inserted [          0 ], job/sec [    44018.0 ] [2s]
bench: jobs worked [      75473 ], inserted [          0 ], job/sec [    37736.5 ] [2s]
bench: jobs worked [      82604 ], inserted [          0 ], job/sec [    41302.0 ] [2s]
bench: jobs worked [      84048 ], inserted [          0 ], job/sec [    42024.0 ] [2s]
bench: jobs worked [      85508 ], inserted [          0 ], job/sec [    42754.0 ] [2s]
bench: jobs worked [      90580 ], inserted [          0 ], job/sec [    45290.0 ] [2s]
bench: jobs worked [      83568 ], inserted [          0 ], job/sec [    41784.0 ] [2s]
bench: jobs worked [      86062 ], inserted [          0 ], job/sec [    43031.0 ] [2s]
bench: jobs worked [      88508 ], inserted [          0 ], job/sec [    44254.0 ] [2s]
bench: jobs worked [      75308 ], inserted [          0 ], job/sec [    37654.0 ] [2s]
bench: total jobs worked [    1000000 ], total jobs inserted [    1000000 ], overall job/sec [    42331.9 ], running 23.622860125s

The improved benchmarks only work for fixed job burndown mode (with the
--num-total-jobs option) because inserting jobs is still pretty slow
because it's still done one by one.

Once again, I'm pretty sure I'll be able to land some SQLite fixes
that'll make batch operations possible using json_each, and then we
should be able to make all normal operations batch-wise. That'll take
some time though, and we can get this optimization out in time for the
initial SQLite release.

@brandur brandur force-pushed the brandur-optimize-complete-happy-path branch from 2bf276b to 2c01702 Compare May 16, 2025 03:44
jobAfter := jobsAfter[0]
require.Equal(t, rivertype.JobStateCompleted, jobAfter.State)
require.WithinDuration(t, now, *jobAfter.FinalizedAt, time.Microsecond)
require.WithinDuration(t, now, *jobAfter.FinalizedAt, bundle.driver.TimePrecision())
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly ... not 100% sure how this was passing before. Should be constrained to DB-specific precision.

…ut metadata

This one follows up #870 to add an optimization for job completion where
we separate out the most common case of setting jobs to `completed`
without any metadata required and update all of them in a simplified
batch query, then do the rest of the completions afterwards.

In any non-degenerate queue, most completions will be setting success
states so this should help with real world uses, but it also helps us
significantly improve SQLite's benchmarking numbers.

Here's a new benchmark run where throughput is ~4x what it was doing
before and roughly on par with Postgres:

    $ go run ./cmd/river bench --database-url "sqlite://:memory:" --num-total-jobs 1_000_000
    bench: jobs worked [          0 ], inserted [    1000000 ], job/sec [        0.0 ] [0s]
    bench: jobs worked [      88218 ], inserted [          0 ], job/sec [    44109.0 ] [2s]
    bench: jobs worked [      91217 ], inserted [          0 ], job/sec [    45608.5 ] [2s]
    bench: jobs worked [      88858 ], inserted [          0 ], job/sec [    44429.0 ] [2s]
    bench: jobs worked [      77219 ], inserted [          0 ], job/sec [    38609.5 ] [2s]
    bench: jobs worked [      82045 ], inserted [          0 ], job/sec [    41022.5 ] [2s]
    bench: jobs worked [      84052 ], inserted [          0 ], job/sec [    42026.0 ] [2s]
    bench: jobs worked [      72028 ], inserted [          0 ], job/sec [    36014.0 ] [2s]
    bench: jobs worked [      90047 ], inserted [          0 ], job/sec [    45023.5 ] [2s]
    bench: jobs worked [      88875 ], inserted [          0 ], job/sec [    44437.5 ] [2s]
    bench: jobs worked [      89240 ], inserted [          0 ], job/sec [    44620.0 ] [2s]
    bench: jobs worked [      88842 ], inserted [          0 ], job/sec [    44421.0 ] [2s]
    bench: jobs worked [      59359 ], inserted [          0 ], job/sec [    29679.5 ] [2s]
    bench: total jobs worked [    1000000 ], total jobs inserted [    1000000 ], overall job/sec [    42822.8 ], running 23.35203575s

Here's a normal non-memory file-based database:

    $ go run ./cmd/river bench --database-url "sqlite://./sqlite/bench.sqlite3" --num-total-jobs 1_000_000
    bench: jobs worked [          0 ], inserted [    1000000 ], job/sec [        0.0 ] [0s]
    bench: jobs worked [      83657 ], inserted [          0 ], job/sec [    41828.5 ] [2s]
    bench: jobs worked [      76648 ], inserted [          0 ], job/sec [    38324.0 ] [2s]
    bench: jobs worked [      88036 ], inserted [          0 ], job/sec [    44018.0 ] [2s]
    bench: jobs worked [      75473 ], inserted [          0 ], job/sec [    37736.5 ] [2s]
    bench: jobs worked [      82604 ], inserted [          0 ], job/sec [    41302.0 ] [2s]
    bench: jobs worked [      84048 ], inserted [          0 ], job/sec [    42024.0 ] [2s]
    bench: jobs worked [      85508 ], inserted [          0 ], job/sec [    42754.0 ] [2s]
    bench: jobs worked [      90580 ], inserted [          0 ], job/sec [    45290.0 ] [2s]
    bench: jobs worked [      83568 ], inserted [          0 ], job/sec [    41784.0 ] [2s]
    bench: jobs worked [      86062 ], inserted [          0 ], job/sec [    43031.0 ] [2s]
    bench: jobs worked [      88508 ], inserted [          0 ], job/sec [    44254.0 ] [2s]
    bench: jobs worked [      75308 ], inserted [          0 ], job/sec [    37654.0 ] [2s]
    bench: total jobs worked [    1000000 ], total jobs inserted [    1000000 ], overall job/sec [    42331.9 ], running 23.622860125s

The improved benchmarks only work for fixed job burndown mode (with the
`--num-total-jobs` option) because inserting jobs is still pretty slow
because it's still done one by one.

Once again, I'm pretty sure I'll be able to land some SQLite fixes
that'll make batch operations possible using `json_each`, and then we
should be able to make all normal operations batch-wise. That'll take
some time though, and we can get this optimization out in time for the
initial SQLite release.
@brandur brandur force-pushed the brandur-optimize-complete-happy-path branch from 2c01702 to dda5dda Compare May 16, 2025 03:47
@brandur brandur marked this pull request as draft May 16, 2025 04:43
@brandur
Copy link
Contributor Author

brandur commented May 16, 2025

Ah damn, I forgot that the completer tries to set specific finalized_ats per completed job. Hard to do given current sqlc limitations.

@bgentry
Copy link
Contributor

bgentry commented May 23, 2025

Dang. Making sure you don't actually want me to review this or anything since you've hit a blocker.

@brandur
Copy link
Contributor Author

brandur commented May 24, 2025

Nah, no review required for now. Thx.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants