Discussion:
Benchmarking SQL drivers
Stephen R. van den Berg
2014-11-20 14:36:39 UTC
Permalink
Well, now that the dust has settled, it was/is time to compare the results
with other drivers:

I ran a benchmark over a table with 1163 entries, 10 columns.
Where a (very straightforward) SELECT pulls 166 of those entries out,
returning two columns. The intent is to benchmark the driver, not the
quality of the SQL engine.
I open a single connection to the database.
I run 10 queries in rapid succession (big_query), fetch
all results using fetch_row().
And run those 10 queries in a loop for 394 times.

What I measure is throughput to/from the database, the queries probably
are IO-bound (they fit in the database cache) by the network(latencies). The
database is on a different machine on the same ethernet.

11.3s pgsql Pike 8.1 (new driver, multithreaded)
17.3s MySQL Pike 8.1
18.4s pgsql Pike 7.8 (old driver, single threaded)
32.7s postgres Pike 8.1 (C-libpq)

MySQL daemon runs 5.5.39.
PostgreSQL server runs 9.3.4.

Running the tests multiple times shows a variation of the timings
of less than 1%. The database server was idle most of the time.

Running multiple queries at the same time over the same connection will
likely reveal that you can run (up to a certain amount) multiple queries
simultaneously without consuming extra wallclock time.
However since the other three drivers do not support that, a comparison
benchmark will be pointless.
--
Stephen.
Peter Bortas
2014-11-20 14:57:07 UTC
Permalink
Well done!
--
Peter Bortas
Post by Stephen R. van den Berg
Well, now that the dust has settled, it was/is time to compare the results
I ran a benchmark over a table with 1163 entries, 10 columns.
Where a (very straightforward) SELECT pulls 166 of those entries out,
returning two columns. The intent is to benchmark the driver, not the
quality of the SQL engine.
I open a single connection to the database.
I run 10 queries in rapid succession (big_query), fetch
all results using fetch_row().
And run those 10 queries in a loop for 394 times.
What I measure is throughput to/from the database, the queries probably
are IO-bound (they fit in the database cache) by the network(latencies). The
database is on a different machine on the same ethernet.
11.3s pgsql Pike 8.1 (new driver, multithreaded)
17.3s MySQL Pike 8.1
18.4s pgsql Pike 7.8 (old driver, single threaded)
32.7s postgres Pike 8.1 (C-libpq)
MySQL daemon runs 5.5.39.
PostgreSQL server runs 9.3.4.
Running the tests multiple times shows a variation of the timings
of less than 1%. The database server was idle most of the time.
Running multiple queries at the same time over the same connection will
likely reveal that you can run (up to a certain amount) multiple queries
simultaneously without consuming extra wallclock time.
However since the other three drivers do not support that, a comparison
benchmark will be pointless.
--
Stephen.
Stephen R. van den Berg
2014-11-20 14:57:42 UTC
Permalink
Post by Stephen R. van den Berg
What I measure is throughput to/from the database, the queries probably
11.3s pgsql Pike 8.1 (new driver, multithreaded)
17.3s MySQL Pike 8.1
18.4s pgsql Pike 7.8 (old driver, single threaded)
32.7s postgres Pike 8.1 (C-libpq)
Perhaps it is obvious, but the displayed values are seconds to finish, i.e.
the lower number wins.
--
Stephen.
Stephen R. van den Berg
2014-11-20 16:04:35 UTC
Permalink
The driver turns out to have some peculiar properties under load:

This is 3940 queries running in batches of 1:
19.4s pgsql Pike 8.1 (new driver, multithreaded)
16.5s MySQL Pike 8.1
15.6s pgsql Pike 7.8 (old driver, single threaded)
21.3s postgres Pike 8.1 (C-libpq)

This is 3940 queries running in batches of 2:
23.5s pgsql Pike 8.1 (new driver, multithreaded)
19.4s MySQL Pike 8.1
21.4s pgsql Pike 7.8 (old driver, single threaded)
34.8s postgres Pike 8.1 (C-libpq)

This is 3940 queries running in batches of 5:
15.6s pgsql Pike 8.1 (new driver, multithreaded)
18.6s MySQL Pike 8.1
18.6s pgsql Pike 7.8 (old driver, single threaded)
31.1s postgres Pike 8.1 (C-libpq)

This is 3940 queries running in batches of 10 (the initial benchmark):
11.3s pgsql Pike 8.1 (new driver, multithreaded)
17.3s MySQL Pike 8.1
18.4s pgsql Pike 7.8 (old driver, single threaded)
32.7s postgres Pike 8.1 (C-libpq)

This is 3940 queries running in batches of 100:
6.1s pgsql Pike 8.1 (new driver, multithreaded)
15.1s MySQL Pike 8.1
16.8s pgsql Pike 7.8 (old driver, single threaded)
29.2s postgres Pike 8.1 (C-libpq)

Which shows that:
- The still is room for improvement in batches <5, where I'm beaten
by my old singlethreaded driver. That should not have been possible,
so I'm going to compare network dumps there.
- Amazingly the new driver gets spectacularly faster at larger batch sizes.
I'd guess that due to larger batch sizes, more network traffic
is being coalesced to utilise full packets.
--
Stephen.
Stephen R. van den Berg
2014-11-20 22:51:28 UTC
Permalink
New measurements after 2 fixes.
The PostgreSQL database had a slightly lower background load this time,
so I redid the two pgsql benchmarks every time.

The still remaining edge of the old pgsql driver over the new one
when using batches of one can be due to two things:
a. The old C-helper function for decoding the arguments of a row was somewhat
more efficient than the current Pike only version.
b. The extra Mutexes needed to get Buffer output and input properly
working with the nonblocking socket might add some latency.

However, it might not be worth the extra trouble because:
- I still beat the MySQL driver in the batches-of-1 case (ever so slightly).
- The difference with the old pgsql driver is perhaps sufficiently small.

This is 3940 queries running in batches of 1:
16.2s pgsql Pike 8.1 (new driver, multithreaded)
16.5s MySQL Pike 8.1
15.2s pgsql Pike 7.8 (old driver, single threaded)
21.3s postgres Pike 8.1 (C-libpq)

This is 3940 queries running in batches of 2:
20.3s pgsql Pike 8.1 (new driver, multithreaded)
19.4s MySQL Pike 8.1
20.3s pgsql Pike 7.8 (old driver, single threaded)
34.8s postgres Pike 8.1 (C-libpq)

This is 3940 queries running in batches of 5:
14.7s pgsql Pike 8.1 (new driver, multithreaded)
18.6s MySQL Pike 8.1
18.0s pgsql Pike 7.8 (old driver, single threaded)
31.1s postgres Pike 8.1 (C-libpq)

This is 3940 queries running in batches of 10 (the initial benchmark):
10.9s pgsql Pike 8.1 (new driver, multithreaded)
17.3s MySQL Pike 8.1
17.6s pgsql Pike 7.8 (old driver, single threaded)
32.7s postgres Pike 8.1 (C-libpq)

This is 3940 queries running in batches of 100:
6.0s pgsql Pike 8.1 (new driver, multithreaded)
15.1s MySQL Pike 8.1
16.3s pgsql Pike 7.8 (old driver, single threaded)
29.2s postgres Pike 8.1 (C-libpq)
--
Stephen.
Stephen R. van den Berg
2014-11-22 02:07:06 UTC
Permalink
Post by Stephen R. van den Berg
16.2s pgsql Pike 8.1 (new driver, multithreaded)
16.5s MySQL Pike 8.1
15.2s pgsql Pike 7.8 (old driver, single threaded)
21.3s postgres Pike 8.1 (C-libpq)
We now get (batches of 1):
15.5s pgsql Pike 8.1 (new driver, multithreaded)
16.5s MySQL Pike 8.1
15.2s pgsql Pike 7.8 (old driver, single threaded)
Post by Stephen R. van den Berg
20.3s pgsql Pike 8.1 (new driver, multithreaded)
19.4s MySQL Pike 8.1
20.3s pgsql Pike 7.8 (old driver, single threaded)
34.8s postgres Pike 8.1 (C-libpq)
We now get (batches of 2):
20.0s pgsql Pike 8.1 (new driver, multithreaded)
19.4s MySQL Pike 8.1
20.3s pgsql Pike 7.8 (old driver, single threaded)

The higher batchcount tests yielded no significant difference to the
previous benchark.

Getting even higher performance means that I would have to abandon the
Pike-only driver solution, and I would have to start using a cmod
helper module again.
However, since the new driver is only 2% slower than the old driver,
it is not worth the trouble to try and weed that out.
--
Stephen.
Chris Angelico
2014-11-20 15:17:20 UTC
Permalink
Post by Stephen R. van den Berg
11.3s pgsql Pike 8.1 (new driver, multithreaded)
17.3s MySQL Pike 8.1
18.4s pgsql Pike 7.8 (old driver, single threaded)
32.7s postgres Pike 8.1 (C-libpq)
Impressive! A three-to-one improvement over the C library is doing well!

ChrisA
Per Hedbor () @ Pike (-) importmöte för mailinglistan
2014-11-24 10:15:01 UTC
Permalink
Post by Stephen R. van den Berg
Getting even higher performance means that I would have to abandon the
Pike-only driver solution, and I would have to start using a cmod
helper module again.
Nice results. I am considering writing a pike mysql driver, thus
getting rid of yet another 'must install' package when building pike.


By the way, with regards to c71051037ee, "Workaround c?-> bug":

It is not really a bug. The issue you are having is that the final
apply () operator is not conditional (so the fix is not complete, you
will still get errors if the connection is gone, as an example).

I have an idea about this:

After a ?-> in an expression (tree), set a flag to indicate that any
APPLY:s should return 0 if they are called on 0.

It would be fairly trivial, and make the ?-> operator do what people
expect it to, allowing you to type things like fd?->query_fd().
Stephen R. van den Berg
2014-11-24 10:51:29 UTC
Permalink
Post by Per Hedbor () @ Pike (-) importmöte för mailinglistan
Post by Stephen R. van den Berg
Getting even higher performance means that I would have to abandon the
Pike-only driver solution, and I would have to start using a cmod
helper module again.
Nice results. I am considering writing a pike mysql driver, thus
getting rid of yet another 'must install' package when building pike.
Is the MySQL network protocol sufficiently stable?
But, yes, pgsql shows that it is doable and it is quite likely that
one can match or even improve on the C-lib version.

In case of pgsql, there are two main reasons why it can beat the
native C-drivers:
- It is able to utilise network protocol features which the native C-driver
doesn't (fully) exercise (binary/smart batching/multiple queryportals).
- It is able to use the (idle) time between network packets to do some
(pre/post)processing, thus lowering the overall latency.
I'd guess that the MySQL network protocol offers similar opportunities.

Incidentally, I checked to see what the performance difference is
between storing undecoded row-blobs (in pgsql) vs. carving out the columns
and creating the result array in Pike, it turns out to be some 15% max
speedup (for the simple two-column test-query), but even if that decoding
is to be done in C, it still is going to consume CPU, so in effect adding
back a dedicated C-helper is not really effective (and would, yet again,
introduce a potential source of bugs, and is not as easy to maintain).
Post by Per Hedbor () @ Pike (-) importmöte för mailinglistan
It is not really a bug. The issue you are having is that the final
apply () operator is not conditional (so the fix is not complete, you
will still get errors if the connection is gone, as an example).
Well, technically it may not be a bug, but currently it violates the
"principle of least surprise".
Post by Per Hedbor () @ Pike (-) importmöte för mailinglistan
After a ?-> in an expression (tree), set a flag to indicate that any
APPLY:s should return 0 if they are called on 0.
It would be fairly trivial, and make the ?-> operator do what people
expect it to, allowing you to type things like fd?->query_fd().
Sounds good. But I'll leave the expression tree fiddling to you :-).
--
Stephen.
Henrik Grubbström
2014-12-02 14:56:39 UTC
Permalink
Post by Per Hedbor () @ Pike (-) importmöte för mailinglistan
Post by Stephen R. van den Berg
Getting even higher performance means that I would have to abandon the
Pike-only driver solution, and I would have to start using a cmod
helper module again.
Nice results. I am considering writing a pike mysql driver, thus
getting rid of yet another 'must install' package when building pike.
Please do. I took a look at it a few (~3?) years ago, and it didn't seem
like it would be complicated to do.
--
Henrik Grubbström ***@roxen.com
Roxen Internet Software AB
Per Hedbor () @ Pike (-) importmöte för mailinglistan
2014-11-24 11:10:02 UTC
Permalink
Post by Stephen R. van den Berg
Is the MySQL network protocol sufficiently stable?
Well, yes, except for new features it has not really changed since
about mysql 2.13 (about forever).

2.13: size field changed to 4 bytes from 2 (for query/reply packets)
5.0: stored procedures
5.1: multi-query stored procedures (with multiple replies)

Anyway, it is probably safe to only target 2.13+.

Note: I am not saying I am going to do this right now. :)
--
Per Hedbor
Stephen R. van den Berg
2014-11-24 11:20:11 UTC
Permalink
Post by Per Hedbor () @ Pike (-) importmöte för mailinglistan
Post by Stephen R. van den Berg
Is the MySQL network protocol sufficiently stable?
5.0: stored procedures
5.1: multi-query stored procedures (with multiple replies)
I think PostgreSQL supports some similar direct call to stored procedures.
I decided not to bother with that, and simply make people call
those using "SELECT storeproc();" instead. It has identicaly performance.
God only knows why people add those constructs to network protocols.
Post by Per Hedbor () @ Pike (-) importmöte för mailinglistan
Note: I am not saying I am going to do this right now. :)
:-). No problem. As you might have noticed, I'm trying to abstain from
MySQL in all possible ways, so I'm not holding my breath for it.
--
Stephen.
Henrik Grubbström
2014-12-02 15:04:56 UTC
Permalink
Post by Per Hedbor () @ Pike (-) importmöte för mailinglistan
Post by Stephen R. van den Berg
Is the MySQL network protocol sufficiently stable?
Well, yes, except for new features it has not really changed since
about mysql 2.13 (about forever).
2.13: size field changed to 4 bytes from 2 (for query/reply packets)
Somewhere around 4.0 the login protocol changed to use a different
password validation method.
Post by Per Hedbor () @ Pike (-) importmöte för mailinglistan
5.0: stored procedures
5.1: multi-query stored procedures (with multiple replies)
Anyway, it is probably safe to only target 2.13+.
True.
Post by Per Hedbor () @ Pike (-) importmöte för mailinglistan
Note: I am not saying I am going to do this right now. :)
:-)
--
Henrik Grubbström ***@roxen.com
Roxen Internet Software AB
Loading...