speed up queries with sdd

Sasha Pachev sasha at asksasha.com
Mon Oct 1 13:01:48 MDT 2012


When MySQL performance improves significantly from improvements in
disk I/O it is almost always a bad sign. Consider this - if upgrading
your car improved your travel time from New York to Los Angeles by
20%, is that good or bad? On the surface it looks great - we are 20%
faster. But if we stop to think we will realize that we do have a
problem. We did not use the fastest mode of travel. If getting a
faster car improved your travel speed by 20%, it means that you went
most of the way for all that distance by car instead of a plane.

So if improving I/O system gives you better MySQL performance it means
you were I/O bound. In some corner cases being I/O bound cannot be
avoided. But in most cases, being I/O bound is a result of poor
application design, bad schema, bad queries, and/or (I would as a
separate item) lack of creativity. If you got whipped with forty
stripes save one for every time your application did something that
caused disk I/O, I bet you would quickly figure out a way to minimize
disk access. Now just visualize that, and I can guarantee you you will
find a way to avoid writing to disk, it is not that hard when you put
your mind to it.

Some things you can do:

- compact your data: e.g
  -it takes only 16 bytes to store an MD5 sum, and you will not
believe how many people use 32 by storing its string representation
instead of the binary image
  - do you really have to store that column as UTF-8?
  - compact that stupid serialized object - it takes only 4 bytes to
store a 32-bit int, do not store it as string, if your values are male
and female, you can represent it with one bit, no need to store
strings "male" and "female", floats make really bulky strings, do you
really have to store all of the meta information?
   - check your column types
   - move the rarely accessed data into another table and keep the hot
data in a small table

- batch updates in the application and flush them at once every second
or so. For critical updates that have to make it to the database
before you can proceed, just wait in the app until they made it. The
user will not mind waiting a second, and he will be much happier than
having to wait 30 when your database begins to suffer from frequent
small updates

- optimize your queries. Enable --slow-query-log, set long-query-time
to 0.1 (or lower), and question everything that lands in the log. If
you see that examined rows is much higher than matched rows, there is
a potential for optimization. Some common themes:

     - expecting too much from the optimizer - you cannot just throw a
query at the optimizer and hope for the best. Be aware of the table
structure and data size. Run EXPLAIN at least in your mind (better in
actuality on the real production data) before you make a query live.
Make sure you understand what the optimizer is doing. Do not write
queries so complex that you do not understand how in the world it is
going to give you the data you want. 10 simple queries that you
understand are better than one you do not to do the same job.
   - bad paging queries ORDER BY .. LIMIT 1500,1600 will  look at the
first 1500 records before it will give you the row you want. Use a key
instead and remember its value on the previous page.
  - overuse of sub-selects without the understanding how they are optimized
  - compulsion to do everything in one query. Remember, MySQL
client-server protocol and thread initialization overhead is very very
light. So 10 queries is no big deal. Big deal is a monster query that
scans millions of rows.

- cache what you can.

- streamline the application. set long-query-time to 0 for 15 minutes
and just watch what individual connections are doing. See if the
clients are fetching the data they already should have, or that you
know they are not going to use.

- can you live with innodb-flush-log-at-trx-commit = 2 ? This survives
mysqld coredumps and kill -9 no worse than if you set it to 1. The
only drawback is a kernel crash or power down - you will experience
effectively a consistent rollback to the point in time of the last log
sync less than 1 second ago. If you prefer to lose 1 second worth of
transactions only in case of a really bad crash as opposed to losing
system availability altogether because innodb is syncing itself to
death, keep that setting at 2.


-- 
Sasha Pachev

Fast Running Blog.
http://fastrunningblog.com
Run. Blog. Improve. Repeat.


More information about the PLUG mailing list