DOC HOME SITE MAP MAN PAGES GNU INFO SEARCH PRINT BOOK
 

(mysql.info) tips

Info Catalog (mysql.info) delete-speed (mysql.info) query-speed
 
 7.2.19 Other Optimization Tips
 ------------------------------
 
 This section lists a number of miscellaneous tips for improving query
 processing speed:
 
    * Use persistent connections to the database to avoid connection
      overhead. If you cannot use persistent connections and you are
      initiating many new connections to the database, you may want to
      change the value of the `thread_cache_size' variable. See 
      server-parameters.
 
    * Always check whether all your queries really use the indexes that
      you have created in the tables. In MySQL, you can do this with the
      `EXPLAIN' statement. See  explain.
 
    * Try to avoid complex `SELECT' queries on `MyISAM' tables that are
      updated frequently, to avoid problems with table locking that occur
      due to contention between readers and writers.
 
    * With `MyISAM' tables that have no deleted rows in the middle, you
      can insert rows at the end at the same time that another query is
      reading from the table. If it is important to be able to do this,
      you should consider using the table in ways that avoid deleting
      rows. Another possibility is to run `OPTIMIZE TABLE' to defragment
      the table after you have deleted a lot of rows from it. See 
      myisam-storage-engine.
 
    * To fix any compression issues that may have occurred with
      `ARCHIVE' tables, you can use `OPTIMIZE TABLE'. See 
      archive-storage-engine.
 
    * Use `ALTER TABLE ... ORDER BY EXPR1, EXPR2, ...' if you usually
      retrieve rows in `EXPR1, EXPR2, ...' order. By using this option
      after extensive changes to the table, you may be able to get
      higher performance.
 
    * In some cases, it may make sense to introduce a column that is
      `hashed' based on information from other columns. If this column
      is short and reasonably unique, it may be much faster than a
      `wide' index on many columns. In MySQL, it is very easy to use
      this extra column:
 
           SELECT * FROM TBL_NAME
             WHERE HASH_COL=MD5(CONCAT(COL1,COL2))
             AND COL1='CONSTANT' AND COL2='CONSTANT';
 
    * For `MyISAM' tables that change frequently, you should try to
      avoid all variable-length columns (`VARCHAR', `BLOB', and `TEXT').
      The table uses dynamic row format if it includes even a single
      variable-length column. See  storage-engines.
 
    * It is normally not useful to split a table into different tables
      just because the rows become large. In accessing a row, the
      biggest performance hit is the disk seek needed to find the first
      byte of the row. After finding the data, most modern disks can
      read the entire row fast enough for most applications. The only
      cases where splitting up a table makes an appreciable difference
      is if it is a `MyISAM' table using dynamic row format that you can
      change to a fixed row size, or if you very often need to scan the
      table but do not need most of the columns. See 
      storage-engines.
 
    * If you often need to calculate results such as counts based on
      information from a lot of rows, it may be preferable to introduce
      a new table and update the counter in real time.  An update of the
      following form is very fast:
 
           UPDATE TBL_NAME SET COUNT_COL=COUNT_COL+1 WHERE KEY_COL=CONSTANT;
 
      This is very important when you use MySQL storage engines such as
      `MyISAM' that has only table-level locking (multiple readers with
      single writers). This also gives better performance with most
      database systems, because the row locking manager in this case has
      less to do.
 
    * If you need to collect statistics from large log tables, use
      summary tables instead of scanning the entire log table.
      Maintaining the summaries should be much faster than trying to
      calculate statistics `live.' Regenerating new summary tables from
      the logs when things change (depending on business decisions) is
      faster than changing the running application.
 
    * If possible, you should classify reports as `live' or as
      `statistical,' where data needed for statistical reports is
      created only from summary tables that are generated periodically
      from the live data.
 
    * Take advantage of the fact that columns have default values.
      Insert values explicitly only when the value to be inserted
      differs from the default. This reduces the parsing that MySQL must
      do and improves the insert speed.
 
    * In some cases, it is convenient to pack and store data into a
      `BLOB' column. In this case, you must provide code in your
      application to pack and unpack information, but this may save a
      lot of accesses at some stage. This is practical when you have
      data that does not conform well to a rows-and-columns table
      structure.
 
    * Normally, you should try to keep all data non-redundant (observing
      what is referred to in database theory as third normal form).
      However, there may be situations in which it can be advantageous to
      duplicate information or create summary tables to gain more speed.
 
    * Stored routines or UDFs (user-defined functions) may be a good way
      to gain performance for some tasks. See  stored-procedures,
      and  adding-functions, for more information.
 
    * You can always gain something by caching queries or answers in
      your application and then performing many inserts or updates
      together. If your database system supports table locks (as do
      MySQL and Oracle), this should help to ensure that the index cache
      is only flushed once after all updates.  You can also take
      advantage of MySQL's query cache to achieve similar results; see
       query-cache.
 
    * Use `INSERT DELAYED' when you do not need to know when your data
      is written. This reduces the overall insertion impact because many
      rows can be written with a single disk write.
 
    * Use `INSERT LOW_PRIORITY' when you want to give `SELECT'
      statements higher priority than your inserts.
 
    * Use `SELECT HIGH_PRIORITY' to get retrievals that jump the queue.
      That is, the `SELECT' is executed even if there is another client
      waiting to do a write.
 
    * Use multiple-row `INSERT' statements to store many rows with one
      SQL statement. Many SQL servers support this, including MySQL.
 
    * Use `LOAD DATA INFILE' to load large amounts of data. This is
      faster than using `INSERT' statements.
 
    * Use `AUTO_INCREMENT' columns to generate unique values.
 
    * Use `OPTIMIZE TABLE' once in a while to avoid fragmentation with
      dynamic-format `MyISAM' tables. See  myisam-table-formats.
 
    * Use `MEMORY' (`HEAP') tables when possible to get more speed. See
       memory-storage-engine.  `MEMORY' tables are useful for
      non-critical data that is accessed often, such as information
      about the last displayed banner for users who don't have cookies
      enabled in their Web browser. User sessions are another
      alternative available in many Web application environments for
      handling volatile state data.
 
    * With Web servers, images and other binary assets should normally
      be stored as files. That is, store only a reference to the file
      rather than the file itself in the database.  Most Web servers are
      better at caching files than database contents, so using files is
      generally faster.
 
    * Columns with identical information in different tables should be
      declared to have identical data types so that joins based on the
      corresponding columns will be faster.
 
    * Try to keep column names simple. For example, in a table named
      `customer', use a column name of `name' instead of
      `customer_name'. To make your names portable to other SQL servers,
      you should keep them shorter than 18 characters.
 
    * If you need really high speed, you should take a look at the
      low-level interfaces for data storage that the different SQL
      servers support. For example, by accessing the MySQL `MyISAM'
      storage engine directly, you could get a speed increase of two to
      five times compared to using the SQL interface. To be able to do
      this, the data must be on the same server as the application, and
      usually it should only be accessed by one process (because
      external file locking is really slow). One could eliminate these
      problems by introducing low-level `MyISAM' commands in the MySQL
      server (this could be one easy way to get more performance if
      needed). By carefully designing the database interface, it should
      be quite easy to support this type of optimization.
 
    * If you are using numerical data, it is faster in many cases to
      access information from a database (using a live connection) than
      to access a text file. Information in the database is likely to be
      stored in a more compact format than in the text file, so
      accessing it involves fewer disk accesses. You also save code in
      your application because you need not parse your text files to
      find line and column boundaries.
 
    * Replication can provide a performance benefit for some operations.
      You can distribute client retrievals among replication servers to
      split up the load. To avoid slowing down the master while making
      backups, you can make backups using a slave server. See 
      replication.
 
    * Declaring a `MyISAM' table with the `DELAY_KEY_WRITE=1' table
      option makes index updates faster because they are not flushed to
      disk until the table is closed. The downside is that if something
      kills the server while such a table is open, you should ensure
      that the table is okay by running the server with the
      -myisam-recover option, or by running `myisamchk' before
      restarting the server.  (However, even in this case, you should
      not lose anything by using `DELAY_KEY_WRITE', because the key
      information can always be generated from the data rows.)
 
Info Catalog (mysql.info) delete-speed (mysql.info) query-speed
automatically generated byinfo2html