DOC HOME SITE MAP MAN PAGES GNU INFO SEARCH PRINT BOOK
 

(mysql.info) myisam-index-statistics

Info Catalog (mysql.info) myisam-key-cache (mysql.info) optimizing-database-structure (mysql.info) table-cache
 
 7.4.7 `MyISAM' Index Statistics Collection
 ------------------------------------------
 
 Storage engines collect statistics about tables for use by the
 optimizer. Table statistics are based on value groups, where a value
 group is a set of rows with the same key prefix value. For optimizer
 purposes, an important statistic is the average value group size.
 
 MySQL uses the average value group size in the following ways:
 
    * To estimate how may rows must be read for each `ref' access
 
    * To estimate how many row a partial join will produce; that is, the
      number of rows that an operation of this form will produce:
 
           (...) JOIN TBL_NAME ON TBL_NAME.KEY = EXPR
 
 As the average value group size for an index increases, the index is
 less useful for those two purposes because the average number of rows
 per lookup increases: For the index to be good for optimization
 purposes, it is best that each index value target a small number of
 rows in the table. When a given index value yields a large number of
 rows, the index is less useful and MySQL is less likely to use it.
 
 The average value group size is related to table cardinality, which is
 the number of value groups. The `SHOW INDEX' statement displays a
 cardinality value based on N/S, where N is the number of rows in the
 table and S is the average value group size. That ratio yields an
 approximate number of value groups in the table.
 
 For a join based on the `<=>' comparison operator, `NULL' is not
 treated differently from any other value: `NULL <=> NULL', just as `N
 <=> N' for any other N.
 
 However, for a join based on the `=' operator, `NULL' is different from
 non-`NULL' values: `EXPR1 = EXPR2' is not true when EXPR1 or EXPR2 (or
 both) are `NULL'. This affects `ref' accesses for comparisons of the
 form `TBL_NAME.KEY = EXPR': MySQL will not access the table if the
 current value of EXPR is `NULL', because the comparison cannot be true.
 
 For `=' comparisons, it does not matter how many `NULL' values are in
 the table. For optimization purposes, the relevant value is the average
 size of the non-`NULL' values groups. However, MySQL does not currently
 allow that average size to be collected or used.
 
 For `MyISAM' tables, you have some control over collection of table
 statistics by means of the `myisam_stats_method' system variable. This
 variable has two possible values, which differ as follows:
 
    * When `myisam_stats_method' is `nulls_equal', all `NULL' values are
      treated as identical (that is, they all form a single value group).
 
      If the `NULL' value group size is much higher than the average
      non-`NULL' value group size, this method skews the average value
      group size upward. This makes index appear to the optimizer to be
      less useful than it really is for joins that look for non-`NULL'
      values. Consequently, the `nulls_equal' method may cause the
      optimizer not to use the index for `ref' accesses when it should.
 
    * When `myisam_stats_method' is `nulls_unequal', `NULL' values are
      not considered the same. Instead, each `NULL' value forms a
      separate value group of size 1.
 
      If you have many `NULL' values, this method skews the average
      value group size downward. If the average non-`NULL' value group
      size is large, counting `NULL' values each as a group of size 1
      causes the optimizer to overestimate the value of the index for
      joins that look for non-`NULL' values. Consequently, the
      `nulls_unequal' method may cause the optimizer to use this index
      for `ref' lookups when other methods may be better.
 
 If you tend to use many joins that use `<=>' rather than `=', `NULL'
 values are not special in comparisons and one `NULL' is equal to
 another. In this case, `nulls_equal' is the appropriate statistics
 method.
 
 The `myisam_stats_method' system variable has global and session
 values. Setting the global value affects `MyISAM' statistics collection
 for all `MyISAM' tables. Setting the session value affects statistics
 collection only for the current client connection. This means that you
 can force a table's statistics to be regenerated with a given method
 without affecting other clients by setting the session value of
 `myisam_stats_method'.
 
 To regenerate table statistics, you can use any of the following
 methods:
 
    * Set `myisam_stats_method', and then issue a `CHECK TABLE' statement
 
    * Execute `myisamchk --stats_method=METHOD_NAME --analyze'
 
    * Change the table to cause its statistics to go out of date (for
      example, insert a row and then delete it), and then set
      `myisam_stats_method' and issue an `ANALYZE TABLE' statement
 
 Some caveats regarding the use of `myisam_stats_method':
 
 You can force table statistics to be collected explicitly, as just
 described. However, MySQL may also collect statistics automatically.
 For example, if during the course of executing statements for a table,
 some of those statements modify the table, MySQL may collect
 statistics. (This may occur for bulk inserts or deletes, or some `ALTER
 TABLE' statements, for example.) If this happens, the statistics are
 collected using whatever value `myisam_stats_method' has at the time.
 Thus, if you collect statistics using one method, but
 `myisam_stats_method' is set to the other method when a table's
 statistics are collected automatically later, the other method will be
 used.
 
 There is no way to tell which method was used to generate statistics
 for a given `MyISAM' table.
 
 `myisam_stats_method' applies only to `MyISAM' tables. Other storage
 engines have only one method for collecting table statistics. Usually
 it is closer to the `nulls_equal' method.
 
Info Catalog (mysql.info) myisam-key-cache (mysql.info) optimizing-database-structure (mysql.info) table-cache
automatically generated byinfo2html