(mysql.info) myisam-index-statistics
Info Catalog
(mysql.info) myisam-key-cache
(mysql.info) optimizing-database-structure
(mysql.info) table-cache
7.4.7 `MyISAM' Index Statistics Collection
------------------------------------------
Storage engines collect statistics about tables for use by the
optimizer. Table statistics are based on value groups, where a value
group is a set of rows with the same key prefix value. For optimizer
purposes, an important statistic is the average value group size.
MySQL uses the average value group size in the following ways:
* To estimate how may rows must be read for each `ref' access
* To estimate how many row a partial join will produce; that is, the
number of rows that an operation of this form will produce:
(...) JOIN TBL_NAME ON TBL_NAME.KEY = EXPR
As the average value group size for an index increases, the index is
less useful for those two purposes because the average number of rows
per lookup increases: For the index to be good for optimization
purposes, it is best that each index value target a small number of
rows in the table. When a given index value yields a large number of
rows, the index is less useful and MySQL is less likely to use it.
The average value group size is related to table cardinality, which is
the number of value groups. The `SHOW INDEX' statement displays a
cardinality value based on N/S, where N is the number of rows in the
table and S is the average value group size. That ratio yields an
approximate number of value groups in the table.
For a join based on the `<=>' comparison operator, `NULL' is not
treated differently from any other value: `NULL <=> NULL', just as `N
<=> N' for any other N.
However, for a join based on the `=' operator, `NULL' is different from
non-`NULL' values: `EXPR1 = EXPR2' is not true when EXPR1 or EXPR2 (or
both) are `NULL'. This affects `ref' accesses for comparisons of the
form `TBL_NAME.KEY = EXPR': MySQL will not access the table if the
current value of EXPR is `NULL', because the comparison cannot be true.
For `=' comparisons, it does not matter how many `NULL' values are in
the table. For optimization purposes, the relevant value is the average
size of the non-`NULL' values groups. However, MySQL does not currently
allow that average size to be collected or used.
For `MyISAM' tables, you have some control over collection of table
statistics by means of the `myisam_stats_method' system variable. This
variable has two possible values, which differ as follows:
* When `myisam_stats_method' is `nulls_equal', all `NULL' values are
treated as identical (that is, they all form a single value group).
If the `NULL' value group size is much higher than the average
non-`NULL' value group size, this method skews the average value
group size upward. This makes index appear to the optimizer to be
less useful than it really is for joins that look for non-`NULL'
values. Consequently, the `nulls_equal' method may cause the
optimizer not to use the index for `ref' accesses when it should.
* When `myisam_stats_method' is `nulls_unequal', `NULL' values are
not considered the same. Instead, each `NULL' value forms a
separate value group of size 1.
If you have many `NULL' values, this method skews the average
value group size downward. If the average non-`NULL' value group
size is large, counting `NULL' values each as a group of size 1
causes the optimizer to overestimate the value of the index for
joins that look for non-`NULL' values. Consequently, the
`nulls_unequal' method may cause the optimizer to use this index
for `ref' lookups when other methods may be better.
If you tend to use many joins that use `<=>' rather than `=', `NULL'
values are not special in comparisons and one `NULL' is equal to
another. In this case, `nulls_equal' is the appropriate statistics
method.
The `myisam_stats_method' system variable has global and session
values. Setting the global value affects `MyISAM' statistics collection
for all `MyISAM' tables. Setting the session value affects statistics
collection only for the current client connection. This means that you
can force a table's statistics to be regenerated with a given method
without affecting other clients by setting the session value of
`myisam_stats_method'.
To regenerate table statistics, you can use any of the following
methods:
* Set `myisam_stats_method', and then issue a `CHECK TABLE' statement
* Execute `myisamchk --stats_method=METHOD_NAME --analyze'
* Change the table to cause its statistics to go out of date (for
example, insert a row and then delete it), and then set
`myisam_stats_method' and issue an `ANALYZE TABLE' statement
Some caveats regarding the use of `myisam_stats_method':
You can force table statistics to be collected explicitly, as just
described. However, MySQL may also collect statistics automatically.
For example, if during the course of executing statements for a table,
some of those statements modify the table, MySQL may collect
statistics. (This may occur for bulk inserts or deletes, or some `ALTER
TABLE' statements, for example.) If this happens, the statistics are
collected using whatever value `myisam_stats_method' has at the time.
Thus, if you collect statistics using one method, but
`myisam_stats_method' is set to the other method when a table's
statistics are collected automatically later, the other method will be
used.
There is no way to tell which method was used to generate statistics
for a given `MyISAM' table.
`myisam_stats_method' applies only to `MyISAM' tables. Other storage
engines have only one method for collecting table statistics. Usually
it is closer to the `nulls_equal' method.
Info Catalog
(mysql.info) myisam-key-cache
(mysql.info) optimizing-database-structure
(mysql.info) table-cache
automatically generated byinfo2html