DOC HOME SITE MAP MAN PAGES GNU INFO SEARCH PRINT BOOK
 

(mysql.info) subquery-restrictions

Info Catalog (mysql.info) cursor-restrictions (mysql.info) restrictions (mysql.info) view-restrictions
 
 I.3 Restrictions on Subqueries
 ==============================
 
    * Known bug to be fixed later: If you compare a `NULL' value to a
      subquery using `ALL', `ANY', or `SOME', and the subquery returns
      an empty result, the comparison might evaluate to the non-standard
      result of `NULL' rather than to `TRUE' or `FALSE'.
 
    * A subquery's outer statement can be any one of: `SELECT', `INSERT',
      `UPDATE', `DELETE', `SET', or `DO'.
 
    * Subquery optimization for `IN' is not as effective as for the `='
      operator or for `IN(VALUE_LIST)' constructs.
 
      A typical case for poor `IN' subquery performance is when the
      subquery returns a small number of rows but the outer query
      returns a large number of rows to be compared to the subquery
      result.
 
      The problem is that, for a statement that uses an `IN' subquery,
      the optimizer rewrites it as a correlated subquery. Consider the
      following statement that uses an uncorrelated subquery:
 
           SELECT ... FROM t1 WHERE t1.a IN (SELECT b FROM t2);
 
      The optimizer rewrites the statement to a correlated subquery:
 
           SELECT ... FROM t1 WHERE EXISTS (SELECT 1 FROM t2 WHERE t2.b = t1.a);
 
      If the inner and outer queries return M and N rows, respectively,
      the execution time becomes on the order of `O(M×N)', rather than
      `O(M+N)' as it would be for an uncorrelated subquery.
 
      An implication is that an `IN' subquery can be much slower than a
      query written using an `IN(VALUE_LIST)' construct that lists the
      same values that the subquery would return.
 
    * In general, you cannot modify a table and select from the same
      table in a subquery. For example, this limitation applies to
      statements of the following forms:
 
           DELETE FROM t WHERE ... (SELECT ... FROM t ...);
           UPDATE t ... WHERE col = (SELECT ... FROM t ...);
           {INSERT|REPLACE} INTO t (SELECT ... FROM t ...);
 
      Exception: The preceding prohibition does not apply if you are
      using a subquery for the modified table in the `FROM' clause.
      Example:
 
           UPDATE t ... WHERE col = (SELECT (SELECT ... FROM t...) AS _t ...);
 
      Here the prohibition does not apply because a subquery in the
      `FROM' clause is materialized as a temporary table, so the
      relevant rows in `t' have already been selected by the time the
      update to `t' takes place.
 
    * Row comparison operations are only partially supported:
 
         * For `EXPR IN (SUBQUERY)', EXPR can be an N-tuple (specified
           via row constructor syntax) and the subquery can return rows
           of N-tuples.
 
         * For `EXPR OP {ALL|ANY|SOME} (SUBQUERY)', EXPR must be a
           scalar value and the subquery must be a column subquery; it
           cannot return multiple-column rows.
 
      In other words, for a subquery that returns rows of N-tuples, this
      is supported:
 
           (VAL_1, ..., VAL_N) IN (SUBQUERY)
 
      But this is not supported:
 
           (VAL_1, ..., VAL_N) OP {ALL|ANY|SOME} (SUBQUERY)
 
      The reason for supporting row comparisons for `IN' but not for the
      others is that `IN' is implemented by rewriting it as a sequence
      of `=' comparisons and `AND' operations. This approach cannot be
      used for `ALL', `ANY', or `SOME'.
 
    * Row constructors are not well optimized. The following two
      expressions are equivalent, but only the second can be optimized:
 
           (col1, col2, ...) = (val1, val2, ...)
           col1 = val1 AND col2 = val2 AND ...
 
    * Subqueries in the `FROM' clause cannot be correlated subqueries.
      They are materialized (executed to produce a result set) before
      evaluating the outer query, so they cannot be evaluated per row of
      the outer query.
 
    * The optimizer is more mature for joins than for subqueries, so in
      many cases a statement that uses a subquery can be executed more
      efficiently if you rewrite it as a join.
 
      An exception occurs for the case where an `IN' subquery can be
      rewritten as a `SELECT DISTINCT' join. Example:
 
           SELECT col FROM t1 WHERE id_col IN (SELECT id_col2 FROM t2 WHERE CONDITION);
 
      That statement can be rewritten as follows:
 
           SELECT DISTINCT col FROM t1, t2 WHERE t1.id_col = t2.id_col AND CONDITION;
 
      But in this case, the join requires an extra `DISTINCT' operation
      and is not more efficient than the subquery.
 
    * Possible future optimization: MySQL does not rewrite the join
      order for subquery evaluation. In some cases, a subquery could be
      executed more efficiently if MySQL rewrote it as a join.  This
      would give the optimizer a chance to choose between more execution
      plans. For example, it could decide whether to read one table or
      the other first.
 
      Example:
 
           SELECT a FROM outer_table AS ot
           WHERE a IN (SELECT a FROM inner_table AS it WHERE ot.b = it.b);
 
      For that query, MySQL always scans `outer_table' first and then
      executes the subquery on `inner_table' for each row. If
      `outer_table' has a lot of rows and `inner_table' has few rows,
      the query probably will not be as fast as it could be.
 
      The preceding query could be rewritten like this:
 
           SELECT a FROM outer_table AS ot, inner_table AS it
           WHERE ot.a = it.a AND ot.b = it.b;
 
      In this case, we can scan the small table (`inner_table') and look
      up rows in `outer_table', which will be fast if there is an index
      on `(ot.a,ot.b)'.
 
    * Possible future optimization: A correlated subquery is evaluated
      for each row of the outer query. A better approach is that if the
      outer row values do not change from the previous row, do not
      evaluate the subquery again. Instead, use its previous result.
 
    * Possible future optimization: A subquery in the `FROM' clause is
      evaluated by materializing the result into a temporary table, and
      this table does not use indexes. This does not allow the use of
      indexes in comparison with other tables in the query, although
      that might be useful.
 
    * Possible future optimization: If a subquery in the `FROM' clause
      resembles a view to which the merge algorithm can be applied,
      rewrite the query and apply the merge algorithm so that indexes
      can be used. The following statement contains such a subquery:
 
           SELECT * FROM (SELECT * FROM t1 WHERE t1.t1_col) AS _t1, t2 WHERE t2.t2_col;
 
      The statement can be rewritten as a join like this:
 
           SELECT * FROM t1, t2 WHERE t1.t1_col AND t2.t2_col;
 
      This type of rewriting would provide two benefits:
 
         * It avoids the use of a temporary table for which no indexes
           can be used. In the rewritten query, the optimizer can use
           indexes on `t1'.
 
         * It gives the optimizer more freedom to choose between
           different execution plans. For example, rewriting the query
           as a join allows the optimizer to use `t1' or `t2' first.
 
    * Possible future optimization: For `IN', `= ANY', `<> ANY', `=
      ALL', and `<> ALL' with non-correlated subqueries, use an
      in-memory hash for a result result or a temporary table with an
      index for larger results. Example:
 
           SELECT a FROM big_table AS bt
           WHERE non_key_field IN (SELECT non_key_field FROM TABLE WHERE CONDITION)
 
      In this case, we could create a temporary table:
 
           CREATE TABLE t (key (non_key_field))
           (SELECT non_key_field FROM TABLE WHERE CONDITION)
 
      Then, for each row in `big_table', do a key lookup in `t' based on
      `bt.non_key_field'.
 
Info Catalog (mysql.info) cursor-restrictions (mysql.info) restrictions (mysql.info) view-restrictions
automatically generated byinfo2html