DOC HOME SITE MAP MAN PAGES GNU INFO SEARCH PRINT BOOK
 

analyze(5)




ANALYZE()                 SQL Commands                  ANALYZE()


NAME

     ANALYZE - collect statistics about a database


SYNOPSIS

     ANALYZE [ VERBOSE ] [ table [ (column [, ...] ) ] ]


DESCRIPTION

     ANALYZE collects statistics about the contents of tables  in
     the  database,  and  stores  the results in the system table
     pg_statistic. Subsequently, the  query  planner  uses  these
     statistics  to  help  determine the most efficient execution
     plans for queries.

     With no parameter,  ANALYZE  examines  every  table  in  the
     current  database.  With  a parameter, ANALYZE examines only
     that table. It is further possible to give a list of  column
     names,  in  which case only the statistics for those columns
     are collected.


PARAMETERS

     VERBOSE
          Enables display of progress messages.

     table
          The name  (possibly  schema-qualified)  of  a  specific
          table to analyze. Defaults to all tables in the current
          database.

     column
          The name of a specific column to analyze.  Defaults  to
          all columns.


OUTPUTS

     When VERBOSE is specified, ANALYZE emits  progress  messages
     to  indicate which table is currently being processed. Vari-
     ous statistics about the tables are printed as well.


NOTES

     It is a good idea to run ANALYZE periodically, or just after
     making  major  changes  in the contents of a table. Accurate
     statistics  will  help  the  planner  to  choose  the   most
     appropriate  query  plan,  and  thereby improve the speed of
     query  processing.  A  common  strategy  is  to  run  VACUUM
     [vacuum(5)]  and  ANALYZE once a day during a low-usage time
     of day.

     Unlike VACUUM FULL, ANALYZE requires only a read lock on the
     target  table, so it can run in parallel with other activity
     on the table.


SQL - Language StatementLast change: 2008-01-03 1


ANALYZE()                 SQL Commands                  ANALYZE()

     The statistics collected by ANALYZE usually include  a  list
     of  some of the most common values in each column and a his-
     togram showing the approximate  data  distribution  in  each
     column. One or both of these may be omitted if ANALYZE deems
     them uninteresting (for example,  in  a  unique-key  column,
     there  are no common values) or if the column data type does
     not support the appropriate operators. There is more  infor-
     mation about the statistics in in the documentation.

     For large tables, ANALYZE takes a random sample of the table
     contents,  rather than examining every row. This allows even
     very large tables to be analyzed in a small amount of  time.
     Note, however, that the statistics are only approximate, and
     will change slightly each time ANALYZE is run, even  if  the
     actual  table  contents  did  not change. This may result in
     small changes in the  planner's  estimated  costs  shown  by
     EXPLAIN. In rare situations, this non-determinism will cause
     the query optimizer to choose a different query plan between
     runs  of ANALYZE. To avoid this, raise the amount of statis-
     tics collected by ANALYZE, as described below.

     The extent of analysis can be controlled  by  adjusting  the
     default_statistics_target  configuration  variable,  or on a
     column-by-column basis by setting the per-column  statistics
     target  with ALTER TABLE ... ALTER COLUMN ... SET STATISTICS
     (see ALTER TABLE [alter_table(5)]). The  target  value  sets
     the  maximum number of entries in the most-common-value list
     and the maximum number of bins in the histogram. The default
     target  value  is 10, but this can be adjusted up or down to
     trade off accuracy of planner  estimates  against  the  time
     taken  for  ANALYZE  and  the  amount  of  space occupied in
     pg_statistic. In particular, setting the  statistics  target
     to  zero  disables collection of statistics for that column.
     It may be useful to do that for columns that are never  used
     as  part  of  the  WHERE,  GROUP  BY, or ORDER BY clauses of
     queries, since the planner will have no use  for  statistics
     on such columns.

     The  largest  statistics  target  among  the  columns  being
     analyzed  determines  the  number  of  table rows sampled to
     prepare the statistics. Increasing the target causes a  pro-
     portional  increase  in  the  time  and  space  needed to do
     ANALYZE.


COMPATIBILITY

     There is no ANALYZE statement in the SQL standard.


SQL - Language StatementLast change: 2008-01-03 2



Man(1) output converted with man2html