DOC HOME SITE MAP MAN PAGES GNU INFO SEARCH PRINT BOOK
 

(mysql.info) load-data

Info Catalog (mysql.info) insert (mysql.info) data-manipulation (mysql.info) replace
 
 13.2.5 `LOAD DATA INFILE' Syntax
 --------------------------------
 
      LOAD DATA [LOW_PRIORITY | CONCURRENT] [LOCAL] INFILE 'FILE_NAME'
          [REPLACE | IGNORE]
          INTO TABLE TBL_NAME
          [FIELDS
              [TERMINATED BY 'STRING']
              [[OPTIONALLY] ENCLOSED BY 'CHAR']
              [ESCAPED BY 'CHAR']
          ]
          [LINES
              [STARTING BY 'STRING']
              [TERMINATED BY 'STRING']
          ]
          [IGNORE NUMBER LINES]
          [(COL_NAME_OR_USER_VAR,...)]
          [SET COL_NAME = EXPR,...)]
 
 The `LOAD DATA INFILE' statement reads rows from a text file into a
 table at a very high speed. The filename must be given as a literal
 string.
 
 The syntax for the `FIELDS' and `LINES' clauses also applies to the
 `SELECT ... INTO OUTFILE' statement, as described later in this
 section. (See also  select.)
 
 For more information about the efficiency of `INSERT' versus `LOAD DATA
 INFILE' and speeding up `LOAD DATA INFILE', see  insert-speed.
 
 The character set indicated by the `character_set_database' system
 variable is used to interpret the information in the file. `SET NAMES'
 and the setting of `character_set_client' do not affect interpretation
 of input.
 
 Note that it is currently not possible to load data files that use the
 `ucs2' character set.
 
 As of MySQL 5.0.19, the `character_set_filesystem' system variable
 controls the interpretation of the filename.
 
 You can also load data files by using the `mysqlimport' utility; it
 operates by sending a `LOAD DATA INFILE' statement to the server. The
 -local option causes `mysqlimport' to read data files from the client
 host. You can specify the -compress option to get better performance
 over slow networks if the client and server support the compressed
 protocol. See  mysqlimport.
 
 If you use `LOW_PRIORITY', execution of the `LOAD DATA' statement is
 delayed until no other clients are reading from the table.
 
 If you specify `CONCURRENT' with a `MyISAM' table that satisfies the
 condition for concurrent inserts (that is, it contains no free blocks
 in the middle), other threads can retrieve data from the table while
 `LOAD DATA' is executing. Using this option affects the performance of
 `LOAD DATA' a bit, even if no other thread is using the table at the
 same time.
 
 The `LOCAL' keyword, if specified, is interpreted with respect to the
 client end of the connection:
 
    * If `LOCAL' is specified, the file is read by the client program on
      the client host and sent to the server. The file can be given as a
      full pathname to specify its exact location. If given as a
      relative pathname, the name is interpreted relative to the
      directory in which the client program was started.
 
    * If `LOCAL' is not specified, the file must be located on the
      server host and is read directly by the server. The server uses
      the following rules to locate the file:
 
         * If the filename is an absolute pathname, the server uses it
           as given.
 
         * If the filename is a relative pathname with one or more
           leading components, the server searches for the file relative
           to the server's data directory.
 
         * If a filename with no leading components is given, the server
           looks for the file in the database directory of the default
           database.
 
 Note that, in the non-`LOCAL' case, these rules mean that a file named
 as `./myfile.txt' is read from the server's data directory, whereas the
 file named as `myfile.txt' is read from the database directory of the
 default database. For example, if `db1' is the default database, the
 following `LOAD DATA' statement reads the file `data.txt' from the
 database directory for `db1', even though the statement explicitly
 loads the file into a table in the `db2' database:
 
      LOAD DATA INFILE 'data.txt' INTO TABLE db2.my_table;
 
 Windows pathnames are specified using forward slashes rather than
 backslashes. If you do use backslashes, you must double them.
 
 For security reasons, when reading text files located on the server,
 the files must either reside in the database directory or be readable
 by all. Also, to use `LOAD DATA INFILE' on server files, you must have
 the `FILE' privilege. See  privileges-provided.
 
 Using `LOCAL' is a bit slower than letting the server access the files
 directly, because the contents of the file must be sent over the
 connection by the client to the server. On the other hand, you do not
 need the `FILE' privilege to load local files.
 
 `LOCAL' works only if your server and your client both have been
 enabled to allow it. For example, if `mysqld' was started with
 -local-infile=0, `LOCAL' does not work. See  load-data-local.
 
 On Unix, if you need `LOAD DATA' to read from a pipe, you can use the
 following technique (here we load the listing of the `/' directory into
 a table):
 
      mkfifo /mysql/db/x/x
      chmod 666 /mysql/db/x/x
      find / -ls > /mysql/db/x/x
      mysql -e "LOAD DATA INFILE 'x' INTO TABLE x" x
 
 The `REPLACE' and `IGNORE' keywords control handling of input rows that
 duplicate existing rows on unique key values.
 
 If you specify `REPLACE', input rows replace existing rows. In other
 words, rows that have the same value for a primary key or unique index
 as an existing row. See  replace.
 
 If you specify `IGNORE', input rows that duplicate an existing row on a
 unique key value are skipped. If you do not specify either option, the
 behavior depends on whether the `LOCAL' keyword is specified.  Without
 `LOCAL', an error occurs when a duplicate key value is found, and the
 rest of the text file is ignored. With `LOCAL', the default behavior is
 the same as if `IGNORE' is specified; this is because the server has no
 way to stop transmission of the file in the middle of the operation.
 
 If you want to ignore foreign key constraints during the load
 operation, you can issue a `SET FOREIGN_KEY_CHECKS=0' statement before
 executing `LOAD DATA'.
 
 If you use `LOAD DATA INFILE' on an empty `MyISAM' table, all
 non-unique indexes are created in a separate batch (as for `REPAIR
 TABLE'). Normally, this makes `LOAD DATA INFILE' much faster when you
 have many indexes. In some extreme cases, you can create the indexes
 even faster by turning them off with `ALTER TABLE ... DISABLE KEYS'
 before loading the file into the table and using `ALTER TABLE ...
 ENABLE KEYS' to re-create the indexes after loading the file. See 
 insert-speed.
 
 `LOAD DATA INFILE' is the complement of `SELECT ... INTO OUTFILE'. (See
  select.) To write data from a table to a file, use `SELECT ...
 INTO OUTFILE'. To read the file back into a table, use `LOAD DATA
 INFILE'. The syntax of the `FIELDS' and `LINES' clauses is the same for
 both statements. Both clauses are optional, but `FIELDS' must precede
 `LINES' if both are specified.
 
 If you specify a `FIELDS' clause, each of its subclauses (`TERMINATED
 BY', `[OPTIONALLY] ENCLOSED BY', and `ESCAPED BY') is also optional,
 except that you must specify at least one of them.
 
 If you specify no `FIELDS' clause, the defaults are the same as if you
 had written this:
 
      FIELDS TERMINATED BY '\t' ENCLOSED BY '' ESCAPED BY '\\'
 
 If you specify no `LINES' clause, the defaults are the same as if you
 had written this:
 
      LINES TERMINATED BY '\n' STARTING BY ''
 
 In other words, the defaults cause `LOAD DATA INFILE' to act as follows
 when reading input:
 
    * Look for line boundaries at newlines.
 
    * Do not skip over any line prefix.
 
    * Break lines into fields at tabs.
 
    * Do not expect fields to be enclosed within any quoting characters.
 
    * Interpret occurrences of tab, newline, or ‘`\'’ preceded by
      ‘`\'’ as literal characters that are part of field values.
 
 Conversely, the defaults cause `SELECT ... INTO OUTFILE' to act as
 follows when writing output:
 
    * Write tabs between fields.
 
    * Do not enclose fields within any quoting characters.
 
    * Use ‘`\'’ to escape instances of tab, newline, or ‘`\'’
      that occur within field values.
 
    * Write newlines at the ends of lines.
 
 Backslash is the MySQL escape character within strings, so to write
 `FIELDS ESCAPED BY '\\'', you must specify two backslashes for the
 value to be interpreted as a single backslash.
 
 * If you have generated the text file on a Windows system, you
 might have to use `LINES TERMINATED BY '\r\n'' to read the file
 properly, because Windows programs typically use two characters as a
 line terminator. Some programs, such as `WordPad', might use `\r' as a
 line terminator when writing files. To read such files, use `LINES
 TERMINATED BY '\r''.
 
 If all the lines you want to read in have a common prefix that you want
 to ignore, you can use `LINES STARTING BY 'PREFIX_STRING'' to skip over
 the prefix, _and anything before it_. If a line does not include the
 prefix, the entire line is skipped.  Suppose that you issue the
 following statement:
 
      LOAD DATA INFILE '/tmp/test.txt' INTO TABLE test
        FIELDS TERMINATED BY ','  LINES STARTING BY 'xxx';
 
 If the data file looks like this:
 
      xxx"abc",1
      something xxx"def",2
      "ghi",3
 
 The resulting rows will be `("abc",1)' and `("def",2)'. The third row
 in the file will be skipped because it does not contain the prefix.
 
 The `IGNORE NUMBER LINES' option can be used to ignore lines at the
 start of the file. For example, you can use `IGNORE 1 LINES' to skip
 over an initial header line containing column names:
 
      LOAD DATA INFILE '/tmp/test.txt' INTO TABLE test IGNORE 1 LINES;
 
 When you use `SELECT ... INTO OUTFILE' in tandem with `LOAD DATA
 INFILE' to write data from a database into a file and then read the
 file back into the database later, the field- and line-handling options
 for both statements must match. Otherwise, `LOAD DATA INFILE' will not
 interpret the contents of the file properly. Suppose that you use
 `SELECT ... INTO OUTFILE' to write a file with fields delimited by
 commas:
 
      SELECT * INTO OUTFILE 'data.txt'
        FIELDS TERMINATED BY ','
        FROM table2;
 
 To read the comma-delimited file back in, the correct statement would
 be:
 
      LOAD DATA INFILE 'data.txt' INTO TABLE table2
        FIELDS TERMINATED BY ',';
 
 If instead you tried to read in the file with the statement shown
 following, it wouldn't work because it instructs `LOAD DATA INFILE' to
 look for tabs between fields:
 
      LOAD DATA INFILE 'data.txt' INTO TABLE table2
        FIELDS TERMINATED BY '\t';
 
 The likely result is that each input line would be interpreted as a
 single field.
 
 `LOAD DATA INFILE' can be used to read files obtained from external
 sources. For example, many programs can export data in comma-separate
 values (CSV) format, such that lines have fields separated by commas
 and enclosed within double quotes. If lines in such a file are
 terminated by newlines, the statement shown here illustrates the field-
 and line-handling options you would use to load the file:
 
      LOAD DATA INFILE 'data.txt' INTO TABLE TBL_NAME
        FIELDS TERMINATED BY ',' ENCLOSED BY '"'
        LINES TERMINATED BY '\n';
 
 Any of the field- or line-handling options can specify an empty string
 (`'''). If not empty, the `FIELDS [OPTIONALLY] ENCLOSED BY' and `FIELDS
 ESCAPED BY' values must be a single character. The `FIELDS TERMINATED
 BY', `LINES STARTING BY', and `LINES TERMINATED BY' values can be more
 than one character.  For example, to write lines that are terminated by
 carriage return/linefeed pairs, or to read a file containing such lines,
 specify a `LINES TERMINATED BY '\r\n'' clause.
 
 To read a file containing jokes that are separated by lines consisting
 of `%%', you can do this
 
      CREATE TABLE jokes
        (a INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
        joke TEXT NOT NULL);
      LOAD DATA INFILE '/tmp/jokes.txt' INTO TABLE jokes
        FIELDS TERMINATED BY ''
        LINES TERMINATED BY '\n%%\n' (joke);
 
 `FIELDS [OPTIONALLY] ENCLOSED BY' controls quoting of fields. For
 output (`SELECT ... INTO OUTFILE'), if you omit the word `OPTIONALLY',
 all fields are enclosed by the `ENCLOSED BY' character. An example of
 such output (using a comma as the field delimiter) is shown here:
 
      "1","a string","100.20"
      "2","a string containing a , comma","102.20"
      "3","a string containing a \" quote","102.20"
      "4","a string containing a \", quote and comma","102.20"
 
 If you specify `OPTIONALLY', the `ENCLOSED BY' character is used only
 to enclose values from columns that have a string data type (such as
 `CHAR', `BINARY', `TEXT', or `ENUM'):
 
      1,"a string",100.20
      2,"a string containing a , comma",102.20
      3,"a string containing a \" quote",102.20
      4,"a string containing a \", quote and comma",102.20
 
 Note that occurrences of the `ENCLOSED BY' character within a field
 value are escaped by prefixing them with the `ESCAPED BY' character.
 Also note that if you specify an empty `ESCAPED BY' value, it is
 possible to inadvertently generate output that cannot be read properly
 by `LOAD DATA INFILE'. For example, the preceding output just shown
 would appear as follows if the escape character is empty. Observe that
 the second field in the fourth line contains a comma following the
 quote, which (erroneously) appears to terminate the field:
 
      1,"a string",100.20
      2,"a string containing a , comma",102.20
      3,"a string containing a " quote",102.20
      4,"a string containing a ", quote and comma",102.20
 
 For input, the `ENCLOSED BY' character, if present, is stripped from
 the ends of field values. (This is true regardless of whether
 `OPTIONALLY' is specified; `OPTIONALLY' has no effect on input
 interpretation.) Occurrences of the `ENCLOSED BY' character preceded by
 the `ESCAPED BY' character are interpreted as part of the current field
 value.
 
 If the field begins with the `ENCLOSED BY' character, instances of that
 character are recognized as terminating a field value only if followed
 by the field or line `TERMINATED BY' sequence. To avoid ambiguity,
 occurrences of the `ENCLOSED BY' character within a field value can be
 doubled and are interpreted as a single instance of the character. For
 example, if `ENCLOSED BY '"'' is specified, quotes are handled as shown
 here:
 
      "The ""BIG"" boss"  -> The "BIG" boss
      The "BIG" boss      -> The "BIG" boss
      The ""BIG"" boss    -> The ""BIG"" boss
 
 `FIELDS ESCAPED BY' controls how to write or read special characters.
 If the `FIELDS ESCAPED BY' character is not empty, it is used to prefix
 the following characters on output:
 
    * The `FIELDS ESCAPED BY' character
 
    * The `FIELDS [OPTIONALLY] ENCLOSED BY' character
 
    * The first character of the `FIELDS TERMINATED BY' and `LINES
      TERMINATED BY' values
 
    * ASCII `0' (what is actually written following the escape character
      is ASCII ‘`0'’, not a zero-valued byte)
 
 If the `FIELDS ESCAPED BY' character is empty, no characters are
 escaped and `NULL' is output as `NULL', not `\N'. It is probably not a
 good idea to specify an empty escape character, particularly if field
 values in your data contain any of the characters in the list just
 given.
 
 For input, if the `FIELDS ESCAPED BY' character is not empty,
 occurrences of that character are stripped and the following character
 is taken literally as part of a field value.  The exceptions are an
 escaped ‘`0'’ or ‘`N'’ (for example, `\0' or `\N' if the escape
 character is ‘`\'’). These sequences are interpreted as ASCII NUL
 (a zero-valued byte) and `NULL'. The rules for `NULL' handling are
 described later in this section.
 
 For more information about ‘`\'’-escape syntax, see 
 literals.
 
 In certain cases, field- and line-handling options interact:
 
    * If `LINES TERMINATED BY' is an empty string and `FIELDS TERMINATED
      BY' is non-empty, lines are also terminated with `FIELDS TERMINATED
      BY'.
 
    * If the `FIELDS TERMINATED BY' and `FIELDS ENCLOSED BY' values are
      both empty (`'''), a fixed-row (non-delimited) format is used.
      With fixed-row format, no delimiters are used between fields (but
      you can still have a line terminator).  Instead, column values are
      written and read using the display widths of the columns. For
      example, if a column is declared as `INT(7)', values for the column
      are written using seven-character fields. On input, values for the
      column are obtained by reading seven characters.
 
      `LINES TERMINATED BY' is still used to separate lines. If a line
      does not contain all fields, the rest of the columns are set to
      their default values. If you do not have a line terminator, you
      should set this to `'''. In this case, the text file must contain
      all fields for each row.
 
      Fixed-row format also affects handling of `NULL' values, as
      described later. Note that fixed-size format does not work if you
      are using a multi-byte character set.
 
 Handling of `NULL' values varies according to the `FIELDS' and `LINES'
 options in use:
 
    * For the default `FIELDS' and `LINES' values, `NULL' is written as
      a field value of `\N' for output, and a field value of `\N' is read
      as `NULL' for input (assuming that the `ESCAPED BY' character is
      ‘`\'’).
 
    * If `FIELDS ENCLOSED BY' is not empty, a field containing the
      literal word `NULL' as its value is read as a `NULL' value. This
      differs from the word `NULL' enclosed within `FIELDS ENCLOSED BY'
      characters, which is read as the string `'NULL''.
 
    * If `FIELDS ESCAPED BY' is empty, `NULL' is written as the word
      `NULL'.
 
    * With fixed-row format (which is used when `FIELDS TERMINATED BY'
      and `FIELDS ENCLOSED BY' are both empty), `NULL' is written as an
      empty string. Note that this causes both `NULL' values and empty
      strings in the table to be indistinguishable when written to the
      file because both are written as empty strings. If you need to be
      able to tell the two apart when reading the file back in, you
      should not use fixed-row format.
 
 Some cases are not supported by `LOAD DATA INFILE':
 
    * Fixed-size rows (`FIELDS TERMINATED BY' and `FIELDS ENCLOSED BY'
      both empty) and `BLOB' or `TEXT' columns.
 
    * If you specify one separator that is the same as or a prefix of
      another, `LOAD DATA INFILE' cannot interpret the input properly.
      For example, the following `FIELDS' clause would cause problems:
 
           FIELDS TERMINATED BY '"' ENCLOSED BY '"'
 
    * If `FIELDS ESCAPED BY' is empty, a field value that contains an
      occurrence of `FIELDS ENCLOSED BY' or `LINES TERMINATED BY'
      followed by the `FIELDS TERMINATED BY' value causes `LOAD DATA
      INFILE' to stop reading a field or line too early.  This happens
      because `LOAD DATA INFILE' cannot properly determine where the
      field or line value ends.
 
 The following example loads all columns of the `persondata' table:
 
      LOAD DATA INFILE 'persondata.txt' INTO TABLE persondata;
 
 By default, when no column list is provided at the end of the `LOAD
 DATA INFILE' statement, input lines are expected to contain a field for
 each table column. If you want to load only some of a table's columns,
 specify a column list:
 
      LOAD DATA INFILE 'persondata.txt' INTO TABLE persondata (col1,col2,...);
 
 You must also specify a column list if the order of the fields in the
 input file differs from the order of the columns in the table.
 Otherwise, MySQL cannot tell how to match input fields with table
 columns.
 
 Before MySQL 5.0.3, the column list must contain only names of columns
 in the table being loaded, and the `SET' clause is not supported. As of
 MySQL 5.0.3, the column list can contain either column names or user
 variables. With user variables, the `SET' clause enables you to perform
 transformations on their values before assigning the result to columns.
 
 User variables in the `SET' clause can be used in several ways. The
 following example uses the first input column directly for the value of
 `t1.column1', and assigns the second input column to a user variable
 that is subjected to a division operation before being used for the
 value of `t1.column2':
 
      LOAD DATA INFILE 'file.txt'
        INTO TABLE t1
        (column1, @var1)
        SET column2 = @var1/100;
 
 The `SET' clause can be used to supply values not derived from the
 input file. The following statement sets `column3' to the current date
 and time:
 
      LOAD DATA INFILE 'file.txt'
        INTO TABLE t1
        (column1, column2)
        SET column3 = CURRENT_TIMESTAMP;
 
 You can also discard an input value by assigning it to a user variable
 and not assigning the variable to a table column:
 
      LOAD DATA INFILE 'file.txt'
        INTO TABLE t1
        (column1, @dummy, column2, @dummy, column3);
 
 Use of the column/variable list and `SET' clause is subject to the
 following restrictions:
 
    * Assignments in the `SET' clause should have only column names on
      the left hand side of assignment operators.
 
    * You can use subqueries in the right hand side of `SET'
      assignments. A subquery that returns a value to be assigned to a
      column may be a scalar subquery only. Also, you cannot use a
      subquery to select from the table that is being loaded.
 
    * Lines ignored by an `IGNORE' clause are not processed for the
      column/variable list or `SET' clause.
 
    * User variables cannot be used when loading data with fixed-row
      format because user variables do not have a display width.
 
 When processing an input line, `LOAD DATA' splits it into fields and
 uses the values according to the column/variable list and the `SET'
 clause, if they are present. Then the resulting row is inserted into the
 table. If there are `BEFORE INSERT' or `AFTER INSERT' triggers for the
 table, they are activated before or after inserting the row,
 respectively.
 
 If an input line has too many fields, the extra fields are ignored and
 the number of warnings is incremented.
 
 If an input line has too few fields, the table columns for which input
 fields are missing are set to their default values.  Default value
 assignment is described in  data-type-defaults.
 
 An empty field value is interpreted differently than if the field value
 is missing:
 
    * For string types, the column is set to the empty string.
 
    * For numeric types, the column is set to `0'.
 
    * For date and time types, the column is set to the appropriate
      `zero' value for the type. See  date-and-time-types.
 
 These are the same values that result if you assign an empty string
 explicitly to a string, numeric, or date or time type explicitly in an
 `INSERT' or `UPDATE' statement.
 
 `TIMESTAMP' columns are set to the current date and time only if there
 is a `NULL' value for the column (that is, `\N'), or if the `TIMESTAMP'
 column's default value is the current timestamp and it is omitted from
 the field list when a field list is specified.
 
 `LOAD DATA INFILE' regards all input as strings, so you cannot use
 numeric values for `ENUM' or `SET' columns the way you can with
 `INSERT' statements. All `ENUM' and `SET' values must be specified as
 strings.
 
 When the `LOAD DATA INFILE' statement finishes, it returns an
 information string in the following format:
 
      Records: 1  Deleted: 0  Skipped: 0  Warnings: 0
 
 If you are using the C API, you can get information about the statement
 by calling the `mysql_info()' function. See  mysql-info.
 
 Warnings occur under the same circumstances as when values are inserted
 via the `INSERT' statement (see  insert), except that `LOAD DATA
 INFILE' also generates warnings when there are too few or too many
 fields in the input row. The warnings are not stored anywhere; the
 number of warnings can be used only as an indication of whether
 everything went well.
 
 You can use `SHOW WARNINGS' to get a list of the first
 `max_error_count' warnings as information about what went wrong. See
  show-warnings.
 
Info Catalog (mysql.info) insert (mysql.info) data-manipulation (mysql.info) replace
automatically generated byinfo2html