Programming with awk

Field variables

The fields of the current record can be referred to by the field variables $1, $2, . . ., $NF. Field variables share all of the properties of other variables--they may be used in arithmetic or string operations, and they may have values assigned to them. So, for example, you can divide the second field of the file countries by 1000 to convert the area from thousands to millions of square miles:

   { $2 /= 1000; print }

or assign a new string to a field:

   BEGIN                   { FS = OFS = "\t" }
   $4 == "North America"   { $4 = "NA" }
   $4 == "South America"   { $4 = "SA" }
                           { print }

The BEGIN action in this program resets the input field separator FS and the output field separator OFS to a tab. Notice that the print in the fourth line of the program prints the value of $0 after it has been modified by previous assignments.

Fields can be accessed by expressions. For example, $(NF-1) is the second to last field of the current record. The parentheses are needed: the value of $NF-1 is 1 less than the value in the last field.

A field variable referring to a nonexistent field, for example, $(NF+1), has as its initial value the empty string. A new field can be created, however, by assigning a value to it. For example, the following program invoked on the file countries creates a fifth field giving the population density:

   BEGIN  { FS = OFS = "\t" }
          { $5 = 1000 * $3 / $2; print }

The number of fields can vary from record to record, but usually the implementation limit is 100 fields per record.