Previous Table of Contents Next


Arrays

An array is a data structure used to store related strings and/or numbers. Arrays are made up of elements. Each element is equivalent to a single variable. The difference is you address the array with one name and each element with a subscript. Nawk supports one-dimensional and multidimensional arrays.

One-dimensional Arrays Nawk provides one-dimensional arrays for storing data. This allows you to store multiple values in one variable name using a single subscript. For example, the following code assigns the string "Alabama" to the first element of the array named STATE and assigns "Wyoming" to the fiftieth element.

     STATE[1]="Alabama"
     STATE[50]="Wyoming"

To clarify an array structure, let's consider a very small post office. To keep it simple, let's say you only have one row of mail boxes and there are fifty mail boxes beside each other. The boxes are numbered from 1 to 50, and each box contains a piece of paper in it with the name of a state on it. For example,

The box numbers can be thought of as the subscript number used to address the elements of the array. The boxes themselves are like elements or cells of the array. Thus to refer to an element and retrieve the name of a state you simply type STATE[subscript].

Arrays are declared automatically by nawk the first time you assign a value to an element of one. If no assignment has been made to an array element, the value returned is a null string " " or a numeric 0.

Subscripts  The unusual aspect of arrays in nawk is the fact the subscripts are strings. They are not truly numbers. This allows you to reference elements in an array with string values. For example,

     STATE["Alabama"]=1,900,200

assigns the value of 1,900,200 to the "Alabama" element of the STATE array. This is not the same as the STATE[1] element. To reference this element you simply use the "Alabama" string value.

Subscripts may be variables or constants. When you assign values to array elements, you must make sure you are consistent in using the same value to reference the correct element. For example,

     STATE["Alabama"]=10

and

     STATE[Alabama]=10

are not the same unless you have assigned the value of "Alabama" to the variable Alabama. Because the second statement is using a variable named Alabama which is set to null (" ") not the constant "Alabama."

To determine whether a subscript exists in an array you can use the following expression.

     if ( "Wyoming" in STATE )

This statement checks the array STATE for a subscript named "Wyoming." It does not check for the value of "Wyoming" being in the array STATE.


CAUTION:  
When using numbers as subscripts you must remember ARRAY[1] references the same element as ARRAY["1"]. This is because nawk converts all subscript names to string values.



Multidimensional Arrays  Nawk also supports multidimensional arrays. Actually it provides a simulation of multidimensional arrays. Since subscripts are strings, nawk concatenates the subscripts together to form one unique subscript to address an element in a one-dimensional array. The following example illustrates how you might utilize a multidimensional array.
     for ( i = 1; i <= NF; i++ )
        records[NR,j] = $i

The loop counts through each record of input and assigns each field to the NRth sub j element of the records array. The comma in the subscript is replaced by nawk with the character stored in the SUBSEP variable. This value is "\034" which is very unlikely to be contained in any input text. Thus the subscript for 1,1 (the first field of the first record) is actually "1\0341."

To check for an existing element in a multidimensional array you use the following expression.

     if ( (i,j) in ARRAY )

To loop through a multidimensional array you use the expression,

    for ( i in ARRAY )
      split( i, ARRAY, SUBSEP )

Looping Through Arrays  The previous expression can be used inside the for loop structure to loop through every subscript of an array. For example,

     for ( state in STATE )
       print state "   " STATE[state]

prints the name of the state, three spaces, and the name value stored in the element with the subscript of state.

Deleting Array Elements  It may be necessary at times to delete an array element from an array. For example,

     delete STATE["Alabama"]

deletes the "Alabama" element from the STATE array.

Splitting Strings into Arrays The split function splits a string into fields and stores the separate fields in elements of an array. For example,

     DATE="03/16/80"
     split( DATE, BIRTH, "/" )

splits the value of the variable DATE (03/16/80) into three fields using the / as the field separator. It stores "03" in BIRTH[1], "16" in BIRTH[2], and "80" in BIRTH[3].

Arithmetic Operators

The standard set of C arithmetic operators is supported by nawk. The following table defines each operator.


Operator Function

+ addition
- subtraction
* multiplication
/ division
% modulus
++ increment
-- decrement

Arithmetic Assignment Operators

These operators perform faster arithmetic assignments than the conventional assignment operations. For example, to add the value of 10 to the variable X you normally type,

      X = X + 10

Using an assignment operator you type,

      X += 10

These two statements are equivalent. They both add 10 to the value of the variable X. The following table lists all assignment operators.


Operator Function

+= addition assignment; N = N + X or N += X
-= subtraction assignment
*= multiplication assignment
/= division assignment
%= modulus assignment


Previous Table of Contents Next