AWK — A Language That Impressed Me

An Introduction and Exercises With Solutions

Nickson Joram
4 min readFeb 2, 2023

AWK is an interpreter-based programming language. It is extremely strong and made specifically for text processing. AWK is being used for data manipulation and report generation. With the awk command programming language, variables, string functions, numeric functions, and logical operators can all be used without the need for compilation. Its name is derived from the family names of its authors (Alfred Aho, Peter Weinberger, and Brian Kernighan).

Using the AWK utility, a programmer can create short but powerful programmes by writing statements that specify text patterns to be looked for in each line of a document and the action to be taken when a match is found. AWK is mostly used for processing and scanning patterns.

The AWK language is a data-driven scripting language that consists of a collection of actions to be executed against streams of textual data. It can be used directly on files or as part of a pipeline to extract or convert text for purposes like producing prepared reports. The string datatype, associative arrays (i.e., arrays indexed by key strings), and regular expressions are all widely used in the language. Even while AWK was created specifically to handle one-line programmes and has a narrow intended application area.

An AWK program’s basic structure has the following form:

pattern { action }

When the action is carried out is specified by the pattern. AWK is line-oriented, like the majority of UNIX tools. With each line read as input, a test is specified by the pattern, in this manner. The action is carried out if the condition is true. There is a match for each line in the default pattern. The null or blank pattern is this. The keywords “BEGIN” and “END” define two more significant patterns. These two words, as one might think, define what should be done both before and after the last line is read.

BEGIN { print "START" }
{ print "TODO" }
END { print "STOP" }

You must be familiar with its internals in order to become a master AWK coder. The read, execute, and repeat workflow of AWK is straightforward. The AWK workflow is shown in the diagram below.

Let’s see some AWK executions

Let’s see how to print out the nth line of a data file.

Let’s create a new data text file first.

nano data.txt

After that,

1 This is my test data
2 This is my test data
3 This is my test data
4 This is my test data
5 This is my test data
6 This is my test data
7 This is my test data
8 This is my test data
9 This is my test data
10 This is my test data

Then, create an AWK file to perform the requirement

nano printnthline.awk

Include the scripts, lets print 2nd line.

NR == 2 {print}

Save it. How to execute this script?

awk -f printnthline.awk data.txt

The output will be

2 This is my test data

Then, how to print the last line?

NR > 0{lines = $0 }
END {printf("The last line with text is %s\n",lines);}

The output will be

The last line with text is 10 This is my test data

Lets see how to write AWK statement that will only display lines that have exactly 3 fields of a data file.

NF == 3 {print $0;}

Use a new data file

5 6 7
Hello hi
10 2
H T C

The output will be

5 6 7
H T C

Lets create a text file where we include in some line the string test; then lets create an awk statement that will run through the file lines and print out the lines that contain the word test, as well as count the number of lines printed.

BEGIN {nlines = 0}
/[Tt][Ee][Ss][Tt][\ ]*$/ {print $0;nlines++;}
END {printf("The number of lines printed with test is %d\n",nlines);}

Lets use a new data file

1
2 this is a test
3 cat dog
4 people house
5 test test
6 TEst

The output

2 this is a test
5 test test
6 TEst
The number of lines printed with test is 3

Lets write an awk statement that will print each line in reverse order of a data file

{
for (i=NF;i>0;i--){
printf("%s ",$i);
}
printf("\n");
}

Lets use the first data file. The output is

data test my is This 1
data test my is This 2
data test my is This 3
data test my is This 4
data test my is This 5
data test my is This 6
data test my is This 7
data test my is This 8
data test my is This 9
data test my is This 10

I’ll add few more exercises with solutions. Play with the AWK and let me know what makes you more interested in AWK.

Share your thoughts.

--

--