AWK Command in Linux with Examples
The awk command is a Linux tool and programming language that allows users to process and manipulate data and produce formatted reports. The tool supports various operations for advanced text processing and facilitates expressing complex data selections.
In this tutorial, you will learn what the awk command does and how to use it.
AWK Command Syntax
The syntax for the awk command is:
awk [options] 'selection_criteria ' input-file > output-file
The available options are:
Option | Description |
---|---|
-F [separator] | Used to specify a file separator. The default separator is a blank space. |
-f [filename] | Used to specify the file containing the awk script. Reads the awk program source from the specified file, instead of the first command-line argument. |
-v | Used to assign a variable. |
Note: You might also be interested in learning about the Linux curl command, allowing you to transfer data to and from a server after processing it with awk.
How Does the AWK Command Work?
The awk command’s main purpose is to make information retrieval and text manipulation easy to perform in Linux. This Linux command works by scanning a set of input lines in order and searches for lines matching the patterns specified by the user.
For each pattern, users can specify an action to perform on each line that matches the specified pattern. Thus, using awk , users can easily process complex log files and output a readable report.
Note: The awk command got its name from three people who wrote the original version in 1977 — Alfred Aho, Peter Weinberger, and Brian Kernighan.
AWK Operations
awk allows users to perform various operations on an input file or text. Some of the available operations are:
- Scan a file line by line.
- Split the input line/file into fields.
- Compare the input line or fields with the specified pattern(s).
- Perform various actions on the matched lines.
- Format the output lines.
- Perform arithmetic and string operations.
- Use control flow and loops on output.
- Transform the files and data according to a specified structure.
- Generate formatted reports.
AWK Statements
The command provides basic control flow statements ( if-else , while , for , break ) and also allows users to group statements using braces <> .
The if-else statement works by evaluating the condition specified in the parentheses and, if the condition is true, the statement following the if statement is executed. The else part is optional.
The output shows the lines in which duplicates exist and states No duplicates if there are no duplicate answers in the line.
The while statement repeatedly executes a target statement as long as the specified condition is true. That means that it operates like the one in the C programming language. If the condition is true, the body of the loop is executed. If the condition is false, awk continues with the execution.
For example, the following statement instructs awk to print all input fields one per line:
The for statement also works like that of C, allowing users to create a loop that needs to execute a specific number of times.
The statement above increases the value of i by one until it reaches ten and calculates the square of i each time.
Note: The expressions in the condition part of if , while or for can include relational operators, such as =, == (is equal to), and != (not equal to). The expressions can also include regular expression matches with the match operators ∼ and !∼ , logical operators ||, && , and ! . The operators are grouped with parentheses.
The break statement immediately exits from an enclosing while or for . To begin the next iteration, use the continue statement.
The next statement instructs awk to skip to the next record and begin scanning for patterns from the top. The exit statement instructs awk that the input has ended.
Following is an example of the break statement:
The command above breaks the loop after 5 iterations.
Note: The awk tool allows users to place comments in AWK programs. Comments begin with # and end at the end of the line.
AWK Patterns
Inserting a pattern in front of an action in awk acts as a selector. The selector determines whether to perform an action or not. The following expressions can serve as patterns:
- Regular expressions.
- Arithmetic relational expressions.
- String-valued expressions.
- Arbitrary Boolean combinations of the expressions above.
The following sections explain the above-mentioned expressions and how to use them.
Note: Learn how you can search for strings or patterns with the grep command.
Regular Expression Patterns
Regular expression patterns are the simplest form of expressions containing a string of characters enclosed in slashes. It can be a sequence of letters, numbers, or a combination of both.
In the following example, the program outputs all the lines starting with «A». If the specified string is a part of a larger word, it is also printed.
Relational Expression Patterns
Another type of awk patterns are relational expression patterns. The relational expression patterns involve using any of the following relational operators: =, and >.
Following is an example of an awk relational expression:
Range Patterns
A range pattern is a pattern consisting of two patterns separated by a comma. Range patterns perform the specified action for each line between the occurrence of pattern one and pattern two.
awk '/clerk/, /manager/ ' employees.txt
The pattern above instructs awk to print all the lines of the input containing the keywords «clerk» and «manager».
Special Expression Patterns
Special expression patterns include BEGIN and END which denote program initialization and end. The BEGIN pattern matches the beginning of the input, before the first record is processed. The END pattern matches the end of the input, after the last record has been processed.
For example, you can instruct awk to display a message at the beginning and at the end of the process:
awk 'BEGIN < print "List of debtors:" >; ; END ' debtors.txt
Combining Patterns
The awk command allows users to combine two or more patterns using logical operators. The combined patterns can be any Boolean combination of patterns. The logical operators for combining patterns are:
awk '$3 > 10 && $4 < 20 ' employees.txt
The output prints the first and second fields of those records whose third field is greater than ten and the fourth field is less than 20.
AWK Variables
The awk command has built-in field variables, which break the input file into separate parts called fields. The awk assigns the following variables to each data field:
- $0 . Used to specify the whole line.
- $1 . Specifies the first field.
- $2 . Specifies the second field.
- etc.
Other available built-in awk variables are:
- NR . Counts the number of input records (usually lines). The awk command performs the pattern/action statements once for each record in a file.
The command displays the line number in the output.
- NF . Counts the number of fields in the current input record and displays the last field of the file.
- FS . Contains the character used to divide fields on the input line. The default separator is space, but you can use FS to reassign the separator to another character (typically in BEGIN ).
For example, you can make the etc/passwd file (user list) more readable by changing the separator from a colon ( : ) to a dash ( / ) and print out the field separator as well:
- RS . Stores the current record separator character. The default input line is the input record, which makes a newline the default record separator. The command is useful if the input is a comma-separated file (CSV).
Note: We first used the cat command to show the file's contents and then formatted the output with AWK .
- OFS . Stores the output field separator, which separates the fields when printed. The default separator is a blank space. Whenever the printed file has several parameters separated with commas, the OFS value is printed between each parameter.
awk 'OFS=" works as " ' employees.txt
AWK Actions
The awk tool follows rules containing pattern-action pairs. Actions consist of statements enclosed in curly braces <> which contain expressions, control statements, compound statements, input and output statements, and deletion statements. Those statements are described in the sections above.
Create an awk script using the following syntax:
This simple command instructs awk to print the specified string each time you run the command. Terminate the program using Ctrl+D.
How to Use the AWK Command - Examples
Apart from manipulating data and producing formatted outputs, awk has other uses as it is a scripting language and not only a text processing command. This section explains alternative use cases for awk .
- Calculations. The awk command allows you to perform arithmetic calculations. For example:
In this example, we pipe into the df command and use the information generated in the report to calculate the total memory available and used by the mounted filesystems that contain only /dev and /loop in the name.
The produced report shows the memory sum of the /dev and /loop filesystems in columns two and three in the df output.
- Filtering. The awk command allows you to filter the output by limiting the length of the lines. For example:
In this example, we ran the /etc/shells system file through awk and filtered the output to contain only the lines containing more than 8 characters.
- Monitoring. Check if a certain process is running in Linux by piping into the ps command. For example:
The output prints a list of all the processes running on your machine with the last field matching the specified pattern.
- Counting. You can use awk to count the number of characters in a line and get the number printed in the result. For example:
Note: Read also about gawk command, a text-processing and data-manipulating tool.
After reading this tutorial, you know what the awk command is and how you can use it effectively for various use cases.
The awk command is also a scripting language with many uses, and it is essential knowledge for every Linux user. Use it for powerful text manipulations, but also as a scripting language.
Having worked as an educator and content writer, combined with his lifelong passion for all things high-tech, Bosko strives to simplify intricate concepts and make them user-friendly. That has led him to technical writing at PhoenixNAP, where he continues his mission of spreading knowledge.
Deleting a file in Linux or any other OS does not actually remove the data from the drive. The best practice is to use the shred command to permanently destroy sensitive data so that it cannot be recovered. Learn how to use the sh
A list of all the important Linux commands in one place. Find the command you need, whenever you need it or download our Linux Commands Cheat Sheet and save it for future reference.
The Linux w command allows you to list the information for currently logged in users. learn how the w command works and how to change the output using different options in this tutorial.
The built-in Linux watch command allows you to repeatedly run a user-defined command in regular time intervals. Learn how to use it in this tutorial.