- Linux/UNIX Awk Command Tutorial with Examples
- Syntax of awk
- 1) Print all the lines from a file
- 2) Print only specific field like 2nd & 3rd
- 3) Print the lines which matches the pattern
- 4) How do we find unique values in the first column of name
- 5) How to find the sum of data entry in a particular column
- 6) How to find the total of all numbers in a column
- 7) How to find the sum of individual group records
- 8) Find the sum of all entries of specific columns and append it to the end of the file
- 9) How to find the count of entries against every column based on the first column
- 10) How to print only the first record of every group
- AWK Begin Block
- 11) How to populate each column names along with their corresponding data
- 12) How to change the field separator
- The Linux AWK Command – Linux and Unix Usage Syntax Examples
- What is the awk command?
- The Basic Syntax of the awk command
- How to create a sample file
- How to print all the contents of the file using awk
- How to print specific columns using awk
- How to print specific lines of a column
- How to print out lines with a specific pattern in awk
- How to use regular expressions in awk
- How to use comparisson operators in awk
- Conclusion
Linux/UNIX Awk Command Tutorial with Examples
Awk is a scripting language which is used for processing or analyzing text files. Or we can say that awk command is mainly used for grouping of data based on either a column or field , or on a set of columns . Mainly it’s used for reporting data in a useful manner. It also employs Begin and End Blocks to process the data.
AWK Stands for ‘ Aho, Weinberger , and Kernighan ‘
In this tutorial, we will learn awk command with practical examples.
Syntax of awk
# awk ‘pattern ’ input-file > output-file
Let’s take a input file with the following data
$ cat awk_file Name,Marks,Max Marks Ram,200,1000 Shyam,500,1000 Ghyansham,1000 Abharam,800,1000 Hari,600,1000 Ram,400,1000
Now, let’s deep dive into practical examples of awk command.
1) Print all the lines from a file
By default, awk prints all lines of a file , so to print every line of above created file use below command :
$ awk '' awk_file Name,Marks,Max Marks Ram,200,1000 Shyam,500,1000 Ghyansham,1000 Abharam,800,1000 Hari,600,1000 Ram,400,1000
Note: In awk command ‘’ is used print all fields along with their values.
2) Print only specific field like 2nd & 3rd
In awk command, we use $ (dollar) symbol followed by field number to prints field values. In below example, we are printing field 2 (i.e Marks) and field 3 (i.e Max Marks)
$ awk -F "," '' awk_file Marks Max Marks 200 1000 500 1000 1000 800 1000 600 1000 400 1000
In the above command we have used the option -F “,” which specifies that comma (,) is the field separator in the file.
3) Print the lines which matches the pattern
I want to print the lines which contains the word “Hari & Ram”, run
$ awk '/Hari|Ram/' awk_file Ram,200,1000 Hari,600,1000 Ram,400,1000
4) How do we find unique values in the first column of name
To print unique values from the first column, run below awk command
$ awk -F, 'END' awk_file Abharam Hari Name Ghyansham Ram Shyam
5) How to find the sum of data entry in a particular column
In awk command, it is also possible to perform some arithmetic operation based on search, syntax is shown below
In the below example, we search for Ram and then we add values of 2nd field for Ram word.
$ awk -F, '$1=="Ram"END' awk_file 600
6) How to find the total of all numbers in a column
In awk command, we can also calculate the sum of all numbers in a column of a file. In the below example we are calculating the sum of all numbers of 2nd and 3rd column.
$ awk -F"," 'END' awk_file 3500 $ awk -F"," 'END' awk_file 5000
7) How to find the sum of individual group records
For example, if we consider the first column than we can do the summation for the first column based on the items
$ awk -F, 'END' awk_file Abharam, 800 Hari, 600 Name, 0 Ghyansham, 1000 Ram, 600 Shyam, 500
8) Find the sum of all entries of specific columns and append it to the end of the file
As we already discuss that awk command can do sum of all numbers of a column, so to append the sum of column 2 and column 3 at the end of file, run
$ awk -F"," 'END' awk_file Name,Marks,Max Marks Ram,200,1000 Shyam,500,1000 Ghyansham,1000 Abharam,800,1000 Hari,600,1000 Ram,400,1000 Total,3500 5000
9) How to find the count of entries against every column based on the first column
$ awk -F, 'END' awk_file Abharam 1 Hari 1 Name 1 Ghyansham 1 Ram 2 Shyam 1
10) How to print only the first record of every group
To print only first of every group, run below awk command
$ awk -F, '!a[$1]++' awk_file Name,Marks,Max Marks Ram,200,1000 Shyam,500,1000 Ghyansham,1000 Abharam,800,1000 Hari,600,1000
AWK Begin Block
Syntax for BEGIN block is
Let us create a datafile with below contents
11) How to populate each column names along with their corresponding data
12) How to change the field separator
As we can see space is the field separator in the datafile , in the below example we will change field separator from space to “|”
That’s all from this tutorial, I hope you found it informative. Please do share your feedback and queries in below comment’s section.
Recommended Read: 10 Quick Linux Tail Command with Examples
The Linux AWK Command – Linux and Unix Usage Syntax Examples
Dionysia Lemonaki
In this beginner-friendly guide, you’ll learn the very basics of the awk command. You’ll also see some of the ways you can use it when dealing with text.
What is the awk command?
awk is a scripting language, and it is helpful when working in the command line. It’s also a widely used command for text processing.
When using awk , you are able to select data – one or more pieces of individual text – based on a pattern you provide.
For example, some of the operations you can do with awk are searching for a specific word or pattern in a piece of text given, or even select a certain line or a certain column in a file you provide.
The Basic Syntax of the awk command
In its simplest form, the awk command is followed by a set of single quotation marks and a set of curly braces, with the name of the file you want to search through mentioned last.
It looks something like this:
When you want to search for text that has a specific pattern or you’re looking for a specific word in the text, the command would look something like this:
awk '/regex pattern/' your_file_name.txt
How to create a sample file
To create a file in the command line, you use the touch command.
For example: touch filename.txt where filename , is the name of your file.
You can then use the open command ( open filename.txt ), and a word processor program like TextEdit will open where you can add the contents of the file.
So, say you have a text file, information.txt , that contains data separated into different columns.
The file contents could look something like this:
fristName lastName age city ID Thomas Shelby 30 Rio 400 Omega Night 45 Ontario 600 Wood Tinker 54 Lisbon N/A Giorgos Georgiou 35 London 300 Timmy Turner 32 Berlin N/A
In my example, there is one column for firstName , lastName , age , city , and ID .
At any time, you can view the output of the contents of your file by typing cat text_file , where text_file is the name of your file.
How to print all the contents of the file using awk
To print all the contents of a file, the action you specify inside the curly braces is print $0 .
This will work in exactly the same way as the cat command mentioned previously.
fristName lastName age city ID Thomas Shelby 30 Rio 400 Omega Night 45 Ontario 600 Wood Tinker 54 Lisbon N/A Giorgos Georgiou 35 London 300 Timmy Turner 32 Berlin N/A
If you would like each line to have a line-number count, you would use the NR built-in variable:
1 fristName lastName age city ID 2 3 Thomas Shelby 30 Rio 400 4 Omega Night 45 Ontario 600 5 Wood Tinker 54 Lisbon N/A 6 Giorgos Georgiou 35 London 300 7 Timmy Turner 32 Berlin N/A
How to print specific columns using awk
When using awk , you can specify certain columns you want printed.
To have the first column printed, you use the command:
Thomas Omega Wood Giorgos Timmy
The $1 stands for the first field, in this case the first column.
To print the second column,you would use $2 :
lastName Shelby Night Tinker Georgiou Turner
The way awk determines where each column starts and ends is with a space, by default.
To print more than one column, for example the first and forth columns, you would do:
fristName city Thomas Rio Omega Ontario Wood Lisbon Giorgos London Timmy Berlin
The $1 represents the first input field (first column), and the $4 represents the forth. You separate them with a comma, $1,$4 , so the output has a space and is more readable.
To print the last field (the last column), you can also use $NF which represents the last field in a record:
How to print specific lines of a column
You can also specify the line you want printed from your chosen column:
awk '' information.txt | head -1
Let’s break that command down. awk » information.txt prints the first column. Then the output of that command (which you saw earlier on) is piped, using the pipe symbol | , to the head command, where its -1 argument selects the first line of the column.
If you wanted two lines printed, you’d do:
awk '' information.txt | head -2
How to print out lines with a specific pattern in awk
You can print a line that starts with a specific letter.
Omega Night 45 Ontario 600
That command selects any line with text that starts with an O .
You use the up arrow symbol ( ^ ) first, which indicates the beginning of a line, and then the letter you want a line to start with.
You can also print a line that ends in a specific pattern:
Thomas Shelby 30 Rio 400 Omega Night 45 Ontario 600 Giorgos Georgiou 35 London 300
This prints out the lines that end in a 0 – the $ symbol is used after a character to siginify how a line will end.
That command could also be changed to:
The ! is used as a NOT , so in this case it selects the lines that DON’T end in a 0 .
fristName lastName age city ID Wood Tinker 54 Lisbon N/A Timmy Turner 32 Berlin N/A
How to use regular expressions in awk
To output words that contain certain letters and print out words that match a pattern you specify, you again use the slashes, // , shown previously.
If you want to look for words containing on , you’d do:
Thomas Shelby 30 Rio 400 Omega Night 45 Ontario 600 Giorgos Georgiou 35 London 300
This matches all entries that contain io .
Say you had an extra column – a department column:
fristName lastName age city ID department Thomas Shelby 30 Rio 400 IT Omega Night 45 Ontario 600 Design Wood Tinker 54 Lisbon N/A IT Giorgos Georgiou 35 London 300 Data Timmy Turner 32 Berlin N/A Engineering
To find all the information of people working in IT , you would need to speficy the string you’re searching for between the slashes, // :
Thomas Shelby 30 Rio 400 IT Wood Tinker 54 Lisbon N/A IT
What if you wanted to see only the first and last names of the people working in IT ?
You can specify the column like such:
Thomas Shelby Wood Tinker
This will only display the first and second columns where IT appears, instead of presenting all fields.
When searching for words with a specific pattern, there may be times when you’ll need to use an escape character, like such:
Wood Tinker 54 Lisbon N/A Timmy Turner 32 Berlin N/A
I wanted to find lines that end with the pattern N/A .
So, when searching between the ‘ // ‘ like shown so far, I had to use an escape character ( \ ) between N/A , otherwise I would’ve gotten an error.
How to use comparisson operators in awk
If, for example, you wanted to find all the information of employees that were under the age of 40 , you would use the < comparisson operator like so:
Thomas Shelby 30 Rio 400 Giorgos Georgiou 35 London 300 Timmy Turner 32 Berlin N/A
The output shows only the information of people under 40.
Conclusion
And there you have it! You now know the absolute basics to start working with awk and manipulate text data.
To learn more about Linux, freeCodeCamp has a wide variety of learning materials available.
Here are a couple of them get you started:
Thanks for reading and happy learning 😊