Linux sort by columns

Sorting data based on second column of a file

I have a file of 2 columns and n number of rows. column1 contains names and column2 age . I want to sort the content of this file in ascending order based on the age (in second column). The result should display the name of the youngest person along with name and then second youngest person and so on. Any suggestions for a one liner shell or bash script.

5 Answers 5

You can use the key option of the sort command, which takes a «field number», so if you wanted the second column:

-n , —numeric-sort compare according to string numerical value

$ cat ages.txt Bob 12 Jane 48 Mark 3 Tashi 54 $ sort -k2 -n ages.txt Mark 3 Bob 12 Jane 48 Tashi 54 

also note that using -h instead of -n will sort human readable values like 2G or 3K as well as numbers separated with commas e.g. 1,234.5

Faced issue with «wrong» ordering. Pay attention to man «*** WARNING *** The locale specified by the environment affects sort order. Set LC_ALL=C to get the traditional sort order that uses native byte values.» (for string match case without -n )

This doesn’t consider spaces in the first column neither works if there are more columns after the second, since -k read until the line end. Supposing it is a TSV file a better solution is sort -t$’\t’ -k2 -n FILE

Solution:

more verbosely written as:

sort —key 2 —numeric-sort filename

Example:

$ cat filename A 12 B 48 C 3 $ sort --key 2 --numeric-sort filename C 3 A 12 B 48 

Explanation:

  • -k # — this argument specifies the first column that will be used to sort. (note that column here is defined as a whitespace delimited field; the argument -k5 will sort starting with the fifth field in each line, not the fifth character in each line)
  • -n — this option specifies a «numeric sort» meaning that column should be interpreted as a row of numbers, instead of text.

More:

Other common options include:

  • -r — this option reverses the sorting order. It can also be written as —reverse.
  • -i — This option ignores non-printable characters. It can also be written as —ignore-nonprinting.
  • -b — This option ignores leading blank spaces, which is handy as white spaces are used to determine the number of rows. It can also be written as —ignore-leading-blanks.
  • -f — This option ignores letter case. «A»==»a». It can also be written as —ignore-case.
  • -t [new separator] — This option makes the preprocessing use a operator other than space. It can also be written as —field-separator.
Читайте также:  Курсы администрирования linux ubuntu

There are other options, but these are the most common and helpful ones, that I use often.

Источник

How to Sort in Linux Bash by Column

The sort command available in Linux allows users to perform sorting operations on a file or an input. The sort command is handy when we want to get an ordered output of a file ascending, descending, or custom-defined sort order. By default, the sort command does not alter the original file unless the output is redirected back to the file.

This article covers how to use the sort command to perform sorting operations on specific columns in a file.

Basic Usage

The sort command is simple to use and very useful in daily Linux operations. The general syntax of the command is as:

The options you pass to the command modifies how the file is sorted and the specific conditions to sort the target file. You can omit the options to use the default sorting parameters.

By default, the sort command:

  • Sorts the alphabets in ascending order.
  • Letters come after numerical values
  • Assigns higher precedence to lowercase letters than to uppercase letters.

For example, to sort a file without options:

Once we run the sort command against the file, we get the information sorted in alphabetical order (ascending).

NOTE: Numerical values take precedence as from the example above.

Sort Command Options

You can use the following options in conjunction with the raw command to modify how the values are sorted.

  • -n – sorts in numerical values.
  • -h – compares human-readable numbers such as 1k, 1G
  • -R – sort in random order but group the identical keys.
  • -r – sort the values in reverse (descending order).
  • -o – save ouput to a file
  • -c – check if the input file is sorted; do not sort if true.
  • -u – show unique values only.
  • -k – sort the data via a specific key (useful when sorting columnar data).

Those are some popular options you can tweak to get the best-sorted result. For more options, check the manual.

How to Sort In Linux Bash By Numerical Values

How to Sort In Linux Bash By Reverse Order

To sort input in reverse order, we use the -r flag. For example:

The command above will sort in ascending alphabetical order (numerical values first) and reverse order.

How to Sort In Linux Bash by Column

Sort allows us to sort a file by columns by using the -k option. Let us start by creating a file with more than one column. In sort, we separate a column by a single space.

In the example file below, we have six columns.

To sort the captains’ file above by their century, we can specify the -k followed by the column number as:

Once we specify the column to sort the data, the sort command will try to sort the values in ascending order. In the example above, the command sorts the values from the earliest century to the latest.

Читайте также:  Live cd acronis linux

To sort by the first name, set the sort column as 1:

How to Save Sort Output to a File

To save the sorted output to a file, we can use the -o option as:

The command above will sort the captains.txt file by the 5 th column and save the result to the captains_century.txt file.

Conclusion

That is the end of this tutorial on the sort command in Linux. We covered the basics of using the sort command to get the most out of your sorted data. Feel free to explore how you can use the sort command.

About the author

John Otieno

My name is John and am a fellow geek like you. I am passionate about all things computers from Hardware, Operating systems to Programming. My dream is to share my knowledge with the world and help out fellow geeks. Follow my content by subscribing to LinuxHint mailing list

Источник

Sort by Column in Bash

Learn various examples of sorting data by columns in bash scripts and Linux command line.

Once in a while, we all encounter a situation where we have a huge text file and want to sort the data accordingly.

Especially in times when you are given data in tabular form.

Sure, if you have access to GUI, there are multiple tools that do sorting effortlessly for you but what about sorting data by columns in bash directly?

And in this guide, I will walk you through how you can sort data by column in bash.

How to sort by column in bash

Throughout this guide, I will be using the sort command to sort data by column.

To sort columns, you will have to use a -k flag followed by the number of a column you want to sort.

So the syntax to sort columns using the sort command will look like this:

sort -k [Column_number] Filename

But before going to the sorting examples, let me share how the sort command will sort the data by default:

  • Sorts data in alphabetical order
  • Numbers will always come before the alphabet
  • Lowercase letters are prioritized before the capital ones

In conclusion, It looks like this: Numbers>Lowercase>Uppercase .

And for the sake of this tutorial, I will be using a file named Students.txt which contains the basic data of 7 students as follows:

Name Enrollment_No City Sagar 181240116054 Nadiad Milan 181240116019 Aurangabad Anurag 181240116018 Ahmedabad Priya 181240116001 Ananad Abhiman 181240116050 Varanasi Aayush 181240115019 Karnataka Ankush 181240116056 Bhubaneswar

Let’s say I want to sort the 2nd column in the Student.txt file so I will be using the following:

sort column in bash

And as you can see, the 2nd column is sorted in ascending order (the default behavior).

But what if you want to change how the data is sorted?

Sort columns by numbers in bash

You might be wondering if the numbers are sorted by default then why would I’m even writing this?

Читайте также:  Mac shared folder with linux

Well, the problem is the sort command by default will sort numbers from the leading characters only.

Let me share an example here. Here, I created a sample file with random numbers in one column and used the sort command without any additional options:

sort command will only sort numbers based on the first character

And as you can see, the result is messed up!

To solve this issue, you will need to use the -n flag with the existing command:

sort -n -k [Column_number] Filename

sort numbered colums in bash

Sort by columns in reverse in bash

You may encounter times when you want to sort columns in descending order.

For that purpose, you will have to utilize the -r flag:

sort -r -k [Column_number] Filename

Here, I sorted the 3rd column in reverse:

sort by column in reverse in bash

Sort command can do a lot more.

Sort command when used with multiple flags can do wonders for you, especially when you have multiple files to deal with.

For this purpose, we have a dedicated guide on you can use the sort command with multiple examples:

I hope this guide has solved the queries you had before.

And if you still have doubts, feel free to ask in the comments.

Источник

Linux shell sort file according to the second column?

You can use multiple -k flags to sort on more than one column. For example, to sort by family name then first name as a tie breaker:

Relevant options from «man sort»:

-k, —key=POS1[,POS2]

start a key at POS1, end it at POS2 (origin 1)

POS is F[.C][OPTS], where F is the field number and C the character position in the field. OPTS is one or more single-letter ordering options, which override global ordering options for that key. If no key is given, use the entire line as the key.

-t, —field-separator=SEP

use SEP instead of non-blank to blank transition

Just be a little careful to use —field-separator=’,’ if you might have a data entry operator type in values for «First name» like «Billy Bob» or whatever. spaces can easily get into your data if you don’t guard against it, but commas are relatively unlikely.

There are very likely cases of commas in those fields, like «Smith, Jr.» or «McDowell, Sr.» or «Dr. John» or «New York, NY»

Note that if the columns are visually aligned, i.e. there is a non constant number of spaces between each fields, you must use the -b option. This is because sort is actually considering that the string to sort starts just after the comma, and not from the first letter of the column. Also, you may need to prefix the command with LC_ALL=C , to avoid any side effect due to the locale, which can happen even on a simple ASCII file.

@calandoa Thanks for the part on -b ( —ignore-leading-blanks ). To clarify a bit: echo -e ‘aaa\nab’ | sort -k2 gives ab first (the second column starts after the first non-blank to blank transition , and b is before a ), but with -b it gives aaa as expected ( a is before b ).

Источник

Оцените статью
Adblock
detector