How to cut first n and last n columns?
The first part of your question is easy. As already pointed out, cut accepts omission of either the starting or the ending index of a column range, interpreting this as meaning either “from the start to column n (inclusive)” or “from column n (inclusive) to the end,” respectively:
$ printf 'this:is:a:test' | cut -d: -f-2 this:is $ printf 'this:is:a:test' | cut -d: -f3- a:test
It also supports combining ranges. If you want, e.g., the first 3 and the last 2 columns in a row of 7 columns:
$ printf 'foo:bar:baz:qux:quz:quux:quuz' | cut -d: -f-3,6- foo:bar:baz:quux:quuz
However, the second part of your question can be a bit trickier depending on what kind of input you’re expecting. If by “last n columns” you mean “last n columns (regardless of their indices in the overall row)” (i.e. because you don’t necessarily know how many columns you’re going to find in advance) then sadly this is not possible to accomplish using cut alone. In order to effectively use cut to pull out “the last n columns” in each line, the total number of columns present in each line must be known beforehand, and each line must be consistent in the number of columns it contains.
If you do not know how many “columns” may be present in each line (e.g. because you’re working with input that is not strictly tabular), then you’ll have to use something like awk instead. E.g., to use awk to pull out the last 2 “columns” (awk calls them fields, the number of which can vary per line) from each line of input:
$ printf '/a\n/a/b\n/a/b/c\n/a/b/c/d\n' | awk -F/ '' /a a/b b/c c/d
How to find the last field using ‘cut’ in Linux?
In the Linux environment, it is often necessary to extract specific fields from text data in order to manipulate or process the information. One common tool used for this purpose is the ‘cut’ command, which allows you to split a line of text into separate fields based on a specified delimiter. However, in some cases, you may want to extract only the last field from a line of text, which is not directly supported by the ‘cut’ command. In this case, you can use one of the following methods to achieve this goal.
Method 1: Using awk
To find the last field of a line using awk in Linux, you can use the built-in variable NF , which represents the number of fields in the current line.
In summary, the NF variable in awk represents the number of fields in the current line, and you can use it to access the last field or any other field by its position relative to the last field. You can also use pattern matching or specify a different delimiter to extract the last field from lines that have a specific format.
Method 2: Using rev and cut
To find the last field in a line using cut and rev , follow these steps:
- Use rev to reverse the order of the characters in the line.
- Use cut to extract the first field from the reversed line.
- Use rev again to reverse the extracted field back to its original order.
Here is an example command that demonstrates this approach:
echo "first second third" | rev | cut -d ' ' -f 1 | rev
In this example, the echo command outputs the string «first second third». The rev command then reverses the order of the characters in the string, resulting in «driht dnoces tsrif». The cut command extracts the first field from this reversed string using a space delimiter ( -d ‘ ‘ ), resulting in «driht». Finally, the second rev command reverses this extracted field back to its original order, resulting in «third».
Here is another example that demonstrates how to use this approach with a file:
cat file.txt | rev | cut -d ',' -f 1 | rev
In this example, the cat command reads the contents of the file «file.txt». The rev command reverses the order of the characters in each line of the file. The cut command extracts the last field from each line using a comma delimiter ( -d ‘,’ ). Finally, the second rev command reverses each extracted field back to its original order.
Overall, using rev and cut to find the last field in a line is a simple and effective approach that can be used in a variety of Linux command-line scenarios.
Method 3: Using Perl
To find the last field of a string using Perl, you can use the split function. The split function takes two arguments: the delimiter and the string to be split. To split a string using a delimiter, you can use the following syntax:
my @fields = split /delimiter/, $string;
To find the last field, you can use the scalar context of the split function:
my $last_field = (split /delimiter/, $string)[-1];
This will split the string using the specified delimiter and return an array of fields. The [-1] index returns the last element of the array, which is the last field of the string.
Here is an example code that demonstrates how to find the last field of a comma-separated string using Perl:
#!/usr/bin/perl use strict; use warnings; my $string = "apple,banana,orange"; my $last_field = (split /,/, $string)[-1]; print "Last field: $last_field\n";
You can also use regular expressions to split the string and find the last field. Here is an example code that demonstrates how to find the last field of a string using a regular expression:
#!/usr/bin/perl use strict; use warnings; my $string = "apple:banana:orange"; my ($last_field) = $string =~ /([^:]+)$/; print "Last field: $last_field\n";
In this example, the regular expression ([^:]+)$ matches any non-colon character at the end of the string and captures it in a group. The $ anchor matches the end of the string. The parentheses around $last_field assign the captured group to the variable.
Alternatively, you can use the reverse function to reverse the order of the fields and then use the split function to split the reversed string and get the first field, which is the last field of the original string.
How to use cut command to get the first and last elements of a row?
I’ve asked almost the same question already, but this time, I want to retrieve the X latest elements of a row of a CSV file. For example, with an input file as this one:
1;foo;bar;baz;x;y;z 2;foo;bar;baz;x;y;z 3;foo;bar;baz;x;y;z
In fact, my real target is to retrieve the first 3 and the last 2 fields of each row, so I get:
1;foo;bar;y;z 2;foo;bar;y;z 3;foo;bar;y;z
Unfortunately, I cannot use a command like cut -d \; -f 1-3,10-11 (if there are 11 elements in the row), because the CSV file does not respect the real CSV format. Indeed, some fields in the middle of the rows are encrypted, and their encrypted value may sometimes contains a ; characters (and of course, they are not wrapped inside » ). In others words, my lines may look like that:
1;foo;bar;#@$"é&^l#;baz;x;y;z 2;foo;bar;#¤=é;)o'#;baz;x;y;z 3;foo;bar;#]]'~é
and as you can see, on the second line, there is an additional ; character, so I can't use here a command like cut -d \; -f 1-3,7-8 , because if will return that, which is wrong:
1;foo;bar;y;z 2;foo;bar;x;y (-> Wrong here, there is a shift) 3;foo;bar;y;z
So how can I use cut to solve my problem? Thanks ps: I am specially in love with the cut command, so if you have a command that does what I want but that is not cut , then it's fine too 🙂 Edit It seems important to note that the machine is quite old: uname -a give this message:
SunOS ###### 5.10 Generic_142900-05 sun4u sparc SUNW,Sun-Fire-V240
How to get second last field from a cut command
Got a hint from Unix cut except last two tokens and able to figure out the answer :
cat datafile | rev | cut -d '/' -f 2 | rev
@JanKyuPeblik Please explain. What is the benefit of cat and having two processes with additional buffering via pipeline instead of just having one rev process that achieves same result as two ?
@JanKyuPeblik Sorry, but it still is unclear. "Preserving a linear order" doesn't seem to be necessary here, especially since the answer suggest they are in fact reversing the lines first for processing and reversing again. cat datafile | rev has no visible benefit over rev datafile
Awk is suited well for this:
The variable NF is a special awk variable that contains the number of fields in the current record.
There's no need to use cut , rev , or any other tools external to bash here at all. Just read each line into an array, and pick out the piece you want:
while IFS=, read -r -a entries; do printf '%s\n' "$ - 2]>" done
Doing this in pure bash is far faster than starting up a pipeline, at least for reasonably small inputs. For large inputs, the better tool is awk.
I wouldn't say no reason.. That's a lot of nasty syntax for a simple task and I'd take a few extra nanoseconds personally. Anyway, +1 for giving a robust bash solution.
Saying there's no reason to use an external tool when you can use bash constructs is like saying there's no reason to use a lawnmower when you can use scissors. A shell is a just an environment from which to call tools and manipulate files and processes along with some constructs to sequence all of that. Like with any other form of construction, when constructing software just use the right tool for each job.
@EdMorton That may be a nice sound bite, but it doesn't actually line up with the world as it is. bash is a fairly complete programming environment, and provides the tools necessary to do most common operations in-process. You wouldn't write Python code that calls external tools for operations Python has built in; why do so in bash?
@EdMorton . to go a little deeper: this isn't your grandpa's Bourne shell. bash has proper arrays (of C strings), map/hash datatypes, indirect variable references. 40 years ago, a shell might have been a tool that did nothing but set up pipelines, but now ain't then.
To paraphrase then - for inputs that it'd literally take the blink of an eye to process in awk, you can do it in a slightly briefer blink of an eye using bash but then be prepared for a severe performance hit as your data grows. So the bash solution is more cumbersome to write than awk and runs far slower than awk in situations where performance is a actually something you'd care about (i.e. on large data sets). Best I can tell, then, there's no reason to write it in bash other than just as an academic exercise just to show people how to use the bash constructs.
Perl solution similar to awk solution from @iiSeymour
These command-line options are used:
- n loop around every line of the input file, do not automatically print every line
- l removes newlines before processing, and adds them back in afterwards
- a autosplit mode – split input lines into the @F array. Defaults to splitting on whitespace
- e execute the perl code
The @F autosplit array starts at index [0] while awk fields start with $1
-1 is the last element
-2 is the second to last element
The most minimalist answer to this problem is to use my cuts utility:
$ cat file.txt text,blah,blaah,foo this,is,another,text,line $ cuts -2 file.txt blaah text
cuts, which stands for "cut on steroids":
- automatically figures out the input field separators - supports multi-char (and regexp) separators - automatically pastes (side-by-side) multiple columns from multiple files - supports negative offsets (from end of line) - has good defaults to save typing + allows the user to override them
I wrote cuts after being frustrated with the too many limitations of cut on Unix. It is designed to replace various cut / paste combos, slicing and dicing columns from multiple files, with multiple separator variations, while imposing minimal typing from the user.
You can get cuts (free software, Artistic Licence) from github: https://github.com/arielf/cuts/
Calling cuts without arguments will print a detailed Usage message.