- split long line on a delimiter
- 6 Answers 6
- Edit
- A pure Bash solution that works with ‘:’ at the end.
- Example
- You must log in to answer this question.
- Related
- Hot Network Questions
- Subscribe to RSS
- Linux split command
- Description
- Syntax
- Options
- Examples
- Related commands
- Linux split Command with Examples
- Linux split Command Syntax
- Linux split Command Options
- Linux split Command Examples
- Split Files
- Use the Verbose Option
- Set Number of Lines per File
- Choose File Size
- Specify Maximum Size
- Set Number of Output Files
- Split a File at the End of a Line
- Show Only a Specified Output File
- Set Suffix Length
- Change Suffix
- Change Prefix
- Omit Files with Zero Size
- Reconnect Split Files
split long line on a delimiter
I’m trying to figure out the cut command but it seems to only work with fixed amounts of input, like «first 1000 characters» or «first 7 fields». I need to work with arbitrarily long input.
6 Answers 6
You can also do this in pure bash :
while IFS=: read -ra line; do printf '%s\n' "$" done
Note that using \n in the replacement string like that will work in GNU sed, but will fail in most other sed implementations.
$ line=foo:bar:baz:quux $ words=$(IFS=:; set -- $line; printf "%s\n" "$@") $ echo "$words" foo bar baz quux
If your grep supports -o you can do it like this:
Or with awk, setting the record separator to : :
cut -d: --output-delimiter=$'\n' -f1-
Edit
As noted by Chris below, this will leave a trailing newline, this can be avoided if your awk supports specifying RS as a regular expression (tested with GNU awk):
don’t know why ppl hate on xargs
$ xargs --version | head -1 xargs (GNU findutils) 4.7.0 . $ printf "foo:bar:baz:quux" | xargs -d: -n1 foo bar baz quux
A pure Bash solution that works with ‘:’ at the end.
## Split string, store in array: IFS=: read -ra arr - 1 ]" # pop last element
Example
line=foo:bar: ## wrong: IFS=: read -ra arr - 1 ]" # pop last element declare -p arr # output: . '([0]="foo" [1]="bar" [2]="")' ## output as records #### for j in "$"; do echo "$j"; done # output is "foo\nbar\n\n"
In some strings I had problem with solutions above. But this worked for me:
echo $string | sed 's/\\n/ /g' | tr " " \\n
You must log in to answer this question.
Related
Hot Network Questions
Subscribe to RSS
To subscribe to this RSS feed, copy and paste this URL into your RSS reader.
Site design / logo © 2023 Stack Exchange Inc; user contributions licensed under CC BY-SA . rev 2023.7.12.43529
Linux is a registered trademark of Linus Torvalds. UNIX is a registered trademark of The Open Group.
This site is not affiliated with Linus Torvalds or The Open Group in any way.
By clicking “Accept all cookies”, you agree Stack Exchange can store cookies on your device and disclose information in accordance with our Cookie Policy.
Linux split command
On Unix-like operating systems, the split command splits a file into pieces.
This page covers the GNU/Linux version of split.
Description
split outputs fixed-size pieces of input INPUT to files named PREFIXaa, PREFIXab, .
The default size for each split file is 1000 lines, and default PREFIX is «x«. With no INPUT, or when INPUT is a dash («—«), read from standard input.
Syntax
split [OPTION]. [INPUT [PREFIX]]
Options
-a N, —suffix-length=N | Use suffixes of length N (default 2) |
-b SIZE, —bytes=SIZE | Write SIZE bytes per output file. |
-C SIZE, —line-bytes=SIZE | Write at most SIZE bytes of lines per output file. |
-d, —numeric-suffixes | Use numeric suffixes instead of alphabetic. |
-e, —elide-empty-files | Do not generate empty output files with «-n« |
—filter=COMMAND | Write to shell command COMMAND; file name is $FILE |
-l NUMBER, —lines=NUMBER | Put NUMBER lines per output file. |
-n CHUNKS, —number=CHUNKS | Generate CHUNKS output files. (See below.) |
-u, —unbuffered | Immediately copy input to output with «-n r/. «. |
—verbose | Print a verbose diagnostic before each output file is opened. |
—help | Display a help message and exit. |
—version | Output version information and exit. |
SIZE may be one of the following, or an integer optionally followed by one of following multipliers:
suffix | multiplier |
---|---|
KB | 1000 |
K | 1024 |
MB | 1000 x 1000 |
M | 1024 x 1024 |
- N: split into N files based on size of input
- K/N: output Kth of N to standard output
- l/N: split into N files without splitting lines
- l/K/N: output Kth of N to standard output without splitting lines
- r/N: like «l» but use round robin distribution r/K/N likewise but only output Kth of N to standard output
Examples
split -b 22 newfile.txt new
Split the file newfile.txt into three separate files called newaa, newab and newac. with each file containing 22 bytes of data.
Split the file newfile.txt into files beginning with the name new, each containing 300 lines of text.
Related commands
csplit — Split files based on a defined context.
Linux split Command with Examples
The Linux split command breaks files into smaller parts and is used for analyzing large text files with many lines. While each split file counts 1000 lines by default, the size is changeable.
In this guide, learn how to use the Linux split command with examples.
- Access to the terminal line.
- A large text file (this tutorial uses large_text, small_text, and tiny_text files).
Linux split Command Syntax
The basic split syntax is:
split [options] [file] [prefix]
The split command cannot be run without including the target file. Stating the prefix is optional. If no prefix is specified, split defaults to using x as the prefix, naming created files as follows: xaa, xab, xac, etc.
Linux split Command Options
The split command supports many options. The most common split command options are:
Option | Description |
---|---|
-a | Set suffix length. |
-b | Determines size per output file. |
-C | Determines the maximum size per output file. |
-d | Changes default suffixes to numeric values. |
-e | Omits creating empty output files. |
-l | Creates files with a specific number of output lines. |
-n | Generates a specific number of output files. |
—verbose | Displays a detailed output. |
Linux split Command Examples
The split command enables users to divide and work with large files in Linux. The command is often used in practice, and 13 common use cases are explained below.
Split Files
The basic usage of split is to divide big files into smaller 1000-line chunks. For instance, split the large_text file and verify the output with ls:
The ls command shows 13 new files, ranging from xaa to xam. Check the line count for each file using wc with the -l flag:
The target file, large_text, is 13000 lines long. The split command makes 13 files containing 1000 lines each. If the target file’s line count is not divisible by 1000, split counts 1000 lines per file except for the last one. The last file has fewer lines.
For instance, a file smaller_text in smaller_directory has 12934 lines:
Split smaller_text and use ls to confirm the outcome:
Once the target file is split, run wc -l again:
The output shows that the last file has 934 lines, as opposed to the other 12, which have 1000 lines each.
Use the Verbose Option
The split command does not print any output. Use —verbose to track how split works. Running split with —verbose shows more details:
split large_ text --verbose
Set Number of Lines per File
To bypass the default 1000-line rule, use the -l flag with split . The split -l command enables users to set the number of lines per file.
For instance, run split -l2500 to create files containing 2500 lines each and check the line count with wc :
The command creates six new files. Files xaa to xae have 2500 lines each, while file xaf has 500 lines, totaling 13000.
The split -l command can also make files with fewer lines than 1000. For example, the tiny_text file has 2693 lines:
Split the text into 500-line files with:
The command prints five 500-line files and one 193-line file.
Choose File Size
Split files based on their size with split -b . The command creates files based on the number ( n ) of:
For instance, create 1500Kb files from large_text with:
split -b1500K large_text --verbose
The —verbose option shows that split -bnK created six files. To check file size, use wc -c :
The output shows that five files are 1 536 000 bytes each, and the sixth is 56 957 bytes long.
Specify Maximum Size
Use -C to set a maximum size per output file. For instance, split large_text and set the output size to 2MB with:
The wc -c command shows that split created four new files and that the first three are roughly 2 MB, while the fourth one is smaller.
Set Number of Output Files
Use -n with split to determine the number of output files. For example, split large_text into ten parts with:
Split a File at the End of a Line
Another -n usage is splitting a file at the end of a complete line. To do this, combine -n with l . For instance, split the file large_text into ten files while ending with a complete line with:
The ls command shows ten newly created files. Run cat on any file to verify the file ends on a complete line:
Show Only a Specified Output File
The split command, by default, creates as many files as necessary to cover the entire source file. However, using -n with split does split a file, but only displays the specified part(s). The flag also doesn’t create output files but prints the output to the terminal.
For instance, split tiny_text into 100 parts but only display the first one with:
The command prints the first split file to the standard output without creating any new files.
Set Suffix Length
The split command creates files with a default suffix of two letters. Change the length by adding the -a flag to split. For instance, to make the suffix 3-characters long, type:
Change Suffix
Use split to create files with different suffixes. For instance, split large_text into 2500-line files with numeric suffixes:
The output shows six files with numbered suffixes created with the -d flag. The -l2500 flag splits the large_text file into six 2500-line files.
Change Prefix
The split command also creates output files with customizable prefixes. The syntax for the command is:
For instance, split large_text into ten files called part00 to part09 with:
split -d large_text part -n 10
The prefix changes from x to part and ends with numbers due to the -d flag. The -n flag splits the file into ten parts.
Omit Files with Zero Size
When splitting files, some output will return zero-size files. To prevent zero-size output files, use split with the -e flag. For instance, split the xaa file from the tiny_directory into 15 files, with a numeric suffix, assuring zero-size files are omitted:
Check file size with wc -c :
Using the x0* and x1* as search terms ensures wc -c prints the size of all files in the directory starting with numbers.
Reconnect Split Files
While split cannot rejoin files, there is an alternative option — the Linux cat command. Used to display the content of different files, cat also reconnects divided files into a new complete document.
For instance, large_text is split into ten files:
All the output files start with x. Apply cat to any items starting with x to merge them.
However, cat prints the result to the standard output. To merge the files into a new file, use > with the new file name:
Running wc -c shows that large_text and new_large_text are the same size.
After reading this article, you know how to use the Linux split command to work with large documents. Next, learn how to securely copy and transfer files using the SCP command.