How to count lines in a document? [closed]
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
09:16:39 AM all 2.00 0.00 4.00 0.00 0.00 0.00 0.00 0.00 94.00 09:16:40 AM all 5.00 0.00 0.00 4.00 0.00 0.00 0.00 0.00 91.00 09:16:41 AM all 0.00 0.00 4.00 0.00 0.00 0.00 0.00 0.00 96.00 09:16:42 AM all 3.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 96.00 09:16:43 AM all 0.00 0.00 1.00 0.00 1.00 0.00 0.00 0.00 98.00 09:16:44 AM all 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 09:16:45 AM all 2.00 0.00 6.00 0.00 0.00 0.00 0.00 0.00 92.00
28 Answers 28
This will output the number of lines in :
$ wc -l /dir/file.txt 3272485 /dir/file.txt
Or, to omit the from the result use wc -l < :
You can also pipe data to wc as well:
$ cat /dir/file.txt | wc -l 3272485 $ curl yahoo.com --silent | wc -l 63
this is great!! you might use awk to get rid of the file name appended to the line number as such: wc -l | awk ‘
Beware that wc -l counts «newlines». If you have a file with 2 lines of text and one «newline» symbol between them, wc will output «1» instead of «2».
To filter and count only lines with pattern use:
See the grep man page to take a look at the -e,-i and -x args.
Oddly sometimes the grep -c works better for me. Mainly due to wc -l annoying «feature» padding space prefix.
Additionally when your last line does not end with an LF or CRLF wc -l gives out a wrong number of lines as it only counts line endings. So grep with a pattern like ^.*$ will actually give you the true line number.
wc -l does not count lines.
Yes, this answer may be a bit late to the party, but I haven’t found anyone document a more robust solution in the answers yet.
Contrary to popular belief, POSIX does not require files to end with a newline character at all. Yes, the definition of a POSIX 3.206 Line is as follows:
A sequence of zero or more non- characters plus a terminating character.
However, what many people are not aware of is that POSIX also defines POSIX 3.195 Incomplete Line as:
A sequence of one or more non- characters at the end of the file.
Hence, files without a trailing LF are perfectly POSIX-compliant.
If you choose not to support both EOF types, your program is not POSIX-compliant.
As an example, let’s have look at the following file.
1 This is the first line. 2 This is the second line.
No matter the EOF, I’m sure you would agree that there are two lines. You figured that out by looking at how many lines have been started, not by looking at how many lines have been terminated. In other words, as per POSIX, these two files both have the same amount of lines:
1 This is the first line.\n 2 This is the second line.\n
1 This is the first line.\n 2 This is the second line.
The man page is relatively clear about wc counting newlines, with a newline just being a 0x0a character:
NAME wc - print newline, word, and byte counts for each file
Hence, wc doesn’t even attempt to count what you might call a «line». Using wc to count lines can very well lead to miscounts, depending on the EOF of your input file.
POSIX-compliant solution
You can use grep to count lines just as in the example above. This solution is both more robust and precise, and it supports all the different flavors of what a line in your file could be:
This should be the accepted asnwer. Not only because it is correct but also because grep is more that twice faster than wc .
Wow, this is a good answer. It needs to be the accepted answer because of good explanation and POSIX specs are clearly outlined.
This is the best answer because this is true, more precise than the other answers, and that darn set of spaces in the beginning of wc output doesn’t show up with grep . I’m using the number of lines in a file for mathematical processing in a program, and those spaces are a pain, especially because I can’t use cut since I don’t know how many digits are going to be in the number of lines, so I can’t always just cut out the number. This just outputs a number and nothing but a number. It should be the accepted answer 🙂
there are many ways. using wc is one.
Yes, but wc -l file gives you the number of lines AND the filename to get just the filename you can do: filename.wc -l < /filepath/filename.ext
I voted this solutions because wc -l counts newline characters and not the actual lines in a file. All the other commands included in this answer will give you the right number in case you need the lines.
The tool wc is the «word counter» in UNIX and UNIX-like operating systems, but you can also use it to count lines in a file by adding the -l option.
wc -l foo will count the number of lines in foo . You can also pipe output from a program like this: ls -l | wc -l , which will tell you how many files are in the current directory (plus one).
ls -l | wc -l will actually give you the number of files in the directory +1 for the total size line. you can do ls -ld * | wc -l to get the correct number of files.
If you want to check the total line of all the files in a directory ,you can use find and wc:
If all you want is the number of lines (and not the number of lines and the stupid file name coming back):
As previously mentioned these also work (but are inferior for other reasons):
awk 'END' file # not on all unixes sed -n '$=' file # (GNU sed) also not on all unixes grep -c ".*" file # overkill and probably also slower
This answer was posted 3 years after the question was asked and it is just copying other ones. The first part is the trivial and the second is all ghostdog’s answer was adding. Downvoting.
No, you are wrong; ghostdog’s answer does not answer the original question. It gives you the number of lines AND the filename. To get just the filename you can do: filename.wc -l < /filepath/filename.ext. Which is why I posted the answer. awk, sed and grep are all slightly inferior ways of doing this. The proper way is the one I listed.
Write each FILE to standard output, with line numbers added. With no FILE, or when FILE is -, read standard input.
This is the first answer I have found that works with a file that has a single line of text that does not end in a newline, which wc -l reports as 0. Thank you.
I prefer it over the accepted answer because it does not print the filename, and you don’t have to use awk to fix that. Accepted answer:
But I think the best one is GGB667’s answer:
I will probably be using that from now on. It’s slightly shorter than my way. I am putting up my old way of doing it in case anyone prefers it. The output is the same with those two methods.
the first and last method are the same. the last one is better because it doesn’t spawn an extra process
Above are the preferred method but «cat» command can also helpful:
Will show you whole content of file with line numbers.
wc -l file_name
it will give you the total number of lines in that file
for getting last line use tail -1 file_name
I saw this question while I was looking for a way to count multiple files lines, so if you want to count multiple file lines of a .txt file you can do this,
it will also run on one .txt file 😉
This will give you number of lines and filename in output.
wc -l 24-11-2019-04-33-01-url_creator.log
to get only number of lines in output.
wc -l 24-11-2019-04-33-01-url_creator.log|cut -d\ -f 1
No, wc -l < filename is different to wc -l filename , the first uses redirection and then there isn't any filename in the output, like shown in the answer from user85509
cat file.log | wc -l | grep -oE '\d+'
count number of lines and store result in variable use this command:
I tried wc -l to get the number of line from the file name
To do more filtering for example want to count to the number of commented lines from the file use grep ‘#’ Filename.txt | wc -l
echo "No of files in the file $FILENAME" wc -l < $FILENAME echo total number of commented lines echo $FILENAME grep '#' $FILENAME | wc -l
Just in case. It's all possible to do it with many files in conjunction with the find command.
find . -name '*.java' | xargs wc -l
Don't use xargs . The find command has an -exec verb that is much simpler to use. Someone already suggested its use 6 years ago, although this question does not ask anything about multiple files. stackoverflow.com/a/28016686
Returns only the number of lines
Redirection/Piping the output of the file to wc -l should suffice, like the following:
which then would provide the no. of lines only.
Or count all lines in subdirectories with a file name pattern (e.g. logfiles with timestamps in the file name):
This drop-in portable shell function [ℹ] works like a charm. Just add the following snippet to your .bashrc file (or the equivalent for your shell environment).
# --------------------------------------------- # Count lines in a file # # @1 = path to file # # EXAMPLE USAGE: `count_file_lines $HISTFILE` # --------------------------------------------- count_file_lines() < local subj=$(wc -l $1) subj="$" echo $ >
This should be fully compatible with all POSIX-compliant shells in addition to bash and zsh.
Awk saves livestime (and lines too):
If you want to make sure you are not counting empty lines, you can do:
@Eric Ah cool, cool. I was going to suggest you post that answer, but it looks like someone else already did so. Anyways, when I posted this answer, I just discovered awk , and this was one of the many things I discovered it could do. I also just tested with a 1GB file, and awk was only 4x slower, not 16x. I created the test file using base64 /dev/urandom | head -c 1000000000 , but with smaller files (which is most likely what these answers will be used for), the speed is hardly variable
Yeah I get also a ratio of 4 with this sort of files. So depending on the file, yout mileage may vary. The point is that it's always in benefit of grep .
I know this is old but still: Count filtered lines
Number of files sent Company 1 file: foo.pdf OK Company 1 file: foo.csv OK Company 1 file: foo.msg OK Company 2 file: foo.pdf OK Company 2 file: foo.csv OK Company 2 file: foo.msg Error Company 3 file: foo.pdf OK Company 3 file: foo.csv OK Company 3 file: foo.msg Error Company 4 file: foo.pdf OK Company 4 file: foo.csv OK Company 4 file: foo.msg Error
If I want to know how many files are sent OK:
As others said wc -l is the best solution, but for future reference you can use Perl:
$. contains line number and END block will execute at the end of script.
Does not work: dir | perl -lne 'END < print $. >' Can't find string terminator "'" anywhere before EOF at -e line 1.'
@VeikkoW Works for me. If you are on Windows, different quoting rules apply; but the OP asked about Linux / Bash.
I just made a program to do this ( with node )
npm install gimme-lines gimme-lines verbose --exclude=node_modules,public,vendor --exclude_extensions=html
if you're on some sort of BSD-based system like macOS, i'd recommend the gnu version of wc. It doesn't trip up on certain binary files the way BSD wc does. At least it's still somewhat usable performance. On the other hand, BSD tail is slow as . zzzzzzzzzz.
As for AWK, only a minor caveat though - since it operates under the default assumption of lines, meaning \n , if your file just happens not to have a trailing new line delimiter, AWK will over count it by 1 compared to either BSD or GNU wc. Also, if you're piping in things with no new lines at all, such as echo -n , depending on whether you're measuring at the END < >section or FNR==1 , the NR will be different.
Highly active question. Earn 10 reputation (not counting the association bonus) in order to answer this question. The reputation requirement helps protect this question from spam and non-answer activity.
Linked
Related
Hot Network Questions
Site design / logo © 2023 Stack Exchange Inc; user contributions licensed under CC BY-SA . rev 2023.7.17.43537
By clicking “Accept all cookies”, you agree Stack Exchange can store cookies on your device and disclose information in accordance with our Cookie Policy.