Source lines of code linux

Counting lines of code?

if I want to count the lines of code, the trivial thing is cat *.c *.h | wc -l But what if I have several subdirectories?

Note that some (possibly even most) modern shells (Bash v4, Zsh, probably more) provide a recursive-globbing mechanism using ** , so you could have used wc -l **/*. or something similar. Note that in Bash, at least, this option (called globstar ) is off by default. But also note that in this particular case, cloc or SLOCCount is a much better option. (Also, ack may be preferable to find for easily finding/listing source files.)

wc -l counts lines, not lines of code. 7000 blank lines will still show up in wc -l but wouldn’t count in a code metric. (comments too usually don’t count)

11 Answers 11

The easiest way is to use the tool called cloc . Use it this way:

-1 because this program doesn’t have any way to recognise lines of code in languages outside of its little, boring brain. It knows about Ada and Pascal and C and C++ and Java and JavaScript and «enterprise» type languages, but it refuses to count the SLOC by just file extension, and is thus completely useless for DSLs, or even languages it just happens to not know about.

Well, the programming language which CLOC refuses to acknowledge does indeed fulfill all my past and future demands 🙂

@cat according to the CLOC documentation it can read in a language definition file, so there is a way to get it to recognize code in languages it hasn’t defined. Plus it’s open source, so you can always extend it to make it better!

You should probably use SLOCCount or cloc for this, they’re designed specifically for counting lines of source code in a project, regardless of directory structure etc.; either

will produce a report on all the source code starting from the current directory.

If you want to use find and wc , GNU wc has a nice —files0-from option:

find . -name '*.[ch]' -print0 | wc --files0-from=- -l 

+1 for sloccount. Interestingly, running sloccount /tmp/stackexchange (created again on May 17 after my most recent reboot) says that the estimated cost to develop the sh, perl, awk, etc files it found is $11,029. and that doesn’t include the one-liners that never made it into a script file.

Estimating cost based on lines of code? What about all the people employed to re-factor spaghetti into something maintainable?

@OrangeDog you could always try to account for that in the overhead; see the documentation for an explanation of the calculation (with very old salary data) and the parameters you can tweak.

Читайте также:  Отключить пользователя root linux

@StephenKitt> still, the main issue is it’s counting backwards. When cleaning up code, you often end up with less lines. Sure you could try to handwave an overhead to incur on the rest of the code to account for the removed one, but I don’t see how it’s better than just guessing the whole price in the first place.

As the wc command can take multiple arguments, you can just pass all the filenames to wc using the + argument of the -exec action of GNU find :

find . -type f -name '*.[ch]' -exec wc -l <> + 

Alternately, in bash , using the shell option globstar to traverse the directories recursively:

Other shells traverse recursively by default (e.g. zsh ) or have similar option like globstar , well, at least most ones.

If you are in an environment where you don’t have access to cloc etc I’d suggest

find -name '*.[ch]' -type f -exec cat '<>' + | grep -c '[^[:space:]]' 

Run-through: find searches recursively for all the regular files whose name ends in either .c or .h and runs cat on them. The output is piped through grep to count all the non-blank lines (the ones that contain at least one non-spacing character).

You can use find together with xargs and wc :

find . -type f -name '*.h' -o -name '*.c' | xargs wc -l 

(that assumes file paths don’t contain blanks, newlines, single quote, double quote of backslash characters though. It may also output several total lines if several wc s are being invoked.)

Perhaps the several wc commands problem can be addressed by piping find to while read FILENAME; do . . .done structure. And inside the while loop use wc -l . The rest is summing up the total lines into a variable and displaying it.

As has been pointed out in the comments, cat file | wc -l is not equivalent to wc -l file because the former prints only a number whereas the latter prints a number and the filename. Likewise cat * | wc -l will print just a number, whereas wc -l * will print a line of information for each file.

In the spirit of simplicity, let’s revisit the question actually asked:

if I want to count the lines of code, the trivial thing is

But what if I have several subdirectories?

Firstly, you can simplify even your trivial command to:

And finally, the many-subdirectory equivalent is:

find . -name '*.[ch]' -exec cat <> + | wc -l 

This could perhaps be improved in many ways, such as restricting the matched files to regular files only (not directories) by adding -type f —but the given find command is the exact recursive equivalent of cat *.[ch] .

find . -name '*.[ch]' -exec wc -l <> \; | awk '; END < print "Total number of lines: " SUM >' 

@Hastur: It runs wc -l for groups of files, rather like xargs does, but it handles odd-ball characters (like spaces) in file names without needing either xargs or the (non-standard) -print0 and -0 options to find and xargs respectively. It’s a minor optimization. The downside would be that each invocation of wc would output a total line count at the end when given multiple files — the awk script would have deal with that. So, it’s not a slam-dunk, but very often, using + in place of \; with find is a good idea.

Читайте также:  Kde linux in windows

@JonathanLeffler Thank you. I agree. My concerns, however, were about the length of the parameter string passed to wc . If unknown a priori the number of files that will be found, is there the risk to pass that limit or somehow is it handled by find?

@Hastur: find groups the files into convenient size bundles, which won’t exceed the length limit for the argument list on the platform, allowing for the environment (which comes out of the argument list length — so the length of the argument list plus the length of the environment has to be less than a maximum value). IOW, find does the job right, like xargs does the job right.

Источник

Count lines of code with cloc

It can be difficult to count the number of lines of code that comprises a certain program, since simply viewing the source code will include comments, whitespace, etc. On Linux systems, the cloc command can be used to count lines of code in one or multiple files, and even sort results by programming language.

The cloc program is especially helpful if you need to measure and submit your progress of a coding project, view coding statistics, or calculate the total value of your code.

In this tutorial, you’ll see how to install the cloc software package on all major Linux distributions, and then use the cloc command to count the lines of code of various program files.

In this tutorial you will learn:

  • How to install cloc on major Linux distros
  • How to use the cloc command to count lines of code on Linux

Use the cloc command to count number of lines of code in Linux

Software Requirements and Linux Command Line Conventions
Category Requirements, Conventions or Software Version Used
System Any Linux distro
Software cloc
Other Privileged access to your Linux system as root or via the sudo command.
Conventions # – requires given linux commands to be executed with root privileges either directly as a root user or by use of sudo command
$ – requires given linux commands to be executed as a regular non-privileged user
Читайте также:  Sftp in linux shell

Install cloc on major Linux distributions

cloc can be installed from your system’s package manager. Use the appropriate command below to install it.

Once it’s installed, you will be able to execute the commands from the examples below.

How to use cloc on Linux

You can use the cloc command to count the lines of code of an individual file, multiple files, a directory, or even a compressed archive such as a .tar.gz and .zip files.

    Counting the lines of a Bash file:

$ cloc countdown.sh 1 text file. 1 unique file. 0 files ignored. github.com/AlDanial/cloc v 1.82 T=0.03 s (34.3 files/s, 2781.5 lines/s) ------------------------------------------------------------------------------- Language files blank comment code ------------------------------------------------------------------------------- Bourne Shell 1 12 0 69 -------------------------------------------------------------------------------
$ cloc *.php 209 text files. 209 unique files. 0 files ignored. github.com/AlDanial/cloc v 1.82 T=2.93 s (71.4 files/s, 67066.1 lines/s) ------------------------------------------------------------------------------- Language files blank comment code ------------------------------------------------------------------------------- PHP 209 22423 80758 93103 ------------------------------------------------------------------------------- SUM: 209 22423 80758 93103 -------------------------------------------------------------------------------
$ cloc --by-file my_project/ 2 text files. 2 unique files. 0 files ignored. http://cloc.sourceforge.net v 1.82 T=0.01 s (149.5 files/s, 448.6 lines/s) -------------------------------------------------------------------------------- File blank comment code -------------------------------------------------------------------------------- my_project/perl.pl 1 0 2 my_project/bash.sh 1 0 2 -------------------------------------------------------------------------------- SUM: 2 0 4 --------------------------------------------------------------------------------
$ cloc /usr/src/linux-headers-`uname -r` 347 text files. 346 unique files. 8625 files ignored. github.com/AlDanial/cloc v 1.82 T=4.70 s (54.3 files/s, 20714.3 lines/s) -------------------------------------------------------------------------------- Language files blank comment code -------------------------------------------------------------------------------- C 43 4888 3948 30263 Perl 29 3323 2819 16355 C/C++ Header 55 425 463 13876 DOS Batch 33 63 0 3050 Bourne Shell 50 557 913 2603 make 17 641 569 2160 C++ 1 268 66 1581 Python 7 285 411 1121 yacc 2 170 52 1015 Bourne Again Shell 7 182 198 892 lex 2 131 66 767 Glade 1 58 0 603 NAnt script 1 107 0 442 Assembly 4 282 1107 360 D 2 0 0 99 awk 1 9 5 67 -------------------------------------------------------------------------------- SUM: 255 11389 10617 75254 --------------------------------------------------------------------------------
$ cloc latest.tar.gz 2421 text files. 2353 unique files. 86 files ignored. github.com/AlDanial/cloc v 1.82 T=29.91 s (78.1 files/s, 40656.4 lines/s) ------------------------------------------------------------------------------- Language files blank comment code ------------------------------------------------------------------------------- JavaScript 543 74183 101931 320805 PHP 992 54140 158885 261259 CSS 554 25906 19563 155284 JSON 73 0 0 28413 Sass 155 2790 462 12011 SVG 15 0 0 344 HTML 1 13 0 84 XML 1 6 0 37 Markdown 1 1 0 2 ------------------------------------------------------------------------------- SUM: 2335 157039 280841 778239 -------------------------------------------------------------------------------

cloc has some extra options, which may come in handy in niche scenarios. To see them all, check out the manual page.

Closing Thoughts

In this tutorial, we saw how to install cloc on major Linux distros, and use the command to count the number of lines of code in one or more files on Linux. cloc is a simple and speedy program, able to process millions of lines of code in just a few seconds. It works on tons of different programming languages, making it useful for almost any type of developer.

Comments and Discussions

Источник

Оцените статью
Adblock
detector