Linux grep and tail

Содержание

grep and tail -f?
8 Answers 8
Using grep on a Continuous Stream
1. Introduction
2. Streams
3. Stream Manipulation
3.1. Redirection
3.2. Buffering
4. grep
4.1. Streams
4.2. Files
5. Continuous Streams with tail
6. Continuous Stream Processing With grep and tail
7. Buffer Control
8. Conclusion

grep and tail -f?

Is it possible to do a tail -f (or similar) on a file, and grep it at the same time? I wouldn’t mind other commands just looking for that kind of behavior.

8 Answers 8

Using GNU tail and GNU grep , I am able to grep a tail -f using the straight-forward syntax:

tail -f /var/log/file.log | grep search_term

This is a solution that works with other implementations of these two utilities, not just the GNU implementation.

tail -F (capital f) will also follow new file created (if file is cycled). -f (small f) will only follow, not trace new cycled files.

Add —line-buffered to grep , and that may reduce the delay for you. Very useful in some cases.

tail -f foo | grep --line-buffered bar

That’s useful when the output of grep doesn’t go to a terminal (redirected to another type of file). line buffering is the default when the output goes to a terminal, so it won’t make any difference there. Note that that option is GNU specific.

I believe it is in the tail ‘s output where the buffering causes delays. I use stdbuf utility to invoke it as stdbuf -o0 tail -f foo | grep . . Same can be applied to grep e.g. if it is being piped to another program, e.g. stdbuf -o0 tail -f foo | stdbuf -i0 -o0 grep bar | another_program

It will work fine; more generally, grep will wait when a program isn’t outputting, and keep reading as the output comes in, so if you do:

$ (echo foo; sleep 5; echo test; sleep 5) | grep test

Nothing will happen for 5 seconds, then grep will output the matched «test», and then five seconds later it will exit when the piped process does

I see all these people saying to use tail -f , but I do not like the limitations of that! My favorite method of searching a file while also watching for new lines (e.g., I commonly work with log files to which are appended the redirected output of processes executed periodically via cron jobs) is:

 tail -Fn+0 /path/to/file|grep searchterm

This assumes GNU tail and grep. Supporting details from the tail manpage (GNU coreutils, mine is v8.22) [https://www.gnu.org/software/coreutils/manual/coreutils.html] :

 -F same as --follow=name --retry -n, --lines=K output the last K lines, instead of the last 10; or use -n +K to output starting with the Kth. If the first character of K (the number of bytes or lines) is a '+', print beginning with the Kth item from the start of each file, otherwise, print the last K items in the file. K may have a multiplier suffix: b 512, kB 1000, K 1024, MB 1000*1000, M 1024*1024, GB 1000*1000*1000, G 1024*1024*1024, and so on for T, P, E, Z, Y. With --follow (-f), tail defaults to following the file descriptor, which means that even if a tail'ed file is renamed, tail will continue to track its end. This default behavior is not desirable when you really want to track the actual name of the file, not the file descriptor (e.g., log rotation). Use --follow=name in that case. That causes tail to track the named file in a way that accommodates renaming, removal and creation.

So, the tail portion of my command equates to tail —follow —retry —lines=+0 , where the final argument directs it to start at the beginning, skipping zero lines.

Источник

Using grep on a Continuous Stream

The Kubernetes ecosystem is huge and quite complex, so it’s easy to forget about costs when trying out all of the exciting tools.

To avoid overspending on your Kubernetes cluster, definitely have a look at the free K8s cost monitoring tool from the automation platform CAST AI. You can view your costs in real time, allocate them, calculate burn rates for projects, spot anomalies or spikes, and get insightful reports you can share with your team.

Connect your cluster and start monitoring your K8s costs right away:

1. Introduction

There are multiple utilities in Linux, which can act on files and standard input. In addition, some can even act on a continuous stream of data directed to them.

In this tutorial, we deal with the grep (Global Regular Expression Print) command and how we can:

Particularly, we define streams and discuss stream manipulation. After that, we briefly discuss how grep handles streams and files. Next, we see a continuous stream produced by a specific tool. Finally, grep is used with a continuous stream, where we also demonstrate buffering control.

We tested the code in this tutorial on Debian 11 (Bullseye) with GNU Bash 5.1.4. It is POSIX-compliant and should work in any such environment.

2. Streams

At its most basic level, a stream is just a pipeline for data. Note the word choice of “for”, instead of “with”. This is because it takes into account the fact that data isn’t needed for a stream to exist.

Consider a keyboard. There is a stream for that input device, but it doesn’t mean we constantly slam keys to feed it with data. In practice, interested processes can subscribe to streams as if they were radio stations.

Читайте также: Linux вывод части строки

They do that in multiple ways:

We can send signals between processes to notify them of an event, but they usually don’t hold actual data along with the metadata. On the other hand, sockets are mainly used for networking. In fact, we can roughly equate a socket to a network pipe.

This leads us to manipulate streams in general – our primary interest.

3. Stream Manipulation

Whether via stream redirection or direct piping, we can modify and divert streams.

3.1. Redirection

In general, we have many means to redirect:

$ echo 'Data.' >> file.ext $ cat file.ext Data. $ echo 'Data.' | cat Data.

In both cases above, we are dealing with redirection operators. First, we use >> to echo some data to a file. Next, we output it back via cat (Concatenate). After that, we pipe the same information with | directly to cat.

Importantly, pipes almost always have buffers.

3.2. Buffering

The use of buffers often means the information doesn’t get through unless a given amount is already loaded for transfer or one end terminates. In particular, the termination can be of the process or the stream via a special character. In short, we can define a buffer as a place where data temporarily accumulates.

Some commands also use buffers directly.

4. grep

The grep tool has internal buffering. It usually functions alone during file operations. On the other hand, grep can also work on streams, which themselves provide a second buffer layer.

4.1. Streams

Indeed, we can just pass data to grep via stream redirection:

$ echo 'Content.' | grep 'Con' Content.

In this instance, we pipe a string directly for processing. Particularly, we redirect stdout through the pipe. Once grep is done with the string, all processes terminate along with the pipe.

However, there is an alternative way.

4.2. Files

Of course, grep can act on files directly by just using the filename:

$ echo 'Content.' > file.ext $ grep 'Con' file.ext Content.

But what if we wanted to monitor the file for changes? By combining with another tool, we can do just that.

5. Continuous Streams with tail

The tail command has the -f (follow) flag, which waits for file updates and adds them to the output instead of terminating directly after execution.

Читайте также: Linux add cron script

For example, if we start such a trailing tail of a file in one terminal and send data to that file in another, we expect to see the same data in the first terminal. Let’s see this in action.

After that, in another terminal, we send data to file.ext:

Indeed, we see the same information at the other end. Let’s now add our filter to the equation.

6. Continuous Stream Processing With grep and tail

This time, we pipe a continuous stream from tail to grep:

$ tail -f /file.ext | grep 'Line'

Next, we add data to the file in another terminal:

Now, depending on the exact setup, we might not see the output. Why? Because of buffering. Both the pipe and grep buffer and may delay output until a line feed or a certain amount of bytes.

However, we can control and prevent this.

7. Buffer Control

We can use the –line-buffered flag of grep to force flushing its buffer on each line termination instead of waiting for a concrete number of bytes:

$ tail -f /file.ext | grep --line-buffered 'Line'

After the above, every line appended to file.ext should produce output. If we don’t output a newline character, no output will get through:

Despite the modification, we may still encounter instances where the pipe itself buffers data. In these cases, there is a tool at our disposal: stdbuf.

In fact, we can enforce the same line buffering on the pipe:

$ stdbuf --output=L tail -f /file.ext | grep --line-buffered 'Line'

Using the –output option equal to L (line), we have line buffering on both sides of the pipe.

Actually, we can use stdbuf to completely remove buffering. To achieve this, we replace L with 0 (no buffering):

$ stdbuf --output=0 tail -f /file.ext | grep 'Line'

Depending on the setup, this line should produce immediate output on any modification of file.ext.

8. Conclusion

In this article, we saw how grep can be used with continuous streams of data. In addition, we applied buffer control via stdbuf.

To demonstrate both tools on a continuous stream, we used tail. Note that this is not the only way to produce such streams, but the methods discussed should work with any command-line tool.

In conclusion, grep works with continuous streams out of the box, but there are options to further enhance and control its functionality.

Источник