ripgrep - Output Content Options

By now we have established that ripgrep proceeds through several steps each time it is called:

Select files to be searched
Apply the specified pattern(s) to each in the specified input(s)
Format and return each line of output

In this chapter we will learn about some of the various options that ripgrep provides around defining what it should output and how that output should be presented to the user.

As we have stated several times, ripgrep's basic function is to traverse through files and directories, locating lines that match one or more specified patterns, and sending them to stdout. There is a bit more to it and that, however, since ripgrep collects a fair amount of information that, depending upon what we are trying to achieve, can be very valuable.

For example, ripgrep knows which files it is parsing, so each time it identifies a matching line it can make that data available which can be very useful information in some applications, but extraneous information in others.

Line and Column Numbers

By default, ripgrep includes output numbers when its output is being printed to the screen, but omits line numbers in other cases, such as when its output is being piped to another command. This behavior can be modified by passing the --line-number or --no-line-number option when ripgrep is invoked.

When line numbers are enabled, the column number of the first match on each can also be included or excluded in the output by additionally passing either of the --column or --no-column options.

Both row and column numbers are 1-indexed, and be aware that column numbers simply count bytes which works fine for ASCII content but may not be reliable when the strings contain Unicode.

Identifying Matching Files

Sometimes the default behavior prints more information than we need. For example, what if we only want to find the files that contain a match, but don't need to know which lines matched? We can suppress line-level details by passing the --files-with-matches options, in which case we only see the paths to the files containing a match. The inverse is also possible, by alternatively passing the --files-without-match option.

Counting Lines & Matches

ripgrep provides a few options when we need a bit more information than just the matching filenames, but we still don't need to see the actual matches themselves.

For example, the --count option can be used for searches over multiple files or one or more directories. When --count is used each file that contains a match is printed, as before, but with the addition of the number of lines that matched within each file. By default files with no matches are omitted, although these can be included as well by additionally passing the --include-zero option.

This option can also be used when searching over a single file, but it behaves a bit differently. By default only the number of lines that matched is printed, without the filename, and no output will be generated by default when there are no matches. The filename can be added to the output by passing the --with-filename option, in which case the output uses the same format as when multiple files are searched. As before, we can also generate output when there are no matches in the file by passing the --include-zero option, as we found in the multi-file case previously.

Finally, in cases where we want to know the actual number of matches, rather than only the number of lines that matched, we can pass the --count-matches option instead of the --count option. This can be helpful in situations where we are searching for a pattern than can match multiple times within a single line.

Handling Long Lines

When ripgrep prints it prints the entire line, which can make the output appear disorganized when dealing with very long lines. ripgrep offers a few options for managing this situation. First, the --max-columns option can be passed, specifying the long line (measured in bytes) that should be printed. Lines that exceed this threshold are omitted and replaced by the number of matches that occurred within that line. This can be helpful in some situations, but is not always the best solution.

The --max-columns option behaves a bit differently when the --max-columns-preview option is also provided - rather than replace long lines with the number of matches that occurred, the lines are simply truncated so that they don't exceed the length specified by --max-columns.

Printing Only Matched Content

As an alternative to printing each matched line, passing the --only-matching option causes ripgrep to print only the matching content, with each match printed on a separate line. This can be convenient if you want to get a complete list of all matches within each input, and none of the non-matching content.

Match Context

Sometimes we want a little bit more context about each match than we can get from the single line that contained the match. In these cases, passing --context=N option can be used to instruct ripgrep to additionally print N lines above and below each match. When a little bit more control is required, passing --before-context=N and/or --after-context=M allow the number of lines before and after each matching line to be set to N and M, respectively.

Transforming Matched Content

Matched content can be replaced with another string using the --replace=REPL option, which replaces each match with the text in REPL. Although there may be times when replacing content with a literal string is useful, this feature becomes very powerful when the patterns contain capture groups, which allows the matched content within each capture group to be used in REPL.

Let's look at a few simple example, starting from the fruits.txt file that have previously used in several examples. To refresh your memory, these are the contents of the file:

ninja$:·rg·.·fruits.txt

1:apple

2:banana

3:watermelon

4:grape

5:strawberry

ninja$:··

In this example we are going to use a very simple regular expression, which matches each combination of an a followed by either a p or an n. Let's filter the file to get an idea of the contents of the un-transformed output:

ninja$:·rg·a[pn]·fruits.txt

1:apple

2:banana

4:grape

ninja$:··

Now, suppose we want to redact all text that matches this patter by replacing it with --. We can achieve this by wrapping our regular expression in parenthesis, creating a capture group, then replacing that capture group with --. A command that can be used to do this is:

ninja$:·rg·--replace=--·(a[pn])·fruits.txt

1:--ple

2:b----a

4:gr--e

ninja$:··

In this example, we know that the pattern matches 2 characters, so when we replace the matched text with -- we maintain the overall length of the content. Now, let's look at a second example that shows how we can use the captured text in the output. The following example modified the pattern so that the capture group contains only the character class, but not the leading a. Next, we define our replacement text as -${1}, which tells ripgrep to replace the matched text with a - followed by whatever text matched inside the capture group. The next result is that we will replace every a character that is followed by either p or an n with a -, but keep the p or n character:

ninja$:·rg·--replace=-${1}·a([pn])·fruits.txt

1:-pple

2:b-n-na

4:gr-pe

ninja$:··

Now that we have seen some of the various ways that ripgrep allows us to define what to include in its output, lets next look at some of the tools it gives us to format those outputs.