By now we have established that ripgrep proceeds through several steps each time it is called:
- Select files to be searched
- Apply the specified pattern(s) to each in the specified input(s)
- Format and return each line of output
In this chapter we will learn about some of the various options that ripgrep provides around defining what it should output and how that output should be presented to the user.
As we have stated several times, ripgrep's basic function is to traverse through files and directories, locating lines that match one or more specified patterns, and sending them to stdout. There is a bit more to it and that, however, since ripgrep collects a fair amount of information that, depending upon what we are trying to achieve, can be very valuable.
For example, ripgrep knows which files it is parsing, so each time it identifies a matching line it can make that data available which can be very useful information in some applications, but extraneous information in others.
Line and Column Numbers
By default, ripgrep includes output numbers when its output is being printed to the screen, but
omits line numbers in other cases, such as when its output is being piped to another command.
This behavior can be modified by passing the --line-number
or --no-line-number
option when
ripgrep is invoked.
When line numbers are enabled, the column number of the first match on each can also be included or
excluded in
the output by additionally passing either of the --column
or --no-column
options.
Both row and column numbers are 1-indexed, and be aware that column numbers simply count bytes which works fine for ASCII content but may not be reliable when the strings contain Unicode.
Identifying Matching Files
Sometimes the default behavior prints more information than we need. For example, what if we only
want to find the files that contain a match, but don't need to know which lines matched? We can
suppress line-level details by passing the --files-with-matches
options, in which case we only see
the paths to the files containing a match. The inverse is also possible, by alternatively passing
the --files-without-match
option.
Counting Lines & Matches
ripgrep provides a few options when we need a bit more information than just the matching filenames, but we still don't need to see the actual matches themselves.
For example, the --count
option can be used for searches over multiple files or one or more
directories. When --count
is used each file that contains a match is printed, as before, but
with the addition of the number of lines that matched within each file. By default files with no
matches are omitted, although these can be included as well by additionally passing the
--include-zero
option.
This option can also be used when searching over a single file, but it behaves a bit differently. By
default only the number of lines that matched is printed, without the filename, and no output will
be generated by default when there are no matches. The filename can be added to the output by
passing the --with-filename
option, in which case the output uses the same format as when
multiple files are searched. As before, we can also generate output when there are no matches in the
file by passing the --include-zero
option, as we found in the multi-file case previously.
Finally, in cases where we want to know the actual number of matches, rather than only the number of
lines that matched, we can pass the --count-matches
option instead of the --count
option.
This can be helpful in situations where we are searching for a pattern than can match multiple times
within a single line.
Handling Long Lines
When ripgrep prints it prints the entire line, which can make the output appear disorganized when
dealing with very long lines. ripgrep offers a few options for managing this situation. First, the
--max-columns
option can be passed, specifying the long line (measured in bytes) that should be
printed. Lines that exceed this threshold are omitted and replaced by the number of matches that
occurred within that line. This can be helpful in some situations, but is not always the best
solution.
The --max-columns
option behaves a bit differently when the --max-columns-preview
option is also provided - rather than
replace long lines with the number of matches that occurred, the lines are simply truncated so that
they don't exceed the length specified by --max-columns
.
Printing Only Matched Content
As an alternative to printing each matched line, passing the --only-matching
option causes
ripgrep to print only the matching content, with each match printed on a separate line. This can
be convenient if you want to get a complete list of all matches within each input, and none of the
non-matching content.
Match Context
Sometimes we want a little bit more context about each match than we can get from the single line
that contained the match. In these cases, passing --context=N
option can be used to instruct
ripgrep to additionally print N
lines above and below each match. When a little bit more control
is required, passing --before-context=N
and/or --after-context=M
allow the number of lines
before and after each matching line to be set to N
and M
, respectively.
Transforming Matched Content
Matched content can be replaced with another string using the --replace=REPL
option, which
replaces each match with the text in REPL
. Although there may be times when replacing content
with a literal string is useful, this feature becomes very powerful when the patterns contain
capture groups, which allows the matched content within
each capture group to be used in REPL
.
Let's look at a few simple example, starting from the fruits.txt
file that have previously used in
several examples. To refresh your memory, these are the contents of the file:
In this example we are going to use a very simple regular expression, which matches each combination
of an a
followed by either a p
or an n
. Let's filter the file to get an idea of the contents
of the un-transformed output:
Now, suppose we want to redact all text that matches this patter by replacing it with --
. We can
achieve this by wrapping our regular expression in parenthesis, creating a capture group, then
replacing that capture group with --
. A command that can be used to do this is:
In this example, we know that the pattern matches 2 characters, so when we replace the matched text
with --
we maintain the overall length of the content. Now, let's look at a second example
that shows how we can use the captured text in the output. The following example
modified the pattern so that the capture group contains only the character class, but not the
leading a
. Next, we define our replacement text as -${1}
, which tells ripgrep to replace the
matched text with a -
followed by whatever text matched inside the capture group. The next result
is that we will replace every a
character that is followed by either p
or an n
with a -
, but
keep the p
or n
character:
Now that we have seen some of the various ways that ripgrep allows us to define what to include in its output, lets next look at some of the tools it gives us to format those outputs.