ripgrep is a recent entry in a long list of grep
-like tools, which has gained popularity and
has become a defacto dependency among several popular neovim plugins. For those unfamiliar
with it, grep
is a command-line utility that has become the standard tool for matching lines in
a text files with regular expressions.
ripgrep is most commonly credited with being much faster than grep
and other similar tools,
which is achieved in two ways. First, ripgrep really is fast. Its github
repo has some benchmarks which, while benchmarks may not
always reflect actual use-cases, indicates that ripgrep is generally about 10x faster than other
tools. Second, by default ripgrep reduces the number of files it has to match against by
respecting .gitignore
files as well as by automatically skipping hidden files and directories,
which can significantly reduce the workload when working with large code bases.
Call Signature
First, although the command is called ripgrep, when it is called from the command-line it is
shortened to simply rg
.
When ripgrep is called directly from the command-line it follows the call signature:
rg [OPTIONS] PATTERN [PATH ...]
which changes slightly when it is called from within a pipeline, due to the input coming from stdin:
command | rg [OPTIONS] PATTERN
The primary difference between the two is that the former requires a PATH
argument that defines
the content that is to be searched, while the latter the output from command
. We will see each
of these in more detail shortly.
The PATH Argument
The PATH
argument defines a file or a directory to search, where directories will be searched
recursively. Paths passed on the command line take precedence over other rules, such as globs and
.gitignore
files.
Let's see how this works, using the fruits.txt
file we have used in previous examples:
applebananawatermelongrapestrawberry
As a first step, let's use the pattern .
to match everything, and apply that pattern to our input
file. Following our call signature:
As expected, ripgrep returned each line in our file. Now, let's see how we can use patterns to filter lines.
The PATTERN Argument
The PATTERN
argument defines the regular expression to be used for searching, where ripgrep's
regular expression syntax is discussed in the regex syntax section.
To see how this works, let's search this file looking for a simple pattern - let's select any lines
that contain an a
followed by either n
or p
. There are several ways we can build
this pattern, but let's use a simple character
class to do so. (Don't worry if you don't yet
understand character classes, we will discuss them in detail in the next section):
Note that the output only contains that matched our pattern, and ignored those that don't. This demonstrates the basic function of ripgrep.
ripgrep can also search for multiple patterns at the same time. This can be achieved a few ways,
but one of the direct methods is to take advantage of the -e/--regexp
option, which allows
multiple patterns to be specified in the ripgrep invocation.
To demonstrate, let's add a second pattern that selects only lines containing 2 or more consecutive
r
s:
At this point we are filtering out just a single line, so let's see one more example where we
achieve the same result, but a bit more directly. In the next example, let's use a pattern that
matches only lines that do not start with the letter w
:
We have now seen how to call ripgrep, and we have see a few simple regular expressions that allow us to select only lines of interest from the input file. In the next chapter we will learn more about the rules of constructing regular expression patterns themselves.
Pattern Files
Before we leave the topic of defining the pattern to search for, there is one more topic we would
like to discuss. While we often use ripgrep for quick, one-off searches, there are some searches
that we run multiple periodically to perform specific tasks, and typing the same patterns in the
command line each time can be repetitive and error-prone, especially when working with more
complicated patterns. In many cases we can write a shell script, but ripgrep offers another
options that can be very useful in these situations. ripgrep provides a -f/--file
option that
tells it to look for patterns in the specified file(s), allowing us to effectively name patterns
that we use often then pass them to ripgrep by name.
Let's take a look at how this works. First, we define our pattern and save it to a file. Pattern files can contain one or more patterns, with each pattern defined on a separate line.
Be careful not to leave any blank lines in the files, however, as ripgrep interprets a blank line as "match all input", which is generally not what is intended.
The pattern file option can be passed multiple times, in which case ripgrep will search the input
for all patterns defined in all specified pattern files, and any input that matches any pattern will
be printed to the output. Finally, PATTERNFILE
can also be specified as -
, in which case ripgrep
will read patterns from stdin allowing patterns to be dynamically generated, which opens up some
interesting possibilities.
Back to our example, although pattern files are most useful when working with complicated patterns,
so keep our example simple we want to select any lines that have exactly two adjacent a
, b
,
or p
letters:
Next, let's take a look at the file we want to search:
Now, let's execute the command and check the results:
which, as expected, filtered the input over the pattern defined in the pattern file.
Piping Input
Before we leave this chapter, let's look a bit deeper into piping output from commands into ripgrep. This time, let's use cat to concatenate two files, then pipe them to ripgrep to filter them. To start, let's dump the concatenated files to the console to see the input that will be sent to ripgrep:
Now, let's apply a simple pattern to filter the lines. This time, let's see which line contain an
e
preceded by on of b
, l
, or p
:
Notice that unlike previous examples with rg
, there are no line numbers. By default, ripgrep shows line
numbers when invoked directly from the console, but does not add line numbers when invoked with
input from stdin. Line numbers can still be added by adding the --line-number
option, they
just aren't enabled by default in this case because we piped input from cat to ripgrep.