User Tools

Site Tools


linux_expr_command

Overview

I think the Linux expr command is woefully underused for matching and extraction of substrings using basic regular expressions, so herewith a quick primer.

Consider a fully-qualified package name with version number and, say, patch level number:

$ P="fubar-1.2.3-42"

Say one wanted to match or extract different components of that string – this variation of the expr command will, first, simply count the number of matching characters based on an anchored match; that is, anchored to the beginning of the string.

Let's count the length of the initial substring that matches anything but a hyphen (remember, this is a pattern match that is anchored to the beginning of the string, so it will match everything up to, but not including, the first hyphen):

$ expr $P : '[^-]*'
5
$

So that's the length of the package name “fubar”. Questionably useful, I guess, so let's extend this by counting the number of characters in the combined package number and version (and intervening hyphen):

$ expr $P : '[^-]*-[^-]*'
11
$

That is clearly the number of characters in the substring “fubar-1.2.3”. Finally, we could, of course, use that trick to match the entire string based on the pattern we know it will have, allowing a simple “.*” to suck up the rest of the string (and, yes, this is overkill, but we're going somewhere with this):

$ expr $P : '[^-]*-[^-]*-.*'
14
$

“So what?”, you think. Ah, but you can also tag the field you care about to have that field value printed, not simply its length.

Knowing we need only supply enough pattern to match from the beginning, extract the package name:

$ expr $P : '\([^-]*\)'
fubar
$

Extract the version number by tagging that second field (making sure to start with enough regular expression to skip over the package name first):

$ expr $P : '[^-]*-\([^-]*\)'
1.2.3
$

Finally, extract the patch level by tagging the trailing substring after the second hyphen (again, supplying enough initial RE to skip over the package name and version number to get there):

$ expr $P : '[^-]*-[^-]*-\(.*\)'
42
$

Piece of cake. Note that this works for only one tagged field – if you tag more than one, you get just the first.

P.S. If the string you're manipulating has only two fields of interest, or has multiple fields of which only the first and last will ever be of interest, then you can of course use standard variable substitution:

  • ${var#ptn}
  • ${var##ptn}
  • ${var%ptn}
  • ${var%%ptn}
linux_expr_command.txt · Last modified: 2019/09/18 10:44 by rpjday