Extended globs in bash
I learned about extended globbing today, in bash.
I wanted to free up some disk space on my tiny macbook, so I ran a scan with Grand Perspective. I saw an unusually large file in the git repo for Fira that I cloned (really I just wanted the .otf
files, but cloning the whole thing seemed easier). Somewhere in the Fira
directory lurked a ~300 MB file. I just wanted to verify that’s where it was, nothing fancy.
So the goal is: show regular entries (README.md
) and dotted/hidden entries (.git
) altogether.
ls
shows hidden files, but doesn’t show you anything about size, of course:
9281 ~/Dropbox/Fonts/Fira (master u=)
ls -la
total 88
drwxr-xr-x@ 17 dude staff 578 Oct 25 12:39 .
drwxr-xr-x@ 108 dude staff 3672 Dec 6 20:10 ..
-rw-r--r--@ 1 dude staff 6148 Oct 25 12:40 .DS_Store
drwxr-xr-x@ 12 dude staff 408 Feb 2 19:05 .git
-rw-r--r--@ 1 dude staff 4515 May 31 2015 LICENSE
-rw-r--r--@ 1 dude staff 169 May 31 2015 README.md
-rw-r--r--@ 1 dude staff 65 May 31 2015 bower.json
drwxr-xr-x@ 37 dude staff 1258 May 31 2015 eot
-rw-r--r--@ 1 dude staff 7379 May 31 2015 fira.css
-rw-r--r--@ 1 dude staff 8128 May 31 2015 index.html
drwxr-xr-x@ 37 dude staff 1258 May 31 2015 otf
-rw-r--r--@ 1 dude staff 618 May 31 2015 package.json
drwxr-xr-x@ 4 dude staff 136 Oct 25 12:39 source
drwxr-xr-x@ 38 dude staff 1292 May 31 2015 technical reports
drwxr-xr-x@ 37 dude staff 1258 May 31 2015 ttf
drwxr-xr-x@ 37 dude staff 1258 May 31 2015 woff
drwxr-xr-x@ 37 dude staff 1258 May 31 2015 woff2
It says .git
is 408 bytes
, but Grand Perspective determined that was a lie. (of course, ls
is just showing the size of the directory entry, not the total size)
My go-to du -hs ./*
didn’t pull up .
-directories, which I knew included the main offender:
9282 ~/Dropbox/Fonts/Fira (master u=)
du -hs ./*
8.0K ./LICENSE
4.0K ./README.md
4.0K ./bower.json
5.2M ./eot
8.0K ./fira.css
8.0K ./index.html
11M ./otf
4.0K ./package.json
101M ./source
54M ./technical reports
13M ./ttf
6.1M ./woff
4.4M ./woff2
I couldn’t find a -a
option for du
, so I needed something else. Of course, ./*
itself doesn’t match hidden directories. I can use the glob ./.*
, but then I only get hidden directories. I wanted them all in one listing:
9284 ~/Dropbox/Fonts/Fira (master u=)
du -hs ./.*
513M ./.
654M ./..
8.0K ./.DS_Store
319M ./.git
extglob & dotglob
There are a couple ways to get to where I wanted:
-
use the
dotglob
shell option, which will change the globbing convention to include matching an initial dot -
use the
extglob
shell option to give you better globs
I prefer the second option. Here’s the first, for completeness:
9280 ~/Dropbox/Fonts/Fira (master u=)
shopt -s dotglob
9281 ~/Dropbox/Fonts/Fira (master u=)
du -hs ./*
8.0K ./.DS_Store
319M ./.git
8.0K ./LICENSE
4.0K ./README.md
4.0K ./bower.json
5.2M ./eot
8.0K ./fira.css
8.0K ./index.html
11M ./otf
4.0K ./package.json
101M ./source
54M ./technical reports
13M ./ttf
6.1M ./woff
4.4M ./woff2
This is nice since it excludes the .
and ..
entries automatically, but I’d rather include those by default and glob them out by hand. Here’s all there is to know about extended globs:
- they can describe regular languages, which makes them regular expressions
- they don’t look like regular expressions (the PCRE kind)
- their syntax is pretty simple (from
man bash
):
?(pattern-list)
Matches zero or one occurrence of the given patterns
*(pattern-list)
Matches zero or more occurrences of the given patterns
+(pattern-list)
Matches one or more occurrences of the given patterns
@(pattern-list)
Matches one of the given patterns
!(pattern-list)
Matches anything except one of the given patterns
A pattern-list is a list of one or more patterns separated by a |.
So for example, *(foo)*
will match foo.md
, foofoo.md
, and bar.md
, but +(foo)*
will not match bar.md
.
The du I was looking for
So armed with shopt -s extglob
, here we go:
9283 ~/Dropbox/Fonts/Fira (master u=)
du -hs ./?(.)*
513M ./.
654M ./..
8.0K ./.DS_Store
319M ./.git
8.0K ./LICENSE
4.0K ./README.md
4.0K ./bower.json
5.2M ./eot
8.0K ./fira.css
8.0K ./index.html
11M ./otf
4.0K ./package.json
101M ./source
54M ./technical reports
13M ./ttf
6.1M ./woff
4.4M ./woff2
I can exclude the .
and ..
entries with the extended glob ./!(.|..)
:
9289 ~/Dropbox/Fonts/Fira (master u=)
du -hs ./!(.|..)
8.0K ./.DS_Store
319M ./.git
8.0K ./LICENSE
4.0K ./README.md
4.0K ./bower.json
5.2M ./eot
8.0K ./fira.css
8.0K ./index.html
11M ./otf
4.0K ./package.json
101M ./source
54M ./technical reports
13M ./ttf
6.1M ./woff
4.4M ./woff2
Finally, a proper du
. I don’t know how I went so long without this. It makes ls -a
look kind of weird and unnecessary. I can get fancier, now, and start sorting / limiting:
9290 ~/Dropbox/Fonts/Fira (master u=)
du -ms ./!(.|..) | sort -n | tail -5
11 ./otf
14 ./ttf
54 ./technical reports
101 ./source
319 ./.git
Finis
You only get extended globs with bash 3+. Right now, OS X comes with 3.2.57 and you can install 4.3.42 from homebrew, so no looking back on my macbook. Don’t forget to turn on extglob
, since it’s not enabled by default. (you can add it to your .bash_profile
to use it in all new terminals)
9280 ~
echo 'shopt -s extglob' >> ~/.bash_profile
9281 ~
shopt extglob # in a new terminal
extglob on
Happy hacking! Glob on.