Consider the following file structure:

.
├── lib
│   ├── a.go
│   ├── b.go
│   ├── c.go
│   └── nested
│       ├── kk.go
│       ├── ll.go
│       └── mm.go
├── main.go
└── vendor
    └── github.com
        └── cirocosta
            └── test
                └── something.go

Naturally, if we search for any Go file under the current root, we’ll be performing our search against the vendor directory as well:

find . -name "*.go"

        ./lib/a.go
        ./lib/b.go
        ./lib/c.go
        ./lib/nested/kk.go
        ./lib/nested/mm.go
        ./lib/nested/ll.go        NOT WANTED
        ./main.go                 \/\/\/
        ./vendor/github.com/cirocosta/test/something.go

To get rid of that, i.e., to use the find(1) tool and exlude specific directories, we can make use of the -not operator from find, which essentially acts as a negation operator (!) as you’d find in a normal programming language:

man find

     -not expression
             This is the unary NOT operator.  
             It evaluates to true if the expression 
             is false.

i.e., -not operates on the next expression that we specify next.

Given that we want to exclude paths matching a pattern, we can use the -path operator that takes a pattern and then evaluates whether that’s matches or not the desired pattern:

     -path pattern
             True if the pathname being examined matches 
             pattern.  

for instance, assuming that previous file structure:

find . -path "./vendor/*"

        ./vendor/github.com
        ./vendor/github.com/cirocosta
        ./vendor/github.com/cirocosta/test
        ./vendor/github.com/cirocosta/test/something.go

Now, excluding that set of results from the total results:

find . -name "*.go" -not -path "./vendor/*"

         ./lib/a.go
         ./lib/b.go
         ./lib/c.go
         ./lib/nested/kk.go
         ./lib/nested/mm.go
         ./lib/nested/ll.go
         ./main.go

We can get all the *.go files that we wanted.

Now, given that we have the files, how do we get the directories? For that, we can use the dirname(1) utility.

To tie dirname with find, we now have three options:

  1. pipe the output from find to dirname;
  2. make use of the -exec operator from find to invoke dirname with the right parameters;
  3. pipe the output from find to xargs and then construct the right executions of dirname with it.

The first option can already be crossed out - dirname doesn’t operate on values supplied to it via stdin.

The second and third options are pratically the same, with the exception that the last involves a separate program (xargs(1)).

# find -exec version
find . \
        -name "*.go" \
        -not -path "./vendor/*" \
        -exec dirname {} \; 

        ./lib
        ./lib
        ./lib
        ./lib/nested
        ./lib/nested
        ./lib/nested
        .

# find | xargs version
find . \
        -name "*.go" \
        -not -path "./vendor/*" | \
        xargs -I {} dirname {}   

        ./lib
        ./lib
        ./lib
        ./lib/nested
        ./lib/nested
        ./lib/nested
        .

Personally, I prefer xargs to cut that weird \; ending for the -exec command that I never remember. Just memorizing the xargs syntax seemed easier for me.

Having those directories, it’s now a matter of removing the duplicates. No secret here: pipe to uniq.

find . \
        -name "*.go" \
        -not -path "./vendor/*" | \
        xargs -I {} dirname {} | \
        uniq

        ./lib
        ./lib/nested
        .

Closing thoughts

How about you? Do you usually make use of find paired with other tools to script stuff around or use something else?

Also, please let me know if you spot any errors. I’m @cirowrc on Twitter.

Have a good one!

finis