TransWikia.com

How to find directories where certain files exist but some others don't

Unix & Linux Asked by John Zhau on October 6, 2020

Let’s say I have lots of directories for reports and I always have a report.md file in directories with reports but report.pdf only in finished reports. I want to find directories where report.md exists but without report.pdf.

reports_dir
|
+-- project_xyz
|   +-- report.md
|   +-- image.png
|
+-- bugs
|   +-- report.md
|   +-- report.pdf
|   +-- Makefile
|
+-- homework
    +-- report.pdf
    +-- test.md

If I use this function as search "reports_dir" --include "report.md" --exclude "report.pdf", it should return /project_xyz but not /bugs and /homework

What’s the quickest way to do such a search in sh/bash? (one-liner preferred)

5 Answers

Maybe this will work?

comm -23 <(find -name "report.md" -printf '%hn' | sort) <(find -name "report.pdf" -printf '%hn' | sort)

Answered by nezabudka on October 6, 2020

With zsh:

print -rC1 -- **/report.md(DN^e['[[ -e $REPLY:r.pdf ]]']:h)

Would print raw on 1 Column the head (dirname) of the report.md files under any level of subdirectories (**/) including Dot (hidden) ones, for which the evaluation of the [[ -e $REPLY:r.pdf ]] code does not (^) return true (where $REPLY:r is the root name of $REPLY containing the file to check).

With find implementations (such as GNU find) that allow {} embedded in -exec arguments and any Bourne-like shell (including bash), you can do:

find . -type d -exec test -e '{}/report.md' ; 
             ! -exec test -e '{}/report.pdf' ; -print

Though that means executing up to 2 test commands per directory.

With bosh and its find builtin, that can be avoided with:

find . -type d -call '
  [ -e "$1/report.md" ] && [ ! -e "$1/report.pdf" ]' {} ; -print

Where this time, -call has the shell interpret that code directly and invokes the builtin [.

POSIXly:

find . -type d -exec sh -c '
  for d do
    [ ! -e "$d/report.md" ] || [ -e "$d/report.pdf" ] || printf "%sn" "$d"
  done' sh {} +

With those sh implementations where [ and printf are builtin (most, these days), that should be relatively efficient as as few sh as possible are invoked by find to process the list of directories it finds.

Answered by Stéphane Chazelas on October 6, 2020

You can use a short bash one-liner to test fore the existence of the PDF ( -exec is also a filter):

find . -name '*.md' -exec bash -c '[[ -f "${1%.*}.pdf" ]] && exit 1; exit 0;' bash-line {} ; -print

If you only want the directory name you can also put it in another -exec after the first one (find stops after the first test):

find . -name '*.md' -exec bash -c '[[ -f "${1%.*}.pdf" ]] && exit 1; exit 0;' bash-line {} ; -exec dirname {} ;

Answered by xenoid on October 6, 2020

You could try this, although the output could be a bit messy, ls -R | grep report.md | grep -v report.pdf The ls -R here lists directories reccursively, then greps report.md and grep -v means anything with report.pdf will be ignored.

Answered by honeymoly on October 6, 2020

for f in "$1"/*/"$2"; do dir=$(dirname -- "$f"); [ -f "$dir/$3" ] || printf '%sn' "$dir"; done

Save the script as search. The first argument is the parent directory, the second is the include and the third the exclude.

sh search "reports_dir" "report.md" "report.pdf"
  • "$1"/*/"$2" gets only the path of the files report.md in the example.

  • dir=$(dirname -- "$f") gets the directory path of those files (assumes it doesn't end in newline characters).

  • [ -f "$dir/$3" ] tests for the existence of the file in 3rd argument in that directory (report.pdf) and that's it's of regular type (not a directory, fifo, device...).

  • || printf '%sn' "$dir" if it does not exist (or is not a regular file), print the directory name.

Answered by Quasímodo on October 6, 2020

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP