Text Processing: Grep - compare two files

From WikiMLT

Source of the ar­ti­cle: Ask Ubun­tu: Com­par­ing con­tents of two files.

Get on­ly the lines that ex­ist in file1 but not in file2:

grep -Fxvf file2 file1 > diff_file

Where:

  • -F, --fixed-strings – PAT­TERNS are strings,
  • -x, --line-regexp – match on­ly whole lines,
  • -v, --invert-match – se­lect non-match­ing lines,
  • -f, --file=FILE – take PAT­TERNS from FILE.

An in­line script for two ways com­par­i­son:

FILE1="file1"; FILE2="file2"; \
cat <(echo -e "\nOnly in $FILE1") \
    <(grep -Fvxf "$FILE2" "$FILE1") \
    <(echo -e "\nOnly in $FILE2") \
    <(grep -Fvxf "$FILE1" "$FILE2")

From the com­ments:

  • The prob­lem with this so­lu­tion is that it'll go su­per slow if you've got long files (it's O(N^2) on the length of the longer file). Sort­ing first and us­ing some­thing like diff or comm will be O(N log N).