Text Processing: Grep - compare two files: Difference between revisions

From WikiMLT
mNo edit summary
mNo edit summary
Line 4: Line 4:
grep -Fxvf file2 file1 > diff_file
grep -Fxvf file2 file1 > diff_file
</syntaxhighlight>Where:
</syntaxhighlight>Where:
* <code class="noTypo">-F</code>, <code class="noTypo">--fixed-strings</code> - PATTERNS are strings,
* <code class="noTypo">-F</code>, <code class="noTypo">--fixed-strings</code> - PATTERNS are strings,
* <code class="noTypo">-x</code>, <code class="noTypo">--line-regexp</code> - match only whole lines,
* <code class="noTypo">-x</code>, <code class="noTypo">--line-regexp</code> - match only whole lines,
* <code class="noTypo">-v</code>, <code class="noTypo">--invert-match</code> - select non-matching lines,
* <code class="noTypo">-v</code>, <code class="noTypo">--invert-match</code> - select non-matching lines,
* <code class="noTypo">-f</code>, <code class="noTypo">--file=FILE</code> - take PATTERNS from FILE.
* <code class="noTypo">-f</code>, <code class="noTypo">--file=FILE</code> - take PATTERNS from FILE.
An inline script for two ways comparison:<syntaxhighlight lang="shell" line="1">
An inline script for two ways comparison:<syntaxhighlight lang="shell" line="1">
FILE1="file1"; FILE2="file2"; \
FILE1="file1"; FILE2="file2"; \
Line 17: Line 15:
     <(grep -Fvxf "$FILE1" "$FILE2")
     <(grep -Fvxf "$FILE1" "$FILE2")
</syntaxhighlight>
</syntaxhighlight>
----From the comments:
* ''The problem with this solution is that it'll go super slow if you've got long files (it's O(N^2) on the length of the longer file). Sorting first and using something like <code>diff</code> or <code>comm</code> will be O(N log N).''

Revision as of 11:16, 4 August 2022

Source of the ar­ti­cle: Ask Ubun­tu: Com­par­ing con­tents of two files.

Get on­ly the lines that ex­ist in file1 but not in file2:

grep -Fxvf file2 file1 > diff_file

Where:

  • -F, --fixed-strings – PAT­TERNS are strings,
  • -x, --line-regexp – match on­ly whole lines,
  • -v, --invert-match – se­lect non-match­ing lines,
  • -f, --file=FILE – take PAT­TERNS from FILE.

An in­line script for two ways com­par­i­son:

FILE1="file1"; FILE2="file2"; \
cat <(echo -e "\nOnly in $FILE1") \
    <(grep -Fvxf "$FILE2" "$FILE1") \
    <(echo -e "\nOnly in $FILE2") \
    <(grep -Fvxf "$FILE1" "$FILE2")

From the com­ments:

  • The prob­lem with this so­lu­tion is that it'll go su­per slow if you've got long files (it's O(N^2) on the length of the longer file). Sort­ing first and us­ing some­thing like diff or comm will be O(N log N).