Text Processing: Grep - compare two files: Difference between revisions
From WikiMLT
mNo edit summary |
mNo edit summary |
||
Line 4: | Line 4: | ||
grep -Fxvf file2 file1 > diff_file | grep -Fxvf file2 file1 > diff_file | ||
</syntaxhighlight>Where: | </syntaxhighlight>Where: | ||
* <code class="noTypo">-F</code>, <code class="noTypo">--fixed-strings</code> - PATTERNS are strings, | * <code class="noTypo">-F</code>, <code class="noTypo">--fixed-strings</code> - PATTERNS are strings, | ||
* <code class="noTypo">-x</code>, <code class="noTypo">--line-regexp</code> - match only whole lines, | * <code class="noTypo">-x</code>, <code class="noTypo">--line-regexp</code> - match only whole lines, | ||
* <code class="noTypo">-v</code>, <code class="noTypo">--invert-match</code> - select non-matching lines, | * <code class="noTypo">-v</code>, <code class="noTypo">--invert-match</code> - select non-matching lines, | ||
* <code class="noTypo">-f</code>, <code class="noTypo">--file=FILE</code> - take PATTERNS from FILE. | * <code class="noTypo">-f</code>, <code class="noTypo">--file=FILE</code> - take PATTERNS from FILE. | ||
An inline script for two ways comparison:<syntaxhighlight lang="shell" line="1"> | An inline script for two ways comparison:<syntaxhighlight lang="shell" line="1"> | ||
FILE1="file1"; FILE2="file2"; \ | FILE1="file1"; FILE2="file2"; \ | ||
Line 17: | Line 15: | ||
<(grep -Fvxf "$FILE1" "$FILE2") | <(grep -Fvxf "$FILE1" "$FILE2") | ||
</syntaxhighlight> | </syntaxhighlight> | ||
----From the comments: | |||
* ''The problem with this solution is that it'll go super slow if you've got long files (it's O(N^2) on the length of the longer file). Sorting first and using something like <code>diff</code> or <code>comm</code> will be O(N log N).'' |
Revision as of 10:16, 4 August 2022
Source of the article: Ask Ubuntu: Comparing contents of two files.
Get only the lines that exist in file1
but not in file2
:
grep -Fxvf file2 file1 > diff_file
Where:
-F
,--fixed-strings
– PATTERNS are strings,-x
,--line-regexp
– match only whole lines,-v
,--invert-match
– select non-matching lines,-f
,--file=FILE
– take PATTERNS from FILE.
An inline script for two ways comparison:
FILE1="file1"; FILE2="file2"; \
cat <(echo -e "\nOnly in $FILE1") \
<(grep -Fvxf "$FILE2" "$FILE1") \
<(echo -e "\nOnly in $FILE2") \
<(grep -Fvxf "$FILE1" "$FILE2")
From the comments:
- The problem with this solution is that it'll go super slow if you've got long files (it's O(N^2) on the length of the longer file). Sorting first and using something like
diff
orcomm
will be O(N log N).