Text Processing: Grep - compare two files: Difference between revisions
From WikiMLT
Created page with "'''''Source of the article: [https://askubuntu.com/a/1030419/566421 Ask Ubuntu: Comparing contents of two files].''''' Get only the lines that exist in <code>file1</code> but not in <code>file2</code>:<syntaxhighlight lang="shell" line="1"> grep -Fxvf file2 file1 > diff_file </syntaxhighlight>Where: * <code>-F</code>, <code>--fixed-strings</code> - PATTERNS are strings, * <code>-x</code>, <code>--line-regexp</code> - match only whole lines, * <code>-v</code> - * <co..." |
mNo edit summary |
||
(9 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
<noinclude><!--[[Category:Linux_Server|?]]-->{{ContentArticleHeader/Linux_Server}}</noinclude> | |||
'''''Source of the article: [https://askubuntu.com/a/1030419/566421 Ask Ubuntu: Comparing contents of two files].''''' | '''''Source of the article: [https://askubuntu.com/a/1030419/566421 Ask Ubuntu: Comparing contents of two files].''''' | ||
Get only the lines that exist in <code>file1</code> but not in <code>file2</code>:<syntaxhighlight lang="shell" line="1"> | Get only the lines that exist in <code class="noTypo">file1</code> but not in <code class="noTypo">file2</code>:<syntaxhighlight lang="shell" line="1"> | ||
grep -Fxvf file2 file1 > diff_file | grep -Fxvf file2 file1 > diff_file | ||
</syntaxhighlight>Where: | </syntaxhighlight>Where: | ||
* <code class="noTypo">-F</code>, <code class="noTypo">--fixed-strings</code> - PATTERNS are strings, | |||
* <code>-F</code>, <code>--fixed-strings</code> - PATTERNS are strings, | * <code class="noTypo">-x</code>, <code class="noTypo">--line-regexp</code> - match only whole lines, | ||
* <code>-x</code>, <code>--line-regexp</code> - match only whole lines, | * <code class="noTypo">-v</code>, <code class="noTypo">--invert-match</code> - select non-matching lines, | ||
* <code>-v</code> - | * <code class="noTypo">-f</code>, <code class="noTypo">--file=FILE</code> - take PATTERNS from FILE. | ||
* <code>-f</code> - | |||
An inline script for two ways comparison:<syntaxhighlight lang="shell" line="1"> | An inline script for two ways comparison:<syntaxhighlight lang="shell" line="1"> | ||
FILE1="file1"; FILE2="file2"; \ | FILE1="file1"; FILE2="file2"; \ | ||
Line 16: | Line 16: | ||
<(echo -e "\nOnly in $FILE2") \ | <(echo -e "\nOnly in $FILE2") \ | ||
<(grep -Fvxf "$FILE1" "$FILE2") | <(grep -Fvxf "$FILE1" "$FILE2") | ||
</syntaxhighlight> | </syntaxhighlight>From the comments: | ||
* ''The problem with this solution is that it'll go super slow if you've got long files (it's O(N^2) on the length of the longer file). Sorting first and using something like <code>diff</code> or <code>comm</code> will be O(N log N).'' | |||
<noinclude> | |||
<div id='devStage'> | |||
{{devStage | |||
| Прндл = Linux Server | |||
| Стадий = 6 | |||
| Фаза = Утвърждаване | |||
| Статус = Утвърден | |||
| ИдтПт = Spas | |||
| РзбПт = Spas | |||
| АвтПт = Spas | |||
| УтвПт = {{REVISIONUSER}} | |||
| ИдтДт = 4.08.2022 | |||
| РзбДт = 4.08.2022 | |||
| АвтДт = 4.08.2022 | |||
| УтвДт = {{Today}} | |||
| ИдтРв = [[Special:Permalink/29894|29894]] | |||
| РзбРв = [[Special:Permalink/29896|29896]] | |||
| АвтРв = [[Special:Permalink/29897|29897]] | |||
| УтвРв = {{REVISIONID}} | |||
}} | |||
</div> | |||
</noinclude> |
Latest revision as of 10:17, 4 August 2022
Source of the article: Ask Ubuntu: Comparing contents of two files.
Get only the lines that exist in file1
but not in file2
:
grep -Fxvf file2 file1 > diff_file
Where:
-F
,--fixed-strings
– PATTERNS are strings,-x
,--line-regexp
– match only whole lines,-v
,--invert-match
– select non-matching lines,-f
,--file=FILE
– take PATTERNS from FILE.
An inline script for two ways comparison:
FILE1="file1"; FILE2="file2"; \
cat <(echo -e "\nOnly in $FILE1") \
<(grep -Fvxf "$FILE2" "$FILE1") \
<(echo -e "\nOnly in $FILE2") \
<(grep -Fvxf "$FILE1" "$FILE2")
From the comments:
- The problem with this solution is that it'll go super slow if you've got long files (it's O(N^2) on the length of the longer file). Sorting first and using something like
diff
orcomm
will be O(N log N).