Text Processing: Grep - compare two files: Difference between revisions

From WikiMLT
mNo edit summary
mNo edit summary
 
(6 intermediate revisions by the same user not shown)
Line 1: Line 1:
<noinclude><!--[[Category:Linux_Server|?]]-->{{ContentArticleHeader/Linux_Server}}</noinclude>
'''''Source of the article: [https://askubuntu.com/a/1030419/566421 Ask Ubuntu: Comparing contents of two files].'''''
'''''Source of the article: [https://askubuntu.com/a/1030419/566421 Ask Ubuntu: Comparing contents of two files].'''''


Line 16: Line 18:
</syntaxhighlight>From the comments:
</syntaxhighlight>From the comments:
* ''The problem with this solution is that it'll go super slow if you've got long files (it's O(N^2) on the length of the longer file). Sorting first and using something like <code>diff</code> or <code>comm</code> will be O(N log N).''
* ''The problem with this solution is that it'll go super slow if you've got long files (it's O(N^2) on the length of the longer file). Sorting first and using something like <code>diff</code> or <code>comm</code> will be O(N log N).''
<noinclude>
<div id='devStage'>
{{devStage
| Прндл  = Linux Server
| Стадий = 6
| Фаза  = Утвърждаване
| Статус = Утвърден
| ИдтПт  = Spas
| РзбПт  = Spas
| АвтПт  = Spas
| УтвПт  = {{REVISIONUSER}}
| ИдтДт  = 4.08.2022
| РзбДт  = 4.08.2022
| АвтДт  = 4.08.2022
| УтвДт  = {{Today}}
| ИдтРв  = [[Special:Permalink/29894|29894]]
| РзбРв  = [[Special:Permalink/29896|29896]]
| АвтРв  = [[Special:Permalink/29897|29897]]
| УтвРв  = {{REVISIONID}}
}}
</div>
</noinclude>

Latest revision as of 11:17, 4 August 2022

Source of the ar­ti­cle: Ask Ubun­tu: Com­par­ing con­tents of two files.

Get on­ly the lines that ex­ist in file1 but not in file2:

grep -Fxvf file2 file1 > diff_file

Where:

  • -F, --fixed-strings – PAT­TERNS are strings,
  • -x, --line-regexp – match on­ly whole lines,
  • -v, --invert-match – se­lect non-match­ing lines,
  • -f, --file=FILE – take PAT­TERNS from FILE.

An in­line script for two ways com­par­i­son:

FILE1="file1"; FILE2="file2"; \
cat <(echo -e "\nOnly in $FILE1") \
    <(grep -Fvxf "$FILE2" "$FILE1") \
    <(echo -e "\nOnly in $FILE2") \
    <(grep -Fvxf "$FILE1" "$FILE2")

From the com­ments:

  • The prob­lem with this so­lu­tion is that it'll go su­per slow if you've got long files (it's O(N^2) on the length of the longer file). Sort­ing first and us­ing some­thing like diff or comm will be O(N log N).