Text Processing: Grep - compare two files: Difference between revisions

From WikiMLT
mNo edit summary
m (Стадий: 1 [Фаза:Идентифициране, Статус:Създаване]; Категория:Linux Server)
Line 1: Line 1:
<noinclude><!--[[Category:Linux_Server|?]]-->{{ContentArticleHeader/Linux_Server}}</noinclude>
== References ==
* ...
* ...
== Section 1 ==
...
'''''Source of the article: [https://askubuntu.com/a/1030419/566421 Ask Ubuntu: Comparing contents of two files].'''''
'''''Source of the article: [https://askubuntu.com/a/1030419/566421 Ask Ubuntu: Comparing contents of two files].'''''


Line 16: Line 25:
</syntaxhighlight>From the comments:
</syntaxhighlight>From the comments:
* ''The problem with this solution is that it'll go super slow if you've got long files (it's O(N^2) on the length of the longer file). Sorting first and using something like <code>diff</code> or <code>comm</code> will be O(N log N).''
* ''The problem with this solution is that it'll go super slow if you've got long files (it's O(N^2) on the length of the longer file). Sorting first and using something like <code>diff</code> or <code>comm</code> will be O(N log N).''
<noinclude>
<div id='devStage'>
{{devStage
| Прндл  = Linux Server
| Стадий = 1
| Фаза  = Идентифициране
| Статус = Създаване
| ИдтПт  = {{REVISIONUSER}}
| ИдтДт  = {{Today}}
| ИдтРв  = {{REVISIONID}}
}}
</div>
</noinclude>

Revision as of 11:16, 4 August 2022

Ref­er­ences

Sec­tion 1

Source of the ar­ti­cle: Ask Ubun­tu: Com­par­ing con­tents of two files.

Get on­ly the lines that ex­ist in file1 but not in file2:

grep -Fxvf file2 file1 > diff_file

Where:

  • -F, --fixed-strings – PAT­TERNS are strings,
  • -x, --line-regexp – match on­ly whole lines,
  • -v, --invert-match – se­lect non-match­ing lines,
  • -f, --file=FILE – take PAT­TERNS from FILE.

An in­line script for two ways com­par­i­son:

FILE1="file1"; FILE2="file2"; \
cat <(echo -e "\nOnly in $FILE1") \
    <(grep -Fvxf "$FILE2" "$FILE1") \
    <(echo -e "\nOnly in $FILE2") \
    <(grep -Fvxf "$FILE1" "$FILE2")

From the com­ments:

  • The prob­lem with this so­lu­tion is that it'll go su­per slow if you've got long files (it's O(N^2) on the length of the longer file). Sort­ing first and us­ing some­thing like diff or comm will be O(N log N).