Using EmEditor, I want to delete all the repeated instances of a string that occupies the full line plus the line above it. For example, in this text the repeated string is Cyperus esculentus (it could be anything else) and I want all its repeated instances deleted, including the previous line, i.e. language code. So far, what I figured out is something like this:
.{2,3} \nCyperus esculentus\n
But the problem is that I have to replace the repeated string with the one that is repeated in each different text.
ar
سعد لذيذ
ast
Cyperus esculentus
azb
یئمهلی توپالاق
az
Yeməli topalaq
bo
ཆུ་འབྲུམ།
ca
Xufa
ceb
Cyperus esculentus
cs
Šáchor jedlý
de
Erdmandel
en
Cyperus esculentus
eo
Cyperus esculentus
es
Cyperus esculentus
eu
Bedaur
fa
اویار سلام زرد
fr
Souchet comestible
gl
Xunca doce
ha
Aya
he
גומא נאכל
id
Cyperus esculentus
it
Cyperus esculentus
ja
ショクヨウガヤツリ
la
Cyperus esculentus
nl
Knolcyperus
nv
Tłʼohigaaí
pl
Cibora jadalna
pt
Cyperus esculentus
ru
Чуфа
srn
Affo
sv
Jordmandel
th
แห้วไทย
tr
Yer bademi
uk
Смикавець їстівний
uz
Yerbodom
vi
Củ gấu tàu
war
Cyperus esculentus
zh
油莎草
The expected result is what is left after applying the regex I mentioned above (to clarify, in these texts there is only one string that can is repeated, so the regex does not have to look for multiple different repeated strings):
ar
سعد لذيذ
azb
یئمهلی توپالاق
az
Yeməli topalaq
bo
ཆུ་འབྲུམ།
ca
Xufa
cs
Šáchor jedlý
de
Erdmandel
eu
Bedaur
fa
اویار سلام زرد
fr
Souchet comestible
gl
Xunca doce
ha
Aya
he
גומא נאכל
ja
ショクヨウガヤツリ
nl
Knolcyperus
nv
Tłʼohigaaí
pl
Cibora jadalna
ru
Чуфа
srn
Affo
sv
Jordmandel
th
แห้วไทย
tr
Yer bademi
uk
Смикавець їстівний
uz
Yerbodom
vi
Củ gấu tàu
zh
油莎草
This is what worked for me
document.selection.StartOfDocument(false);
document.DeleteDuplicates("",eeIncludeAll);
document.selection.Replace("([a-z]{2,3} \\n)([a-z]{2,3} \\n)","\\2",eeFindReplaceCase | eeReplaceAll | eeFindReplaceRegExp,0);
document.selection.Replace("([a-z]{2,3} \\n)([a-z]{2,3} \\n)","\\2",eeFindReplaceCase | eeReplaceAll | eeFindReplaceRegExp,0);
document.selection.Replace("([a-z]{2,3} \\n)([a-z]{2,3} \\n)","\\2",eeFindReplaceCase | eeReplaceAll | eeFindReplaceRegExp,0);
In the Filter toolbar, select 1
from the Number of Additional Visible Lines Above Matched Lines, enter Cyperus esculentus
, and press the Enter key.
Make sure the Block Multiple Changes button is clear (NOT set) in the same toolbar.
Select Select All and Delete on the Edit menu (or press Ctrl + A, Delete when the keyboard forcus is in the editor).
If you would like to use a macro, here is the macro for you:
fs = document.filters;
fs.Clear();
fs.AddFind( "Cyperus esculentus", eeFindReplaceCase, 0 );
fs.VisibleLinesAbove = 1;
fs.VisibleLinesBelow = 0;
document.filters = fs;
document.selection.SelectAll();
document.selection.Delete();
fs.Clear();
document.filters = fs;
You can run this macro after you open your data file. To do this, save this code as, for instance, Filter.jsee
, and then select this file from Select... in the Macros menu. Finally, open your data file, and select Run in the Macros menu while your data file is active. Make sure the Block Multiple Changes button is clear before you run the macro.
References: EmEditor Macro Reference: Filters Collection
Updates
I understand that "Cyperus esculentus" could be any other phrase. Assuming the duplicates always appear at even line numbers, here is the macro you can use instead. This macro selects all even numbers, bookmark duplicates in the selected lines, and delete all bookmarked lines (+one line above). Make sure the Block Multiple Changes button is clear before you run the macro.
editor.ExecuteCommandByID(4323); // clear all bookmarks
document.selection.StartOfDocument(false);
editor.ExecuteCommandByID(4208); // No Wrap
nLines = document.GetLines();
document.selection.LineDown(false,1);
for( i = 0; i < nLines; i += 2 ) {
editor.ExecuteCommandByID(4153); // select character
document.selection.CharRight(false,1);
editor.ExecuteCommandByID(4153);
document.selection.StartOfLine(false,eeLineView | eeLineHomeText);
document.selection.LineDown(false,2);
}
document.DeleteDuplicates("",eeSortSelectionOnly | eeBookmark | eeIncludeAll); // bookmark all duplicates in selected lines
document.selection.Collapse();
// filter bookmarked lines only
fs = document.filters;
fs.Clear();
fs.AddFind( "", 0, eeExFindBookmarkedOnly );
fs.VisibleLinesAbove = 1;
fs.VisibleLinesBelow = 0;
document.filters = fs;
document.selection.SelectAll();
document.selection.Delete(1); // delete all filtered lines
fs.Clear();
document.filters = fs;