Shortcut : COM:REGEX
This is a list of some regular expressions for localisation and general fixes for bots to do. Some of these are fairly trivial and should be combined with other tasks. Regexes marked as minor should not be run alone. If you have any regexes you use and would like to share, please add them below.
Everything is case-insensitive unless specified otherwise. The expressions should be executed from top to bottom. If any of these cause problems, please report it on the talk page. They're reasonably tested but no guarantees.
Localization/Internationalization
Headings
Task
Find
Replace
Notes
"Summary" heading
Add "== {{int:filedesc}} ==" to file pages where it is missing
[Minor] , ideally done after all regex changes
"Summary" heading
(?:Краткое[ _]+)?описание|Beschreibung\,[ _]+Quelle|Quelle|Beschreibung|वर्णन|sumario|descri(ption|pción|ção do arquivo)|achoimriú)( */ *(?:summary|(?:Краткое[ _]+)?описание|Beschreibung\,[ _]+Quelle|Quelle|Beschreibung|वर्णन|sumario|descri(ption|pción|ção do arquivo)|achoimriú))? *\:? *\1</source>
$1 {{int:filedesc}} $1
[MultiLine]
"Licensing" heading
)?(za(?: +d\'uso)?|Лицензирование|li[zcs]en[zcs](e|ing|ia)?(?:\s+information)?( */ *(za(?: +d\'uso)?|Лицензирование|li[zcs]en[zcs](e|ing|ia)?(?:\s+information)?))?|\{\{\s*int:license\s*\}\})(\]\])? *\:? *\1</source>
$1 {{int:license-header}} $1
[MultiLine]
"Original upload log" headings
history)|file ?history|ursprüngliche bild-versionen) *\:? *\1</source>
$1 {{original upload log}} $1
[MultiLine]
Remove duplicate headings
<syntaxhighlight lang="text" enclose="none">^ *(\=+) *(.*?) *\=+ *[\r\n]+\=+ *\2 *\1 *$</source>
$1 $2 $1
[MultiLine] ; Run multiple times
Multilingual tags
Task
Find
Replace
Notes
{{Unknown}}
\s*(?:author|artist)\s*=\s*)(?:unknown?|\{\{\s*unknown\s*\}\}|\?+|unkown|unidentified|αγνωστος|sconosciuto|ignoto|desconocido|inconnu|inconnue|not given|not known|desconhecido|unbekannt|неизвестно|Не известен|neznana|nieznany|непознат|okänd|sconossùo|未知|ukjent|onbekend|nich kennt|ലഭ്യമല്ല|непознат|نهناسرا|descoñecido|不明|ignoto|óþekktur|tak diketahui|ismeretlen|nepoznat|לא ידוע|ûnbekend|tuntematon|نامعلوم|teadmata|nekonata|άγνωστος|ukendt|neznámý|desconegut|Неизвестен|ned bekannt|غير معروف|невідомий)\s*?\;?\.?\s*?(\
\r|\n)</source>
$1{{unknown|author}}$2
{{Own}} (part 1)
\s*source\s*=\s*)(?:own work)?\s*(?:-|;|</?br *[/\\]?>)?\s*(?:own(?: work(?: by uploader)?)?|(?:œuvre |travail )?personnel(?:le)?|self[- ]made|création perso|selbst fotografiert|obra pr[òo]pia|trabajo propr?io)\s*?(?:\(own work\))?\.? *(\
\r|\n)</source>
$1{{own}}$2
{{Own}} (part 2)
(\|\s*source\s*=\s*)(?:\{\{\s*[a-z]{2,3} *\|)? *(?:own(?: work(?: by uploader)?)?|travail personnel|self[- ]made|création perso|selbst fotografiert|obra pr[òo]pia|trabajo propr?io) *(?:\}\})? *(?:\{\{\s*[a-z]{2,3} *\|)? *(?:\(?(?:own *work)\)?)? *(?:\}\})?(\||\}\}|\r|\n) (broken! Example: "{{Information | source = selbst fotografiert }}\newline"
$1{{own}}$2
{{Own}} (part 3)
\s*source\s*=\s*)(?:own[^a-z]*work|opera[^a-z]*propria|trabajo[^a-z]*propio|travail[^a-z]*personnel|eigenes[^a-z]*werk|eigen[^a-z]*werk|собственная[^a-z]*работа|投稿者自身による作品|自己的作品|praca[^a-z]*pw[łl]asna|Obra(?:[^a-z]*do)?[^a-z]*pr[oó]prio|Treball[^a-z]*propi|Собствена[^a-z]*творба|Vlastní[^a-z]*dílo|Eget[^a-z]*arbejde|Propra[^a-z]*verko|Norberak[^a-z]*egina|عمل[^a-z]*شخصي|اثر[^a-z]*شخصی|자작|अपना[^a-z]*काम|נוצר[^a-z]*על[^a-z]*ידי[^a-z]*מעלה[^a-z]*היצירה|Karya[^a-z]*sendiri|Vlastito[^a-z]*djelo[^a-z]*postavljača|Mano[^a-z]*darbas|A[^a-z]*feltöltő[^a-z]*saját[^a-z]*munkája|Karya[^a-z]*sendiri|Eget[^a-z]*verk|Oper[aă][^a-z]*proprie|Vlastné[^a-z]*dielo|Lastno[^a-z]*delo|Сопствено[^a-z]*дело|Oma[^a-z]*teos|Eget[^a-z]*arbete|Yükleyenin[^a-z]*kendi[^a-z]*çalışması|Власна[^a-z]*робота|Sariling[^a-z]*gawa|eie[^a-z]*werk|сопствено[^a-z]*дело|Eige[^a-z]*arbeid|პირადი[^a-z]*ნამუშევარი)\;?\.? *(\
\r|\n)</source>
$1{{own}}$2
{{Own}} (part 4)
\s*source\s*=\s*)(((?:\'\'+)?)([\"\']?)(?:selbst\W*erstellte?s?|selbst\W*gezeichnete?s?|self\W*made|eigene?s?)\W*?(?:arbeit|aufnahme|(?:ph|f)oto(?:gra(?:ph|f)ie)?)?\.?\4\3) *(\
\r|\n)</source>
$1{{own}}$5
{{Self-photographed}}
\s*source\s*=\s*)(?:self[^a-z]*photographed|selbst[^a-z]*(?:aufgenommen|(?:f|ph)otogra(?:f|ph)iert?)|投稿者撮影|投稿者の撮影)\s*?\.? *(\
\r|\n)</source>
$1{{self-photographed}}$2
{{Anonymous}}
\s*author\s*=\s*)(?:anonym(?:e|ous)?|anonyymi|anoniem|an[oòóô]n[yi]mo?|ismeretlen|不明(匿名)|미상|ανώνυμος|аноним(?:ен|ный художник)|neznámy|nieznany|مجهول|Ананім|Anonymní|Ezezaguna|Anonüümne|אלמוני|អនាមិក|Anonimas|അജ്ഞാതം|Анонимный автор|佚名)\s*?\.?\;?\s*?(\
\r|\n)</source>
$1{{anonymous}}$2
{{Unknown photographer}}
\s*author\s*=\s*)(?:unknown\s*photographer|photographer\s*unknown)\s*?\;?\.?\s*?(\
\r|\n)</source>
$1{{unknown photographer}}$2
{{Private collection}}
\s*gallery\s*=\s*)private(?: collection)? *(\
\r|\n)</source>
$1{{private collection}}$2
{{See below}}
\s*permission\s*=\s*)(?:see\s*below|див\.?\s*нижче|дивись\s*нижче)\s*?\;?\.?\s*?(\
\r|\n)</source>
$1{{see below}}$2
Task
Find
Replace
Notes
{{Original description page}} I
is|was) \[(?:https?:)?\/\/(?:www\.)?((?:[a-z\-]+\.)?wik[a-z]+(?:\-old)?)\.org\/w((?:\/shared)?)\/index\.php\?title\=(?:[a-z]+)(?:\:|%3A)([^\[\]\|}{]+?) +here(?:\]\.?|\.?\])(\s+All following user names refer to (?:\1(?:\.org)?\2|(?:wts|shared)\.oldwikivoyage)\.?)?</source>
{{original description page|$1$2|$3}}
{{Original description page}} II
%3A)([\w\%\-\.\~\:\/\?\#\[\]\@\!\$\&\'\(\)\*\+\,\;\=]+?)(?:| [^\]\n]*)\](?:\s*\,?\s*before it was transferr?ed to commons)?\.?</source>
{{original description page|$1|$2}}
{{Original description page}} III
\s*([a-z\-]+\.w[a-z]+)\s*\|\s*[^}\|\[{]+\}\})\s*using\s*\[\[\:en\:WP\:FTCG\|FtCG\]\]\.?</source>
$1{{transferred from|$3||[[:en:WP:FTCG|FtCG]]}} $2
Technique translations
These mainly apply to paintings and other artistic works.
Task
Find
Replace
Notes
Oil on canvas
\s*technique\s*=\s*)(?:\{\{\s*(?:en|de) *\|)? *(?:oil[ -]on[ -]canvas|öl[ -]auf[ -]leinwand) *(?:\}\})?(\
\r|\n)</source>
$1{{technique|oil|canvas}}$2
Oil on wood
\s*technique\s*=\s*)\{\{\s*de *\|\s*öl[ -]auf[ -]holz\s*\}\}(\
\r|\n)</source>
$1{{technique|oil|wood}}$2
Oil on oak
\s*technique\s*=\s*)\{\{\s*de *\|\s*öl[ -]auf[ -]eichenholz\s*\}\}(\
\r|\n)</source>
$1{{technique|oil|panel|wood=oak}}$2
Oil on panel
\s*technique\s*=\s*)(?:\{\{\s*en *\|)? *oil[ -]on[ -]panel *(?:\}\})?(\
\r|\n)</source>
$1{{technique|oil|panel}}$2
Watercolor
\s*technique\s*=\s*)\{\{\s*de *\|\s*aquarell\s*\}\}(\
\r|\n)</source>
$1{{technique|watercolor}}$2
Fresco
\s*technique\s*=\s*)\{\{\s*de *\|\s*fresko\s*\}\}(\
\r|\n)</source>
$1{{technique|fresco}}$2
{{Information}} fields
Task
Find
Replace
Notes
"Description" cleanup
\s*description\s*=)\s*(?:\{\{\s*description missing\s*\}\}|\s*description missing\s*?|(?:\{\{\s*en *\|) *(?:)?no original description(?: )? *(?:\}\})|(?:)?no original description(?: )? *) *(\
\r|\n)</source>
$1$2
"Permission" cleanup 1
\s*permission\s*=)\s*((?:\'\')?)(?:-|—|下記を参照|see(?: licens(?:e|ing|e +section))?(?: bell?ow)?|yes|oui)\s*?\,?\.?;?\s*?\2\s*?(\
\r|\n)</source>
$1$3
"Permission" cleanup 2
\s*permission\s*=)\s*\{\{(?:en\|)?\s*?see\sbell?ow\s*?\}\}\s*?(\
\r|\n)</source>
$1$2
"Other versions" cleanup
\s*other[_ ]versions\s*=)\s*(?:)?(?:-|—|no|none?(?: known)?|nein|yes|keine|\-+)\.?(?: )? *(\
\r|\n)</source>
$1$2
"Source" cleanup
\s*source\s*\=\s*[^*]+?)\n?\*\s*uploaded\s+by\s+\[\[user\:[^\]]+]](\
\r|\n)</source>
$1$2
File Upload Bot (Magnus Manske) was adding these but they can already be found in the filehistory of each uploaded file.
Dates
Most plausible years
Most digital photos are dated after 2000. So the most plausible year is <syntaxhighlight lang="text" enclose="none">(200[0-9]|201[0-9])</source>. For example 19082006 gets translated into 2006-08-19 .
Task
Find
Replace
Notes
Conversion (yyyy[ -/.]mm[ -/.]dd)
\s*date\s*=\s*)(?:created|made|taken)? *(200[0-9]|201[0-9])(-| |/|\.|)(0[1-9]|1[0-2])\3(1[3-9]|2[0-9]|3[01])(\
\r|\n)</source>
$1$2-$4-$5$6
Conversion (yyyy[ -/.]dd[ -/.]mm)
\s*date\s*=\s*)(?:created|made|taken)? *(200[0-9]|201[0-9])(-| |/|\.|)(1[3-9]|2[0-9]|3[01])\3(0[1-9]|1[0-2])(\
\r|\n)</source>
$1$2-$5-$4$6
Conversion (mm[ -/.]dd[ -/.]yyyy)
\s*date\s*=\s*)(?:created|made|taken)? *(0[1-9]|1[0-2])(-| |/|\.|)(1[3-9]|2[0-9]|3[01])\3(200[0-9]|201[0-9])(\
\r|\n)</source>
$1$5-$2-$4$6
Conversion (dd[ -/.]mm[ -/.]yyyy)
\s*date\s*=\s*)(?:created|made|taken)? *(1[3-9]|2[0-9]|3[01])(-| |/|\.|)(0[1-9]|1[0-2])\3(200[0-9]|201[0-9])(\
\r|\n)</source>
$1$5-$4-$2$6
Other plausible years
Try those after applying the above ! For example 19781706 gets translated into 1978-06-17 .
Task
Find
Replace
Notes
Conversion (yyyy[ -/.]mm[ -/.]dd)
\s*date\s*=\s*)(?:created|made|taken)? *(1[89][0-9]{2})(-| |/|\.|)(0[1-9]|1[0-2])\3(1[3-9]|2[0-9]|3[01])(\
\r|\n)</source>
$1$2-$4-$5$6
Conversion (yyyy[ -/.]dd[ -/.]mm)
\s*date\s*=\s*)(?:created|made|taken)? *(1[89][0-9]{2})(-| |/|\.|)(1[3-9]|2[0-9]|3[01])\3(0[1-9]|1[0-2])(\
\r|\n)</source>
$1$2-$5-$4$6
Conversion (mm[ -/.]dd[ -/.]yyyy)
\s*date\s*=\s*)(?:created|made|taken)? *(0[1-9]|1[0-2])(-| |/|\.|)(1[3-9]|2[0-9]|3[01])\3(1[89][0-9]{2})(\
\r|\n)</source>
$1$5-$2-$4$6
Conversion (dd[ -/.]mm[ -/.]yyyy)
\s*date\s*=\s*)(?:created|made|taken)? *(1[3-9]|2[0-9]|3[01])(-| |/|\.|)(0[1-9]|1[0-2])\3(1[89][0-9]{2})(\
\r|\n)</source>
$1$5-$4-$2$6
Task
Find
Replace
Notes
Conversion ({{date|yyyy|mm|dd}} )
\s*date\s*=\s*)(?:created|made|taken)? *\{\{\s*date\|([0-9]{4})\|(0[1-9]|1[012])\|(0?[1-9]|1[0-9]|2[0-9]|3[01])\}\}(\
\r|\n)</source>
$1$2-$3-$4$5
{{Date}} function is built-in
Unknown dateUnknown date
\s*(?:date|year)\s*=\s*)(?:unknown?(?:\s*date)?|\?|unbekannte?s?(\s*datum)?)</source>
$1{{unknown|date}}
{{other date |century}}
\s*(?:date|year)\s*=\s*)(\d\d?)(?:st|nd|rd|th) *century *(\
\r|\n)</source>
$1{{other date|century|$2}}$3
{{other date |~}}
\s*(?:date|year)\s*=\s*)(?:cir)?ca?\.? *\s?(1\d{2})[\-\?] *(\
\r|\n)</source>
$1{{other date|~|${2}0|${2}9}}$3
{{other date |~}}
\s*(?:date|year)\s*=\s*)(?:cir)?ca?\.? *(\d{4}) *(\
\r|\n)</source>
$1{{other date|~|$2}}$3
{{other date |?}}
\s*(?:date|year)\s*=\s*)(?:unknown|\?+)\.? *(\
\r|\n)</source>
$1{{other date|?}}$2
{{Original upload date}} (original upload date)
\d{4}\-\d{2}\-\d{2}\}\})\s*(?:\(original\s*upload\s*date\)|\(\s*first\s*version\s*\);?\s*\{\{\s*original upload date\|\d{4}\-\d{2}\-\d{2}\}\}\s*\(\s*last\s*version\s*\))</source>
$1
{{Original upload date}} & {{According to EXIF data}}
\s*date\s*=\s*)(?:\{\{\s*date\|\s*(\d+)\s*\|\s*(\d+)\s*\|\s*(\d+)\s*\}\}|(\d{4})\-(\d{2})\-(\d{2}))\s*\(\s*(original upload date|according to EXIF data)\s*\)\s*?(\
\r|\n)</source>
$1{{$8|$2$5-$3$6-$4$7}}$9
{{Original upload date}} I
\s*date\s*=\s*)\{\{\s*date\s*\|\s*(\d+)\s*\|\s*(\d+)\s*\|\s*(\d+)\s*\}\}\s*\(\s*first\s*version\s*\)\;?\s*\{\{\s*date\s*\|\s*\d+\s*\|\s*\d+\s*\|\s*\d+\s*\}\}\s*\(\s*last\s*version\s*\)</source>
$1{{original upload date|$2-$3-$4}}
{{Original upload date}} II
\s*date\s*=\s*)(\d{4})\-(\d{2})\-(\d{2})\s*\(\s*first\s*version\s*\)\;?\s*(\d{4})\-(\d{2})\-(\d{2})\s*\(\s*last\s*version\s*\)</source>
$1{{original upload date|$2-$3-$4}}
{{Original upload date}} III
\s*date\s*=\s*\(?\s*)(?:Uploaded\s*on\s*Commons\s*at\s*[\d\-]*\s*[\d:]*\s*\(?UTC\)?\s*\/?\s*)?Original(?:ly)?\s*uploaded\s*at\s*([\d\-]*)\s*[\d:]*</source>
$1{{original upload date|$2}}
{{other date |s}}
\s*date\s*=\s*)(\d{1,3}0)\s*s</source>
$1{{other date|s|$2}}
{{other date |after}}
\s*date\s*=\s*)(?:after|post|بعد|desprès|po|nach|efter|μετά από|después de|pärast|پس از|après|despois do|לאחר|nakon|dopo il|по|na|após|după|после)\s*(\d{4})</source>
$1{{other date|after|$2}}
{{other date |before}}
\s*date\s*=\s*)(?:before|vor|pre|до|vör|voor|prior to|ante|antes de|قبل|Преди|abans|před|før|πριν από|enne|پیش از|ennen|avant|antes do|לפני|prije|prima del|пред|przed|înainte de|ранее|pred|före)[\s\-]*(\d{4})</source>
$1{{other date|before|$2}}
{{other date |or}}
\s*date\s*=\s*)(\d{4})\s*(?:or|أو|o|nebo|eller|oder|ή|ó|või|یا|tai|ou|או|vagy|または|или|അഥവാ|of|lub|ou|sau|или|ali|หรือ|和)\s*?(\d{4})</source>
$1{{other date|or|$2|$3}}
{{other date |between}}
\s*date\s*=\s*)(?:sometime\s*)?(?:between)\s*(\d{4})\s*(?:and|\-)?\s*?(\d{4})</source>
$1{{other date|between|$2|$3}}
{{other date |spring}}
\s*date\s*=\s*)(?:primavera(?:\s*de)?|jaro|forår|frühling|spring|printempo|Kevät|printemps|пролет|Vörjohr|früh[ \-]?jahr|voorjaar|wiosna|primăvara(?:\s*lui)?|весна|pomlad|våren|spring)\s*(\d{4})</source>
$1{{other date|spring|$2}}
{{other date |summer}}
\s*date\s*=\s*)(?:estiu|léto|somero|verano|Kesä|été|verán|estate|лето|zomer|lato|verão(?:\s*de)?|vara(?:\s*lui)?|poletje|sommaren|sommer|summer)\s*(\d{4})</source>
$1{{other date|summer|$2}}
{{other date |fall}}
\s*date\s*=\s*)(?:fall|autumn|tardor|podzim|Efterår|Herbst|aŭtuno|otoño|Syksy|outono(?:\s*de)?automne|outono|autunno|есен|Harvst|herfst|jesień|toamna(?:\s*lui)?|осень|jesen|hösten)\s*(\d{4})</source>
$1{{other date|fall|$2}}
{{other date |winter}}
\s*date\s*=\s*)(?:winter|hivern|zima|Vinter|vintro|invierno|Talvi|hiver|inverno(?:\s*de)?|зима|iarna(?:\s*lui)?|зима|zima|vintern)\s*(\d{4})</source>
$1{{other date|winter|$2}}
{{other date |circa}}
\s*date\s*=\s*)(?:[zc]ir[kc]a|ungefähr|about|around|vers|حوالي|cca|etwa|περ\.?|cerca\s*de|حدود|noin|cara a|oko|około|около|c[\:\. ]?a?[\:\. ]?)\s*(\d{3,4})(?:\s*\-\s*(?:[zc]ir[kc]a|ungefähr|about|around|vers|حوالي|cca|etwa|περ\.?|cerca\s*de|حدود|noin|cara a|oko|około|около|c[\:\. ]?a?[\:\. ]?)?\s*(\d{3,4}))?</source>
$1{{other date|circa|$2|$3}}
empty argument fix
circa\|\d+)\|\}\}</source>
$1}}
{{other date |circa}}
\s*date\s*=\s*)(?:[zc]ir[kc]a|ungefähr|about|around|vers|حوالي|cca|etwa|περ\.?|cerca\s*de|حدود|noin|cara a|oko|około|около|c[\:\. ]?a?[\:\. ]?)\s*(\d{3,4})</source>
$1{{other date|circa|$2}}
(from metadata)
\s*date\s*=\s*)\{\{\s*ISOdate\s*\|\s*([\d\-]+)\s*\}\}\s*\(\s*from\s*metadata\s*\)</source>
$1{{according to EXIF|$2}}
Junk cleanup
Task
Find
Replace
Notes
{{ImageUpload}} removal
<syntaxhighlight lang="text" enclose="none">\s*\n?</source>
[Minor]
Uncategorized comment
<syntaxhighlight lang="text" enclose="none"> * *</source>
[Minor] ; Usually left behind after categorizing
"Categories" comment
<syntaxhighlight lang="text" enclose="none"> * *\n?</source>
[Minor]
"move approved by"
\n)*?)(?:This image was moved from *\[\[:?(?:File|image):?[^\]\[{}]*\]\]\.?)?</source>
$1
Useless templates (if they take no parameters)
Art\.|bots|football[ _]+kit|template[ _]+other|s|tl|tlxs|template|template[ _]+link|temp|tls|tlx|tl1|tlp|tlsx|tlsp|mbox|tmbox(?:\/core)?|lan|jULIANDAY|file[ _]+title|nowrap|plural|time[ _]+ago|time[ _]+ago\/core|toolbar|red|green|sp|other date|max|max\/2|str[ _]+left|str[ _]+right|music|date|cite[ _]+book|citation\/core|citation\/make[ _]+link|citation\/identifier|citation|cite|cite[ _]+book|citation\/authors|citation\/make[ _]+link|cite[ _]+journal|cite[ _]+patent|cite[ _]+web|hide in print|only in print|parmPart|error|crediti|fontcolor|transclude|trim|navbox|navbar|section[ _]+link|yesno|center|unused|•|infobox\/row)\s*\}\}</source>
Useless full URL
\s*(?:https?:)?\/\/ticket\.wikimedia\.org\/otrs\/index\.pl\?Action\s*\=\s*AgentTicketZoom&(?:amp;)?TicketNumber\=(\d+)\s*\}\}</source>
{{PermissionOTRS|id=$1}}
Unnecessary __NOTOC__
<syntaxhighlight lang="text" enclose="none">__ *NOTOC *__</source>
[Case sensitive] [Minor] ; Common.css prevents file pages from showing TOCs
Remove empty lang templates
ab|ace|af|ak|als|am|an|ang|ar|arc|arz|as|ast|av|ay|az|ba|bar|bcl|be|bg|bh|bi|bjn|bm|bn|bo|bpy|br|bs|bug|bxr|ca|cbk-zam|cdo|ce|ceb|ch|cho|chr|chy|ckb|co|cr|crh|cs|csb|cu|cv|cy|da|de|diq|dsb|dv|dz|ee|el|eml|en|eo|es|et|eu|ext|fa|ff|fi|fiu-vro|fj|fo|fr|frp|frr|fur|fy|ga|gag|gan|gd|gl|glk|gn|got|gu|gv|ha|hak|haw|he|hi|hif|ho|hr|hsb|ht|hu|hy|hz|ia|id|ie|ig|ii|map-bms|ik|ilo|io|is|it|iu|ja|jbo|jv|ka|kaa|kab|kbd|kg|ki|kj|kk|kl|km|kn|ko|kr|krc|ks|ksh|ku|kv|kw|ky|la|lad|lb|lbe|lez|lg|li|lij|roa-rup|lmo|ln|lo|lt|ltg|lv|mdf|mg|mh|mhr|mi|mk|ml|mn|mo|mr|mrj|ms|mt|mus|mwl|my|myv|mzn|na|nah|nap|nds|nds-nl|ne|new|ng|nl|nn|no|nov|nrm|nso|nv|ny|oc|om|or|os|pa|pag|pam|pap|pcd|pdc|pfl|pi|pih|pl|pms|pnb|pnt|ps|pt|qu|rm|rmy|rn|ro|roa-tara|ru|rue|rw|sa|sah|sc|scn|sco|sd|se|sg|sh|si|sk|sl|sm|sn|so|sq|sr|srn|ss|st|stq|su|sv|sw|szl|ta|te|tet|tg|th|ti|tk|tn|to|zh-hans|tpi|tr|ts|tt|tum|tw|ty|tyv|udm|ug|uk|ur|uz|ve|vec|vep|vi|vls|vo|wa|war|wo|wuu|xal|xh|xmf|yi|yo|za|zea|zh|zh-hant|zh-hk|zh-min-nan|zh-sg|zu)\s*(?:|\
\s*1=)?\s*\}\} *(\
\r|\n)</source>
$1
Ignores those followed by text (incorrect usage but still indicates the language)
Remove void parameter (wrong syntax)
(\s*\
\}\})</source>
$1$2
Links
Task
Find
Replace
Notes
External to interwiki (part 1)
(wikt)ionary|wiki(n)ews|wiki(b)ooks|wiki(q)uote|wiki(s)ource|wiki(v)ersity|wiki(voy)age)\.(?:com|net|org)/wiki/([^\]\[{|}\s"]*) +([^\n\]]+)\]</source>
[[$2$3$4$5$6$7$8:$1:$9|$10]]
Make sure not to touch credit lines which require a link to the file page. (Effectively a self-link which results in bold text after this regex)
External to interwiki (part 2)
(incubator)|(quality))\.wikimedia\.(?:com|net|org)/wiki/([^\]\[{|}\s"]*) +([^\n\]]+)\]</source>
[[$1$2$3:$4|$5]]
See above
External to wikilink (local)
net|org)/wiki/([^\]\[{|}\s"]*) +([^\n\]]+)\]</source>
[[:$1|$2]]
See above
Interlanguage
sv|nl|de|fr|ru|it|es|ceb|vi|war|pl|ja|pt|zh|uk|ca|no|fa|fi|id|ar|cs|ko|ms|hu|ro|zh-yue|sr|tr|min|sh|kk|eo|eu|sk|da|lt|bg|he|hr|sl|hy|uz|et|vo|nn|gl|bat-smg|simple|hi|la|el|az|th|oc|ka|mk|be|new|tt|pms|tl|ta|te|cy|lv|ce|be-x-old|ht|ur|bs|sq|br|jv|mg|lb|mr|is|ml|pnb|ba|af|my|bn|ga|lmo|yo|fy|an|cv|tg|ky|nds-nl|sw|ne|io|gu|sco|bpy|scn|nds|ku|ast|qu|su|als|gd|kn|am|ckb|ia|nap|bug|wa|mn|pa|arz|mzn|si|zh-min-nan|yi|fo|sah|vec|sa|bar|nah|os|or|pam|hsb|se|li|mrj|mi|ilo|co|hif|bcl|gan|frr|bo|rue|mhr|glk|fiu-vro|ps|tk|pag|vls|gv|xmf|diq|km|kv|zea|csb|crh|hak|vep|sc|ay|dv|map-bms|so|nrm|rm|udm|koi|kw|ug|stq|bh|lad|wuu|lij|eml|fur|mt|szl|gn|pi|as|pcd|gag|cbk-zam|ksh|nov|ang|ie|nv|ace|ext|frp|mwl|ln|lez|sn|dsb|pfl|krc|haw|pdc|kab|xal|rw|myv|to|arc|kl|roa-tara|bjn|kbd|lo|ha|pap|av|tpi|mdf|lbe|jbo|na|wo|bxr|ty|srn|kaa|ig|nso|tet|kg|ab|ltg|roa-rup|zu|za|cdo|tyv|chy|tw|rmy|om|cu|tn|chr|bi|got|pih|sm|rn|bm|ss|mo|iu|sd|pnt|ki|xh|ts|zh-classical|ee|ak|ti|fj|lg|ks|ff|sg|ny|ve|cr|st|dz|ik|tum|ch|ng|ii|cho|mh|aa|kj|ho|mus|kr|hz):([^\]\[\|\}\{]+)\]\]</source>
[[:$1:$2]]
Interlanguage links in the File namespace do not make sense, categories should be used instead. Thus, convert to normal link and leave for manual cleanup.
Categories
These are mainly to improve machine-readability when performing other category work.
Task
Find
Replace
Notes
Normalize categories
[^]]*)?\]\] *</source>
[[Category:$1$2]]
Run this before the other category fixes
Remove empty [[Category:]]
<syntaxhighlight lang="text" enclose="none">\[\[category: *\]\](?:\n( *\[\[category:))?</source>
$1
Remove double [[Category:[[Category:...]]]]
<syntaxhighlight lang="text" enclose="none">\[\[category:(\[\[category:[^]]*\]\])[ ]*\]\]</source>
$1
One category per line
<syntaxhighlight lang="text" enclose="none">\[\[category:([^]]+)\]\] *\[\[category:([^]]+)\]\]</source>
[[Category:$1]]\n[[Category:$2]]
Run multiple times
Remove duplicates
<syntaxhighlight lang="text" enclose="none">(\[\[[Cc]ategory:)([^]]+\]\])(.*?)\1\2\n?</source>
$1$2$3
Run multiple times, case sensitive
Remove blank lines between categories
<syntaxhighlight lang="text" enclose="none">(\[\[category:[^]]+\]\]\n)\n+(\[\[category:)</source>
$1$2
[Minor]
Formatting
Task
Find
Replace
Notes
Delete surplus lines
<syntaxhighlight lang="text" enclose="none">\n{3,}</source>
\n\n
[Minor]
Fix incorrect line break syntax
<syntaxhighlight lang="text" enclose="none"></?br( )?(/)?\\?></source>
<br$1$2>
This fixes only incorrect syntax (so <br>, <br/>, and <br /> are preserved)
Remove {{}}, [[]], <gallery></gallery>, etc.
\[\[\]\]|<gallery>\s*</gallery>|\[\[:?File *: *\]\])</source>
See also
Recent Comments