Regular Expressions – Arma 3

From Bohemia Interactive Community
Jump to navigation Jump to search
(Page creation)
 
m (Text replacement - "{{HashLink" to "{{Link")
(7 intermediate revisions by the same user not shown)
Line 1: Line 1:
{{TOC|side}}
A '''Regular Expression''' (or regex/regexp) is an advanced text search format involving specific codes.
A '''Regular Expression''' (or regex/regexp) is an advanced text search format involving specific codes.
{{Feature|informative|
{{Feature|informative|
* For a detailed explanation of what a Regular Expression (Regex) is, please visit {{Wikipedia|Regular expression|the Wikipedia page}} about it.
* For a detailed explanation of what a Regular Expression (Regex) is, please visit {{Wikipedia|Regular expression|the Wikipedia page}} about it.
* The following pages allow to test regular expressions:
* The following pages allow to test regular expressions:
** {{ExternalLink|link= https://regexr.com/}}
** {{Link|link= https://regexr.com/}}
** {{ExternalLink|link= https://regex101.com/}}
** {{Link|link= https://regex101.com/}}
}}
}}


Line 10: Line 11:
== Quick Guide ==
== Quick Guide ==


* <tt>a</tt> means "the a character"
* {{hl|a}} means "the a character"
* <tt>.</tt> means "any character"
* {{hl|.}} means "any character"
* <tt>.?</tt> means "any character zero or one time"
* {{hl|.?}} means "any character zero or one time"
* <tt>.+</tt> means "any character from one to infinity times"
* {{hl|.+}} means "any character from one to infinity times"
* <tt>.*</tt> means "any character zero to infinity times"
* {{hl|.*}} means "any character zero to infinity times"
* <tt>.{3,5}</tt> means "any (identical) character that is present three to five times"
* {{hl|.{3,5}}} means "any (identical) character that is present three to five times"
* <tt>[a c]</tt> is a '''group''' and means "a character that is '''either''' a '''or''' c '''or''' space"
* {{hl|[a c]}} is a '''group''' and means "a character that is '''either''' a '''or''' c '''or''' space"
* <tt>[a-z]</tt> is a '''range''' and means "any character between a and z" ('''not''' between A and Z in the event of a case-sensitive search!)
* {{hl|[a-z]}} is a '''range''' and means "any character between a and z" ('''not''' between A and Z in the event of a case-sensitive search!)
* <tt>[a-zA-Z0-9]</tt> is a '''range''' and means "any character between a and z '''or''' A and Z '''or''' 0 and 9"
* {{hl|[a-zA-Z0-9]}} is a '''range''' and means "any character between a and z '''or''' A and Z '''or''' 0 and 9"
* <tt>[^a-z]</tt> is a '''negative range''' and means "any character '''not''' between a and z"
* {{hl|[^a-z]}} is a '''negative range''' and means "any character '''not''' between a and z"
* <tt>Arma [0-9]</tt> means anything from "Arma 1" to "Arma 9" (going through 2, 3, 4 etc)
* {{hl|Arma [0-9]}} means anything from "Arma 0" to "Arma 9" (going through 1, 2, 3, 4 etc)
{{Feature|informative|To match a specific character that is used in regex syntax, escape it with <tt>\</tt>, e.g {{ic|I\.\.\. don't know\?}}}}
{{Feature|informative|To match a specific character that is used in regex syntax, escape it with {{hl|\}}, e.g {{hl|I\.\.\. don't know\?}}}}
 
 
== Commands ==
 
See [[:Category:Command Group: Strings - Regular Expression|Command Group: Strings - Regular Expression]].




== Flags ==
== Flags ==


In order to adjust the behaviour of the regex commands, certain flags can be set when using them. Flags are specified at the end of the pattern and start with <tt>/</tt>. Flags need to be lowercase. If there are any non-flag characters in the flags they will be ignored and considered part of the pattern itself.
In order to adjust the behaviour of the regex commands, certain flags can be set when using them. Flags are specified at the end of the pattern and start with {{hl|/}}. Flags need to be lowercase. If there are any non-flag characters in the flags they will be ignored and considered part of the pattern itself.
{{Feature|Informative|If no flags are specified, the default flags are set to <tt>g</tt> and <tt>i</tt>. It is valid to specify <tt>/</tt> to indicate no flags, first match only, case sensitive - see {{HashLink|#Examples}}.}}
{{Feature|Informative|If no flags are specified, the default flags are set to {{hl|g}} and {{hl|i}}. It is valid to specify {{hl|/}} to indicate no flags, first match only, case sensitive - see {{Link|#Examples}}.}}


{| class="wikitable" style="font-size: 0.9em"
{| class="wikitable" style="font-size: 0.9em"
Line 36: Line 42:
| g
| g
| Global
| Global
| Only relevant for [[regexReplace]] and [[regexFind]]. '''''Missing''''' the global flag sets <tt>format_first_only</tt> flag [https://en.cppreference.com/w/cpp/regex/match_flag_type (source)] and:
| Only relevant for [[regexReplace]] and [[regexFind]]. '''''Missing''''' the global flag sets {{hl|format_first_only}} flag [https://en.cppreference.com/w/cpp/regex/match_flag_type (source)] and:
* '''only replaces the first occurrence''' with [[regexReplace]]
* '''only replaces the first occurrence''' with [[regexReplace]]
* '''only returns the first element''' with [[regexFind]]
* '''only returns the first element''' with [[regexFind]]
Line 56: Line 62:
== Examples ==
== Examples ==


<code>"1, 42, and 10e10" [[regexFind]] ["[0-9]"]; {{cc|matches "1", "4", "2", "1", "0" "1" and "0"}}</code>
<sqf>"1, 42, and 10e10" regexFind ["[0-9]"]; // matches "1", "4", "2", "1", "0" "1" and "0"</sqf>
<code>"1, 42, and 10e10" [[regexFind]] ["[0-9]+"]; {{cc|matches "1", "42", "10" and "10", avoiding commas and spaces}}</code>
<sqf>"1, 42, and 10e10" regexFind ["[0-9]+"]; // matches "1", "42", "10" and "10", avoiding commas and spaces</sqf>
<code>"Hello there!" [[regexMatch]] "There"; {{cc|matches - default flags g and i are active}}
<sqf>
"Hello there!" [[regexMatch]] "There/"; {{cc|no flags are set - as "There" is different from "there" and search is case-sensitive, so the match fails}}
"Hello there!" regexMatch "There"; // matches - default flags g and i are active
"Hello there!" [[regexMatch]] "There/i"; {{cc|matches - only the i flag is active}}</code>
"Hello there!" regexMatch "There/"; // no flags are set - as "There" is different from "there" and search is case-sensitive, so the match fails
<code>"I like garlic, onions and cheese" [[regexFind]] ["([^ ]+)(?:,| and) "]; {{cc|matches "garlic" and "onions"}}</code>
"Hello there!" regexMatch "There/i"; // matches - only the i flag is active
<code>"Existing Arma: Arma 0, ArmA, ArmA 1, Arma 1, Arma 2, Arma 3, Arma 4, Arma 5, Arma 2035" [[regexFind]] ["(Arma [0-3])[^0-9]/g"]; {{cc|matches "Arma 0", "Arma 1", "Arma 2", "Arma 3"}}
</sqf>
<sqf>"I like garlic, onions and cheese" regexFind ["([^ ]+)(?:,| and) "]; // matches "garlic" and "onions"</sqf>
<sqf>"Existing Arma: Arma 0, ArmA, ArmA 1, Arma 1, Arma 2, Arma 3, Arma 4, Arma 5, Arma 2035" regexFind ["(Arma [0-3])[^0-9]/g"]; // matches "Arma 0", "Arma 1", "Arma 2", "Arma 3"</sqf>




{{GameCategory|arma3|Editing}}
{{GameCategory|arma3|Editing}}

Revision as of 18:43, 4 January 2023

A Regular Expression (or regex/regexp) is an advanced text search format involving specific codes.


Quick Guide

  • a means "the a character"
  • . means "any character"
  • .? means "any character zero or one time"
  • .+ means "any character from one to infinity times"
  • .* means "any character zero to infinity times"
  • .{3,5} means "any (identical) character that is present three to five times"
  • [a c] is a group and means "a character that is either a or c or space"
  • [a-z] is a range and means "any character between a and z" (not between A and Z in the event of a case-sensitive search!)
  • [a-zA-Z0-9] is a range and means "any character between a and z or A and Z or 0 and 9"
  • [^a-z] is a negative range and means "any character not between a and z"
  • Arma [0-9] means anything from "Arma 0" to "Arma 9" (going through 1, 2, 3, 4 etc)
To match a specific character that is used in regex syntax, escape it with \, e.g I\.\.\. don't know\?


Commands

See Command Group: Strings - Regular Expression.


Flags

In order to adjust the behaviour of the regex commands, certain flags can be set when using them. Flags are specified at the end of the pattern and start with /. Flags need to be lowercase. If there are any non-flag characters in the flags they will be ignored and considered part of the pattern itself.

If no flags are specified, the default flags are set to g and i. It is valid to specify / to indicate no flags, first match only, case sensitive - see Examples.
Flag Name Description
g Global Only relevant for regexReplace and regexFind. Missing the global flag sets format_first_only flag (source) and:
i Case-insensitive N/A
n noSubs
«
« When performing matches, all marked sub-expressions (expr) are treated as non-marking sub-expressions (?:expr). No matches are stored in the supplied std::regex_match structure and mark_count() is zero » – c++ reference (source)
o Optimize Optimize pattern, pattern creation is slower, but will execute more efficiently (source)


Examples

"1, 42, and 10e10" regexFind ["[0-9]"]; // matches "1", "4", "2", "1", "0" "1" and "0"
"1, 42, and 10e10" regexFind ["[0-9]+"]; // matches "1", "42", "10" and "10", avoiding commas and spaces
"Hello there!" regexMatch "There"; // matches - default flags g and i are active "Hello there!" regexMatch "There/"; // no flags are set - as "There" is different from "there" and search is case-sensitive, so the match fails "Hello there!" regexMatch "There/i"; // matches - only the i flag is active
"I like garlic, onions and cheese" regexFind ["([^ ]+)(?:,| and) "]; // matches "garlic" and "onions"
"Existing Arma: Arma 0, ArmA, ArmA 1, Arma 1, Arma 2, Arma 3, Arma 4, Arma 5, Arma 2035" regexFind ["(Arma [0-3])[^0-9]/g"]; // matches "Arma 0", "Arma 1", "Arma 2", "Arma 3"