Difference between revisions of "Arma 3: Regular Expressions"

From Bohemia Interactive Community
Jump to navigation Jump to search
m (Add commands link)
m (Text replacement - "<tt>([a-zA-Z0-9\. _"\\']+)<\/tt>" to "{{hl|$1}}")
Line 11: Line 11:
 
== Quick Guide ==
 
== Quick Guide ==
  
* <tt>a</tt> means "the a character"
+
* {{hl|a}} means "the a character"
* <tt>.</tt> means "any character"
+
* {{hl|.}} means "any character"
 
* <tt>.?</tt> means "any character zero or one time"
 
* <tt>.?</tt> means "any character zero or one time"
 
* <tt>.+</tt> means "any character from one to infinity times"
 
* <tt>.+</tt> means "any character from one to infinity times"
Line 22: Line 22:
 
* <tt>[^a-z]</tt> is a '''negative range''' and means "any character '''not''' between a and z"
 
* <tt>[^a-z]</tt> is a '''negative range''' and means "any character '''not''' between a and z"
 
* <tt>Arma [0-9]</tt> means anything from "Arma 1" to "Arma 9" (going through 2, 3, 4 etc)
 
* <tt>Arma [0-9]</tt> means anything from "Arma 1" to "Arma 9" (going through 2, 3, 4 etc)
{{Feature|informative|To match a specific character that is used in regex syntax, escape it with <tt>\</tt>, e.g {{ic|I\.\.\. don't know\?}}}}
+
{{Feature|informative|To match a specific character that is used in regex syntax, escape it with {{hl|\}}, e.g {{ic|I\.\.\. don't know\?}}}}
  
  
Line 33: Line 33:
  
 
In order to adjust the behaviour of the regex commands, certain flags can be set when using them. Flags are specified at the end of the pattern and start with <tt>/</tt>. Flags need to be lowercase. If there are any non-flag characters in the flags they will be ignored and considered part of the pattern itself.
 
In order to adjust the behaviour of the regex commands, certain flags can be set when using them. Flags are specified at the end of the pattern and start with <tt>/</tt>. Flags need to be lowercase. If there are any non-flag characters in the flags they will be ignored and considered part of the pattern itself.
{{Feature|Informative|If no flags are specified, the default flags are set to <tt>g</tt> and <tt>i</tt>. It is valid to specify <tt>/</tt> to indicate no flags, first match only, case sensitive - see {{HashLink|#Examples}}.}}
+
{{Feature|Informative|If no flags are specified, the default flags are set to {{hl|g}} and {{hl|i}}. It is valid to specify <tt>/</tt> to indicate no flags, first match only, case sensitive - see {{HashLink|#Examples}}.}}
  
 
{| class="wikitable" style="font-size: 0.9em"
 
{| class="wikitable" style="font-size: 0.9em"
Line 42: Line 42:
 
| g
 
| g
 
| Global
 
| Global
| Only relevant for [[regexReplace]] and [[regexFind]]. '''''Missing''''' the global flag sets <tt>format_first_only</tt> flag [https://en.cppreference.com/w/cpp/regex/match_flag_type (source)] and:
+
| Only relevant for [[regexReplace]] and [[regexFind]]. '''''Missing''''' the global flag sets {{hl|format_first_only}} flag [https://en.cppreference.com/w/cpp/regex/match_flag_type (source)] and:
 
* '''only replaces the first occurrence''' with [[regexReplace]]
 
* '''only replaces the first occurrence''' with [[regexReplace]]
 
* '''only returns the first element''' with [[regexFind]]
 
* '''only returns the first element''' with [[regexFind]]

Revision as of 23:55, 15 November 2021

A Regular Expression (or regex/regexp) is an advanced text search format involving specific codes.


Quick Guide

  • a means "the a character"
  • . means "any character"
  • .? means "any character zero or one time"
  • .+ means "any character from one to infinity times"
  • .* means "any character zero to infinity times"
  • .{3,5} means "any (identical) character that is present three to five times"
  • [a c] is a group and means "a character that is either a or c or space"
  • [a-z] is a range and means "any character between a and z" (not between A and Z in the event of a case-sensitive search!)
  • [a-zA-Z0-9] is a range and means "any character between a and z or A and Z or 0 and 9"
  • [^a-z] is a negative range and means "any character not between a and z"
  • Arma [0-9] means anything from "Arma 1" to "Arma 9" (going through 2, 3, 4 etc)
To match a specific character that is used in regex syntax, escape it with \, e.g I\.\.\. don't know\?


Commands

See Command Group: Strings - Regular Expression.


Flags

In order to adjust the behaviour of the regex commands, certain flags can be set when using them. Flags are specified at the end of the pattern and start with /. Flags need to be lowercase. If there are any non-flag characters in the flags they will be ignored and considered part of the pattern itself.

If no flags are specified, the default flags are set to g and i. It is valid to specify / to indicate no flags, first match only, case sensitive - see Examples.
Flag Name Description
g Global Only relevant for regexReplace and regexFind. Missing the global flag sets format_first_only flag (source) and:
i Case-insensitive N/A
n noSubs
«
« When performing matches, all marked sub-expressions (expr) are treated as non-marking sub-expressions (?:expr). No matches are stored in the supplied std::regex_match structure and mark_count() is zero » – c++ reference (source)
o Optimize Optimize pattern, pattern creation is slower, but will execute more efficiently (source)


Examples

"1, 42, and 10e10" regexFind ["[0-9]"]; // matches "1", "4", "2", "1", "0" "1" and "0" "1, 42, and 10e10" regexFind ["[0-9]+"]; // matches "1", "42", "10" and "10", avoiding commas and spaces "Hello there!" regexMatch "There"; // matches - default flags g and i are active "Hello there!" regexMatch "There/"; // no flags are set - as "There" is different from "there" and search is case-sensitive, so the match fails "Hello there!" regexMatch "There/i"; // matches - only the i flag is active "I like garlic, onions and cheese" regexFind ["([^ ]+)(?:,| and) "]; // matches "garlic" and "onions" "Existing Arma: Arma 0, ArmA, ArmA 1, Arma 1, Arma 2, Arma 3, Arma 4, Arma 5, Arma 2035" regexFind ["(Arma [0-3])[^0-9]/g"]; // matches "Arma 0", "Arma 1", "Arma 2", "Arma 3"