HashMap: Difference between revisions

From Bohemia Interactive Community
Jump to navigation Jump to search
(Started HashMap Basics section)
(Finished HashMap Basics section)
Line 12: Line 12:


== HashMap Basics ==
== HashMap Basics ==
HashMaps are specifically designed for (near) constant-time lookup of keys. Consider the following example to understand what that means: Let's say we have an Array that looks like this ...
HashMaps are specifically designed for (near) constant-time lookup of keys.
 
Consider the following example to understand what that means: Let's say we have an Array that looks like this ...
  _playerDataArray <nowiki>= [</nowiki>["Player_A_UID", "Data A-1", "Data A-2", ...], ["Player_B_UID", "Data B-1", "Data B-2", ...], ["Player_C_UID", "Data C-1", "Data C-2", ...], ...];
  _playerDataArray <nowiki>= [</nowiki>["Player_A_UID", "Data A-1", "Data A-2", ...], ["Player_B_UID", "Data B-1", "Data B-2", ...], ["Player_C_UID", "Data C-1", "Data C-2", ...], ...];
... and we want to find a specific [[getPlayerUID|UID]] so that we can retrieve the corresponding data. We can easily achieve that with the [[findIf]] command:
... and we want to find a specific [[getPlayerUID|UID]] so that we can retrieve the corresponding data. We can easily achieve that with the [[findIf]] command:
Line 23: Line 25:
     };
     };
  } [[forEach]] _playerDataArray;
  } [[forEach]] _playerDataArray;
Now consider how many [[isEqualTo]] comparisons this [[forEach]]-loop has to perform: If the element identified by "Wanted_UID" is stored at or near the end of {{Inline code|_playerDataArray}}, then our code has to go through (almost) the entire Array to find it - and the same is the case if the element we are looking for does not exist in our Array.<br>
Now consider how many [[isEqualTo]] comparisons this [[forEach]]-loop has to perform: If the element identified by "Wanted_UID" is stored at or near the end of {{Inline code|_playerDataArray}}, then our code has to go through (almost) the entire Array to find it - and the same is the case if the element we are looking for does not exist in our Array.
Every single one of these comparison takes some time - not much, but it will add up eventually. This is no problem as long as our Array is relatively small, but it becomes a serious issue when the Array starts growing: If there are 10,000 elements in the Array that means up to 10,000 comparisons might be necessary just to find a single element, if there are 100,000 elements that means up to 100,000 comparisons and so on - the amount of steps needed to find an element grows linearly with the Array size.<br>
 
{{WIP}}
Every single one of these comparisons takes some time - not much, but it will add up eventually. This is no problem as long as our Array is relatively small, but it becomes a serious issue when the Array starts growing: If there are 10,000 elements in the Array that means up to 10,000 comparisons might be necessary just to find a single element, if there are 100,000 elements that means up to 100,000 comparisons and so on - the amount of steps and time needed to find an element grows linearly with the Array size.<br>
 
And this is where HashMaps come in. Simply put: When a key-value pair is inserted into a HashMap, a hash function is applied to the key to determine the position at which the key-value pair is going to be stored. Then, when we want to retrieve the value associated with a specific key, the same hash function ({{arma3}} uses [https://en.wikipedia.org/wiki/Fowler%E2%80%93Noll%E2%80%93Vo_hash_function FNV-1a 64-bit]) is applied to that key and the resulting position tells the HashMap exactly where to find the key-value pair that we are looking for.
_data = _playerDataHashMap [[get]] "Wanted_UID";
Since the process to find a specific key-value pair is always the same (apply hash function, look at the resulting position, return the value that is stored there), it also always takes (more or less) the same amount of time - regardless of the size of the HashMap.
 
And that is what (near) constant-time lookup for keys means. It comes with very useful benefits: The procedures to search for, retrieve, modify or remove an element can all be completed in constant time; whether the HashMap contains 10, 10,000 or 1,000,000 elements does not matter. That is the advantage of HashMaps.





Revision as of 11:20, 26 January 2021

Arma 3
HashMaps were introduced in Arma 3 version 2.01.

Overview

A HashMap is a specialized data structure that contains key-value pairs.
HashMaps provide (near) constant-time lookup for keys, making them highly efficient at finding the value associated with a specific key - even if there is a very large amount of keys.
See Wikipedia to learn more about the underlying technology.

While HashMaps and Arrays share many traits (and SQF command names), there are important differences and HashMaps must not be considered as some sort of new or improved replacement for the Array.


HashMap Basics

HashMaps are specifically designed for (near) constant-time lookup of keys.

Consider the following example to understand what that means: Let's say we have an Array that looks like this ...

_playerDataArray = [["Player_A_UID", "Data A-1", "Data A-2", ...], ["Player_B_UID", "Data B-1", "Data B-2", ...], ["Player_C_UID", "Data C-1", "Data C-2", ...], ...];

... and we want to find a specific UID so that we can retrieve the corresponding data. We can easily achieve that with the findIf command:

_index = _playerDataArray findIf {_x # 0 isEqualTo "Wanted_UID"};
_data = _playerDataArray # _index;

But what findIf actually does for us is something like this:

{
    if (_x # 0 isEqualTo "Wanted_UID") then {
        breakWith _forEachIndex;
    };
} forEach _playerDataArray;

Now consider how many isEqualTo comparisons this forEach-loop has to perform: If the element identified by "Wanted_UID" is stored at or near the end of _playerDataArray, then our code has to go through (almost) the entire Array to find it - and the same is the case if the element we are looking for does not exist in our Array.

Every single one of these comparisons takes some time - not much, but it will add up eventually. This is no problem as long as our Array is relatively small, but it becomes a serious issue when the Array starts growing: If there are 10,000 elements in the Array that means up to 10,000 comparisons might be necessary just to find a single element, if there are 100,000 elements that means up to 100,000 comparisons and so on - the amount of steps and time needed to find an element grows linearly with the Array size.

And this is where HashMaps come in. Simply put: When a key-value pair is inserted into a HashMap, a hash function is applied to the key to determine the position at which the key-value pair is going to be stored. Then, when we want to retrieve the value associated with a specific key, the same hash function (Arma 3 uses FNV-1a 64-bit) is applied to that key and the resulting position tells the HashMap exactly where to find the key-value pair that we are looking for.

_data = _playerDataHashMap get "Wanted_UID";

Since the process to find a specific key-value pair is always the same (apply hash function, look at the resulting position, return the value that is stored there), it also always takes (more or less) the same amount of time - regardless of the size of the HashMap.

And that is what (near) constant-time lookup for keys means. It comes with very useful benefits: The procedures to search for, retrieve, modify or remove an element can all be completed in constant time; whether the HashMap contains 10, 10,000 or 1,000,000 elements does not matter. That is the advantage of HashMaps.


Working with HashMaps

Key Types

Because of the requirement for the keys to be hashable (and constant), not all Data Types can be used as keys.

Supported types are limited to:

  • Array keys can only contain supported types.
  • Array keys are deep-copied on insertion and cannot be modified when retrieved via keys or inside forEach.
  • The virtual type HashMapKey is a combination of all the supported types.

Creating a HashMap

// Example of an empty HashMap
private _myMap = createHashMap;
count _myMap;			// returns 0

// Example of a prefilled HashMap
private _myFilledMap = createHashMapFromArray [["a",1], ["b",2], ["c", 3]];
count _myFilledMap;	// returns 3

Setting an element

private _myMap = createHashMap;
_myMap set [1, "hello there"];	// _myMap is [[1, "hello there"]]

Inserting an element with a key that already exists inside the HashMap will overwrite the existing key.

private _myMap = createHashMapFromArray [["a",1], ["b",2]];
_overwritten = _myMap set ["a", 1337];	// _myMap is now [["a",1337], ["b",2]] and _overwritten is true

Getting an element

Values are retrieved by their key:

private _myMap = createHashMapFromArray [["a",1], ["b",2]];
_myMap get "a";	// returns 1
_myMap get "z";	// returns Nothing
_myMap getOrDefault ["z", "NotFound"];	// returns "NotFound"

Removing an element

You can remove (delete) elements from the HashMap using deleteAt with the element's key:

private _myMap = createHashMapFromArray [["a",1], ["b",2]];
_myMap deleteAt "b"; // _myMap is "a",1

Checking if an element exists

You can check if a key is present in the HashMap using the in command:

private _myMap = createHashMapFromArray [["a",1], ["b",2]];
"a" in _myMap;	// returns true
"z" in _myMap;	// returns false

Counting elements

The count command can be used to return the number of key-value pairs stored in the HashMap:

private _myMap = createHashMapFromArray [["a",1], ["b",2]];
count _myMap ; // returns 2

Retrieving keys

You can retrieve an Array of all keys in the HashMap using the keys command:

private _myMap = createHashMapFromArray [["a",1], ["b",2]];
keys _myMap; // returns ["a", "b"]

HashMap variables

A HashMap variable is a reference to the HashMap (see Wikipedia); this means that if the HashMap is edited, all scripts and functions using this HashMap will see the changes.

private _myMap = createHashMapFromArray [["a",1], ["b",2], ["c",3]];
private _myNewMap = _myMap;
_myMap set ["z", 4];
_myNewMap get "z"; // will be 4

A HashMap set through setVariable does not need to be assigned again:

player setVariable ["myMap", createHashMapFromArray [["a",1], ["b",2], ["c",3]]];
private _myMap = player getVariable "myMap";
_myMap set ["z", 4];
player getVariable "myMap"; // is [["a",1], ["b",2], ["c",3], ["z",4]]

Copying a HashMap

private _myMap = createHashMapFromArray [["a",1], ["b",2]];
private _myNewMap = _myMap;
_myMap set ["a", 1337];
_myNewMap get "a"; // will be 1337

In order to avoid this behaviour, copy the HashMap with + (plus):

private _myMap = createHashMapFromArray [["a",1], ["b",2]];
private _myNewMap = +_myMap;
_myMap set ["a", 1337];
_myNewMap get "a"; // still 1

Arrays stored as key or value in the HashMap will also be deep-copied.


Advanced usage

Iterating through a HashMap

In general, HashMaps have to be considered unordered. While iterating through them is possible with forEach, it is less efficient than looping through Arrays.

private _myMap = createHashMapFromArray [["a",1], ["b",2]];
{ systemChat str [_x, _y] } forEach _myMap;

When iterating through a HashMap with forEach, _x contains the key of the current element and _y contains the corresponding value.


Common errors

Scalar Key precision

Numbers in Arma 3 are floating point numbers, and because there are gaps between floating point numbers, rounding is necessary - see also Wikipedia. For example, 87654316, 87654317, 87654318, 87654319, 87654320, 87654321, 87654322, 87654323 and 87654324 will all be rounded to and treated as the same value by the game engine (because the actual value of each of these numbers can not be represented as an Arma 3 floating point number). Similar problems occur with fractional numbers:

// This will return false:
(0.3 + 0.4 == 0.7)

This means that using very large numbers or fractional numbers as HashMap keys has to be done cautiously to avoid accidentally overwriting existing keys.


See Also