Lazarus
Programming => General => Topic started by: julkas on June 13, 2019, 08:41:52 am
-
What is time complexity of the following functions - Length, Low, High?
Thanks.
-
High and low are builtin and mostly compiletime, or cheap (e.g. open arrays).
Length differs in implementation on the type, but should be fairly cheap also
-
Considering that we're talking about strings and arrays here they all have constant time complexity as Low() is either 0 or 1 depending on the type and High() is either constant for static arrays or can be read by a simple lookup from the meta data header of the type (or the hidden high parameter for open array parameters). Same is the case for Length() (High() and Length() are usually implemented by either Length() being High() + 1 or High() being Length() - 1 depending on the type).
-
E.g. a string or dynamic array in Pascal has a time complexity of O(1) compared to C, where it is always O(N). That's because length is stored on these types and C needs strlen to iterate over a string to determine its length. Since length is stored Low and High are also O(1)
-
So,
- For static arrays and short strings: Length, Low, High - O(1).
- For PChar, PWideChar: Length, High - O(n).
Length also supports arguments of type PChar and PWideChar, in which case it is identical to the StrLen and WStrLen functions, respectively. In this case, the function actually calculates the length of the null-terminated string, and its execution time is proportional to the string length because the terminating null character is searched through a linear scan.
- For dynamic arrays: Length, Low, High - O(1).
Please confirm.
-
That is correct. Furthermore longstrings and unicodestrings are also O(1), not only shortstrings. Native Pascal string types always store length at negative offset.
The PChar family is always O(n). So be careful with casts: will slow down your code to C speed.. ::) 8)
-
Except for length as the number of characters ("code points") in a default (UTF-8) string.
-
Except for length as the number of characters ("code points") in a default (UTF-8) string.
Yes, but the time complexity is still better than O(n). UTF8 is only default in Lazarus, though, not in FPC. For UTF8 it is O(log(n)) I believe, but it can also be that the correct length is stored, meaning O(1) again. I will have to check that.
[edit]
Looks like the correct length is stored.
-
If you change a char, you would have to rescan the string anyway, because the length can change. For example, if you change a char that adds an apostrophe to a plain char, the length becomes +1.
-
If you change a char, you would have to rescan the string anyway, because the length can change. For example, if you change a char that adds an apostrophe to a plain char, the length becomes +1.
Length - Low -High are read operations. Of course write operations have a different O, but that was not the question. The time complexity for length-low-high for native Pascal strings is always O(1).
The new length is stored at write time. For shortstring and Ansistring that is still O(1), but for the unicodestring type write operations may be O(n). UTF8 is not a native string type, but write operations are also O(n).
-
No, I mean, if you have an UTF-8 string and you change
MyString[x]
(with Low <= x <= High) from a byte that denotes 'add an apostrophe to the current code point' to the char 'Z', the length of the array stays the same, but the length of the text += 1. You can either make an index that points to all the code points (too much overhead), scan the string when you write something to it, or scan it when you want to know the length of the text. If you scan it after writing, you increase the cost of that operation simply in the hope that writing happens less than looking up the length of the text. It's unlikely that is done. Ergo, looking up the length of the text requires scanning the array.
-
Frankly I am not very interested in such things, if you do not understand modification is a write operation.
Ergo: read operations are - given the question - O(1) as opposed to write operations, which have a different O as i explained.
It is always handy to read a slight introduction:
https://en.wikipedia.org/wiki/Big_O_notation
Theory does matter..... <grumpy >:D , oooooh that was a long time ago...>
Fun: https://www.youtube.com/watch?v=5A7hSaoRv0g O:-) 8-)