string
The string
type's main difference to the text type is how it's indexed. It is recommended to use string types for identification strings which may contain special characters which would be dropped by the analyzer.
Index
String values are normalised and lowercased for the index document.
Strings are normalized using the icu_normalizer
and converted to lower case using the using the icu_folding
token filter.
All strings for indexed documents is split into chunks of 8000 UTF-8 characters. When matching full texts in analysed form, text cannot easily be matched if they exceed 8000 characters.
Sorting
The sorting of string values works like for text
. In addition to the text sorting a pure alphanumerical version is stored in the index alongside with the numerically sortable variant. With that, sorting can sort Car 10, Car 11, Car 12, Car 100. Some special replacement is always done.
Export
The XML looks like for text
.
In this example the column ref
is exported using value hall/7$.
The CSV and JSON export the string as is.
Last updated