String Functions

This document contains the string functions supported by e6data.

| | - Concatenation Operator

returns the final string expression by concating two expressions

> select 'hello' || 'world';
'helloworld'

> select concat('hello ', 'world')
'hello world'

LIKE - Like Operator

The like operator is used to match a specified pattern in a string expression. The pattern contains regular characters like %, _

select colA from table where colA like 'P_%'

ILIKE - ILike Operator

The ilike is a case-insensitive operator used to match a specified pattern in a string expression. The pattern contains regular characters like %, _

> select colA from table where colA ilike 'P_%'
> select colA from table where colA not ilike 'P_%'

RLIKE - RLike Operator

The ilike is a case-insensitive operator used to match a specified pattern in a string expression. The pattern contains regular characters like %, _

> select 'E6data' RLIKE 'E[0-9][a-z]+'
true

CHARACTER_LENGTH / CHAR_LENGTH / LEN / LENGTH ( <expr> )

Returns the length of a given string expression.

> select character_length('e6data');
6

REPLACE( <expr>, <search> [, <replace> ] )

expr - string expression to be searched/modified. search - string expression to be searched for in the input expression. replace - optional string expression to be replaced with the search expression. Default is an empty string.

Replaces all instances of search with replace in string expression

> select replace('hard worker', 'hard', 'smart');
'smart worker'

> select replace('hard worker', 'hard');
'worker'

TRIM( <expr> )

Trims/removes leading and trailing whitespace in the input string expression.

> select trim('   Fastest database engine  ');
'Fastest database engine'

LTRIM( <expr> )

Removes leading whitespace in the input string expression.

> select ltrim('  Fastest database engine--');
'Fastest database engine--'

RTRIM( <expr> )

Removes trailing whitespace in the input string expression.

> select rtrim('Fastest database engine  ');
'Fastest database engine'

LOWER( <expr> )

Returns the input string expression converted to lowercase characters.

> select lower('FASTEST DataBAse ENgine- 2022');
'fastest database engine- 2022'

UPPER( <expr> )

Returns the input string expression converted to uppercase characters.

> select upper('Fastest database engine- 2022');
'FASTEST DATABASE ENGINE- 2022'

SUBSTRING( <expr>, start, length )

start - numeric expression representing a starting position in the string. length - numeric expression representing the length of the substring.

Returns a substring from an input string expression of the given length beginning from the start index.

Note: Indexing starts from 1.

> select substring('Fastest database engine', 9, 8)
'database'

SUBSTR( <expr>, start, length )

This is an alias of the substring function

INITCAP( <expr> )

Returns string expression with the first letter of each word converted to uppercase.

> select initcap('hello world');
'Hello World'

CHARINDEX( <expr> , string [, startindex] )

Returns the first occurrence of the starting position of the input expression within string. Default value of startindex is 1.

> select charindex('a', 'abc')
1

> select charindex('a', 'abc', 2)
0

POSITION( <expr> in <expr> from [startindex] )

Returns first occurrence of the starting position of the input expression within string. Default value of startindex is 1.

> select position('6' in 'e6data-e6data');
2

> select position('6' in 'e6data-e6data' from 3);
9

RIGHT ( <string>, <offset> )

Returns the rightmost substring of its input. Offset index starts from 1.

Supported datatypes are string, integer, decimal, date, timestamp

> SELECT right('abc', 2);
a

LEFT ( <string>, <offset> )

Returns the leftmost substring of its input. Offset index starts from 1

Supported datatypes are string, integer, decimal, date, timestamp

> SELECT left('abc', 2);
c

LOCATE ( <substring> , <string> [, <start_position> ] )

Returns the position of the first occurrence of a substring in a string

Index starts with 1

> SELECT LOCATE('6', 'e6data');
2
> SELECT LOCATE('6', 'e6data', 3);
0

CONTAINS_SUBSTR( <expr> , <search_value_literal> )

Returns TRUE if the value exists, otherwise returns FALSE.

Supported datatype are, string, integer, decimal, date, timestamp

> SELECT CONTAINS_SUBSTR('abcdef', 'ef');
TRUE

Limitations

  • Complex types are not supported.

  • Unicode characters are not supported.

INSTR( <expr>, subvalue [, position, occurrence] )

Returns the lowest position of subvalue in value.

> SELECT INSTR('helloooooooo','oo', 1, 3)
9

Limitations

  • Complex types are not supported.

SOUNDEX( <expr> )

Returns a string that contains a phonetic representation of the input string/integer.

> select soundex('winter')
w536

SPLIT( <expr>, [delimiter] )

Splits <expr> using the delimiter argument.

The default delimiter is the comma , Supported Datatype are string

> SELECT SPLIT('A-B-C-D-E-F-G-H', '-' )	 
[ A, B, C, D, E, F, G, H ]

Limitations

  • Even after converting to Varchar, the passing of a date or timestamp is not supported.

SPLIT_PART( <expr>, <delimiter>, <position> )

Splits a given <expr> based on the <delimiter> Supported Datatype are string

> SELECT SPLIT_PART('abc@def@ghi','@',2)
def

Limitations

  • Unicode characters are not supported

ASCII( <expr> )

Return the ASCII code for the first character of a given string. Supported Datatype are string

> select ascii('a')
97

REPEAT( <input_string> , <repeat_value> )

Create a string by duplicating the input by number of repetitions specified.

Supported datatypes: string, integer, decimal

> select repeat('a', 4)
aaaa

Usages:

  • Output is NULL, if either <input_string> or <repeat_value> is NULL

  • Unicodes are not supported in <input_string>

ENDSWITH / ENDS_WITH( <expr_1> , <expr_2> )

Returns TRUE if the first expression ends with second expression

Support Datatype is String

> SELECT ENDSWITH('e6data','ta');
true

STARTSWITH / STARTS_WITH( <expr_1> , <expr_2> )

Returns TRUE if the first expression starts with second expression

Support Datatype is String

> SELECT STARTS_WITH('e6data','ta');
false

STRPOS ( <string>, <substring> [, <start_position> ] )

Returns the position of the first occurrence of a substring in a string

Index starts with 1

> SELECT STRPOS('e6data','6');
2
> SELECT STRPOS('e6data','6', 3);
0

LPAD ( <expr>, <length> [, <pad> ] )

LPAD function is used to left-pad a string with a specified character or set of characters to a certain length

> SELECT LPAD('123', 5, '0');
00123

RPAD ( <expr>, <length> [, <pad> ] )

RPAD function is used to right-pad a string with a specified character or set of characters to a certain length

> SELECT RPAD('123', 5, '0');
12300

REVERSE ( <expr> )

Returns the reversed order of characters of string, integer and boolean values

Support datatype: INT/DOUBLE/VARCHAR/BOOLEAN

> select reverse(53);
35

TO_CHAR/TO_VARCHAR ( <expr> )

TO_CHAR/TO_VARCHAR ( <date_or_timestamp_expr> [, '<format>' ] )

This function transforms the given expression into a string.

Support datatype: INT/FLOAT/DOUBLE/STRING/DATE/TIMESTAMP/BOOL

> select to_char(cast('2022-01-02 18:09:22' as timestamp), 'dd MMM yyyy hh:mm:ss a');
02 Jan 2022 06:09:22 PM

Usages:

  1. we support java format string for date and timestamp type. However we are not supporting zone as the format

  2. In the formate string we don’t support # as it is a reserved keyword.

  3. We don’t support format parameter for other data type except for date and timestamp type

Last updated

#930: Cross account hive GCP

Change request updated