Scalar functions list¶
boolean
functions¶
- ifthenelse function
-
ifthenelse
(condition, x, y)¶ Returns x if condition is true, else returns y.
Note
The difference with the if construct in most programming languages is that x and y expressions will always be evaluated.
Examples:
>>> sql("select ifthenelse(1>0,'yes','no') as answer") answer ------ yes
-
date
functions¶
- activityindex function
-
activityIndex
(date, c1, c2) → int¶
Specialized function that classifies the provided date argument into a 6-point scale (0 to 5)
Examples:
>>> table1(''' ... '2009-01-01T01:32:03Z' ... '2010-01-01T00:03:13Z' ... '2010-12-31T00:03:13Z' ... '2011-04-01T00:03:13Z' ... ''') >>> sql("select activityIndex(a) from table1") activityIndex(a) ---------------- 0 1 3 5
-
- cleantimezone function
-
cleantimezone
(date) → date¶
Specialized function that removes timezone information from date string
Examples:
>>> table1(''' ... '2009-01-01T01:03:13+0100' ... '2009-01-01T01:03:13-0100' ... '2009-01-01T01:03:13+01:00' ... '2009-01-01T01:03:13-01:00' ... '2009-01-01T01:03:13+01' ... '2009-01-01T01:03:13-01' ... ''') >>> sql("select cleantimezone(a) from table1") cleantimezone(a) ------------------- 2009-01-01 01:03:13 2009-01-01 01:03:13 2009-01-01 01:03:13 2009-01-01 01:03:13 2009-01-01 01:03:13 2009-01-01 01:03:13
-
- date2iso function
-
date2iso
(sec) → ISO Datetime¶
Converts an input date to ISO-8601 date format. It tries to autodetect, the input date format.
Examples:
>>> table1(''' ... 2007-12-31 ... 2010-01-01 ... 2010W06 ... "18/Jan/2011:11:13:00 +0100" ... ''')
>>> sql("select date2iso(a) from table1") date2iso(a) ------------------------- 2007-12-31T00:00:00+00:00 2010-01-01T00:00:00+00:00 2010-02-05T00:00:00+00:00 2011-01-18T11:13:00+01:00
-
- datestrf2isoweek function
-
dateisoweek2week52
(sec) → isoweek¶
Converts an ISOweek (having weeks in range [0,53]) to an ISOweek format which has weeks in range [1,53]. This function is usefull for producing week statistics which do not have incomplete weeks.
Examples:
>>> table1(''' ... 2007-12-31 ... 2010-01-01 ... ''')
>>> sql("select strftime('%YW%W',a) from table1") strftime('%YW%W',a) ------------------- 2007W53 2010W00
>>> sql("select datestrf2isoweek(strftime('%YW%W',a)) from table1") datestrf2isoweek(strftime('%YW%W',a)) ------------------------------------- 2007W53 2009W53
-
- sectohuman function
-
sectohuman
(sec) → human readable format¶
Converts a number of seconds to human readable format.
Examples:
>>> table1(''' ... 3 ... 63 ... 10000 ... 100000 ... 1000000 ... ''') >>> sql("select sectohuman(a) from table1") sectohuman(a) ------------------------------ 3 sec 1 min 3 sec 2 hours 46 min 40 sec 1 day 3 hours 46 min 40 sec 11 days 13 hours 46 min 40 sec
-
evals
functions¶
- pyeval function
-
pyeval
(expression)¶
Evaluates with Python the expression/s given and returns the result
>>> sql("pyeval '1+1'") pyeval('1+1') ------------- 2 >>> sql("select var('test')") Traceback (most recent call last): ... OperatorError: YeSQL SQLError: Operator VAR: Variable 'test' does not exist >>> sql("select var('test', pyeval('1+1'))") var('test', pyeval('1+1')) -------------------------- 2 >>> sql("select var('test')") var('test') ----------- 2 >>> sql('''pyeval '1+1' '"-"' '3+1' ''') pyeval('1+1','"-"','3+1') ------------------------- 2-4 >>> sql("var 'testvar' of select 5") var('testvar',(select 5)) ------------------------- 5 >>> sql("pyeval 'testvar+5'") pyeval('testvar+5') ------------------- 10 >>> sql('''pyeval keywords('lala') ''') pyeval('keywords(''lala'')') ---------------------------- lala
-
- pyfun function
-
pyfun
(pyfunction, parameters)¶
Calls a python function and returns the result. If an error occurs, it throws an exception.
>>> sql("select pyfun('math.sqrt', 25)") pyfun('math.sqrt', 25) ---------------------- 5.0 >>> sql("select pyfun('math.log10', 100)") pyfun('math.log10', 100) ------------------------ 2.0 >>> sql("select pyfun('math.log10', -1)") Traceback (most recent call last): ... OperatorError: YeSQL SQLError: Operator PYFUN: math.log10: math domain error
-
- pyfunerrtonul function
-
pyfunerrtonul
(pyfunction, parameters)¶
Calls a python function and returns the result. If an error occurs it returns null.
>>> sql("select pyfunerrtonul('math.sqrt', -1)") pyfunerrtonul('math.sqrt', -1) ------------------------------ None >>> sql("select pyfunerrtonul('math.log10', -1)") pyfunerrtonul('math.log10', -1) ------------------------------- None
-
query function
- subst function
-
subst
(str, variables)¶
Substitutes the special text markers with the variables values.
>>> sql('''subst 'Variable %s has value %s' 'var1' '5' ''') subst('Variable %s has value %s','var1','5') -------------------------------------------- Variable var1 has value 5
>>> sql('''select subst('Variable %s has value %d','var2',5) ''') subst('Variable %s has value %d','var2',5) ------------------------------------------ Variable var2 has value 5
>>> sql('''var 'testvar' 'testvalue' ''') var('testvar','testvalue') -------------------------- testvalue
>>> sql('''select subst('Variable %{testvar}1 %{testvar1} has value %s', 5) ''') subst('Variable testvalue1 %{testvar1} has value %s', 5) -------------------------------------------------------- Variable testvalue1 %{testvar1} has value %s
-
fileops
functions¶
- fileextension function
-
fileextension
(text) → text¶
Returns the extension of a given text argument.
Examples:
>>> table1(''' ... "http://www.test.com/lalala.gif" ... "http://www.test.com/lalala.GIF" ... ''') >>> sql("select fileextension(a) from table1") fileextension(a) ---------------- .gif .gif
-
- filetext function
-
filetext
(filename) → text¶
Returns the contents of the file in a single value
Examples:
>>> sql("select filetext('testing/sales.tsv')") filetext('testing/sales.tsv') ---------------------------------- Cars 2010-01 Athens 200 Cars 2010-02 Athens 130 Bikes 2010-01 NY 10 Bikes 2010-02 NY 30 Cars 2010-01 NY 100 Cars 2010-02 NY 160 Cars 2010-01 Paris 70 Cars 2010-02 Paris 20 Bikes 2010-01 Paris 100 Bikes 2010-02 Paris 20 Boats 2010-01 Paris 200
-
formating
functions¶
- dateformat function
-
dateformat
(date[, inpformat, outformat])¶
Returns date formatted with outformat. Default inpformat is %d-%m-%y and default outformat is %Y-%m-%d. To change inpformat just provide as second parameter the input format. To change outformat, both format parameters must be provided. If date is not formatted according to inpformat null value is returned.
Check formatting pattern details
Examples:
>>> sql("select dateformat('28-01-09') as date") date ---------- 2009-01-28
When date doesn’t meet the format null is returned
>>> sql("select dateformat('32-01-09') as date") date ---- None >>> sql("select dateformat('Thu, 28 Jun 2001 14:17:15 +0000','%a, %d %b %Y %H:%M:%S +0000') as date") date ---------- 2001-06-28 >>> sql("select dateformat('28-01-09','%d-%m-%y','%x') as date") date -------- 01/28/09
-
- strjoin function
-
strjoin
(value1, value2, ....[, 'params', formatting options])¶
Returns a string with value1,value2,…. formatted according to formatting options. Literal parameter ‘params’ must precede formating options. Default formatting is comma separated values.
Examples:
>>> sql("select strjoin('First','Second','Third',100) as joinedvals") joinedvals ----------------------- First,Second,Third,100 >>> sql("select strjoin('First','Second','Third',100,'params','delimiter: ') as joinedvals") joinedvals ----------------------- First Second Third 100 >>> sql("select strjoin('lola',2,10,'llalallall','params','delimiter:%','quoting:QUOTE_ALL') as joinedvals") joinedvals ----------------------------- "lola"%"2"%"10"%"llalallall" >>> sql("select strjoin('value, with delimiter ',2,10,'new','params','dialect:csv') as joinedvals") joinedvals ---------------------------------- "value, with delimiter ",2,10,new
-
strsplit function
strsplit
(str[, formatting options]) → [C1,C2,....]¶Splits str according to formatting options, default behavior is to split on space. It is a multiset operator that returns one row.
See also
- Multisets functions
Formatting options:
dialect: tsv/csv
Formats field as tab/comma separated values with minimal quoting
delimiter: A string used to separate fields. It defaults to ‘ ‘
doublequote: t/f
Controls how instances of quotechar appearing inside a field should be themselves be quoted. When True, the character is doubled. When False, the escapechar is used as a prefix to the quotechar. It defaults to True. On output, if doublequote is False and no escapechar is set, Error is raised if a quotechar is found in a field
escapechar: A one-character string used by the writer to escape the delimiter if quoting is set to QUOTE_NONE and the quotechar if doublequote is False. On reading, the escapechar removes any special meaning from the following character. It defaults to None, which disables escaping.
quotechar: A one-character string used to quote fields containing special characters, such as the delimiter or quotechar, or which contain new-line characters. It defaults to ‘”’.
quoting: Controls when quotes should be generated by the writer and recognized by the reader. It can take on any of the QUOTE_* constants and defaults to QUOTE_MINIMAL. Possible values are QUOTE_ALL, QUOTE_NONE, QUOTE_MINIMAL, QUOTE_NONNUMERIC
skipinitialspace: t/f
When True, whitespace immediately following the delimiter is ignored. The default is False
Returned multiset schema:
- C1,C2…
- Text values produced from splitting.
Examples:
>>> sql("select strsplit('First,Second,Third', 'dialect:csv')") C1 | C2 | C3 ---------------------- First | Second | Third >>> sql("select strsplit('-First-%Second%-Third-','delimiter:%','quotechar:-')") C1 | C2 | C3 ---------------------- First | Second | Third
- strsplitv function
-
strsplitv
(str[, formatting options]) → [C1]¶
Splits in rows str according to formatting options, default behavior is to split on space. It is a multiset operator that returns one column but many rows. Formatting options are like in
strsplit()
function.See also
- Multisets functions
>>> sql("select strsplitv('First,Second,Third','dialect:csv')") C1 ------ First Second Third >>> sql("select strsplitv('First Second Third')") C1 ------ First Second Third
-
htmlops
functions¶
- htmladdbreaks function
-
url
(href, linktext)¶
Returns the a url pointing to href and having the link text linktext.
Examples:
>>> sql("select htmladdbreaks('very-long/string') as brokenhtml") brokenhtml -------------------------- very-<wbr>long/<wbr>string
-
- htmldecode function
-
htmldecode
(str)¶
Returns the html decoded str.
Examples:
>>> sql("select htmldecode('("die+wunderbaren+jahre")') as query") query ------------------------- ("die+wunderbaren+jahre") >>> sql("select htmldecode(null) as query") query ----- None
-
- htmlencode function
-
htmldecode
(str)
Returns the html decoded str.
Examples:
>>> sql("select htmldecode('("die+wunderbaren+jahre")') as query") query ------------------------- ("die+wunderbaren+jahre") >>> sql("select htmldecode(null) as query") query ----- None
-
- htmllink function
-
htmllink
(href, linktext)¶
Returns the an html link pointing to href and having the link text linktext.
Examples:
>>> sql("select htmllink('http://somewhere.org') as url") url ----------------------------------------------------------------- <a href="http://somewhere.org">http://<wbr>somewhere.<wbr>org</a>
>>> sql("select htmllink('somewhere.org') as url") url ----------------------------------------------------- <a href="http://somewhere.org">somewhere.<wbr>org</a>
>>> sql("select htmllink('somewhere.org', 'go somewhere') as url") url ----------------------------------------------- <a href="http://somewhere.org">go somewhere</a>
-
- htmlstriptags function
Strips the html tags of input. It also converts “<br>” tags to new lines. If a default_tag_conversion is provided then tags that would have been erased are converted to default_tag_conversion.
Examples:
>>> sql("select htmlstriptags('<tag1>asdf<>as< br>df<p class = lala>spaced</sp>paragraph</p>anotherline<tag2> w<sup>3</sup>') as query") query ------------------------------------------- asdfas df spaced paragraph anotherline w_3
>>> sql("select htmlstriptags('<tag1>asdf<>as< br>df<p class = lala>spaced</sp>paragraph</p>anotherline<tag2> w<sup>3</sup>', '***') as query") query ---------------------------------------------------- ***asdf***as df spaced paragraph anotherline*** w_3
>>> sql("select htmlstriptags(null) as query") query -----
- urldecode function
-
urldecode
(str)¶
Returns the url decoded str.
Examples:
>>> sql("select urldecode('where%2Ccollid%3Dcolid+and+u%3D%27val%27') as query") query ------------------------------ where,collid=colid and u='val'
>>> sql("select urldecode(null) as query") query ----- None
-
- urlencode function
-
urlescape
(str)¶
Returns the escaped URL.
Examples:
>>> sql("select urlencode('where, collid=colid') as query") query ----------------------- where%2C+collid%3Dcolid
-
- urllocation function
-
urllocation
(str) → str¶
Returns the location part of provided URL.
Examples:
>>> table1(''' ... http://www.test.com/apath/bpath/fname.pdf ... http://www.test.com/search.csv;p=5?q=test#hl=en ... ''') >>> sql("select urllocation(a) from table1") urllocation(a) ----------------------------------------- http://www.test.com/apath/bpath/fname.pdf http://www.test.com/search.csv
-
- urlquery2jdict function
-
urlquery2jdict
(URL or URL_query_part) → JDICT¶
Converts the query part of a URL into a JSON associative array.
Examples:
>>> table1(''' ... 'url_ver=ver1&url_tim=2011-01-01T00%3A02%3A40Z' ... 'url_tim=2011-01-01T00%3A02%3A40Z&url_ver=ver1' ... http://www.test.com/search.csv;p=5?lang=test&ver=en ... ''') >>> sql("select urlquery2jdict(a) from table1") urlquery2jdict(a) --------------------------------------------------- {"url_tim":"2011-01-01T00:02:40Z","url_ver":"ver1"} {"url_tim":"2011-01-01T00:02:40Z","url_ver":"ver1"} {"lang":"test","ver":"en"}
-
- urlsplit function
-
urlsplit
(text1[, text2, ...]) → multiset¶
Breaks a given URL into multiple fields. The returned table schema is:
scheme: What type the URL is (e.g. http, ftp …) netloc: Network location of URL (e.g. www.text.com) path: Path part of URL (e.g. /data/2010/). It always has a slash at the end filename: Filename part of URL type: Mime type of URL, or if not a mime type exists, the extension part of filename. subtype: Mime subtype of URL. params: All parameters following ‘;’ in URL. query: All parameters following ‘?’ in URL. fragment: All parameters following ‘#’ in URL. Examples:
>>> table1(''' ... http://www.test.com/apath/bpath/fname.pdf ... http://www.test.com/search.csv;p=5?q=test#hl=en ... ''') >>> sql("select urlsplit(a) from table1") scheme | netloc | path | filename | type | subtype | params | query | fragment ------------------------------------------------------------------------------------------------------- http | www.test.com | /apath/bpath/ | fname.pdf | application | pdf | | | http | www.test.com | / | search.csv | csv | | p=5 | q=test | hl=en
-
iptools
functions¶
- ip2long function
-
ip2long
(ip) → int¶
Converts a decimal dotted quad IP string to long integer IP format. It can take either one column of IP strings or 4 columns each having one part of the IP address.
Examples:
>>> sql("select ip2long('123.123.123.123')") ip2long('123.123.123.123') -------------------------- 2071690107 >>> sql("select ip2long(123,123,123,123)") ip2long(123,123,123,123) ------------------------ 2071690107
-
- ip_prefix function
-
ip_prefix
(ip, class_number) → ip¶
Returns the subnetwork class of an IP address.
Examples:
>>> sql("ip_prefix '123.34.24.54' ") ip_prefix('123.34.24.54') ------------------------- 4 >>> sql("ip_prefix '123.34.24.54' '3'") ip_prefix('123.34.24.54','3') ----------------------------- 123.34.24 >>> sql("ip_prefix '123.34.24.54' '2'") ip_prefix('123.34.24.54','2') ----------------------------- 123.34
-
- long2ip function
-
long2ip
(int) → ip¶
Convert longint IP to dotted quad string
Examples:
>>> sql("select long2ip('2071690107')") long2ip('2071690107') --------------------- 123.123.123.123 >>> sql("select long2ip(2071690107)") long2ip(2071690107) ------------------- 123.123.123.123
-
jpacks
functions¶
- j2nl function
-
j2nl
(jpack) → text¶
Converts multiple input jpacks to a newline separated text.
Examples:
>>> sql("select j2nl('[1,2,3]')") j2nl('[1,2,3]') --------------- 1 2 3
>>> sql("select j2nl('[1,2,3]','a')") j2nl('[1,2,3]','a') ------------------- 1 2 3 a
>>> sql("select j2nl('a', 'b')") j2nl('a', 'b') -------------- a b
-
- j2s function
-
j2s
(jpack) → space separated string¶
Converts multiple input jpacks to a space separated string. Newlines are converted to spaces.
Examples:
>>> sql("select j2s('[1,2,3]')") j2s('[1,2,3]') -------------- 1 2 3
>>> sql("select j2s('[1,2,3]','a')") j2s('[1,2,3]','a') ------------------ 1 2 3 a
>>> sql("select j2s('a', 'b')") j2s('a', 'b') ------------- a b
-
- j2t function
-
j2t
(jpack) → tabpack¶
Converts multiple input jpacks to a tab separated pack (tab separated values). If tab or newline characters are found in the source jpack they are converted to spaces.
Examples:
>>> sql("select j2t('[1,2,3]')") j2t('[1,2,3]') -------------- 1 2 3
>>> sql("select j2t('[1,2,3]','a')") j2t('[1,2,3]','a') ------------------ 1 2 3 a
>>> sql("select j2t('a', 'b')") j2t('a', 'b') ------------- a b
-
- jchars function
-
jletters
(text) → character jpack¶
Splits an input text into its composing characters.
Examples:
>>> sql("select jchars('this is a text')") jchars('this is a text') --------------------------------------------------------- ["t","h","i","s"," ","i","s"," ","a"," ","t","e","x","t"]
>>> sql("select jchars('another', 'text')") jchars('another', 'text') --------------------------------------------- ["a","n","o","t","h","e","r","t","e","x","t"]
-
- jcombinations function
-
jcombinations
(jpack, r) → multiset¶
Returns all length r combinations of jpack.
Examples:
>>> sql('''select jcombinations('["t1","t2","t3"]',2)''') C1 | C2 ------- t1 | t2 t1 | t3 t2 | t3
>>> sql('''select jcombinations('["t1","t2",["t3","t4"]]',2)''') C1 | C2 ---------------- t1 | t2 t1 | ["t3","t4"] t2 | ["t3","t4"]
>>> sql('''select jcombinations(null,2)''')
>>> sql('''select jcombinations('["t1","t2","t3","t4"]')''') C1 -- t1 t2 t3 t4
-
- jdict function
-
jdict
(key, value, key1, value1) → jdict¶
Returns a jdict of the keys and value pairs.
Examples:
>>> sql(''' select jdict('key1', 'val1', 'key2', 'val2') ''') jdict('key1', 'val1', 'key2', 'val2') ------------------------------------- {"key1":"val1","key2":"val2"}
>>> sql(''' select jdict('key', '{"k1":1,"k2":2}') ''') jdict('key', '{"k1":1,"k2":2}') ------------------------------- {"key":{"k1":1,"k2":2}}
>>> sql(''' select jdict('key', '["val1", "val2"]') ''') jdict('key', '["val1", "val2"]') -------------------------------- {"key":["val1","val2"]}
>>> sql(''' select jdict('1') ''') Traceback (most recent call last): ... OperatorError: YeSQL SQLError: Operator JDICT: At least two arguments required
-
- jdictgroupkey function
-
jdictgroupkey
(list_of_jdicts, groupkey1, groupkey2, ...)¶
It groups an array of jdicts into a hierarchical structure. The grouping is done first by groupkey1 then by groupkey2 and so on.
If no groupkeys are provided, then the first key of array’s first jdict is used as a groupkey.
Examples:
>>> sql('''select jdictgroupkey('[{"gkey":"v1", "b":1},{"gkey":"v1","b":2},{"gkey":"v2","b":1, "c":2}]') as j''') j --------------------------------------------- {"v1":[{"b":1},{"b":2}],"v2":[{"b":1,"c":2}]}
>>> sql('''select jdictgroupkey('[{"gkey":"v1", "b":1},{"gkey":"v1","b":2},{"gkey":"v2","b":1, "c":2}]', "gkey") as j''') j --------------------------------------------- {"v1":[{"b":1},{"b":2}],"v2":[{"b":1,"c":2}]}
>>> sql('''select jdictgroupkey('[{"gkey":"v1", "gkey2":"f1", "b":1},{"gkey":"v1", "gkey2":"f2", "b":2},{"gkey":"v1", "gkey2":"f2", "b":1, "c":2}]', "gkey", "gkey2") as j''') j -------------------------------------------------------------- {"v1":{"gkey2":{"f1":[{"b":1}],"f2":[{"b":2},{"b":1,"c":2}]}}}
-
- jdictkeys function
-
jdictkeys
(jdict) → jpack¶
Returns a jpack of the keys of input jdict
Examples:
>>> sql(''' select jdictkeys('{"k1":1,"k2":2}', '{"k1":1,"k3":2}') ''') jdictkeys('{"k1":1,"k2":2}', '{"k1":1,"k3":2}') ----------------------------------------------- ["k1","k2","k3"]
>>> sql(''' select jdictkeys('{"k1":1,"k2":2}') ''') jdictkeys('{"k1":1,"k2":2}') ---------------------------- ["k1","k2"] >>> sql(''' select jdictkeys('test') ''') jdictkeys('test') ----------------- [] >>> sql(''' select jdictkeys(1) ''') jdictkeys(1) ------------ []
-
- jdictsplit function
-
jdictsplit
(jdict[, key1, key2, ..]) → columns¶
If only the first argument (jdict) is provided, it returns a row containing the values of input jdict (sorted by the jdict keys).
If key values are also provided, it returns only the columns of which the keys have been provided.
Examples:
>>> sql(''' select jdictsplit('{"k1":1,"k2":2}') ''') k1 | k2 ------- 1 | 2
>>> sql(''' select jdictsplit('{"k1":1,"k2":2, "k3":3}', 'k3', 'k1', 'k4') ''') k3 | k1 | k4 -------------- 3 | 1 | None
-
- jdictsplitv function
-
jdictsplitv
(jdict[, key1, key2, ..]) → columns¶
If only the first argument (jdict) is provided, it returns rows containing the values of input jdict.
If key values are also provided, it returns only the columns of which the keys have been provided.
Examples:
>>> sql(''' select jdictsplitv('{"k1":1,"k2":2}') ''') key | val --------- k1 | 1 k2 | 2
>>> sql(''' select jdictsplitv('{"k1":1,"k2":2, "k3":3}', 'k3', 'k1', 'k4') ''') key | val --------- k3 | 3 k1 | 1
-
- jdictvals function
-
jdictvals
(jdict[, key1, key2, ..]) → jpack¶
If only the first argument (jdict) is provided, it returns a jpack of the values of input jdict (sorted by the jdict keys).
If key values are also provided, it returns only the keys that have been provided.
Examples:
>>> sql(''' select jdictvals('{"k1":1,"k2":2}') ''') jdictvals('{"k1":1,"k2":2}') ---------------------------- [1,2]
>>> sql(''' select jdictvals('{"k1":1,"k2":2, "k3":3}', 'k3', 'k1', 'k4') ''') jdictvals('{"k1":1,"k2":2, "k3":3}', 'k3', 'k1', 'k4') ------------------------------------------------------ [3,1,null] >>> sql(''' select jdictvals('{"k1":1}') ''') jdictvals('{"k1":1}') --------------------- 1 >>> sql(''' select jdictvals('{"k1":1}') ''') jdictvals('{"k1":1}') --------------------- 1 >>> sql(''' select jdictvals(1) ''') jdictvals(1) ------------ 1
-
- jexcept function
-
jexcept
(jpackA, jpackB) → jpack¶
Returns the items of jpackA except the items that appear on jpackB.
Examples:
>>> sql("select jexcept('[1,2,3]', '[1,2,3]')") jexcept('[1,2,3]', '[1,2,3]') ----------------------------- []
>>> sql("select jexcept('[1,2,3]', '[1,3]')") jexcept('[1,2,3]', '[1,3]') --------------------------- 2
-
- jfilterempty function
-
jfilterempty
(jpacks.) → jpack¶
Removes from input jpacks all empty elements.
Examples:
>>> sql("select jfilterempty('a', '', '[]')") jfilterempty('a', '', '[]') --------------------------- a
>>> sql("select jfilterempty('a','[null]',3)") jfilterempty('a','[null]',3) ---------------------------- ["a",3]
>>> sql("select jfilterempty('[3]', jpack('b', ''))") jfilterempty('[3]', jpack('b', '')) ----------------------------------- [3,"b"]
-
- jflatten function
-
jflattten
(jpacks) → jpack¶
Flattens all nested sub-jpacks.
Examples:
>>> sql(''' select jflatten('1', '[2]') ''') jflatten('1', '[2]') -------------------- ["1",2]
>>> sql(''' select jflatten('[["word1", 1], ["word2", 1], [["word3", 2], ["word4", 2]], 3]') ''') jflatten('[["word1", 1], ["word2", 1], [["word3", 2], ["word4", 2]], 3]') ------------------------------------------------------------------------- ["word1",1,"word2",1,"word3",2,"word4",2,3]
-
- jfrequentwords function
-
jfrequentwords
(args...) → jpack¶
Returns the frequent words of a text in a jpack
-
- jintersection function
-
jintersection
(jpackA, jpackB) → jpack¶
Returns the items of jpackA except the items that appear on jpackB.
Examples:
>>> sql("select jintersection('[1,2,3]', '[1,2,3]')") jintersection('[1,2,3]', '[1,2,3]') ----------------------------------- [1,2,3]
>>> sql("select jintersection('[1,2,3]', '[1,3]', 1)") jintersection('[1,2,3]', '[1,3]', 1) ------------------------------------ 1
-
- jlen function
-
jlen
(args...) → int¶
Returns the total length in elements of the input jpacks.
Examples:
>>> sql("select jlen('abc')") jlen('abc') ----------- 1
>>> sql("select jlen('a','b',3)") jlen('a','b',3) --------------- 3
>>> sql("select jlen('a', jpack('b',3))") jlen('a', jpack('b',3)) ----------------------- 3
>>> sql("select jlen('[1,2,3]')") jlen('[1,2,3]') --------------- 3
-
- jlengthiest function
-
jlengthiest
(jpacks.) → jpack¶
Returns the string with the greatest length contained in the jpacks.
Examples:
>>> sql("select jlengthiest('a', '', '[]')") jlengthiest('a', '', '[]') -------------------------- a
>>> sql("select jlengthiest('a','longer',3)") jlengthiest('a','longer',3) --------------------------- longer
>>> sql("select jlengthiest('[3]', jpack('b', ''))") jlengthiest('[3]', jpack('b', '')) ---------------------------------- 3
-
- jmerge function
-
jmerge
(jpacks) → jpack¶
Merges multiple jpacks into one jpack.
Examples:
>>> sql("select jmerge('[1,2,3]', '[1,2,3]', 'a', 3 )") jmerge('[1,2,3]', '[1,2,3]', 'a', 3 ) ------------------------------------- [1,2,3,1,2,3,"a",3]
-
- jmergeregexp function
-
jmergeregexp
(jpacks) → jpack¶
Creates a regular expression that matches all of the jpack’s contents. If the input jpack contains keyword pairs, then jmergeregexp returns a regular expression with named groups.
Examples:
>>> sql(''' select jmergeregexp('["abc", "def"]') ''') jmergeregexp('["abc", "def"]') ------------------------------ (?:abc)|(?:def)
>>> sql(''' select jmergeregexp('[["pos", "p1"], ["neg", "n1"], ["pos", "p2"]]') ''') jmergeregexp('[["pos", "p1"], ["neg", "n1"], ["pos", "p2"]]') ------------------------------------------------------------- (?P<neg>n1)|(?P<pos>p1|p2)
>>> sql(''' select jmergeregexp('[]') ''') jmergeregexp('[]') ------------------ _^
>>> sql(''' select jmergeregexp('["ab",""]') ''') jmergeregexp('["ab",""]') ------------------------- (?:ab)
-
- jmergeregexpnamed function
-
jmergeregexpnamed
(jpacks) → jpack¶
Creates a regular expression that matches all of the jpack’s contents with named groups. If the number of named groups in a regular expression is greater than 99, then the output will be a jpack of regular expressions.
Examples:
>>> sql(''' select jmergeregexpnamed('["abc", "def"]') ''') jmergeregexpnamed('["abc", "def"]') ----------------------------------- (abc)|(def)
-
- jngrams function
-
jngrams
(n, text) → jpack¶
Converts multiple input arguments into a jpack of ngrams.
Examples:
>>> sql("select jngrams(1,'This is a test phrase')") jngrams(1,'This is a test phrase') ------------------------------------------- [["This"],["is"],["a"],["test"],["phrase"]]
>>> sql("select jngrams(2,'This is a test phrase')") jngrams(2,'This is a test phrase') --------------------------------------------------------- [["This","is"],["is","a"],["a","test"],["test","phrase"]]
-
- jpack function
-
jpack
(args...) → jpack¶
Converts multiple input arguments into a single string. Jpacks preserve the types of their inputs and are based on JSON encoding. Single values are represented as themselves where possible.
Examples:
>>> sql("select jpack('a')") jpack('a') ---------- a
>>> sql("select jpack('a','b',3)") jpack('a','b',3) ---------------- ["a","b",3]
>>> sql("select jpack('a', jpack('b',3))") jpack('a', jpack('b',3)) ------------------------ ["a",["b",3]]
-
- jpermutations function
-
jpermutations
(jpack, r) → multiset¶
Returns all length r permutations of jpack.
Examples:
>>> sql('''select jpermutations('["t1","t2","t3"]',2)''') C1 | C2 ------- t1 | t2 t1 | t3 t2 | t1 t2 | t3 t3 | t1 t3 | t2
>>> sql('''select jpermutations('["t1","t2",["t3","t4"]]',2)''') C1 | C2 ------------------------- t1 | t2 t1 | ["t3","t4"] t2 | t1 t2 | ["t3","t4"] ["t3","t4"] | t1 ["t3","t4"] | t2
>>> sql('''select jpermutations(null,2)''')
>>> sql('''select jpermutations('["t1","t2","t3","t4"]')''') C1 -- t1 t2 t3 t4
-
- jrange function
-
jrange
(num) → jrange¶
Returns a jrange of integer numbers.
Examples:
>>> sql("select jrange(5)") jrange('a') ----------------- ["0","1","2","3"]
-
- jset function
-
jset
(jpacks) → jpack¶
Returns a set representation of a jpack, unifying duplicate items.
Examples:
>>> sql("select jset('[1,2,3]', '[1,2,3]', 'b', 'a', 3 )") jset('[1,2,3]', '[1,2,3]', 'b', 'a', 3 ) ---------------------------------------- [1,2,3,"a","b"]
-
- jsonpath function
-
jsonpath
(JSON, jsonpathexpr1, jsonpathexpr2) → multiset¶
Uses jsonpath expressions to pick values from inside a JSON input. If the outputs of all JSONpath expressions have the same number of elements in them, it splits the output into multiple rows.
Note
For more on JSONpath see: http://goessner.net/articles/JsonPath/
Examples:
>>> sql('''select jsonpath('{"d1":[{"name":"n1", "value":"v1"}, {"name":"n2", "value":"v2"}]}', '$.d1') ''') C1 ------------------------------------------------------- [{"name":"n1","value":"v1"},{"name":"n2","value":"v2"}]
>>> sql('''select jsonpath('{"d1":[{"name":"n1", "value":"v1"}, {"name":"n2", "value":"v2"}]}', '$.d1[*].name') ''') C1 -- n1 n2
>>> sql('''select jsonpath('{"d1":[{"name":"n1", "value":"v1"}, {"name":"n2", "value":"v2"}]}', '$.d1[*].name', '$.d1[*].value') ''') C1 | C2 ------- n1 | v1 n2 | v2
>>> sql('''select jsonpath('{"d1":[{"name":"n1", "value":"v1"}, {"name":"n2", "value":"v2"}]}', '$.d1[*].name', '$.d1[*].nonexisting') ''') C1 | C2 --------- n1 | None n2 | None
>>> sql('''select jsonpath('{"d1":[{"name":"n1", "value":"v1"}, {"name":"n2"}]}', '$.d1[*].name', '$.d1[*].value') ''') C1 | C2 ---------------- ["n1","n2"] | v1
>>> sql('''select jsonpath('{"d1":[{"name":"n1", "value":"v1"}, {"name":"n2", "value":"v2"}]}', '$.nonexisting') ''')
-
- jsonstrict function
-
jsonstrict
(args...) → json string¶
Sometimes we wish to process json lists from another application. Jsonstrict function tries to always create json compatible lists. So it always returns json lists.
Examples:
>>> sql("select jsonstrict('a')") jsonstrict('a') --------------- ["a"]
>>> sql("select jsonstrict('a','b',3)") jsonstrict('a','b',3) --------------------- ["a","b",3]
>>> sql("select jsonstrict('a', jpack('b',3))") jsonstrict('a', jpack('b',3)) ----------------------------- ["a",["b",3]]
-
- jsort function
-
jsort
(jpacks) → jpack¶
Sorts the input jpacks.
Examples:
>>> sql("select jsort('[1,2,3]', '[1,2,3]', 'b', 'a', 3 )") jsort('[1,2,3]', '[1,2,3]', 'b', 'a', 3 ) ----------------------------------------- [1,1,2,2,3,3,3,"a","b"]
-
- jsplice function
-
jsplice
(jpack, range1_start, range1_end, ...) → jpack¶
Splices input jpack. If only a single range argument is provided, it returns input jpack’s element in provided position. If defined position index is positive, then it starts counting from the beginning of input jpack. If defined position is negative it starts counting from the end of input jpack.
If more than one range arguments are provided, then the arguments are assumed to be provided in pairs (start, end) that define ranges inside the input jpack that should be joined together in output jpack.
Examples:
>>> sql(''' select jsplice('[1,2,3,4,5]',0) ''') jsplice('[1,2,3,4,5]',0) ------------------------ 1
>>> sql(''' select jsplice('[1,2,3,4,5]',-1) ''') jsplice('[1,2,3,4,5]',-1) ------------------------- 5
>>> sql(''' select jsplice('[1,2,3,4,5]',10) ''') jsplice('[1,2,3,4,5]',10) ------------------------- None
>>> sql(''' select jsplice('[1,2,3,4,5]', 0, 3, 0, 2) ''') jsplice('[1,2,3,4,5]', 0, 3, 0, 2) ---------------------------------- [1,2,3,1,2]
>>> sql(''' select jsplice('[1,2,3,4,5]', 2, -1) ''') jsplice('[1,2,3,4,5]', 2, -1) ----------------------------- [3,4]
-
- jsplit function
-
jsplit
(jpacks) → [C1, C2, ...]¶
Splits horizontally a jpack.
Examples:
>>> sql("select jsplit('[1,2,3]', '[3,4,5]')") C1 | C2 | C3 | C4 | C5 | C6 --------------------------- 1 | 2 | 3 | 3 | 4 | 5
-
- jsplitv function
-
jsplitv
(jpacks) → [C1]¶
Splits vertically a jpack.
Examples:
>>> sql("select jsplitv(jmerge('[1,2,3]', '[1,2,3]', 'b', 'a', 3 ))") C1 -- 1 2 3 1 2 3 b a 3
-
- jzip function
-
jzip
(args...) → json string¶
It combines the corresponding elements of input jpacks.
Examples:
>>> sql('''select jzip('["a", "b"]', '[1,2]','[4,5]')''') jzip('["a", "b"]', '[1,2]','[4,5]') ----------------------------------- [["a",1,4],["b",2,5]]
-
- jzipdict function
-
jzipdict
(args...) → json string¶
It combines the correspinding elements of input jpacks into a jdict.
Examples:
>>> sql('''select jzipdict('["a", "b"]', '[1,2]','[4,5]')''') jzipdict('["a", "b"]', '[1,2]','[4,5]') --------------------------------------- {"a":[1,4],"b":[2,5]}
-
- nl2j function
-
nl2j
(text) → jpack¶
Converts a text with newlines to a jpack.
-
- s2j function
-
s2j
(tabpack) → jpack¶
Converts a space separated pack to a jpack.
Examples:
>>> sql("select s2j('1 2 3 ')") s2j('1 2 3 ') -------------- ["1","2","3"]
-
- t2j function
-
t2j
(tabpack) → jpack¶
Converts a tab separated pack to a jpack.
Examples:
>>> sql("select t2j(j2t('[1,2,3]'))") t2j(j2t('[1,2,3]')) ------------------- ["1","2","3"]
>>> sql("select t2j('asdfasdf')") t2j('asdfasdf') --------------- ["asdfasdf"]
-
langtools
functions¶
- detectlang function
-
detectlang
(text1, text2, ...) → text¶
Detects the language of a snippet of text by analysing its statistical properties.
Examples:
>>> sql("detectlang 'ελληνικά'") detectlang('ελληνικά') ------------------------------ greek-utf >>> sql("detectlang this is in english") detectlang('this is in english') -------------------------------- english >>> sql("detectlang ceci est en français") detectlang('ceci est en français') ----------------------------------- french >>> sql("detectlang este es el español") detectlang('este es el español') --------------------------------- spanish
-
- stem function
-
stem
(text1, text2, ...) → text¶
Does stemming according to the porter algorithm.
Examples:
>>> sql("stem 'cutting and creating'") stem('cutting and creating') ---------------------------- cut and creat
>>> sql("stem ceci est en français cutting") stem('ceci est en français cutting') ------------------------------------- ceci est en françai cut
-
- stem_en function
-
stem_en
(text1, text2, ...) → text¶
Detects if the input is in english and only then does the porter stemming else it returns the input arguments concatenated
Examples:
>>> sql("stem_en 'cutting and creating'") stem_en('cutting and creating') ------------------------------- cut and creat
>>> sql("stem_en ceci est en français cutting") stem_en('ceci est en français cutting') ---------------------------------------- ceci est en français cutting
-
logparse
functions¶
- apachelogsplit function
-
apachelogsplit
(apache_log_line) → [ip, ident, authuser, date, request, status, bytes, referrer, useragent]¶
Breaks a single apache log row into multiple fields.
Examples:
>>> table1(''' ... '1.1.1.1 - - [01/Feb/2001:01:02:03 +0001] "HEAD /test.com HTTP/1.1" 200 - "-" "reftest"' ... ''') >>> sql("select apachelogsplit(a) from table1") ip | ident | authuser | date | method | uri | httpver | status | bytes | referrer | useragent ---------------------------------------------------------------------------------------------------------------------------- 1.1.1.1 | None | None | 2001-02-01T01:02:03+0001 | HEAD | /test.com | 1.1 | 200 | None | None | reftest
-
mathops
functions¶
- farith function
-
farith
(calc) → float or Fraction¶
Takes as input a mathematical expression in polish notation and computes the result using fractional computation
Examples:
>>> sql("select farith('+',5,7)" ) farith('+',5,7) --------------- 12
>>> sql("select farith('-','*','/',15,'-',7,'+',1,1,3,'+',2,'+',1,1)" ) farith('-','*','/',15,'-',7,'+',1,1,3,'+',2,'+',1,1) ---------------------------------------------------- 5
-
- gaussdistribution function
-
gaussdistribution
(mean, sigma) → float¶
Returns a gaussian distribution. Sigma is the standard deviation of the distribution
Examples:
>>> sql("select gaussdistribution(10,5)") gaussdistribution(10,5) ----------------------- ...
-
- randomrange function
-
randomrange
(start, end, step) → int¶
Returns a random number in the defined range
Examples:
>>> sql("select randomrange(0, 68, 1)") randomrange(0, 68, 1) --------------------- ...
>>> sql("select randomrange(0, 68)") randomrange(0, 68) ------------------ ...
-
- safediv function
-
safediv
(int, int, int) → int¶
Returns the first argument, when the division of the two subsequent numbers includes zero in denominator (i.e. in third argument)
Examples:
>>> sql("select safeDiv(1,5,0)") safeDiv(1,5,0) -------------- 1
-
- sqroot function
-
sqroot
(int) → int¶
Returns the square root of a given argument.
Examples:
>>> table1(''' ... 25 ... ''') >>> sql("select sqroot(a) from table1") sqroot(a) --------- 5.0
-
- tonumber function
-
tonumber
(variable) → int or float¶
Convert variable, whose type is str or unicode, to int or float, if it is feasible
Examples:
>>> sql("select tonumber('12.3') as val") val ---- 12.3
>>> sql("select tonumber(12.3) as val") val ---- 12.3
>>> sql("select tonumber('not a number') as val") val ------------ not a number
>>> sql("select tonumber(null) as val") val ---- None
-
settings
functions¶
- setlog function
-
setlog
(filename)¶
Sets the log file path/filename for exec operator
-
- setting function
-
setting
(setting_name[, value])¶
Sets and returns a setting’s value
Examples:
>>> sql("setting 'tracing' 0") setting('tracing','0') ---------------------- False
>>> sql("setting 'tracing'") setting('tracing') ------------------ False
-
- toggle function
-
toggle
(setting_name)¶
Toggles a boolean global setting
Examples:
>>> sql("toggle lala") toggle('lala') -------------- lala not found
>>> sql("toggle tracing") toggle('tracing') -------------------- tracing is now: True
-
similarity
functions¶
- jaccard function
-
jaccard
(jpack1, jpack2)¶
Return jaccard similarity value of two jpacks.
Example:
>>> table1(''' ... user1 movie1 20 ... user1 movie2 30 ... user2 movie1 40 ... user2 movie3 90 ... user2 movie4 90 ... user3 movie1 40 ... user3 movie3 80 ... user4 movie1 70 ... user4 movie2 10 ... ''')
NOTE that only column b is jgrouped because jaccard operates on packs as sets, not weighted values, So packing also column c would not make any difference.
>>> sql("""select u1.userid,u2.userid, jaccard(u1.pk, u2.pk) as similarity ... from ... (select a as userid, jgroup(b) as pk from table1 group by a) as u1, ... (select a as userid, jgroup(b) as pk from table1 group by a) as u2 ... where u1.userid<u2.userid""") userid | userid | similarity -------------------------------- user1 | user2 | 0.25 user1 | user3 | 0.333333333333 user1 | user4 | 1.0 user2 | user3 | 0.666666666667 user2 | user4 | 0.25 user3 | user4 | 0.333333333333
-
- sorensendice function
-
sorensendice
(jpack1, jpack2)¶
Return jaccard similarity value of two jpacks.
Example:
>>> table1(''' ... user1 movie1 20 ... user1 movie2 30 ... user2 movie1 40 ... user2 movie3 90 ... user2 movie4 90 ... user3 movie1 40 ... user3 movie3 80 ... user4 movie1 70 ... user4 movie2 10 ... ''')
NOTE that only column b is jgrouped because jaccard operates on packs as sets, not weighted values, So packing also column c would not make any difference.
>>> sql("""select u1.userid,u2.userid, sorensendice(u1.pk, u2.pk) as similarity ... from ... (select a as userid, jgroup(b) as pk from table1 group by a) as u1, ... (select a as userid, jgroup(b) as pk from table1 group by a) as u2 ... where u1.userid<u2.userid""") userid | userid | similarity ---------------------------- user1 | user2 | 0.4 user1 | user3 | 0.5 user1 | user4 | 1.0 user2 | user3 | 0.8 user2 | user4 | 0.4 user3 | user4 | 0.5
-
stopwords
functions¶
- filterstopwords function
-
filterstopwords
(str) → str¶
Returns the input text with the stopwords removed. The case of the first letter matters.
Examples:
>>> table1(''' ... 'this and wood' 'NO more No words' ... 'No more stop words' 'more free time' ... ''') >>> sql("select filterstopwords(a,b) from table1") filterstopwords(a,b) -------------------- wood NO words stop words free time
-
stringdist
functions¶
- damlevendist function
-
damlevendist
(str1, str2)¶
Returns int which is the damerau-levenshtein distance between str1 and str2
Examples:
>>> sql("select damlevendist('HURQBOHP','QKHOZ') ") damlevendist('HURQBOHP','QKHOZ') -------------------------------- 6
-
- levendist function
-
levendist
(str1, str2)¶
Returns int which is the levenshtein distance between str1 and str2
Examples:
>>> sql("select levendist('HURQBOHP','QKHOZ') ") levendist('HURQBOHP','QKHOZ') ----------------------------- 7
-
- quickstrdist function
-
damlevendist
(str1, str2)
Returns int which is a string distance between str1 and str2, based on Python’s difflib library. It is a lot faster than levendist or damlevendist.
Examples:
>>> sql("select quickstrdist('HURQBOHP','QKHOZ') ") quickstrdist('HURQBOHP','QKHOZ') -------------------------------- 8
-
termsetops
functions¶
- tset function
-
termsetdiff
(termset1, termset2) → termset¶
Returns the termset that is the difference of sets of termset1 - termset2.
Examples:
>>> table1(''' ... 't1 t2 t3' 't2 t3' ... 't3 t2 t1' 't3 t4' ... ''') >>> sql("select tset(a,b) from table1") tset(a,b) ----------- t1 t2 t3 t1 t2 t3 t4
-
- tsetcombinations function
-
tsetcombinations
(termset, r) → termset¶
Returns all the termset combinations of length r. It is a multiset operator that returns one column but many rows.
See also
- Multisets functions
>>> sql("select tsetcombinations('t1 t2 t3 t4',2)") C1 ----- t1 t2 t1 t3 t1 t4 t2 t3 t2 t4 t3 t4
-
- tsetdiff function
-
termsetdiff
(termset1, termset2) → termset
Returns the termset that is the difference of sets of termset1 - termset2.
Examples:
>>> table1(''' ... 't1 t2 t3' 't2 t3' ... 't3 t2 t1' 't3 t4' ... ''') >>> sql("select tsetdiff(a,b) from table1") tsetdiff(a,b) ------------- t1 t1 t2
-
text
functions¶
- comprspaces function
-
comprspaces
(text1[, text2, ...]) → text¶
This function strips (from the beginning and the end) and compresses the spaces in its input.
Examples:
>>> table1(''' ... ' an example with spaces ' 'another example with spaces ' ... ''') >>> sql("select comprspaces(a,b) from table1") comprspaces(a,b) -------------------------------------------------- an example with spaces another example with spaces
-
- contains function
-
contains
(str1, str2) → bool¶
Returns true if string str1 contains str2.
Examples:
>>> sql("select contains('test string', 'str') as test ") test ---- 1 >>> sql("select contains('test string', 'nostr') as test ") test ---- 0
-
- cqlfields function
This functions returns the keywords inside a single column or aggregated from multiple columns. It plays well with Unicode.
The difference of cqlkeywords to keywords is that cqlkeywords also strips cql syntax like “title all” or “author all”.
>>> table1(''' ... '(title all "scrieri") and (creator all "arghezi") and (title all "other")' ... '("maschinenschreiben") and (language all "ger")' ... '("sauer") and ("übungsbuch")' ... ''') >>> sql("select cqlfields(a) from table1") cqlfields(a) ------------------- title creator title language
- cqlkeywords function
-
cqlkeywords
(text1[, text2, ...]) → text¶
Returns the keywords inside a single column (text1) or aggregated from multiple columns.
The difference of cqlkeywords to keywords is that cqlkeywords also strips cql syntax like “title all” or “author all” and plain cql directives like ‘creator’, ‘title’…
Examples:
>>> table1(''' ... first(second) third+fourth ... πρωτο(δευτερο) τριτο_τέταρτο ... 'πέμπτο all' 'έκτο title all τεστ' ... 'title all and something' 'other' ... 'title and something' 'other' ... ''') >>> sql("select cqlkeywords(a,b) from table1") cqlkeywords(a,b) --------------------------------------------------- first second third fourth πρωτο δευτερο τριτο_τέταρτο έκτο τεστ something other and something other
-
- crc32 function
-
crc32
(args) → int¶
Returns the CRC32 of args. Numbers are converted to text before hashing is performed.
Examples:
>>> sql("select crc32(65)") crc32(65) ---------- 2658551721
>>> sql("select crc32(6,5)") crc32(6,5) ---------- 1565899724
>>> sql("select crc32(5)") crc32(5) ---------- 2226203566
>>> sql("select crc32('5')") crc32('5') ---------- 1201448970
-
escapechars function
- hashmd5 function
-
hashmd5
(args)¶
Returns an MD5 hash of args. Numbers are converted to text before hashing is performed.
Examples:
>>> sql("select hashmd5(65)") hashmd5(65) -------------------------------- fc490ca45c00b1249bbe3554a4fdf6fb >>> sql("select hashmd5(6,5)") hashmd5(6,5) -------------------------------- f0d95c20cde50e3ca03cab53f986b6c3 >>> sql("select hashmd5(5)") hashmd5(5) -------------------------------- e4da3b7fbbce2345d7772b0674a318d5 >>> sql("select hashmd5('5')") hashmd5('5') -------------------------------- 7000aaf68ca7a93da0af3d03850571c2
-
- hashmd5mod function
-
hashmd5mod
(args, divisor) → int¶
Returns the modulo with divisor number of the MD5 hash of args. Numbers are converted to text before hashing is performed.
Examples:
>>> sql("select hashmd5mod(65, 3)") hashmd5mod(65, 3) ----------------- 0
>>> sql("select hashmd5mod(6,5, 4)") hashmd5mod(6,5, 4) ------------------ 2
>>> sql("select hashmd5mod(5, 5)") hashmd5mod(5, 5) ---------------- 3
>>> sql("select hashmd5mod('5', 5)") hashmd5mod('5', 5) ------------------ 4
-
- hashmodarchdep function
-
hashmodarchdep
(args, divisor) → int¶
Returns a hash of the args.
Note
This hash function is architecture dependent (32bit vs 64bit).
Examples:
>>> sql("select hashmodarchdep(65,5)") hashmodarchdep(65,5) -------------------- ...
>>> sql("select hashmodarchdep(6,5)") hashmodarchdep(6,5) ------------------- ...
>>> sql("select hashmodarchdep(5,5)") hashmodarchdep(5,5) ------------------- ...
>>> sql("select hashmodarchdep('5',5)") hashmodarchdep('5',5) --------------------- ...
-
- hashmodarchdep2 function
-
hashmodarchdep2
(arg, divisor) → int¶
Returns a hash of the args.
Note
This hash function is architecture dependent (32bit vs 64bit). It is specialized for 2 parameters
Examples:
>>> sql("select hashmodarchdep2(65,5)") hashmodarchdep2(65,5) --------------------- ...
>>> sql("select hashmodarchdep2(6,5)") hashmodarchdep2(6,5) -------------------- ...
>>> sql("select hashmodarchdep2(5,5)") hashmodarchdep2(5,5) -------------------- ...
>>> sql("select hashmodarchdep2('5',5)") hashmodarchdep2('5',5) ---------------------- ...
-
- isvalidutf8 function
-
isvalidutf8
(text) → 1/0¶
Returns 1 if the input text is in valid UTF-8 format, or 0 if not. This function is used to find corrupted UTF-8 strings with a heuristic based on non common characters.
Examples:
>>> table1(''' ... test ... δοκιμή! ... sévignà ... évezred ... ''') >>> sql("select isvalidutf8(a) from table1") isvalidutf8(a) -------------- 1 1 1 1
-
- keywords function
-
keywords
(text1[, text2, ...]) → text¶
Returns the keywords inside a single column (text1) or aggregated multiple columns.
Examples:
>>> table1(''' ... first(second) third+fourth ... πρωτο(δευτερο) τριτο+τέταρτο ... 'πέμπτο all' 'qwer.zxcv' ... ''') >>> sql("select keywords(a,b) from table1") keywords(a,b) --------------------------------------------------- first second third fourth πρωτο δευτερο τριτο τέταρτο πέμπτο all qwer zxcv
-
- kwnum function
-
kwnum
(text1[, text2, ...]) → int¶
Returns the number of simple keywords in a string. Its input should be words separated by spaces, as returned by cqlkeywords or keywords.
Examples:
>>> table1(''' ... 'word1 word2 word3' ... 'word1 word2' ... 'word' ... ''') >>> sql("select kwnum(a) from table1") kwnum(a) -------- 3 2 1
-
- normalizetext function
-
normalizetext
(text1[, text2, ...]) → text¶
Normalizes a text by replacing all the non-words except s
-
,.; with ‘_’
Examples:
>>> table1(''' ... first(second) third+fourth ... πρωτο(δευτερο) τριτο+τέταρτο ... 'πέμπτο all' 'έκτο title all τεστ' ... ''') >>> sql("select normalizetext(a,b) from table1") normalizetext(a,b) ---------------------------------------------------- first_second_ third_fourth πρωτο_δευτερο_ τριτο_τέταρτο πέμπτο all έκτο title all τεστ
- normuni function
-
normuni
(str)¶
Returns str normalised in the composed unicode normal form without replacing same look characters. For example this ‘À’ character can be encoded with one or two different characters,
normuni()
returns an one-character encoded version. This function is important to check true strings equality.Functions
sunitouni()
andunitosuni()
are used in the examples to make it more comprehensive.Examples:
Note
Returned results in the next two examples should look the same, if not that is a bug at the combined characters rendering of the shell that the specifications was created.
>>> sql("select sunitouni('C\u0327') as test ") test ---- Ç >>> sql("select normuni(sunitouni('C\u0327')) as test ") test ---- Ç >>> sql("select unitosuni(normuni(sunitouni('C\u0327'))) as test ") test ------ \u00c7
-
- regexpcountuniquematches function
-
regexpcountuniquematches
(pattern, expression)¶ Returns the number of matches of pattern in expression.
Examples:
>>> sql("regexpcountuniquematches 'start' 'start end start' ") regexpcountuniquematches('start','start end start') --------------------------------------------------- 1
>>> sql("regexpcountuniquematches 'start end' 'start end start' ") regexpcountuniquematches('start end','start end start') ------------------------------------------------------- 1
-
- regexpcountwithpositions function
-
regexpcountwithpositions
(pattern, expression, start = 0, min = 0.5, multiply = 1)¶ Returns a score of positioned matches of pattern in expression.
Examples:
>>> sql("regexpcountwithpositions 'start' 'start end start' ") regexpcountwithpositions('start','start end start') --------------------------------------------------- 1.75
>>> sql("regexpcountwithpositions 'start' 'start end start' ") regexpcountwithpositions('start','start end start') --------------------------------------------------- 1.75
>>> sql("regexpcountwithpositions 'first' 'first second third fourth'") regexpcountwithpositions('first','first second third fourth') ------------------------------------------------------------- 0.75
>>> sql("regexpcountwithpositions 'fourth' 'first second third fourth'") regexpcountwithpositions('fourth','first second third fourth') -------------------------------------------------------------- 1.5
>>> sql("regexpcountwithpositions 'fourth' 'first second third fourth' 1") regexpcountwithpositions('fourth','first second third fourth','1') ------------------------------------------------------------------ 0.5
-
- regexpcountwords function
-
regexpcountwords
(pattern, expression)¶ Returns the number of matches of pattern in expression. If a match includes more than one words then it returns the number of the words.
Examples:
>>> sql("regexpcountwords 'start' 'start end start' ") regexpcountwords('start','start end start') ------------------------------------------- 2
>>> sql("regexpcountwords 'start end' 'start end start' ") regexpcountwords('start end','start end start') ----------------------------------------------- 2
-
- regexpr function
-
regexp
(pattern, expression[, replacestr])¶
This function returns a match to the first parenthesis of pattern or replaces the matches of pattern in expression with replacestr. Pattern Syntax is according to python’s re module.
Examples use inversion.
Examples:
>>> table1(''' ... 25 ... ''')
>>> sql("regexpr 'start\s(\w+)\send' 'start otherword end' ") regexpr('start\s(\w+)\send','start otherword end') -------------------------------------------------- otherword
>>> sql("regexpr '\W+' '@#$%@$#% tobereplaced @#$%@#$%' 'nonword' ") regexpr('\W+','@#$%@$#% tobereplaced @#$%@#$%','nonword') --------------------------------------------------------- nonwordtobereplacednonword
>>> sql("select regexpr('(\w+).*?(\w+)', 'one two three')") regexpr('(\w+).*?(\w+)', 'one two three') ----------------------------------------- ["one","two"]
-
- regexprfindall function
-
regexprfindall
(pattern, text)¶
This function returns all matches of pattern in text.
Examples:
>>> sql("select regexprfindall('\w+', 'one')") regexprfindall('\w+', 'one') ---------------------------- ["one"]
>>> sql("select regexprfindall('\w+', 'one two three')") regexprfindall('\w+', 'one two three') -------------------------------------- ["one","two","three"]
-
- regexprmatches function
-
regexprmatches
(pattern, arg)¶
This function returns true if the pattern matches arg or false otherwise.
Examples use inversion.
Examples:
>>> sql("regexprmatches '(a)' 'qwer a qwer' ") regexprmatches('(a)','qwer a qwer') ----------------------------------- 1
-
- stripchars function
-
stripchars
(str[, stripchars])¶
Returns str removing leading and trailing whitespace characters or stripchars characters if given. Works like python’s strip function.
Examples:
>>> sql("select stripchars(' initial and final spaces ') as test ") test ------------------------ initial and final spaces >>> sql("select stripchars(' <initial and final spaces> ',' <>') as test ") test ------------------------ initial and final spaces >>> sql("select stripchars(null)") stripchars(null) ---------------- None
-
- sunitouni function
-
sunitouni
(str)¶
Returns str replacing literal unicode code points to their string representation.
Examples:
>>> sql("select sunitouni('br\u00fbl\u00e9') as test ") test ------- brûlé >>> sql("select sunitouni('\u that is not a unicode code point') as test ") test ----------------------------------- \u that is not a unicode code point >>> sql("select sunitouni(null)") sunitouni(null) --------------- None >>> sql("select sunitouni(9)") sunitouni(9) ------------ 9
-
- textreferences function
-
textreferences
(text, maxlen = 5, pattern = (|_)(1|2)d{3, 3}(|_))¶
Returns the “Reference” section of documents. To find it, it searches for parts of the document that have a high density of pattern matches.
Examples:
>>> table1(''' ... eeeeeeeeeeeeee ... gggggggggggggg ... aaaaaaaaaaaaaa ... bbbbbbbbbbbbbb ... aaa_1914_ccccc ... bbb_2014_bbbbb ... dddd_2008_ddddddd ... cccc_2005_ccccc ... ccccc_2014_ccccc ... dddddd_2009_ddddd ... gggggggggggggg ... ''')
>>> sql("select textreferences(group_concat(a,'\n'),1,'(|_)(1|2)\d{3,3}(|_)') as a from table1") a -------------------------------------------------------------------------------------------------- aaa_1914_ccccc bbb_2014_bbbbb dddd_2008_ddddddd cccc_2005_ccccc ccccc_2014_ccccc dddddd_2009_ddddd
If an inadequate amount of newlines is found, it returns the text as is.
>>> sql("select textreferences(group_concat(a,'.')) from table1") textreferences(group_concat(a,'.')) ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------- eeeeeeeeeeeeee.gggggggggggggg.aaaaaaaaaaaaaa.bbbbbbbbbbbbbb.aaa_1914_ccccc.bbb_2014_bbbbb.dddd_2008_ddddddd.cccc_2005_ccccc.ccccc_2014_ccccc.dddddd_2009_ddddd.gggggggggggggg
>>> sql("select textreferences('')") textreferences('') ------------------
-
- textwindow function
-
textwindow
(text, previous_word_count = 0, next_word_count = 0, middle_word_count = 1, pattern = None)¶
Returns a rolling window over the text. The window includes previous_word_count words before the middle word and next_word_count words after the middleword. Optionally you may choose more than one words to be in the middle, and filter your window with a regular expression pattern
If the value of prev_word_count or next_word_count is negative, and a pattern exists then all matches of the pattern are filtered out from prev and next output.
Examples:
>>> sql("select textwindow('This is a test phrase') ") middle ------ This is a test phrase
>>> sql("select textwindow('This is a test phrase',1,1) ") prev1 | middle | next1 ----------------------- | This | is This | is | a is | a | test a | test | phrase test | phrase |
>>> sql("select textwindow('This is a test phrase',1,1,2) ") prev1 | middle | next1 ---------------------------- | This is | a This | is a | test is | a test | phrase a | test phrase | test | phrase |
>>> sql("select textwindow('This is a test phrase with pdb codes: 1abc 2bcd 3cde 4bde ',-2,1,2,'\d\w{3}' ) ") prev1 | prev2 | middle | next1 ---------------------------------- pdb | codes: | 1abc 2bcd | 3cde pdb | codes: | 2bcd 3cde | 4bde pdb | codes: | 3cde 4bde | pdb | codes: | 4bde |
>>> sql("select textwindow('This is a test phrase (123) for filtering middle with a number',1,1,'\d+') ") prev1 | middle | next1 ----------------------- phrase | (123) | for
-
- textwindow2s function
-
textwindow2s
(text, prev_word_count, middle_word_count, next_word_count, pattern)¶ Returns a rolling window in the text. The window includes numberofprev words before the middle word and numberofnext words after the middleword. You may filter your window using a pattern.
Examples:
>>> sql("select textwindow2s('This is a test phrase',2,1,1) ") prev | middle | next ------------------------- | This | is This | is | a This is | a | test is a | test | phrase a test | phrase |
>>> sql("select textwindow2s('This is a test phrase',2,1,1, '\w{4}') ") prev | middle | next ------------------------ | This | is is a | test | phrase a test | phrase |
-
- uniqueterms function
-
uniqueterms
(text1[, text2, ...]) → text¶
Returns the unique terms of an input string.
Examples:
>>> table1(''' ... 'word1 word2 word2' ... 'word1 word2 word1' ... 'word' ... ''') >>> sql("select uniqueterms(a) from table1") uniqueterms(a) -------------- word1 word2 word1 word2 word
-
- unitosuni function
-
unitosuni
(str)¶
Returns str replacing non-ascii characters with their equivalent unicode code point literal at the u00 format.
Examples:
>>> sql("select unitosuni('brûlé') as test ") test --------------- br\u00fbl\u00e9 >>> sql("select sunitouni(null)") sunitouni(null) --------------- None >>> sql("select unitosuni(9)") unitosuni(9) ------------ 9
-
- utf8clean function
-
utf8clean
(text) → text¶
Removes control characters from input utf-8 text.
Examples:
>>> table1(''' ... test ... δοκιμή! ... sévignà ... évezred ... ''') >>> sql("select utf8clean(a) from table1") utf8clean(a) ------------- test δοκιμή! sévignà évezred
-
tzconverter
functions¶
- tzconverter function
-
tzconverter
(timestamp, offset)¶
Returns timestamps converted from UTC to target timezone, indicated by the offset parameter.
Example:
>>> table1('''
… “2010-12-05T00:00:00+00:00” … “2010-12-05T00:01:00+00:00” … “2010-12-05T00:02:00+00:00” … ‘’‘)
… ‘’‘) >>> sql(“select a, tzconverter(a,’-01:00’) from table1 “) a | tzconverter(a,’-01:00’) —————————————————– 2010-12-05T00:00:00+00:00 | 2010-12-04T23:00:00-01:00 2010-12-05T00:01:00+00:00 | 2010-12-04T23:01:00-01:00 2010-12-05T00:02:00+00:00 | 2010-12-04T23:02:00-01:00
… ‘’‘) >>> sql(“select a, tzconverter(a,’-01’) from table1 “) a | tzconverter(a,’-01’) ————————————————– 2010-12-05T00:00:00+00:00 | 2010-12-04T23:00:00-01 2010-12-05T00:01:00+00:00 | 2010-12-04T23:01:00-01 2010-12-05T00:02:00+00:00 | 2010-12-04T23:02:00-01
>>> sql("select a, tzconverter(a,'-0100') from table1 ") a | tzconverter(a,'-0100') ---------------------------------------------------- 2010-12-05T00:00:00+00:00 | 2010-12-04T23:00:00-0100 2010-12-05T00:01:00+00:00 | 2010-12-04T23:01:00-0100 2010-12-05T00:02:00+00:00 | 2010-12-04T23:02:00-0100
>>> sql("select a, tzconverter(a,'+00:30') from table1 ") a | tzconverter(a,'+00:30') ----------------------------------------------------- 2010-12-05T00:00:00+00:00 | 2010-12-05T00:30:00+00:30 2010-12-05T00:01:00+00:00 | 2010-12-05T00:31:00+00:30 2010-12-05T00:02:00+00:00 | 2010-12-05T00:32:00+00:30
-
util
functions¶
- execprogram function
-
execprogram
(stdin=null, program_name, parameters[, raise_error]) → text or blob¶
Function execprogram executes a shell command and returns its output. If the value of the first argument is not null, the arguments value will be pushed in program’s Standard Input.
If the program doesn’t return a 0 return code, then a YeSQL error will be raised, containing the contents of the program’s error stream.
If the last argument of execprogram is set to null, then all program errors will be returned as null (see “cat non_existent_file” examples below).
Every one of the program’s parameters must be provided as different arguments of the execprogram call (see “cat -n” example below).
Note
Function execprogram tries by default to convert the program’s output to UTF-8. If the conversion isn’t succesfull, then it returns the output as a binary blob.
Examples:
>>> table1(''' ... echo test ... echo 1 ... ''') >>> sql("select execprogram(null, a, b) from table1") execprogram(null, a, b) ----------------------- test 1
>>> sql("select execprogram(null, null, '-l')") Traceback (most recent call last): ... OperatorError: YeSQL SQLError: Operator EXECPROGRAM: Second parameter should be the name of the program to run
>>> sql("select execprogram(null, null, '-l', null)") execprogram(null, null, '-l', null) ----------------------------------- None
>>> sql("select execprogram('test', 'cat')") execprogram('test', 'cat') -------------------------- test
>>> sql('''select execprogram('test', 'cat', '-n')''') execprogram('test', 'cat', '-n') -------------------------------- 1 test
>>> sql("select execprogram(null, 'NON_EXISTENT_PROGRAM')") Traceback (most recent call last): ... OperatorError: YeSQL SQLError: Operator EXECPROGRAM: [Errno 2] No such file or directory
>>> sql("select execprogram(null, 'cat', 'non_existent_file')") Traceback (most recent call last): ... OperatorError: YeSQL SQLError: Operator EXECPROGRAM: cat: non_existent_file: No such file or directory
>>> sql("select execprogram(null, 'cat', 'non_existent_file', null)") execprogram(null, 'cat', 'non_existent_file', null) --------------------------------------------------- None
-
- failif function
-
failif
(condition[, messsage])¶
If condition is true, raises an error. If message is provided, the message is included in raised error.
Examples:
>>> sql("select failif(1=1,'exception') as answer") Traceback (most recent call last): ... OperatorError: YeSQL SQLError: Operator FAILIF: exception
>>> sql("select failif(1=0,'exception') as answer") answer ------ 0
>>> sql("select failif(1=1) as answer") Traceback (most recent call last): ... OperatorError: YeSQL SQLError: Operator FAILIF: an error was found
-
- gz function
-
gz
(text) → gzip compressed blob¶
Function gz compresses its input with gzip’s maximum compression level.
Examples:
>>> table1(''' ... "qwerqewrqwerqwerqwerqwerqwer" ... "asdfasdfasdfasdfasdfasdfsadf" ... ''') >>> sql("select length(a), length(gz(a)) from table1") length(a) | length(gz(a)) ------------------------- 28 | 20 28 | 18
-
- sleep function
-
sleep
(seconds)¶
This function waits for the given number of seconds before returning. The seconds parameters can be fractional (e.g. 0.1 will sleep for 100 milliseconds).
Examples:
>>> sql("select sleep(0.1)") sleep(0.1) ---------- 0.1
-
- ungz function
-
ungz
(blob) → text¶
Function ungz decompresses gzip blobs. If the input blobs aren’t gzip compressed, then it just returns them as they are.
Examples:
>>> table1(''' ... "qwerqwerqwer" ... "asdfasdfasdf" ... ''') >>> sql("select ungz(gz(a)) from table1") ungz(gz(a)) ------------ qwerqwerqwer asdfasdfasdf
>>> sql("select ungz('string'), ungz(123)") ungz('string') | ungz(123) -------------------------- string | 123
-
- urlrequest function
-
urlrequest
([null, ]url) → response¶
This functions connects to the url (via GET HTTP method) and returns the request’s result. If first parameter is null, then in case of errors null will be returned.
Examples:
>>> sql("select urlrequest('http://www.google.com/not_existing')") Traceback (most recent call last): ... HTTPError: HTTP Error 404: Not Found
>>> sql("select urlrequest(null, 'http://www.google.com/not_existing') as result") result ------ None
-
- urlrequestpost function
-
urlrequestpost
(data_jdict, [null, ]url) → response¶
This functions connects to the url (via POST HTTP method), submits the data_jdict, and returns the request’s result. If second parameter is null, then in case of errors null will be returned.
Examples:
>>> sql('''select urlrequestpost('{"POST_param_name":"data"}', 'http://www.google.com/not_existing')''') Traceback (most recent call last): ... HTTPError: HTTP Error 404: Not Found
>>> sql('''select urlrequestpost('["POST_param_name","data"]', null, 'http://www.google.com/not_existing') as result''') result ------ None
>>> sql("select urlrequestpost(jdict('param1','value1'), null, 'http://www.google.com/not_existing') as result") result ------ None
>>> sql("select urlrequestpost(jpack('param1','value1'), null, 'http://www.google.com/not_existing') as result") result ------ None
-
variables
functions¶
- flowname function
-
flowname
([str])¶
Sets and retrieves, ‘flowname’ variable
Examples:
>>> sql("flowname test flow ") flowname('test flow') --------------------- test flow >>> sql("flowname") flowname() ---------- test flow >>> sql("flowname 'arg1' arg2") Traceback (most recent call last): ... OperatorError: YeSQL SQLError: Operator FLOWNAME: Flowname accepts only 1 argument
-
- getvar function
-
getvar
(varname) → value¶
Returns the value of varname.
Examples:
>>> sql("var 't' 5") var('t','5') ------------ 5 >>> sql("getvar 't'") getvar('t') ----------- 5 >>> sql("select getvar('t', random())") getvar('t', random()) --------------------- 5
-
- requirevars function
-
requirevars
(varname1[, varname2, ...])¶
Checks if all variables (varname1,…) exist. If not it throws an exception.
Examples:
>>> sql("var 'cv1' 5") var('cv1','5') -------------- 5 >>> sql("var 'cv2' 10") var('cv2','10') --------------- 10 >>> sql("requirevars 'cv1' 'cv2'") requirevars('cv1','cv2') ------------------------ 1 >>> sql("requirevars cv1 cv2") requirevars('cv1 cv2') ---------------------- 1 >>> sql("requirevars 'cv1' 'testvar'") Traceback (most recent call last): ... OperatorError: YeSQL SQLError: Operator REQUIREVARS: Variable testvar isn't initialized
-
- setexecdb function
-
setexecdb
(str)¶
Sets the database path/filename for exec operator.
-
- var function
-
var
(varname[, value]) → value¶
Sets (if both varname and value are given) or returns (if only varname is given) the contents of a variable.
Examples:
>>> sql("var 'v'") Traceback (most recent call last): ... OperatorError: YeSQL SQLError: Operator VAR: Variable 'v' does not exist >>> sql("var 'v' 5") var('v','5') ------------ 5 >>> sql("var 'v'") var('v') -------- 5 >>> sql("select var('v')") var('v') -------- 5
-