Q-SQL
How to read table meta
what each column does:
- c: column name
- t: column type
- f: foreign keys
- a: attributes - modifiers for performance characters. see details here
c | t f a
-----------------
date | d
month | m
vendor | s
time | p s
fby
Use
fby when you find yourself trying to apply a filter to the aggregated column of a table
produced by
select...by...
The basic syntax is:
(aggregation;data) fby group
where
aggregation is the aggregation function you want to apply,
data is the column
you want to filter, and
group is the column you want to group by.
Example
Say you have this intermediate table, and you want to select from the original table
jan09 based on
type of vendor, but only records with
duration greater than
avgDuration, which is
simply the average of
duration.
q) res: select avgDuration: avg duration by vendor from jan09
q) res
vendor | avgDuration
----------------------
CMT | 6.480017e+11
DDS | 7.600017e+11
VTS | 7.290017e+11
Without
fby, you need to do it like this:
First generate this intermediate table, then join
jan09 with this intermediate table.
select from jan09 lj res where duration > avgDuration
But with
fby, you can do it like this:
select from jan09 where duration > (avg,duration) fby vendor
More on how to use fby
Keep in mind that
fby is used as a filter (a where clause) on the aggregated column of a table, so
the three signs are 1. a filter, 2. an aggregated value, and 3. a value to group on. Here are more examples:
-
Which vendor has the largest number of trips when only considering trips shorter than the
average duration of each vendor?
Where to use fby: "when only considering trips shorter than the
average duration of each vendor?"
Constructed fby: (avg; duration) fby vendor
-
Which payment type produces the highest average tip when only trups with a fare larger than the average
for each vendor is considered?
Where to use fby: "when only trups with a fare larger than the average
for each vendor is considered?"
Constructed fby: (avg; fare) fby vendor
xbar
This, together with
by, is used to group time/numeric values in to buckets.
For example, for each severn day, find the min temperature and the max temperature.
select min mintemp, max maxtemp by 7 xbar date from weather
Functional form
Functional q-sql
For functional queries, the general form is
queryFunc[table; whereClause; byClause; returnClause]
In the following sections, we will learn how to use
parse to write functional forms.
queryFunc
queryFunc is always
? for
select,
exec, and
!
for
update,
delete.
Let's start with a simplest query and see what the
parse output looks like.
q) parse"select from trade"
? // queryFunc
`trade // table
() // whereClause
0b // byClause
() // returnClause
When you
parse, the five elements of a functional query will be listed one by one. However,
sometimes you need to modify the output a little. We will show some common modifications in sections below.
whereClause
whereClause is a list of functional expressions. If it's a single statement, you need to
enlist.
Example of writing the where clause from
parse:
q) parse"select from trade where sym=`JPM, size>20"
?
`trade
,((=;`sym;,`JPM);(>;`size;20)) // whereClause to be translated
0b
()
This
where output cannot be used directly. You need to translate it, two principles (applicable to
all following sections on other components of the functional query):
- ignore the first comma at the very beginning
- for other commas, use
enlist, such as the one before `JPM
Thus the
whereClause for this statement is actually
((=;`sym; enlist `JPM);(>;`size;20)).
For
exec only, you need to use
whereClause together with
returnClause,
otherwise it will only return the last record that satisfy the where clause.
If the above example is changed to
exec, you need a
returnClause that is a dictionary
x!x:cols[trade].
byClause
byClause is a dictionary. Usually you don't need to translate the output from
parse.
If you
select by sym, exchange..., the
byClause in it will just be
`sym`exchange!`sym`exchange.
But if there is
xbar in the
by, then you need to translate it.
q)parse"select by twoHour:0D02:00:00 xbar time from trade"
?
`trade
()
(,`twoHour)!,(k){$[9h=t:abs[@x];x*r+y=x*1+r:(y:9h$y)div x;x*y div x:$[16h=t;"j"$x;x]]};0D02:00:00.000000000;`time)
()
The way to translate is to put it into a dictionary like this
enlist[`twoHour]!enlist (xbar; 0D02:00:00; `time)
In addtion, with functional by, you can use group by on
exec. It will just behave as if it were
select. You can pass a list instead of dictionary as
byClause if it's
exec too.
returnClause
returnClause can be a list or a dictionary, and depending on what
queryFunc you're
using, it means different things.
- for
select and exec, it can be a list or a dictionary. It will return only
those columns you ask to return. We usually use it for aggregation functions.
select open:first price, high: max price from trade
The returnClause for the functional form of this query will be
`open`high!((first:;`price);(max;`price)) // first will show up as * in parse output
- for
update, it can be a list or a dictionary. It will return all columns, with the columns
in returnClause updated.
update size+3 from trade
The returnClause for the functional form of this query will be
(enlist `size)!enlist (+;`size;3)
- for
delete, it can only be a list. These columns will be deleted.
delete size from trade
The returnClause for the functional form of this query will be
enlist `price
optional arguments
For
select statement, you can pass two addtional arguments.
?[table; whereClause; byClause; returnClause; windowSize; indexSort]
windowSize specifies the rows to be returned.
For example, 2 means the first two rows, -2 means the last two rows, a list 2 2 means start at row index=2 and
return 2 rows (so the third and fourth row).
indexSort can only be used when
windowSize is specified.
If you want to return all rows, you can set
windowSize:0W but it has to be there.
This argument specifies how you'd sort the output, e.g.,
(idesc; `price) means sort in descending
order then return indices.