Syntax comparison to QUEL #69

dumblob · 2022-02-09T14:08:39Z

Before prql gets implemented, I'd like to see some comparison of the proposed syntax with QUEL.

https://en.wikipedia.org/wiki/QUEL_query_languages

QUEL is a more readable but fully composable alternative to SQL. It was created by a mathematician and fully implemented in POSTGRES 4.2 (yeah, POSTGRES got the frontend thrown away later and exchanged for SQL due to market pressure).

Btw. I'd strongly recommend everyone reading the paper What Goes Around Comes Around from M. Stonebraker and co.

It's a summary of 35 years of data model proposals (and thus what query languages are designed around), grouped into 9 different eras. The outcome of the paper is a list of lessons learned:

Lesson 1: Physical and logical data independence are highly desirable
Lesson 2: Tree structured data models are very restrictive
Lesson 3: It is a challenge to provide sophisticated logical reorganizations of tree structured data
Lesson 4: A record-at-a-time user interface forces the programmer to do manual query optimization, and this is often hard.
Lesson 5: Directed graphs are more flexible than hierarchies but more complex
Lesson 6: Loading and recovering directed graphs is more complex than hierarchies
Lesson 7: Set-a-time languages are good, regardless of the data model, since they offer much improved physical data independence.
Lesson 8: Logical data independence is easier with a simple data model than with a complex one.
Lesson 9: Technical debates are usually settled by the elephants of the marketplace, and often for reasons that have little to do with the technology.
Lesson 10: Query optimizers can beat all but the best record-at-a-time DBMS application programmers.
Lesson 11: Functional dependencies are too difficult for mere mortals to understand. Another reason for KISS (Keep It Simple Stupid).
Lesson 12: Unless there is a big performance or functionality advantage, new constructs will go nowhere.
Lesson 13: Packages will not sell to users unless they are in “major pain”
Lesson 14: Persistent languages will go nowhere without the support of the programming language community.
(yes, there is a numbering mistake in the paper)
Lesson 14: The major benefits of OR is two-fold: putting code in the data base (and thereby bluring the distinction between code and data) and a general purpose extension mechanism that allows OR DBMSs to quickly respond to market requirements.
Lesson 15: Widespread adoption of new technology requires either standards and/or an elephant pushing hard.
Lesson 16: Schema-later is a probably a niche market
Lesson 17: XQuery is pretty much OR SQL with a different syntax
Lesson 18: XML will not solve the semantic heterogeneity either inside or outside the enterprise.

The text was updated successfully, but these errors were encountered:

max-sixty · 2022-02-12T19:19:16Z

Thanks @dumblob — this was interesting reading. And they were correct about the future of XML!

Please continue adding things like this that you think people would find interesting; and any direct implications on PRQL are welcome too.

dumblob · 2022-02-12T19:52:54Z

Thanks @dumblob — this was interesting reading.

You're welcome!

And they were correct about the future of XML!

Yep. Totally!

Btw. I'd still be interested in some comparison of the features between prql and QUEL. I kept it for me, but now I'll say it: QUEL seems only slightly less understandable than prql ("holy grail") while being (much) more lightweight and pure (mathematically, implementation-wise, etc.).

Therefore the question is whether to not make prql closer (identical? - of course with extensions on top) to QUEL. If not, then why to reinvent the wheel (something the cited paper criticized)?

max-sixty · 2022-02-12T21:07:47Z

Btw. I'd still be interested in some comparison of the features between prql and QUEL.

I'd welcome a more detailed comparison — I'm not the best person to be doing this given my lack of familiarity with it — but on initial viewing:

QUEL has a similar take on pipelines, which is a foundational principle for PRQL
But QUEL's pipelines seem stateful with the DB — e.g. replace s (age=s.age+1) executes an update student set age=age+1 query
There are a bunch of syntax differences

We need to send the full query given the performance impact — particularly for analytical queries, which are our main target.

Arguably we could restrict ourselves to the select syntax; e.g. from Wikipedia:

range of E is EMPLOYEE
retrieve into W
(COMP = E.Salary / (E.Age - 18))
where E.Name = "Jones"

...but this seems more verbose, and goes against some of the feedback we've incorporated into PRQL — e.g. every transformation starting with a function name.

What's your take? What would you take from QUEL and put into PRQL?

wtkhan · 2022-11-03T01:55:52Z

@max-sixty, I know this issue's been closed, but wanted to add this example of QUEL's powerful nested aggregations feature to the discussion. From Wikipedia:

retrieve (
  a = count(y.i by y.d where y.str = "ii*" or y.str = "foo"),
  b = max(count(y.i by y.d))
)

Is this something PRQL could support?

max-sixty · 2022-11-03T20:47:31Z

Yes, I also find the inline transformations great. Here there's a couple:

a filter / where, e.g. count(y where y.str = "ii*"); ref Inline filters #82
The groupby; i.e. max(count(y.i by y.d)) is also nice

I'm not confident on the best way of having this in PRQL. In full verbosity a=count(y.i by y.d where y.str = "ii*") would be something like:

derive a = (| filter y.str = "ii*" | group [y.d] (aggregate [count y.i])

...which is not exactly pretty, with lots of syntax. (it's also not clear where the l_value a= should go; would it go in the derive or a = count y.i)

PRQL does benefit from clearly specifying the resulting type; a downside of count(y where y.str = "ii*") is that it doesn't specify whether it's an aggregate or not — rank(y where y.str = "ii*") (or some function like regex_match) returns a column rather than a value, but looks the same without knowing the type of count / rank / regex_match. I wrote more on this at "What’s going on with this aggregate function?" in the FAQ.

Lmk if you have any thoughts!

max-sixty closed this as completed Feb 12, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Syntax comparison to QUEL #69

Syntax comparison to QUEL #69

dumblob commented Feb 9, 2022

max-sixty commented Feb 12, 2022

dumblob commented Feb 12, 2022

max-sixty commented Feb 12, 2022

wtkhan commented Nov 3, 2022

max-sixty commented Nov 3, 2022

Syntax comparison to QUEL #69

Syntax comparison to QUEL #69

Comments

dumblob commented Feb 9, 2022

max-sixty commented Feb 12, 2022

dumblob commented Feb 12, 2022

max-sixty commented Feb 12, 2022

wtkhan commented Nov 3, 2022

max-sixty commented Nov 3, 2022