PostgreSQL DISTINCT与不同的ORDER BY

我想运行这个查询:

SELECT DISTINCT ON (address_id) purchases.address_id, purchases.* FROM purchases WHERE purchases.product_id = 1 ORDER BY purchases.purchased_at DESC 

但是我得到这个错误:

PG ::错误:错误:SELECT DISTINCT ONexpression式必须匹配初始的ORDER BYexpression式

添加address_id作为第一ORDER BYexpression式沉默错误,但我真的不希望添加sortingover address_id 。 有没有可能没有通过address_id命令?

文档说:

DISTINCT ON(expression [,…])仅保留给定expression式求值相等的每一组行的第一行。 […]请注意,除非使用ORDER BY来确保所需的行首先出现,否则每个集合的“第一行”是不可预知的。 […] DISTINCT ONexpression式必须匹配最左边的ORDER BYexpression式。

官方文件

所以你必须将address_id添加到order by。

或者,如果您正在查找包含每个address_id最近购买的产品的完整行,并且此结果按purchased_atsorting,那么您正试图解决以下方法可解决的最大N个问题:

一般的解决scheme应该在大多数DBMS中工作:

 SELECT t1.* FROM purchases t1 JOIN ( SELECT address_id, max(purchased_at) max_purchased_at FROM purchases WHERE product_id = 1 GROUP BY address_id ) t2 ON t1.address_id = t2.address_id AND t1.purchased_at = t2.max_purchased_at ORDER BY t1.purchased_at DESC 

基于@ hkf答案的更多基于PostgreSQL的解决scheme:

 SELECT * FROM ( SELECT DISTINCT ON (address_id) * FROM purchases WHERE product_id = 1 ORDER BY address_id, purchased_at DESC ) t ORDER BY purchased_at DESC 

这里澄清,扩展和解决的问题: select按某列sorting的行,并在另一列上sorting

您可以通过address_id在子查询中进行sorting,然后按照您在外部查询中所需的顺序进行sorting。

 SELECT * FROM (SELECT DISTINCT ON (address_id) purchases.address_id, purchases.* FROM "purchases" WHERE "purchases"."product_id" = 1 ORDER BY address_id DESC ) ORDER BY purchased_at DESC 

子查询可以解决它:

 SELECT * FROM ( SELECT DISTINCT ON (address_id) * FROM purchases WHERE product_id = 1 ) p ORDER BY purchased_at DESC; 

ORDER BY前导expression式必须与DISTINCT ON列一致,所以不能在同一个SELECT按不同的列sorting。

如果你想从每个集合中select一个特定的行,只能在子查询中使用额外的ORDER BY

 SELECT * FROM ( SELECT DISTINCT ON (address_id) * FROM purchases WHERE product_id = 1 ORDER BY address_id, purchased_at DESC -- get "latest" row per address_id ) p ORDER BY purchased_at DESC; 

如果purchased_at可以为NULL ,请考虑DESC NULLS LAST
相关的,更多的解释:

  • 在每个GROUP BY组中select第一行?
  • PostgreSQL按datetime ascsorting,先是null?

窗口函数可以一次性解决:

 SELECT DISTINCT ON (address_id) LAST_VALUE(purchases.address_id) OVER wnd AS address_id FROM "purchases" WHERE "purchases"."product_id" = 1 WINDOW wnd AS ( PARTITION BY address_id ORDER BY purchases.purchased_at DESC ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) 

对于任何使用Flask-SQLAlchemy的人来说,这都适用于我

 from app import db from app.models import Purchases from sqlalchemy.orm import aliased from sqlalchemy import desc stmt = Purchases.query.distinct(Purchases.address_id).subquery('purchases') alias = aliased(Purchases, stmt) distinct = db.session.query(alias) distinct.order_by(desc(alias.purchased_at)) 

你也可以使用group by子句做到这一点

  SELECT purchases.address_id, purchases.* FROM "purchases" WHERE "purchases"."product_id" = 1 GROUP BY address_id, purchases.purchased_at ORDER purchases.purchased_at DESC 
Interesting Posts