MYSQL sum()为不同的行

我正在寻找在我的SQL查询中使用sum()的帮助:

SELECT links.id, count(DISTINCT stats.id) as clicks, count(DISTINCT conversions.id) as conversions, sum(conversions.value) as conversion_value FROM links LEFT OUTER JOIN stats ON links.id = stats.parent_id LEFT OUTER JOIN conversions ON links.id = conversions.link_id GROUP BY links.id ORDER BY links.created desc; 

我使用DISTINCT因为我正在做“group by”,这可以确保同一行不会超过一次。

问题是,SUM(conversions.value)多次计算每行的“值”(由于组)

我基本上想要为每个DISTINCT conversions.id做SUM(conversions.value)

那可能吗?

我可能是错的,但从我的理解

  • conversions.id是表格转换主键
  • stats.id是你的表统计主键

因此,对于每个conversions.id,您最多只有一个links.id受到影响。

你的要求有点像做2套笛卡儿的产品:

 [clicks] SELECT * FROM links LEFT OUTER JOIN stats ON links.id = stats.parent_id [conversions] SELECT * FROM links LEFT OUTER JOIN conversions ON links.id = conversions.link_id 

并为每个链接,你得到sizeof([点击])×sizeof([转换])行

正如您所logging的,您的请求中的唯一转化次数可以通过a

 count(distinct conversions.id) = sizeof([conversions]) 

这个独特的设法去除了笛卡尔产品中的所有[点击]行

但显然

 sum(conversions.value) = sum([conversions].value) * sizeof([clicks]) 

你的情况,因为

 count(*) = sizeof([clicks]) x sizeof([conversions]) count(*) = sizeof([clicks]) x count(distinct conversions.id) 

你有

 sizeof([clicks]) = count(*)/count(distinct conversions.id) 

所以我会testing你的请求

 SELECT links.id, count(DISTINCT stats.id) as clicks, count(DISTINCT conversions.id) as conversions, sum(conversions.value)*count(DISTINCT conversions.id)/count(*) as conversion_value FROM links LEFT OUTER JOIN stats ON links.id = stats.parent_id LEFT OUTER JOIN conversions ON links.id = conversions.link_id GROUP BY links.id ORDER BY links.created desc; 

保持我张贴! 杰罗姆

有关您为什么看到不正确的数字的解释, 请阅读

我认为杰罗姆能够处理导致错误的原因。 布赖森的查询将工作,虽然在SELECT中的子查询可能是低效的。

使用以下查询:

 SELECT links.id , ( SELECT COUNT(*) FROM stats WHERE links.id = stats.parent_id ) AS clicks , conversions.conversions , conversions.conversion_value FROM links LEFT JOIN ( SELECT link_id , COUNT(id) AS conversions , SUM(conversions.value) AS conversion_value FROM conversions GROUP BY link_id ) AS conversions ON links.id = conversions.link_id ORDER BY links.created DESC 

杰罗姆斯的解决scheme其实是错误的,会产生不正确的结果!

 sum(conversions.value)*count(DISTINCT conversions.id)/count(*) as conversion_value 

让我们假设下面的表格

 conversions id value 1 5 1 5 1 5 2 2 3 1 

不同的ID的正确价值总和将是8.杰罗姆的公式产生:

 sum(conversions.value) = 18 count(distinct conversions.id) = 3 count(*) = 5 18*3/5 = 9.6 != 8 

我使用子查询来做到这一点。 它消除了分组的问题。 所以这个查询会是这样的:

 SELECT COUNT(DISTINCT conversions.id) ... (SELECT SUM(conversions.value) FROM ....) AS Vals 

怎么样这样的事情:

 select l.id, count(s.id) clicks, count(c.id) clicks, sum(c.value) conversion_value from (SELECT l.id id, l.created created, s.id clicks, c.id conversions, max(c.value) conversion_value FROM links l LEFT JOIN stats s ON l.id = s.parent_id LEFT JOIN conversions c ON l.id = c.link_id GROUP BY l.id, l.created, s.id, c.id) t order by t.created 

这将做的伎俩,只是将分数与重复的对话ID计数。

 SELECT a.id, a.clicks, SUM(a.conversion_value/a.conversions) AS conversion_value, a.conversions FROM (SELECT links.id, COUNT(DISTINCT stats.id) AS clicks, COUNT(conversions.id) AS conversions, SUM(conversions.value) AS conversion_value FROM links LEFT OUTER JOIN stats ON links.id = stats.parent_id LEFT OUTER JOIN conversions ON links.id = conversions.link_id GROUP BY conversions.id,links.id ORDER BY links.created DESC) AS a GROUP BY a.id