处理inheritance与有效覆盖

我有以下两个数据结构。

首先 ,应用于对象三元组的属性列表:

Object1 Object2 Object3 Property Value O1 O2 O3 P1 "abc" O1 O2 O3 P2 "xyz" O1 O3 O4 P1 "123" O2 O4 O5 P1 "098" 

其次 ,一个inheritance树:

 O1 O2 O4 O3 O5 

或者被视为一种关系:

 Object Parent O2 O1 O4 O2 O3 O1 O5 O3 O1 null 

O2的这种语义是inheritance了O1的属性; O4 – 来自O2和O1; O3 – 来自O1; 和O5 – 从O3和O1开始,按优先顺序排列。
注1 :我有一个有效的方法来select给定对象的所有孩子或所有父母。 这是目前用左右索引实现的,但是hierarchyid也可以工作。 这现在看起来并不重要。
注2 :我有一些东西,确保“对象”列始终包含所有可能的对象,即使它们并不真正在那里(即没有定义父对象或子对象)。 这使得使用inner join而不是严重不那么有效的outer join成为可能。

目标是 :给定一对(Property,Value),返回具有该属性的所有对象三元组,该值或者显式定义,或者从父类inheritance。

注1 :当X = AX is a parent of A对象triple (X,Y,Z)被认为是triple (A,B,C)的“父”,同样的情况也是如此(Y,B)(Z,C)
注2 :在一个更亲近的父亲上定义的属性“覆盖”在更远的父亲上定义的同一个属性。
注3 :当(A,B,C)有两个父项 – (X1,Y1,Z1)和(X2,Y2,Z2)时,则在下列情况下(X1,Y1,Z1)
(a)X2是X1的父亲,或者
(b)X2 = X1和Y2是Y1的父亲,或者
(c)X2 = X1和Y2 = Y1,Z2是Z1的父亲

换句话说,三元组的祖先“亲密性”是基于三元组的第一个组成部分,然后是第二个组成部分,然后是第三个组成部分。 这个规则在祖先方面build立了一个不可思议的偏序三元组。

例如,给定一对(P1,“abc”),三元组的结果集为:

  O1, O2, O3 -- Defined explicitly O1, O2, O5 -- Because O5 inherits from O3 O1, O4, O3 -- Because O4 inherits from O2 O1, O4, O5 -- Because O4 inherits from O2 and O5 inherits from O3 O2, O2, O3 -- Because O2 inherits from O1 O2, O2, O5 -- Because O2 inherits from O1 and O5 inherits from O3 O2, O4, O3 -- Because O2 inherits from O1 and O4 inherits from O2 O3, O2, O3 -- Because O3 inherits from O1 O3, O2, O5 -- Because O3 inherits from O1 and O5 inherits from O3 O3, O4, O3 -- Because O3 inherits from O1 and O4 inherits from O2 O3, O4, O5 -- Because O3 inherits from O1 and O4 inherits from O2 and O5 inherits from O3 O4, O2, O3 -- Because O4 inherits from O1 O4, O2, O5 -- Because O4 inherits from O1 and O5 inherits from O3 O4, O4, O3 -- Because O4 inherits from O1 and O4 inherits from O2 O5, O2, O3 -- Because O5 inherits from O1 O5, O2, O5 -- Because O5 inherits from O1 and O5 inherits from O3 O5, O4, O3 -- Because O5 inherits from O1 and O4 inherits from O2 O5, O4, O5 -- Because O5 inherits from O1 and O4 inherits from O2 and O5 inherits from O3 

请注意,这个列表中没有三元(O2,O4,O5)。 这是因为属性P1明确地为三元组(O2,O4,O5)定义,并且这防止了从(O1,O2,O3)inheritance该属性的三元组。 另外请注意三联(O4,O4,O5)也不存在。 这是因为它从(O2,O4,O5)inheritance了P1 =“098”的值,因为它比(O1,O2,O3)更亲近。

直接的方法如下。 首先,对于属性定义的每个三元组,select所有可能的子元素:

 select Children1.Id as O1, Children2.Id as O2, Children3.Id as O3, tp.Property, tp.Value from TriplesAndProperties tp -- Select corresponding objects of the triple inner join Objects as Objects1 on Objects1.Id = tp.O1 inner join Objects as Objects2 on Objects2.Id = tp.O2 inner join Objects as Objects3 on Objects3.Id = tp.O3 -- Then add all possible children of all those objects inner join Objects as Children1 on Objects1.Id [isparentof] Children1.Id inner join Objects as Children2 on Objects2.Id [isparentof] Children2.Id inner join Objects as Children3 on Objects3.Id [isparentof] Children3.Id 

但是,这并不是一个完整的故事:如果某个triple从几个父母inheritance了同一个属性,那么这个查询就会产生相互冲突的结果。 因此,第二步是只select其中一个冲突的结果:

 select * from ( select Children1.Id as O1, Children2.Id as O2, Children3.Id as O3, tp.Property, tp.Value, row_number() over( partition by Children1.Id, Children2.Id, Children3.Id, tp.Property order by Objects1.[depthInTheTree] descending, Objects2.[depthInTheTree] descending, Objects3.[depthInTheTree] descending ) as InheritancePriority from ... (see above) ) where InheritancePriority = 1 

窗口函数row_number() over( ... )执行以下操作:对于对象的每个唯一组合triple和property,它将所有值从三元组到父元素的祖先距离进行sorting,从而inheritance该值,然后I只select结果列表中的第一个值。 使用GROUP BYORDER BY语句可以实现类似的效果,但我只是发现窗口函数在语义上更干净(它们产生的执行计划是相同的)。 重点是,我需要select最接近的贡献祖先,为此,我需要分组,然后在组内进行sorting。

最后,现在我可以简单地过滤Property和Value的结果集。

这个计划有效。 非常可靠和可预测的。 它已被certificate是非常强大的业务任务。

唯一的麻烦是, 这是非常缓慢的
有人可能会指出,join七张表可能会减慢速度,但实际上并不是瓶颈。

根据我从SQL Management Studio(以及SQL Profiler)得到的实际执行计划,瓶颈就是sorting。 问题是,为了满足我的窗口函数,服务器必须按Children1.Id, Children2.Id, Children3.Id, tp.Property, Parents1.[depthInTheTree] descending, Parents2.[depthInTheTree] descending, Parents3.[depthInTheTree] descending ,并且不能使用索引,因为这些值来自多个表的交叉连接。

编辑:根据迈克尔·布恩的build议(谢谢你,迈克尔),我已经把整个难题贴在sqlfiddle 这里 。 在执行计划中可以看到Sort操作占整个查询的32%,并且随着总行数增加,因为所有其他操作都使用索引。

通常在这种情况下,我会使用索引视图,但在这种情况下,因为索引视图不能包含自连接,其中有六个。

目前我能想到的唯一方法是创build对象表的六个副本,然后将它们用于连接,从而启用索引视图。
到了那个时候,我会被沦为那种黑客? 绝望落入。

我有3个可能的答案。

您的问题的SQL小提琴在这里: http : //sqlfiddle.com/#!3/7c7a0/3/0

我的答案的sql小提琴在这里: http : //sqlfiddle.com/#!3/5d257/1

警告:

  1. 查询分析器是不够的 – 我注意到一些答案被拒绝,因为他们的查询计划比原来的查询更昂贵。 分析仪只是指导。 根据实际的数据集,硬件和用例,更昂贵的查询可以比更便宜的查询返回更快的结果。 你必须在你的环境中testing。
  2. 查询分析器是无效的 – 即使您find一种方法来从查询中删除“最昂贵的步骤”,它通常对您的查询没有任何影响。
  3. 单独查询更改很less减轻架构/devise问题 – 某些答案因涉及架构级别更改(如触发器和附加表)而被拒绝。 抵制优化的复杂查询是一个强有力的信号,表明问题出在底层devise或者我的期望上。 你可能不喜欢它,但你可能不得不接受这个问题在查询层面是不可解的。
  4. 索引视图不能包含row_number()/ partitition子句 – 通过创build对象表的六个副本来解决自连接问题并不足以让您创buildbuild议的索引视图。 我在这个sqlfiddle中尝试过。 如果取消最后一个“创build索引”语句的注释,则会出现错误,因为您的视图“包含排名或聚合窗口函数”。

工作答案:

  1. 左连接而不是row_number() – 可以使用使用左连接的查询来排除树中被覆盖较低的结果。 从这个查询中删除最后的“order by”实际上消除了困扰你的sorting! 这个查询的执行计划比你原来的还要贵,但是请看上面的免责声明#1。
  2. 部分查询的索引视图 – 使用一些严重的查询魔术(基于这种技术 ),我为查询的一部分创build了一个索引视图。 此视图可用于增强原始问题查询或答案#1。
  3. 实现一个索引良好的表 – 其他人提出了这个答案,但他们可能没有解释得很好。 除非你的结果集非常大或者你正在对源表进行非常频繁的更新,否则实现查询结果并使用触发器保持它们是最新的,这是解决这类问题的最好方法。 一旦为查询创build了一个视图,testing这个选项就足够简单了。 您可以重复使用答案#2来加速触发,然后随着时间的推移进一步改进。 (您正在讨论创build表的六个副本,请先尝试一下,它保证您所关心的select的性能将尽可能好。)

下面是我的答案从sqlfiddle模式的一部分:

 Create Table Objects ( Id int not null identity primary key, LeftIndex int not null default 0, RightIndex int not null default 0 ) alter table Objects add ParentId int null references Objects CREATE TABLE TP ( Object1 int not null references Objects, Object2 int not null references Objects, Object3 int not null references Objects, Property varchar(20) not null, Value varchar(50) not null ) insert into Objects(LeftIndex, RightIndex) values(1, 10) insert into Objects(ParentId, LeftIndex, RightIndex) values(1, 2, 5) insert into Objects(ParentId, LeftIndex, RightIndex) values(1, 6, 9) insert into Objects(ParentId, LeftIndex, RightIndex) values(2, 3, 4) insert into Objects(ParentId, LeftIndex, RightIndex) values(3, 7, 8) insert into TP(Object1, Object2, Object3, Property, Value) values(1,2,3, 'P1', 'abc') insert into TP(Object1, Object2, Object3, Property, Value) values(1,2,3, 'P2', 'xyz') insert into TP(Object1, Object2, Object3, Property, Value) values(1,3,4, 'P1', '123') insert into TP(Object1, Object2, Object3, Property, Value) values(2,4,5, 'P1', '098') create index ix_LeftIndex on Objects(LeftIndex) create index ix_RightIndex on Objects(RightIndex) create index ix_Objects on TP(Property, Value, Object1, Object2, Object3) create index ix_Prop on TP(Property) GO ---------- QUESTION ADDITIONAL SCHEMA -------- CREATE VIEW TPResultView AS Select O1, O2, O3, Property, Value FROM ( select Children1.Id as O1, Children2.Id as O2, Children3.Id as O3, tp.Property, tp.Value, row_number() over( partition by Children1.Id, Children2.Id, Children3.Id, tp.Property order by Objects1.LeftIndex desc, Objects2.LeftIndex desc, Objects3.LeftIndex desc ) as Idx from tp -- Select corresponding objects of the triple inner join Objects as Objects1 on Objects1.Id = tp.Object1 inner join Objects as Objects2 on Objects2.Id = tp.Object2 inner join Objects as Objects3 on Objects3.Id = tp.Object3 -- Then add all possible children of all those objects inner join Objects as Children1 on Children1.LeftIndex between Objects1.LeftIndex and Objects1.RightIndex inner join Objects as Children2 on Children2.LeftIndex between Objects2.LeftIndex and Objects2.RightIndex inner join Objects as Children3 on Children3.LeftIndex between Objects3.LeftIndex and Objects3.RightIndex ) as x WHERE idx = 1 GO ---------- ANSWER 1 SCHEMA -------- CREATE VIEW TPIntermediate AS select tp.Property, tp.Value , Children1.Id as O1, Children2.Id as O2, Children3.Id as O3 , Objects1.LeftIndex as PL1, Objects2.LeftIndex as PL2, Objects3.LeftIndex as PL3 , Children1.LeftIndex as CL1, Children2.LeftIndex as CL2, Children3.LeftIndex as CL3 from tp -- Select corresponding objects of the triple inner join Objects as Objects1 on Objects1.Id = tp.Object1 inner join Objects as Objects2 on Objects2.Id = tp.Object2 inner join Objects as Objects3 on Objects3.Id = tp.Object3 -- Then add all possible children of all those objects inner join Objects as Children1 WITH (INDEX(ix_LeftIndex)) on Children1.LeftIndex between Objects1.LeftIndex and Objects1.RightIndex inner join Objects as Children2 WITH (INDEX(ix_LeftIndex)) on Children2.LeftIndex between Objects2.LeftIndex and Objects2.RightIndex inner join Objects as Children3 WITH (INDEX(ix_LeftIndex)) on Children3.LeftIndex between Objects3.LeftIndex and Objects3.RightIndex GO ---------- ANSWER 2 SCHEMA -------- -- Partial calculation using an indexed view -- Circumvented the self-join limitation using a black magic technique, based on -- http://jmkehayias.blogspot.com/2008/12/creating-indexed-view-with-self-join.html CREATE TABLE dbo.multiplier (i INT PRIMARY KEY) INSERT INTO dbo.multiplier VALUES (1) INSERT INTO dbo.multiplier VALUES (2) INSERT INTO dbo.multiplier VALUES (3) GO CREATE VIEW TPIndexed WITH SCHEMABINDING AS SELECT tp.Object1, tp.object2, tp.object3, tp.property, tp.value, SUM(ISNULL(CASE Mi WHEN 1 THEN Objects.LeftIndex ELSE NULL END, 0)) as PL1, SUM(ISNULL(CASE Mi WHEN 2 THEN Objects.LeftIndex ELSE NULL END, 0)) as PL2, SUM(ISNULL(CASE Mi WHEN 3 THEN Objects.LeftIndex ELSE NULL END, 0)) as PL3, SUM(ISNULL(CASE Mi WHEN 1 THEN Objects.RightIndex ELSE NULL END, 0)) as PR1, SUM(ISNULL(CASE Mi WHEN 2 THEN Objects.RightIndex ELSE NULL END, 0)) as PR2, SUM(ISNULL(CASE Mi WHEN 3 THEN Objects.RightIndex ELSE NULL END, 0)) as PR3, COUNT_BIG(*) as ID FROM dbo.tp cross join dbo.multiplier M inner join dbo.Objects on (Mi = 1 AND Objects.Id = tp.Object1) or (Mi = 2 AND Objects.Id = tp.Object2) or (Mi = 3 AND Objects.Id = tp.Object3) GROUP BY tp.Object1, tp.object2, tp.object3, tp.property, tp.value GO -- This index is mostly useless but required create UNIQUE CLUSTERED index pk_TPIndexed on dbo.TPIndexed(property, value, object1, object2, object3) -- Once we have the clustered index, we can create a nonclustered that actually addresses our needs create NONCLUSTERED index ix_TPIndexed on dbo.TPIndexed(property, value, PL1, PL2, PL3, PR1, PR2, PR3) GO -- NOTE: this View is not indexed, but is uses the indexed view CREATE VIEW TPIndexedResultView AS Select O1, O2, O3, Property, Value FROM ( select Children1.Id as O1, Children2.Id as O2, Children3.Id as O3, tp.Property, tp.Value, row_number() over( partition by tp.Property, Children1.Id, Children2.Id, Children3.Id order by tp.Property, Tp.PL1 desc, Tp.PL2 desc, Tp.PL3 desc ) as Idx from TPIndexed as TP WITH (NOEXPAND) -- Then add all possible children of all those objects inner join Objects as Children1 WITH (INDEX(ix_LeftIndex)) on Children1.LeftIndex between TP.PL1 and TP.PR1 inner join Objects as Children2 WITH (INDEX(ix_LeftIndex)) on Children2.LeftIndex between TP.PL2 and TP.PR2 inner join Objects as Children3 WITH (INDEX(ix_LeftIndex)) on Children3.LeftIndex between TP.PL3 and TP.PR3 ) as x WHERE idx = 1 GO -- NOTE: this View is not indexed, but is uses the indexed view CREATE VIEW TPIndexedIntermediate AS select tp.Property, tp.Value , Children1.Id as O1, Children2.Id as O2, Children3.Id as O3 , PL1, PL2, PL3 , Children1.LeftIndex as CL1, Children2.LeftIndex as CL2, Children3.LeftIndex as CL3 from TPIndexed as TP WITH (NOEXPAND) -- Then add all possible children of all those objects inner join Objects as Children1 WITH (INDEX(ix_LeftIndex)) on Children1.LeftIndex between TP.PL1 and TP.PR1 inner join Objects as Children2 WITH (INDEX(ix_LeftIndex)) on Children2.LeftIndex between TP.PL2 and TP.PR2 inner join Objects as Children3 WITH (INDEX(ix_LeftIndex)) on Children3.LeftIndex between TP.PL3 and TP.PR3 GO ---------- ANSWER 3 SCHEMA -------- -- You're talking about making six copies of the TP table -- If you're going to go that far, you might as well, go the trigger route -- The performance profile is much the same - slower on insert, faster on read -- And instead of still recalculating on every read, you'll be recalculating -- only when the data changes. CREATE TABLE TPResult ( Object1 int not null references Objects, Object2 int not null references Objects, Object3 int not null references Objects, Property varchar(20) not null, Value varchar(50) not null ) GO create UNIQUE index ix_Result on TPResult(Property, Value, Object1, Object2, Object3) --You'll have to imagine this trigger, sql fiddle doesn't want to do it --CREATE TRIGGER tr_TP --ON TP -- FOR INSERT, UPDATE, DELETE --AS -- DELETE FROM TPResult -- -- For this example we'll just insert into the table once INSERT INTO TPResult SELECT O1, O2, O3, Property, Value FROM TPResultView 

从sqlfiddle查询我的答案的一部分:

 -------- QUESTION QUERY ---------- -- Original query, modified to use the view I added SELECT O1, O2, O3, Property, Value FROM TPResultView WHERE property = 'P1' AND value = 'abc' -- Your assertion is that this order by is the most expensive part. -- Sometimes converting queries into views allows the server to -- Optimize them better over time. -- NOTE: removing this order by has no effect on this query. -- ORDER BY O1, O2, O3 GO -------- ANSWER 1 QUERY ---------- -- A different way to get the same result. -- Query optimizer says this is more expensive, but I've seen cases where -- it says a query is more expensive but it returns results faster. SELECT O1, O2, O3, Property, Value FROM ( SELECT A.O1, A.O2, A.O3, A.Property, A.Value FROM TPIntermediate A LEFT JOIN TPIntermediate B ON A.O1 = B.O1 AND A.O2 = B.O2 AND A.O3 = B.O3 AND A.Property = B.Property AND ( -- Find any rows with Parent LeftIndex triplet that is greater than this one (A.PL1 < B.PL1 AND A.PL2 < B.PL2 AND A.PL3 < B.PL3) OR -- Find any rows with LeftIndex triplet that is greater than this one (A.CL1 < B.CL1 AND A.CL2 < B.CL2 AND A.CL3 < B.CL3) ) -- If this row has any rows that match the previous two cases, exclude it WHERE B.O1 IS NULL ) AS x WHERE property = 'P1' AND value = 'abc' -- NOTE: Removing this order _DOES_ reduce query cost removing the "sort" action -- that has been the focus of your question. -- Howeer, it wasn't clear from your question whether this order by was required. --ORDER BY O1, O2, O3 GO -------- ANSWER 2 QUERIES ---------- -- Same as above but using an indexed view to partially calculate results SELECT O1, O2, O3, Property, Value FROM TPIndexedResultView WHERE property = 'P1' AND value = 'abc' -- Your assertion is that this order by is the most expensive part. -- Sometimes converting queries into views allows the server to -- Optimize them better over time. -- NOTE: removing this order by has no effect on this query. --ORDER BY O1, O2, O3 GO SELECT O1, O2, O3, Property, Value FROM ( SELECT A.O1, A.O2, A.O3, A.Property, A.Value FROM TPIndexedIntermediate A LEFT JOIN TPIndexedIntermediate B ON A.O1 = B.O1 AND A.O2 = B.O2 AND A.O3 = B.O3 AND A.Property = B.Property AND ( -- Find any rows with Parent LeftIndex triplet that is greater than this one (A.PL1 < B.PL1 AND A.PL2 < B.PL2 AND A.PL3 < B.PL3) OR -- Find any rows with LeftIndex triplet that is greater than this one (A.CL1 < B.CL1 AND A.CL2 < B.CL2 AND A.CL3 < B.CL3) ) -- If this row has any rows that match the previous two cases, exclude it WHERE B.O1 IS NULL ) AS x WHERE property = 'P1' AND value = 'abc' -- NOTE: Removing this order _DOES_ reduce query cost removing the "sort" action -- that has been the focus of your question. -- Howeer, it wasn't clear from your question whether this order by was required. --ORDER BY O1, O2, O3 GO -------- ANSWER 3 QUERY ---------- -- Returning results from a pre-calculated table is fast and easy -- Unless your are doing many more inserts than reads, or your result -- set is very large, this is a fine way to compensate for a poor design -- in one area of your database. SELECT Object1 as O1, Object2 as O2, Object3 as O3, Property, Value FROM TPResult WHERE property = 'P1' AND value = 'abc' ORDER BY O1, O2, O3 

你可以通过在一个索引表中实现连接来加快速度。 这具有需要空间和保存到磁盘的缺点。 但它具有能够使用缓慢部分的索引的优点。

 insert into joinedresult select Children1.Id as O1, Children2.Id as O2, Children3.Id as O3, tp.Property, tp.Value,Objects1.[depthInTheTree] as O1D,Objects2.[depthInTheTree] as O2D,Objects3. depthInTheTree] as O3D from ... (see above) 

确保join的结果在[O1,O2,O3,Property,O1D,O2D,O3D]上有一个索引,并在运行之前将其清除。 然后

 select * from ( select Children1.Id as O1, Children2.Id as O2, Children3.Id as O3, tp.Property, tp.Value, row_number() over( partition by Children1.Id, Children2.Id, Children3.Id, tp.Property order by O1D descending, O2D descending, O3D descending ) as InheritancePriority from joinedresult ) where InheritancePriority = 1 

你有没有试过一个索引(或设置PK)的“价值”列首先,“属性”列第二,“对象1”列第三,“对象2”列第四,和“对象3”列第五? 我假设“价值”比“财产”更具限制性。

我还假设您有Id列设置为主键,ParentId和Id之间的外键关系。

这个查询如何执行?

  with -- First, get all combinations that match the property/value pair. validTrip as ( select Object1, Object2, Object3 from TriplesAndProperties where value = @value and property = @property ), -- Recursively flatten the inheritance hierarchy of Object1, 2 and 3. o1 as ( select Id, 0 as InherLevel from Objects where Id in (select Object1 from validTrip) union all select rec.Id, InherLevel + 1 from Objects rec inner join o1 base on rec.Parent = base.[Object] ), o2 as ( select Id, 0 as InherLevel from Objects where Id in (select Object2 from validTrip) union all select rec.Id, InherLevel + 1 from Objects rec inner join o2 base on rec.Parent = base.[Object] ), o3 as ( select Id, 0 as InherLevel from Objects where Id in (select Object3 from validTrip) union all select rec.Id, InherLevel + 1 from Objects rec inner join o3 base on rec.Parent = base.[Object] ) -- select the Id triple. select o1.Id, o2.Id, o3.Id N -- match every option in o1, with every option in o2, with every option in o3. from o1 cross join o2 cross join o3 -- order by the inheritance level. order by o1.InherLevel, o2.InherLevel, o3.InherLevel; 

在这种情况下, 分层查询 ,即WITH RECURSIVE ...或专有的等价物(如CONNECT BY是您的朋友。

解决你的特定问题的方法是:开始在休假,并提升到根聚合,排除已经发现的任何东西。

我猜你的桌子相当大。 因此,缓慢。 在这种情况下,我也猜测你有多个属性(从2到很多)。 在这种情况下,我build议你在CTE内部移动“where property ='P1'”。 这将过滤数据的很大一部分,使您的查询速度达到快速的属性数量。

类似于: http : //sqlfiddle.com/#!3/ 7c7a0/ 92/0

Caching is the KEY to making a query faster. It reduces the calculations you must make. You want to create an index, because you want to CACHE , and save WORK . Below are two possibilities to do this.

选项1

The SQL database sorts because of your windowing function. And you say the windowing function is too slow as it is.

I don't know how well this will work, but it might work.

Instead of sorting by a number of columns, you could try sorting by a single column – "closeness".

Let's define closeness as some abstract integer for now. Instead of your windowing function, you can instead have the following SQL:

 select * from ( select Children1.Id as O1, Children2.Id as O2, Children3.Id as O3, tp.Property, tp.Value, row_number() over( partition by Children1.Id, Children2.Id, Children3.Id, tp.Property order by closeness DESC ) as InheritancePriority from ... (see above) ) where InheritancePriority = 1 

closeness can be a column defined in the TriplesAndProperties table. For each object, you can define its "closeness", as the distance it is away from the root node (O1). Then the we can define closeness(tuple) = closeness(Object1)*100+closeness(Object2)*10+closeness(Object3)

This way, the tuple with furtherest from the root is what you want.

To avoid sorting, you just have to make sure that closeness is indexed.


选项2

I am VERY sure that this will work.

Define your TriplesAndProperties table to have the columns: Object1, Object2, Object3, Property, Value, Effective_Object1, Effective_Object2, Effective_Object3, Closeness .

Notice that here I also define closeness as a column.

When you insert/update a tuple into your table, (X,Y,Z), instead, you want to insert:

 (X,Y,Z,Property,Value,X,Y,Z,0) (X,Y,Z,Property,Value,X,Y,Z.child,1) (X,Y,Z,Property,Value,X,Y,Z.grandchild,2) (X,Y,Z,Property,Value,X,Y.child,Z,10) (X,Y,Z,Property,Value,X,Y.child,Z.child,11) (X,Y,Z,Property,Value,X,Y.child,Z.grandchild,12) (X,Y,Z,Property,Value,X,Y.grandchild,Z,20) (X,Y,Z,Property,Value,X,Y.grandchild,Z.child,21) (X,Y,Z,Property,Value,X,Y.grandchild,Z.grandchild,22) ... ... 

This means that instead of inserting/updating/destroying a single row in your table, you are inserting up to ~20 rows. 这不是太糟糕。

Then your query is VERY EASY.

You just say:

 SELECT * FROM ( SELECT Effective_Object1, Effective_Object2, Effective_Object3, Property, Value, row_number() over( partition by Effective_Object1, Effective_Object2, Effective_Object3, Property order by Closeness DESC ) AS InheritancePriority FROM TriplesAndProperties ) WHERE InheritancePriority = 1; 

In this option, you have to make sure closeness is indexed, you can just index by the tuple (Effective_Object1, Effective_Object2, Effective_Object3, Property, Closeness).


In both cases, you have some amount of caching , which is data which doesn't add any additional information as such, but which caches a certain amount of calculation or work .