SQL比较两个表中的数据

我有两个表格TableATableB具有相同的列格式,例如两个表格TableATableB有列

 ABCDEF 

A和B是主键。

如何编写SQL以检查具有相同主键的TableATableB是否在每列中包含完全相同的值。

这意味着这两个表具有完全相同的数据。

根据数据库pipe理系统使用的SQL的风格,您应该能够“减”或“异”。

 select * from tableA minus select * from tableB 

如果查询没有返回任何行,那么数据是完全一样的。

使用关系运算符:

 SELECT * FROM TableA UNION SELECT * FROM TableB EXCEPT SELECT * FROM TableA INTERSECT SELECT * FROM TableB; 

对于Oracle更改为MINUS

稍微挑剔一点:以上依赖于运算符的优先级,依据SQL标准是依赖于实现的,所以YMMV。 它适用于SQL Server,其优先级为:

  1. 括号中的expression式
  2. INTERSECT
  3. EXCEPTUNION从左到右评估。

dietbuddha有一个很好的答案。 在没有MINUS或EXCEPT的情况下,一种方法是在所有的表之间进行联合,并将所有的列分组,并确保其中有两个:

 SELECT col1, col2, col3 FROM (SELECT * FROM tableA UNION ALL SELECT * FROM tableB) data GROUP BY col1, col2, col3 HAVING count(*)!=2 
 SELECT c.ID FROM clients c WHERE EXISTS(SELECT c2.ID FROM clients2 c2 WHERE c2.ID = c.ID); 

将返回两个表中的所有相同的ID。 为了得到差异改变EXISTS不存在。

以时间为准的剧本,我修改它也显示每个条目来自哪个表。

 DECLARE @table1 NVARCHAR(80)= 'table 1 name' DECLARE @table2 NVARCHAR(80)= 'table 2 name' DECLARE @sql NVARCHAR (1000) SET @sql = ' SELECT ''' + @table1 + ''' AS table_name,* FROM ( SELECT * FROM ' + @table1 + ' EXCEPT SELECT * FROM ' + @table2 + ' ) x UNION SELECT ''' + @table2 + ''' AS table_name,* FROM ( SELECT * FROM ' + @table2 + ' EXCEPT SELECT * FROM ' + @table1 + ' ) y ' EXEC sp_executesql @stmt = @sql 

只是为了完成,使用except方法存储的proc比较2个表,并给出与3个错误状态相同的表中的结果,ADD,DEL,GAP表必须有相同的PK,你声明2个表和字段比较1或两个表

只要像这样使用ps_TableGap'tbl1','Tbl2','fld1,fld2,fld3','fld4'fld5'fld6'(可选)

 /****** Object: StoredProcedure [dbo].[ps_TableGap] Script Date: 10/03/2013 16:03:44 ******/ SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO -- ============================================= -- Author: Arnaud ALLAVENA -- Create date: 03.10.2013 -- Description: Compare tables -- ============================================= create PROCEDURE [dbo].[ps_TableGap] -- Add the parameters for the stored procedure here @Tbl1 as varchar(100),@Tbl2 as varchar(100),@Fld1 as varchar(1000), @Fld2 as varchar(1000)= '' AS BEGIN -- SET NOCOUNT ON added to prevent extra result sets from -- interfering with SELECT statements. SET NOCOUNT ON; --Variables --@Tbl1 = table 1 --@Tbl2 = table 2 --@Fld1 = Fields to compare from table 1 --@Fld2 Fields to compare from table 2 Declare @SQL varchar(8000)= '' --SQL statements Declare @nLoop int = 1 --loop counter Declare @Pk varchar(1000)= '' --primary key(s) Declare @Pk1 varchar(1000)= '' --first field of primary key declare @strTmp varchar(50) = '' --returns value in Pk determination declare @FldTmp varchar (1000) = '' --temporarily fields for alias calculation --If @Fld2 empty we take @Fld1 --fields rules: fields to be compare must be in same order and type - always returns Gap If @Fld2 = '' Set @Fld2 = @Fld1 --Change @Fld2 with Alias prefix xxx become _xxx while charindex(',',@Fld2)>0 begin Set @FldTmp = @FldTmp + (select substring(@Fld2,1,charindex(',',@Fld2)-1) + ' as _' + substring(@Fld2,1,charindex(',',@Fld2)-1) + ',') Set @Fld2 = (select ltrim(right(@Fld2,len(@Fld2)-charindex(',',@Fld2)))) end Set @FldTmp = @FldTmp + @Fld2 + ' as _' + @Fld2 Set @Fld2 = @FldTmp --Determinate primary key jointure --rule: same pk in both tables Set @nLoop = 1 Set @SQL = 'Declare crsr cursor for select COLUMN_NAME from INFORMATION_SCHEMA.KEY_COLUMN_USAGE where TABLE_NAME = ''' + @Tbl1 + ''' or TABLE_SCHEMA + ''.'' + TABLE_NAME = ''' + @Tbl1 + ''' or TABLE_CATALOG + ''.'' + TABLE_SCHEMA + ''.'' + TABLE_NAME = ''' + @Tbl1 + ''' order by ORDINAL_POSITION' exec(@SQL) open crsr fetch next from crsr into @strTmp while @@fetch_status = 0 begin if @nLoop = 1 begin Set @Pk = 's.' + @strTmp + ' = b._' + @strTmp Set @Pk1 = @strTmp set @nLoop = @nLoop + 1 end Else Set @Pk = @Pk + ' and s.' + @strTmp + ' = b._' + @strTmp fetch next from crsr into @strTmp end close crsr deallocate crsr --SQL statement build set @SQL = 'select case when s.' + @Pk1 + ' is null then ''Del'' when b._' + @Pk1 + ' is null then ''Add'' else ''Gap'' end as TypErr, ''' set @SQL = @SQL + @Tbl1 +''' as Tbl1, s.*, ''' + @Tbl2 +''' as Tbl2 ,b.* from (Select ' + @Fld1 + ' from ' + @Tbl1 set @SQL = @SQL + ' EXCEPT SELECT ' + @Fld2 + ' from ' + @Tbl2 + ')s full join (Select ' + @Fld2 + ' from ' + @Tbl2 set @SQL = @SQL + ' EXCEPT SELECT ' + @Fld1 + ' from ' + @Tbl1 +')b on '+ @Pk --Run SQL statement Exec(@SQL) END 
  SELECT unnest(ARRAY[1,2,2,3,3]) EXCEPT SELECT unnest(ARRAY[1,1,2,3,3]) UNION SELECT unnest(ARRAY[1,1,2,3,3]) EXCEPT SELECT unnest(ARRAY[1,2,2,3,3]) 

结果为空,但来源不同!

但:

 ( SELECT unnest(ARRAY[1,2,2,3]) EXCEPT ALL SELECT unnest(ARRAY[2,1,2,3]) ) UNION ( SELECT unnest(ARRAY[2,1,2,3]) EXCEPT ALL SELECT unnest(ARRAY[1,2,2,3]) ) 

作品。

另一种方法是根据dietbuddha和IanMc的回答进行增强的查询。 该查询包含描述以帮助显示行存在和丢失的位置。 (注意:对于SQL Server

 ( select 'InTableA_NoMatchInTableB' as Msg, * from tableA except select 'InTableA_NoMatchInTableB' , * from tableB ) union all ( select 'InTableB_NoMatchInTableA' as Msg, * from tableB except select 'InTableB_NNoMatchInTableA' ,* from tableA ) 

在MySQL中,“minus”不被支持,并考虑到性能,这是一个快速的

 query: SELECT t1.id, t1.id FROM t1 inner join t2 using (id) where concat(t1.C, t1.D, ...)<>concat(t2.C, t2.D, ...) 

增强dietbuddha的答案…

 select * from ( select * from tableA minus select * from tableB ) union all select * from ( select * from tableB minus select * from tableA ) 

我在SQL Server中也遇到了同样的问题,并且编写了这个T-SQL脚本来实现这个过程的自动化(实际上这是淡化的版本,我把所有的差异写到了一个简单的报表中)。

将“MyTable”和“MyOtherTable”更新为希望比较的表的名称。

 DECLARE @ColName varchar(100) DECLARE @Table1 varchar(100) = 'MyTable' DECLARE @Table2 varchar(100) = 'MyOtherTable' IF (OBJECT_ID('tempdb..#col') IS NOT NULL) DROP TABLE #col SELECT IDENTITY(INT, 1, 1) RowNum , c.name INTO #col FROM SYS.Objects o JOIN SYS.columns c on o.object_id = c.object_id WHERE o.name = @Table1 AND NOT c.Name IN ('List','Columns','YouWantToIgnore') DECLARE @Counter INT = (SELECT MAX(RowNum) FROM #col) WHILE @Counter > 0 BEGIN SET @ColName = (SELECT name FROM #Col WHERE RowNum= @Counter) EXEC ('SELECT t1.Identifier ,t1.'+@ColName+' AS '+@Table1+@ColName+' ,t2.'+@ColName+' AS '+@Table2+@ColName+' FROM '+@Table1+' t1 LEFT JOIN '+@Table2+' t2 ON t1.Identifier = t2.Identifier WHERE t1.'+@ColName+' <> t2.'+@ColName) SET @Counter = @Counter - 1 END 

我写这个比较了我从Oracle移植到SQL Server的一个非常讨厌的观点的结果。 它创build了一对临时表,#DataVariances和#SchemaVaces,它们在表格中的数据和表格本身的模式(你猜对了)中有所不同。

它要求两个表都有一个主键,但如果源表没有一个,则可以将其放入具有标识列的tempdb中。

 declare @TableA_ThreePartName nvarchar(max) = '' declare @TableB_ThreePartName nvarchar(max) = '' declare @KeyName nvarchar(max) = '' /*********************************************************************************************** Script to compare two tables and return differneces in schema and data. Author: Devin Lamothe 2017-08-11 ***********************************************************************************************/ set nocount on -- Split three part name into database/schema/table declare @Database_A nvarchar(max) = ( select left(@TableA_ThreePartName,charindex('.',@TableA_ThreePartName) - 1)) declare @Table_A nvarchar(max) = ( select right(@TableA_ThreePartName,len(@TableA_ThreePartName) - charindex('.',@TableA_ThreePartName,len(@Database_A) + 2))) declare @Schema_A nvarchar(max) = ( select replace(replace(@TableA_ThreePartName,@Database_A + '.',''),'.' + @Table_A,'')) declare @Database_B nvarchar(max) = ( select left(@TableB_ThreePartName,charindex('.',@TableB_ThreePartName) - 1)) declare @Table_B nvarchar(max) = ( select right(@TableB_ThreePartName,len(@TableB_ThreePartName) - charindex('.',@TableB_ThreePartName,len(@Database_B) + 2))) declare @Schema_B nvarchar(max) = ( select replace(replace(@TableB_ThreePartName,@Database_B + '.',''),'.' + @Table_B,'')) -- Get schema for both tables declare @GetTableADetails nvarchar(max) = ' use [' + @Database_A +'] select COLUMN_NAME , DATA_TYPE from INFORMATION_SCHEMA.COLUMNS where TABLE_NAME = ''' + @Table_A + ''' and TABLE_SCHEMA = ''' + @Schema_A + ''' ' create table #Table_A_Details ( ColumnName nvarchar(max) , DataType nvarchar(max) ) insert into #Table_A_Details exec (@GetTableADetails) declare @GetTableBDetails nvarchar(max) = ' use [' + @Database_B +'] select COLUMN_NAME , DATA_TYPE from INFORMATION_SCHEMA.COLUMNS where TABLE_NAME = ''' + @Table_B + ''' and TABLE_SCHEMA = ''' + @Schema_B + ''' ' create table #Table_B_Details ( ColumnName nvarchar(max) , DataType nvarchar(max) ) insert into #Table_B_Details exec (@GetTableBDetails) -- Get differences in table schema select ROW_NUMBER() over (order by a.ColumnName , b.ColumnName) as RowKey , a.ColumnName as A_ColumnName , a.DataType as A_DataType , b.ColumnName as B_ColumnName , b.DataType as B_DataType into #FieldList from #Table_A_Details a full outer join #Table_B_Details b on a.ColumnName = b.ColumnName where a.ColumnName is null or b.ColumnName is null or a.DataType <> b.DataType drop table #Table_A_Details drop table #Table_B_Details select coalesce(A_ColumnName,B_ColumnName) as ColumnName , A_DataType , B_DataType into #SchemaVariances from #FieldList -- Get differences in table data declare @LastColumn int = (select max(RowKey) from #FieldList) declare @RowNumber int = 1 declare @ThisField nvarchar(max) declare @TestSql nvarchar(max) create table #DataVariances ( TableKey nvarchar(max) , FieldName nvarchar(max) , TableA_Value nvarchar(max) , TableB_Value nvarchar(max) ) delete from #FieldList where A_DataType in ('varbinary','image') or B_DataType in ('varbinary','image') while @RowNumber <= @LastColumn begin set @TestSql = ' select coalesce(a.[' + @KeyName + '],b.[' + @KeyName + ']) as TableKey , ''' + @ThisField + ''' as FieldName , a.[' + @ThisField + '] as [TableA_Value] , b.[' + @ThisField + '] as [TableB_Value] from [' + @Database_A + '].[' + @Schema_A + '].[' + @Table_A + '] a inner join [' + @Database_B + '].[' + @Schema_B + '].[' + @Table_B + '] b on a.[' + @KeyName + '] = b.[' + @KeyName + '] where ltrim(rtrim(a.[' + @ThisField + '])) <> ltrim(rtrim(b.[' + @ThisField + '])) or (a.[' + @ThisField + '] is null and b.[' + @ThisField + '] is not null) or (a.[' + @ThisField + '] is not null and b.[' + @ThisField + '] is null) ' insert into #DataVariances exec (@TestSql) set @RowNumber = @RowNumber + 1 set @ThisField = (select coalesce(A_ColumnName,B_ColumnName) from #FieldList a where RowKey = @RowNumber) end drop table #FieldList print 'Query complete. Select from #DataVariances to verify data integrity or #SchemaVariances to verify schemas match. Data types varbinary and image are not checked.'