在entity framework中插入的最快方式

我正在寻找插入到entity framework最快的方式。

我问这是因为你有一个活跃的TransactionScope的情况下，插入是巨大的（4000 +）。它可能会持续超过10分钟（交易的默认超时），这将导致不完整的交易。

要对你的问题发表评论：

“… SavingChanges（ 对于每个logging ）…”

这是你能做的最糟糕的事！调用SaveChanges()为每个logging减慢批量插入非常closures。我会做几个简单的testing，这将很可能提高性能：

在ALLlogging后调用SaveChanges()一次。
在例如100条logging之后调用SaveChanges() 。
在例如100条logging之后调用SaveChanges()并处理上下文并创build一个新的上下文。
禁用更改检测

对于批量插入，我正在尝试使用如下模式：

 using (TransactionScope scope = new TransactionScope()) { MyDbContext context = null; try { context = new MyDbContext(); context.Configuration.AutoDetectChangesEnabled = false; int count = 0; foreach (var entityToInsert in someCollectionOfEntitiesToInsert) { ++count; context = AddToContext(context, entityToInsert, count, 100, true); } context.SaveChanges(); } finally { if (context != null) context.Dispose(); } scope.Complete(); } private MyDbContext AddToContext(MyDbContext context, Entity entity, int count, int commitCount, bool recreateContext) { context.Set<Entity>().Add(entity); if (count % commitCount == 0) { context.SaveChanges(); if (recreateContext) { context.Dispose(); context = new MyDbContext(); context.Configuration.AutoDetectChangesEnabled = false; } } return context; }

我有一个testing程序，向数据库中插入560.000个实体（9个标量属性，没有导航属性）。有了这个代码，它在不到3分钟的时间内工作。

对于性能来说，在“许多”logging（“100”或1000左右的“许多” SaveChanges()之后调用SaveChanges()是非常重要的。它还提高了SaveChanges后处理上下文的性能，并创build一个新的上下文。这将清除所有实体的上下文， SaveChanges不会这样做，实体仍然以Unchanged状态连接到上下文。在上下文中，连接实体的规模在不断增大，逐步减慢了插入的速度。所以，在一段时间之后清除它是有帮助的。

以下是我的560.000个实体的一些测量：

commitCount = 1，recreateContext = false： 很多小时 （这是你当前的过程）
commitCount = 100，recreateContext = false： 超过20分钟
commitCount = 1000，recreateContext = false： 242秒
commitCount = 10000，recreateContext = false： 202秒
commitCount = 100000，recreateContext = false： 199秒
commitCount = 1000000，recreateContext = false： 内存不足exception
commitCount = 1，recreateContext = true： 超过10分钟
commitCount = 10，recreateContext = true： 241秒
commitCount = 100，recreateContext = true： 164秒
commitCount = 1000，recreateContext = true： 191秒

上面第一个testing的行为是性能非常非线性，随着时间的推移极端降低。（“许多小时”是一个估计，我从来没有完成这个testing，20分钟后我停在了50.000个实体。）在所有其他testing中，这种非线性行为并不是那么重要。

这个组合提高了速度。

 context.Configuration.AutoDetectChangesEnabled = false; context.Configuration.ValidateOnSaveEnabled = false;

最快的方法是使用我开发的批量插入扩展。

它使用SqlBulkCopy和自定义数据读取器来获得最大的性能。因此，它比使用常规插入或AddRange快20多倍 EntityFramework.BulkInsert与EF AddRange

用法非常简单

 context.BulkInsert(hugeAmountOfEntities);

你应该看看使用System.Data.SqlClient.SqlBulkCopy 。这里是文档，当然还有很多在线的教程。

对不起，我知道你正在寻找一个简单的答案让EF做你想做的事情，但批量操作并不是真正的ORMs的意思。

我同意Adam Rackis。 SqlBulkCopy是将批量logging从一个数据源传输到另一个数据源的最快方式。我用这个来复制20K的logging，花了不到3秒。看看下面的例子。

 public static void InsertIntoMembers(DataTable dataTable) { using (var connection = new SqlConnection(@"data source=;persist security info=True;user id=;password=;initial catalog=;MultipleActiveResultSets=True;App=EntityFramework")) { SqlTransaction transaction = null; connection.Open(); try { transaction = connection.BeginTransaction(); using (var sqlBulkCopy = new SqlBulkCopy(connection, SqlBulkCopyOptions.TableLock, transaction)) { sqlBulkCopy.DestinationTableName = "Members"; sqlBulkCopy.ColumnMappings.Add("Firstname", "Firstname"); sqlBulkCopy.ColumnMappings.Add("Lastname", "Lastname"); sqlBulkCopy.ColumnMappings.Add("DOB", "DOB"); sqlBulkCopy.ColumnMappings.Add("Gender", "Gender"); sqlBulkCopy.ColumnMappings.Add("Email", "Email"); sqlBulkCopy.ColumnMappings.Add("Address1", "Address1"); sqlBulkCopy.ColumnMappings.Add("Address2", "Address2"); sqlBulkCopy.ColumnMappings.Add("Address3", "Address3"); sqlBulkCopy.ColumnMappings.Add("Address4", "Address4"); sqlBulkCopy.ColumnMappings.Add("Postcode", "Postcode"); sqlBulkCopy.ColumnMappings.Add("MobileNumber", "MobileNumber"); sqlBulkCopy.ColumnMappings.Add("TelephoneNumber", "TelephoneNumber"); sqlBulkCopy.ColumnMappings.Add("Deleted", "Deleted"); sqlBulkCopy.WriteToServer(dataTable); } transaction.Commit(); } catch (Exception) { transaction.Rollback(); } } }

我已经调查过Slauma的答案了（这真是太棒了，谢谢你的提议），而且我减less了批量，直到达到最佳速度。看着Slauma的结果：

commitCount = 1，recreateContext = true：超过10分钟
commitCount = 10，recreateContext = true：241秒
commitCount = 100，recreateContext = true：164秒
commitCount = 1000，recreateContext = true：191秒

从1到10，从10到100，从100到1000插入速度再次下降，可见速度增加。

所以我把注意力集中在当你将批量减less到10到100之间的时候发生的事情，这里是我的结果（我使用不同的行内容，所以我的时间是不同的值）：

 Quantity | Batch size | Interval 1000 1 3 10000 1 34 100000 1 368 1000 5 1 10000 5 12 100000 5 133 1000 10 1 10000 10 11 100000 10 101 1000 20 1 10000 20 9 100000 20 92 1000 27 0 10000 27 9 100000 27 92 1000 30 0 10000 30 9 100000 30 92 1000 35 1 10000 35 9 100000 35 94 1000 50 1 10000 50 10 100000 50 106 1000 100 1 10000 100 14 100000 100 141

根据我的结果，批量大小的实际最佳值为30左右。这个数字比10和100都less。问题是，我不知道为什么是最优的，我也不能find合理的解释。

如果Add（）实体依赖于上下文中的其他预加载实体（例如，导航属性），则Dispose（）上下文会产生问题

我使用类似的概念来保持我的语境，以达到相同的性能

但是，而不是Dispose（）上下文并重新创build，我只是分离已经存在的实体SaveChanges（）

 public void AddAndSave<TEntity>(List<TEntity> entities) where TEntity : class { const int CommitCount = 1000; //set your own best performance number here int currentCount = 0; while (currentCount < entities.Count()) { //make sure it don't commit more than the entities you have int commitCount = CommitCount; if ((entities.Count - currentCount) < commitCount) commitCount = entities.Count - currentCount; //eg Add entities [ i = 0 to 999, 1000 to 1999, ... , n to n+999... ] to conext for (int i = currentCount; i < (currentCount + commitCount); i++) _context.Entry(entities[i]).State = System.Data.EntityState.Added; //same as calling _context.Set<TEntity>().Add(entities[i]); //commit entities[n to n+999] to database _context.SaveChanges(); //detach all entities in the context that committed to database //so it won't overload the context for (int i = currentCount; i < (currentCount + commitCount); i++) _context.Entry(entities[i]).State = System.Data.EntityState.Detached; currentCount += commitCount; } }

用try catch和TrasactionScope（）来包装它，如果你需要的话，在这里不要显示它们来保持代码的清洁

正如其他人所说的SqlBulkCopy是如果你想要真正好的插入性能的话。

实现起来有点麻烦，但有些库可以帮助你。有一些在那里，但我会无耻地插上我自己的图书馆： https : //github.com/MikaelEliasson/EntityFramework.Utilities#batch-insert-entities

您需要的唯一代码是：

  using (var db = new YourDbContext()) { EFBatchOperation.For(db, db.BlogPosts).InsertAll(list); }

那快多less呢？很难说，因为它取决于如此多的因素，计算机性能，networking，对象大小等等我做的性能testing表明，25k实体可以插入本地主机上的标准方式大约10s如果你优化你的EFconfiguration像在其他答案中提到。大约300ms的EFUtilities。更有意思的是，我使用这种方法在不到15秒的时间内保存了300万个实体，平均每秒约20万个实体。

如果你需要插入相关的数据，那么一个问题是当然的。这可以使用上面的方法有效地进入sql服务器，但它需要你有一个Id生成策略，可以让你在父代的应用程序代码中生成id，所以你可以设置外键。这可以通过使用GUID或者HiLo id代码来完成。

我会推荐这篇文章如何使用EF进行批量插入。

entity framework和慢批量插入

他探索了这些领域并比较了性能：

默认EF（57分钟完成添加30,000条logging）
用ADO.NET代码代替（相同的30,000为25 秒）
上下文膨胀 – 通过使用每个工作单元的新上下文保持活动的上下文图小（相同的30,000次插入需要33秒）
大列表 – closuresAutoDetectChangesEnabled（将时间缩短到20秒）
批量（至16秒）
DbTable.AddRange（） – （性能在12范围内）

尝试使用存储过程 ，将获得您要插入的数据的XML。

另一个select是使用Nuget提供的SqlBulkTools。它使用起来非常简单，并且具有一些强大的function。

例：

 var bulk = new BulkOperations(); var books = GetBooks(); using (TransactionScope trans = new TransactionScope()) { using (SqlConnection conn = new SqlConnection(ConfigurationManager .ConnectionStrings["SqlBulkToolsTest"].ConnectionString)) { bulk.Setup<Book>() .ForCollection(books) .WithTable("Books") .AddAllColumns() .BulkInsert() .Commit(conn); } trans.Complete(); }

有关更多示例和高级用法，请参阅文档。免责声明：我是这个图书馆的作者，任何意见都是我自己的意见。

我正在寻找插入到entity framework最快的方式

有一些支持批量插入的第三方库可用：

Z.EntityFramework.Extensions（推荐）
EFUtilities
EntityFramework.BulkInsert

请参阅： entity framework批量插入库

select批量插入库时要小心。 只有entity framework扩展支持所有types的关联和inheritance，而且它是唯一支持的。

免责声明 ：我是entity framework扩展的所有者

这个库允许您执行您的场景所需的所有批量操作：

批量SaveChanges
大容量插入
批量删除
批量更新
批量合并

例

 // Easy to use context.BulkSaveChanges(); // Easy to customize context.BulkSaveChanges(bulk => bulk.BatchSize = 100); // Perform Bulk Operations context.BulkDelete(customers); context.BulkInsert(customers); context.BulkUpdate(customers); // Customize Primary Key context.BulkMerge(customers, operation => { operation.ColumnPrimaryKeyExpression = customer => customer.Code; });

据我所知， EntityFramework no BulkInsert来增加巨大插入的性能。

在这种情况下，您可以使用ADO.net中的SqlBulkCopy来解决您的问题

下面是一个使用entity framework和使用SqlBulkCopy类的现实例子之间的性能比较：如何将复杂对象批量插入SQL Server数据库

正如其他人已经强调的那样，ORM并不意味着用于批量操作。它们提供了灵活性，关注点和其他好处的分离，但批量操作（散装阅读除外）不是其中之一。

我已经对@Slauma的例子进行了一个通用的扩展，

 public static class DataExtensions { public static DbContext AddToContext<T>(this DbContext context, object entity, int count, int commitCount, bool recreateContext, Func<DbContext> contextCreator) { context.Set(typeof(T)).Add((T)entity); if (count % commitCount == 0) { context.SaveChanges(); if (recreateContext) { context.Dispose(); context = contextCreator.Invoke(); context.Configuration.AutoDetectChangesEnabled = false; } } return context; } }

用法：

 public void AddEntities(List<YourEntity> entities) { using (var transactionScope = new TransactionScope()) { DbContext context = new YourContext(); int count = 0; foreach (var entity in entities) { ++count; context = context.AddToContext<TenancyNote>(entity, count, 100, true, () => new YourContext()); } context.SaveChanges(); transactionScope.Complete(); } }

使用SqlBulkCopy ：

 void BulkInsert(GpsReceiverTrack[] gpsReceiverTracks) { if (gpsReceiverTracks == null) { throw new ArgumentNullException(nameof(gpsReceiverTracks)); } DataTable dataTable = new DataTable("GpsReceiverTracks"); dataTable.Columns.Add("ID", typeof(int)); dataTable.Columns.Add("DownloadedTrackID", typeof(int)); dataTable.Columns.Add("Time", typeof(TimeSpan)); dataTable.Columns.Add("Latitude", typeof(double)); dataTable.Columns.Add("Longitude", typeof(double)); dataTable.Columns.Add("Altitude", typeof(double)); for (int i = 0; i < gpsReceiverTracks.Length; i++) { dataTable.Rows.Add ( new object[] { gpsReceiverTracks[i].ID, gpsReceiverTracks[i].DownloadedTrackID, gpsReceiverTracks[i].Time, gpsReceiverTracks[i].Latitude, gpsReceiverTracks[i].Longitude, gpsReceiverTracks[i].Altitude } ); } string connectionString = (new TeamTrackerEntities()).Database.Connection.ConnectionString; using (var connection = new SqlConnection(connectionString)) { connection.Open(); using (var transaction = connection.BeginTransaction()) { using (var sqlBulkCopy = new SqlBulkCopy(connection, SqlBulkCopyOptions.TableLock, transaction)) { sqlBulkCopy.DestinationTableName = dataTable.TableName; foreach (DataColumn column in dataTable.Columns) { sqlBulkCopy.ColumnMappings.Add(column.ColumnName, column.ColumnName); } sqlBulkCopy.WriteToServer(dataTable); } transaction.Commit(); } } return; }

秘密是插入一个相同的空白临时表。插入物快速地照亮。然后从那里运行一个插入到主大表中。然后截断准备下一批的临时表。

即。

 insert into some_staging_table using Entity Framework. -- Single insert into main table (this could be a tiny stored proc call) insert into some_main_already_large_table (columns...) select (columns...) from some_staging_table truncate table some_staging_table

你有没有试过插入一个后台工作人员或任务？

在我的情况下，即时插入7760个寄存器，分布在182个不同的表与外键关系（由NavigationProperties）。

没有任务，花了2分半钟。在一个任务（ Task.Factory.StartNew(...) ）中，花了15秒。

我只将SaveChanges()添加到上下文后所有的实体。（确保数据完整性）

所有这里写的解决scheme都没有帮助，因为当你执行SaveChanges（）时，插入语句被逐个发送到数据库，这就是Entity的工作方式。

如果你的数据库和返回的时间是50毫秒，那么插入所需的时间是logging数×50毫秒。

你必须使用BulkInsert，这里是链接： https ：//efbulkinsert.codeplex.com/

我使用它将插入时间从5-6分钟减less到10-12秒。

您可以使用散装包库。 Bulk Insert 1.0.0版本用于Entity框架> = 6.0.0的项目中。

更多的描述可以在这里find – 批量操作源代码

[POSTGRESQL的新解决scheme]嘿，我知道这是一个相当古老的post，但我最近遇到类似的问题，但我们正在使用Postgresql。我想用有效的bulkinsert，原来是相当困难的。我还没有find任何适当的免费图书馆这样做在这个数据库。我只find了这个帮手： https ：//bytefish.de/blog/postgresql_bulk_insert/这也是在Nuget。我写了一个小的映射器，它自动映射属性的方式entity framework：

 public static PostgreSQLCopyHelper<T> CreateHelper<T>(string schemaName, string tableName) { var helper = new PostgreSQLCopyHelper<T>("dbo", "\"" + tableName + "\""); var properties = typeof(T).GetProperties(); foreach(var prop in properties) { var type = prop.PropertyType; if (Attribute.IsDefined(prop, typeof(KeyAttribute)) || Attribute.IsDefined(prop, typeof(ForeignKeyAttribute))) continue; switch (type) { case Type intType when intType == typeof(int) || intType == typeof(int?): { helper = helper.MapInteger("\"" + prop.Name + "\"", x => (int?)typeof(T).GetProperty(prop.Name).GetValue(x, null)); break; } case Type stringType when stringType == typeof(string): { helper = helper.MapText("\"" + prop.Name + "\"", x => (string)typeof(T).GetProperty(prop.Name).GetValue(x, null)); break; } case Type dateType when dateType == typeof(DateTime) || dateType == typeof(DateTime?): { helper = helper.MapTimeStamp("\"" + prop.Name + "\"", x => (DateTime?)typeof(T).GetProperty(prop.Name).GetValue(x, null)); break; } case Type decimalType when decimalType == typeof(decimal) || decimalType == typeof(decimal?): { helper = helper.MapMoney("\"" + prop.Name + "\"", x => (decimal?)typeof(T).GetProperty(prop.Name).GetValue(x, null)); break; } case Type doubleType when doubleType == typeof(double) || doubleType == typeof(double?): { helper = helper.MapDouble("\"" + prop.Name + "\"", x => (double?)typeof(T).GetProperty(prop.Name).GetValue(x, null)); break; } case Type floatType when floatType == typeof(float) || floatType == typeof(float?): { helper = helper.MapReal("\"" + prop.Name + "\"", x => (float?)typeof(T).GetProperty(prop.Name).GetValue(x, null)); break; } case Type guidType when guidType == typeof(Guid): { helper = helper.MapUUID("\"" + prop.Name + "\"", x => (Guid)typeof(T).GetProperty(prop.Name).GetValue(x, null)); break; } } } return helper; }

我以下面的方式使用它（我有一个名为Undertaking的实体）：

 var undertakingHelper = BulkMapper.CreateHelper<Model.Undertaking>("dbo", nameof(Model.Undertaking)); undertakingHelper.SaveAll(transaction.UnderlyingTransaction.Connection as Npgsql.NpgsqlConnection, undertakingsToAdd));

我用事务展示了一个例子，但是也可以使用从上下文检索的正常连接来完成。 undertakingsToAdd是可以枚举的正常实体logging，我想批量插入到DB中。

经过几个小时的研究和尝试，我已经得到了这个解决scheme，正如你所期望的那样，速度更快，最终易于使用和免费！我真的build议你使用这个解决scheme，不仅仅是出于上面提到的原因，还因为它是唯一一个和Postgresql本身没有问题的解决scheme，许多其他解决scheme在SqlServer上完美地工作。

首先 – 当在发布而不是在debugging中编译项目时，它的工作速度要快得多（大约是10倍）

其次 – 如果存在严重的性能问题，请将代码隔离，并使用Table-Valued-Parameters将其重写为ADO。它会工作得更快。

在entity framework中插入的最快方式

什么是__stdcall？

目标文件中未parsing的外部符号

索引（从零开始）必须大于或等于零

.net中的交易

如何sorting在C中的结构数组？

什么是一个std :: map扩展初始化列表看起来像？

在entity framework中更新排除属性

在C ++中通过引用传递参数时的默认值

使用C ++基类构造函数？

Visual Studio 2005 Express在哪里？