使用VBA在excel中融化/重塑?

我正在调整一个新的工作,我与同事分享的大部分工作都是通过MS Excel进行的。 我经常使用数据透视表,因此需要“堆叠”的数据,就是我为此所依赖的R中reshape (reshape2)包中的melt()函数的输出。

任何人都可以让我开始在VBAmacros来完成这个,还是已经存在?

macros观纲要是:

  1. 在Excel工作簿中select一个单元格范围。
  2. 开始“融化”的macros。
  3. macros将创build一个提示,“inputID列的数量”,在那里你可以input识别信息的前几列。 (例如下面的R代码是4)。
  4. 在标题为“melt”的excel文件中创build一个新的工作表,用于堆叠数据,并创build一个名为“variable”的新列,与原始select的数据列标题相同。

换句话说,输出看起来和R中简单执行这两行的输出完全一样:

 require(reshape) melt(your.unstacked.dataframe, id.vars = 1:4) 

这是一个例子:

 # unstacked data > df1 Year Month Country Sport No_wins No_losses High_score Total_games 2 2010 5 USA Soccer 4 3 5 9 3 2010 6 USA Soccer 5 3 4 8 4 2010 5 CAN Soccer 2 9 7 11 5 2010 6 CAN Soccer 4 8 4 13 6 2009 5 USA Soccer 8 1 4 9 7 2009 6 USA Soccer 0 0 3 2 8 2009 5 CAN Soccer 2 0 6 3 9 2009 6 CAN Soccer 3 0 8 3 # stacking the data > require(reshape) > melt(df1, id.vars=1:4) Year Month Country Sport variable value 1 2010 5 USA Soccer No_wins 4 2 2010 6 USA Soccer No_wins 5 3 2010 5 CAN Soccer No_wins 2 4 2010 6 CAN Soccer No_wins 4 5 2009 5 USA Soccer No_wins 8 6 2009 6 USA Soccer No_wins 0 7 2009 5 CAN Soccer No_wins 2 8 2009 6 CAN Soccer No_wins 3 9 2010 5 USA Soccer No_losses 3 10 2010 6 USA Soccer No_losses 3 11 2010 5 CAN Soccer No_losses 9 12 2010 6 CAN Soccer No_losses 8 13 2009 5 USA Soccer No_losses 1 14 2009 6 USA Soccer No_losses 0 15 2009 5 CAN Soccer No_losses 0 16 2009 6 CAN Soccer No_losses 0 17 2010 5 USA Soccer High_score 5 18 2010 6 USA Soccer High_score 4 19 2010 5 CAN Soccer High_score 7 20 2010 6 CAN Soccer High_score 4 21 2009 5 USA Soccer High_score 4 22 2009 6 USA Soccer High_score 3 23 2009 5 CAN Soccer High_score 6 24 2009 6 CAN Soccer High_score 8 25 2010 5 USA Soccer Total_games 9 26 2010 6 USA Soccer Total_games 8 27 2010 5 CAN Soccer Total_games 11 28 2010 6 CAN Soccer Total_games 13 29 2009 5 USA Soccer Total_games 9 30 2009 6 USA Soccer Total_games 2 31 2009 5 CAN Soccer Total_games 3 32 2009 6 CAN Soccer Total_games 3 

我有两篇文章,包括可用的代码和可下载的工作簿,在我的博客上的Excel / VBA中做这个工作:

http://yoursumbuddy.com/data-normalizer

http://yoursumbuddy.com/data-normalizer-the-sql/

代码如下:

 'Arguments 'List: The range to be normalized. 'RepeatingColsCount: The number of columns, starting with the leftmost, ' whose headings remain the same. 'NormalizedColHeader: The column header for the rolled-up category. 'DataColHeader: The column header for the normalized data. 'NewWorkbook: Put the sheet with the data in a new workbook? ' 'NOTE: The data must be in a contiguous range and the 'rows that will be repeated must be to the left, 'with the rows to be normalized to the right. Sub NormalizeList(List As Excel.Range, RepeatingColsCount As Long, _ NormalizedColHeader As String, DataColHeader As String, _ Optional NewWorkbook As Boolean = False) Dim FirstNormalizingCol As Long, NormalizingColsCount As Long Dim ColsToRepeat As Excel.Range, ColsToNormalize As Excel.Range Dim NormalizedRowsCount As Long Dim RepeatingList() As String Dim NormalizedList() As Variant Dim ListIndex As Long, i As Long, j As Long Dim wbSource As Excel.Workbook, wbTarget As Excel.Workbook Dim wsTarget As Excel.Worksheet With List 'If the normalized list won't fit, you must quit. If .Rows.Count * (.Columns.Count - RepeatingColsCount) > .Parent.Rows.Count Then MsgBox "The normalized list will be too many rows.", _ vbExclamation + vbOKOnly, "Sorry" Exit Sub End If 'You have the range to be normalized and the count of leftmost rows to be repeated. 'This section uses those arguments to set the two ranges to parse 'and the two corresponding arrays to fill FirstNormalizingCol = RepeatingColsCount + 1 NormalizingColsCount = .Columns.Count - RepeatingColsCount Set ColsToRepeat = .Cells(1).Resize(.Rows.Count, RepeatingColsCount) Set ColsToNormalize = .Cells(1, FirstNormalizingCol).Resize(.Rows.Count, NormalizingColsCount) NormalizedRowsCount = ColsToNormalize.Columns.Count * .Rows.Count ReDim RepeatingList(1 To NormalizedRowsCount, 1 To RepeatingColsCount) ReDim NormalizedList(1 To NormalizedRowsCount, 1 To 2) End With 'Fill in every i elements of the repeating array with the repeating row labels. For i = 1 To NormalizedRowsCount Step NormalizingColsCount ListIndex = ListIndex + 1 For j = 1 To RepeatingColsCount RepeatingList(i, j) = List.Cells(ListIndex, j).Value2 Next j Next i 'We stepped over most rows above, so fill in other repeating array elements. For i = 1 To NormalizedRowsCount For j = 1 To RepeatingColsCount If RepeatingList(i, j) = "" Then RepeatingList(i, j) = RepeatingList(i - 1, j) End If Next j Next i 'Fill in each element of the first dimension of the normalizing array 'with the former column header (which is now another row label) and the data. With ColsToNormalize For i = 1 To .Rows.Count For j = 1 To .Columns.Count NormalizedList(((i - 1) * NormalizingColsCount) + j, 1) = .Cells(1, j) NormalizedList(((i - 1) * NormalizingColsCount) + j, 2) = .Cells(i, j) Next j Next i End With 'Put the normal data in the same workbook, or a new one. If NewWorkbook Then Set wbTarget = Workbooks.Add Set wsTarget = wbTarget.Worksheets(1) Else Set wbSource = List.Parent.Parent With wbSource.Worksheets Set wsTarget = .Add(after:=.Item(.Count)) End With End If With wsTarget 'Put the data from the two arrays in the new worksheet. .Range("A1").Resize(NormalizedRowsCount, RepeatingColsCount) = RepeatingList .Cells(1, FirstNormalizingCol).Resize(NormalizedRowsCount, 2) = NormalizedList 'At this point there will be repeated header rows, so delete all but one. .Range("1:" & NormalizingColsCount - 1).EntireRow.Delete 'Add the headers for the new label column and the data column. .Cells(1, FirstNormalizingCol).Value = NormalizedColHeader .Cells(1, FirstNormalizingCol + 1).Value = DataColHeader End With End Sub 

你会这样称呼它:

 Sub TestIt() NormalizeList ActiveSheet.UsedRange, 4, "Variable", "Value", False End Sub 

微软最近推出了Power Query,这是一个Excel加载项,它为Excel中的数据操作添加了许多有趣的function和function,包括你在找什么。

加载项中的实际function称为“Unpivot列”,这在本文中进行了解释。 这是它的要点:

  1. 下载并安装加载项
  2. 打开你的Excel / CSV文件
  3. select你想融化/重塑的表/范围
  4. 在“Power Query”选项卡中,单击“From Table”,这将打开“查询编辑器”
  5. select你想融化/重塑的列(ctrl或shift-select,不要拖动)
  6. 在“变换”选项卡中,单击“不透明列”(也可以在返回Excel之前应用其他变换)
  7. 在“主页”标签中点击“closures并加载”。 这将在Excel中创build一个新的表/查询对象,并获得所需的结果。

对于任何寻找可视化方式来规范化Excel数据的人来说,请看这个video教程:

http://www.youtube.com/watch?v=xmqTN0X-AgY

首先创build一个Userform,并用两个RefEdit字段将其命名为Unpivot_Form – rng_id和value_id以及一个submit / gobutton。 我也是一个R用户,rng_id是包含id的范围,而value_id包含值; 这两个范围包含标题。

做两个macros:

 Sub unpivot() Unpivot_Form.Show End Sub 

另一个macros在该字段的提交/去button中:

 Private Sub submit_Click() 'Code to unpivot (convert wide to long for excel) Dim rng_id, rng_id_header, val_id As Range Dim colvar, emptyrow, col As Integer Dim new_sheet As Worksheet 'Put val_id range into a range object Set val_id = Range(value_id.Value) 'Determine the parameter for the value id range 'This is used for the looping later on numrows = val_id.Rows.Count numcols = val_id.Columns.Count 'Resize changes the "block" to the size defined by the row and column 'Offset moves the "block" Set rng_id_header = Range(range_id.Value).Resize(1) Set rng_id = Range(range_id.Value).Offset(1, 0).Resize(numrows - 1) Set new_sheet = Worksheets.Add 'Set up the first column and first batch of id vars new_sheet.Activate Range("A65535").End(xlUp).Activate rng_id_header.Copy ActiveCell colvar = Range("XFD1").End(xlToLeft).Column + 1 Range("XFD1").End(xlToLeft).Offset(, 1).Value = "Variable" Range("XFD1").End(xlToLeft).Offset(, 1).Value = "Value" 'Start populating the value ids For col = 1 To numcols 'populate var_id 'determine last row emptyrow = Range("A65535").End(xlUp).Row + 1 'no need to activate to source to copy rng_id.Copy new_sheet.Cells(emptyrow, 1) 'copy the variable val_id.Offset(, col - 1).Resize(1, 1).Copy new_sheet.Range(Cells(emptyrow, colvar), Cells(emptyrow + numrows - 2, colvar)) 'copy the value val_id.Offset(1, col - 1).Resize(numrows - 1, 1).Copy new_sheet.Range(Cells(emptyrow, colvar + 1), Cells(emptyrow + numrows - 2, colvar + 1)) Next Unload Me End Sub 

请享用!

或使用:

 Sub M_snb_000() With sheet1.Cells(1).CurrentRegion sn = .Resize(, .Columns.Count + 1) End With For j = 4 To UBound(sn, 2) - 1 With Sheet2.Cells(2 + (UBound(sn) - 1) * (j - 4), 1) .Resize(UBound(sn) - 1, 5) = Application.Index(sn, Evaluate("row(2:" & UBound(sn) & ")"), Array(1, 2, 3,UBound(sn, 2), j)) .Resize(UBound(sn) - 1, 1).Offset(, 3) = sn(1, j) End With Next End Sub