用AVAssetReader绘制波形

我使用assetUrl(在代码中,它名为audioUrl)从iPod库中读取歌曲,我可以用很多方法播放,我可以剪切它,我可以做一些处理,但是…我真的不明白我会怎么做CMSampleBufferRef获取绘制波形的数据! 我需要关于峰值的信息,我怎么能得到这个(也许是另一种)的方式?

AVAssetTrack * songTrack = [audioUrl.tracks objectAtIndex:0]; AVAssetReaderTrackOutput * output = [[AVAssetReaderTrackOutput alloc] initWithTrack:songTrack outputSettings:nil]; [reader addOutput:output]; [output release]; NSMutableData * fullSongData = [[NSMutableData alloc] init]; [reader startReading]; while (reader.status == AVAssetReaderStatusReading){ AVAssetReaderTrackOutput * trackOutput = (AVAssetReaderTrackOutput *)[reader.outputs objectAtIndex:0]; CMSampleBufferRef sampleBufferRef = [trackOutput copyNextSampleBuffer]; if (sampleBufferRef){/* what I gonna do with this? */} 

请帮帮我!

我正在寻找一个类似的东西,并决定“滚我自己”。 我意识到这是一个旧的post,但如果其他人在寻找这个,这是我的解决scheme。 它是相对快速和肮脏,规范化的形象“全面”。 它创build的图像是“宽”,即你需要把他们在一个UIScrollView或以其他方式pipe理显示。

这是基于对这个问题的一些答案

示例输出

采样波形

编辑:我已经添加了平均和呈现方法的对数版本,请参阅此消息的结束替代版本和比较输出。 我个人更喜欢原始的线性版本,但决定发布它,以防有人可以改进使用的algorithm。

您将需要这些导入:

 #import <MediaPlayer/MediaPlayer.h> #import <AVFoundation/AVFoundation.h> 

首先,一种通用的渲染方法,它需要一个指向平均样本数据的指针,
并返回一个UIImage。 注意这些样本不是可播放的audio样本。

 -(UIImage *) audioImageGraph:(SInt16 *) samples normalizeMax:(SInt16) normalizeMax sampleCount:(NSInteger) sampleCount channelCount:(NSInteger) channelCount imageHeight:(float) imageHeight { CGSize imageSize = CGSizeMake(sampleCount, imageHeight); UIGraphicsBeginImageContext(imageSize); CGContextRef context = UIGraphicsGetCurrentContext(); CGContextSetFillColorWithColor(context, [UIColor blackColor].CGColor); CGContextSetAlpha(context,1.0); CGRect rect; rect.size = imageSize; rect.origin.x = 0; rect.origin.y = 0; CGColorRef leftcolor = [[UIColor whiteColor] CGColor]; CGColorRef rightcolor = [[UIColor redColor] CGColor]; CGContextFillRect(context, rect); CGContextSetLineWidth(context, 1.0); float halfGraphHeight = (imageHeight / 2) / (float) channelCount ; float centerLeft = halfGraphHeight; float centerRight = (halfGraphHeight*3) ; float sampleAdjustmentFactor = (imageHeight/ (float) channelCount) / (float) normalizeMax; for (NSInteger intSample = 0 ; intSample < sampleCount ; intSample ++ ) { SInt16 left = *samples++; float pixels = (float) left; pixels *= sampleAdjustmentFactor; CGContextMoveToPoint(context, intSample, centerLeft-pixels); CGContextAddLineToPoint(context, intSample, centerLeft+pixels); CGContextSetStrokeColorWithColor(context, leftcolor); CGContextStrokePath(context); if (channelCount==2) { SInt16 right = *samples++; float pixels = (float) right; pixels *= sampleAdjustmentFactor; CGContextMoveToPoint(context, intSample, centerRight - pixels); CGContextAddLineToPoint(context, intSample, centerRight + pixels); CGContextSetStrokeColorWithColor(context, rightcolor); CGContextStrokePath(context); } } // Create new image UIImage *newImage = UIGraphicsGetImageFromCurrentImageContext(); // Tidy up UIGraphicsEndImageContext(); return newImage; } 

接下来是一个采用AVURLAsset的方法,并返回PNG图像数据

 - (NSData *) renderPNGAudioPictogramForAsset:(AVURLAsset *)songAsset { NSError * error = nil; AVAssetReader * reader = [[AVAssetReader alloc] initWithAsset:songAsset error:&error]; AVAssetTrack * songTrack = [songAsset.tracks objectAtIndex:0]; NSDictionary* outputSettingsDict = [[NSDictionary alloc] initWithObjectsAndKeys: [NSNumber numberWithInt:kAudioFormatLinearPCM],AVFormatIDKey, // [NSNumber numberWithInt:44100.0],AVSampleRateKey, /*Not Supported*/ // [NSNumber numberWithInt: 2],AVNumberOfChannelsKey, /*Not Supported*/ [NSNumber numberWithInt:16],AVLinearPCMBitDepthKey, [NSNumber numberWithBool:NO],AVLinearPCMIsBigEndianKey, [NSNumber numberWithBool:NO],AVLinearPCMIsFloatKey, [NSNumber numberWithBool:NO],AVLinearPCMIsNonInterleaved, nil]; AVAssetReaderTrackOutput* output = [[AVAssetReaderTrackOutput alloc] initWithTrack:songTrack outputSettings:outputSettingsDict]; [reader addOutput:output]; [output release]; UInt32 sampleRate,channelCount; NSArray* formatDesc = songTrack.formatDescriptions; for(unsigned int i = 0; i < [formatDesc count]; ++i) { CMAudioFormatDescriptionRef item = (CMAudioFormatDescriptionRef)[formatDesc objectAtIndex:i]; const AudioStreamBasicDescription* fmtDesc = CMAudioFormatDescriptionGetStreamBasicDescription (item); if(fmtDesc ) { sampleRate = fmtDesc->mSampleRate; channelCount = fmtDesc->mChannelsPerFrame; // NSLog(@"channels:%u, bytes/packet: %u, sampleRate %f",fmtDesc->mChannelsPerFrame, fmtDesc->mBytesPerPacket,fmtDesc->mSampleRate); } } UInt32 bytesPerSample = 2 * channelCount; SInt16 normalizeMax = 0; NSMutableData * fullSongData = [[NSMutableData alloc] init]; [reader startReading]; UInt64 totalBytes = 0; SInt64 totalLeft = 0; SInt64 totalRight = 0; NSInteger sampleTally = 0; NSInteger samplesPerPixel = sampleRate / 50; while (reader.status == AVAssetReaderStatusReading){ AVAssetReaderTrackOutput * trackOutput = (AVAssetReaderTrackOutput *)[reader.outputs objectAtIndex:0]; CMSampleBufferRef sampleBufferRef = [trackOutput copyNextSampleBuffer]; if (sampleBufferRef){ CMBlockBufferRef blockBufferRef = CMSampleBufferGetDataBuffer(sampleBufferRef); size_t length = CMBlockBufferGetDataLength(blockBufferRef); totalBytes += length; NSAutoreleasePool *wader = [[NSAutoreleasePool alloc] init]; NSMutableData * data = [NSMutableData dataWithLength:length]; CMBlockBufferCopyDataBytes(blockBufferRef, 0, length, data.mutableBytes); SInt16 * samples = (SInt16 *) data.mutableBytes; int sampleCount = length / bytesPerSample; for (int i = 0; i < sampleCount ; i ++) { SInt16 left = *samples++; totalLeft += left; SInt16 right; if (channelCount==2) { right = *samples++; totalRight += right; } sampleTally++; if (sampleTally > samplesPerPixel) { left = totalLeft / sampleTally; SInt16 fix = abs(left); if (fix > normalizeMax) { normalizeMax = fix; } [fullSongData appendBytes:&left length:sizeof(left)]; if (channelCount==2) { right = totalRight / sampleTally; SInt16 fix = abs(right); if (fix > normalizeMax) { normalizeMax = fix; } [fullSongData appendBytes:&right length:sizeof(right)]; } totalLeft = 0; totalRight = 0; sampleTally = 0; } } [wader drain]; CMSampleBufferInvalidate(sampleBufferRef); CFRelease(sampleBufferRef); } } NSData * finalData = nil; if (reader.status == AVAssetReaderStatusFailed || reader.status == AVAssetReaderStatusUnknown){ // Something went wrong. return nil return nil; } if (reader.status == AVAssetReaderStatusCompleted){ NSLog(@"rendering output graphics using normalizeMax %d",normalizeMax); UIImage *test = [self audioImageGraph:(SInt16 *) fullSongData.bytes normalizeMax:normalizeMax sampleCount:fullSongData.length / 4 channelCount:2 imageHeight:100]; finalData = imageToData(test); } [fullSongData release]; [reader release]; return finalData; } 

高级选项:最后,如果您希望能够使用AVAudioPlayer播放audio,则需要将其caching到应用程序的包caching文件夹中。 由于我这样做,我决定也caching图像数据,并把整个东西包装成一个UIImage类别。 你需要包含这个开源的产品来提取audio,还有一些代码来处理一些后台线程function。

首先定义了一些处理path名的generics类方法,

 //#define imgExt @"jpg" //#define imageToData(x) UIImageJPEGRepresentation(x,4) #define imgExt @"png" #define imageToData(x) UIImagePNGRepresentation(x) + (NSString *) assetCacheFolder { NSArray *assetFolderRoot = NSSearchPathForDirectoriesInDomains(NSCachesDirectory, NSUserDomainMask, YES); return [NSString stringWithFormat:@"%@/audio", [assetFolderRoot objectAtIndex:0]]; } + (NSString *) cachedAudioPictogramPathForMPMediaItem:(MPMediaItem*) item { NSString *assetFolder = [[self class] assetCacheFolder]; NSNumber * libraryId = [item valueForProperty:MPMediaItemPropertyPersistentID]; NSString *assetPictogramFilename = [NSString stringWithFormat:@"asset_%@.%@",libraryId,imgExt]; return [NSString stringWithFormat:@"%@/%@", assetFolder, assetPictogramFilename]; } + (NSString *) cachedAudioFilepathForMPMediaItem:(MPMediaItem*) item { NSString *assetFolder = [[self class] assetCacheFolder]; NSURL * assetURL = [item valueForProperty:MPMediaItemPropertyAssetURL]; NSNumber * libraryId = [item valueForProperty:MPMediaItemPropertyPersistentID]; NSString *assetFileExt = [[[assetURL path] lastPathComponent] pathExtension]; NSString *assetFilename = [NSString stringWithFormat:@"asset_%@.%@",libraryId,assetFileExt]; return [NSString stringWithFormat:@"%@/%@", assetFolder, assetFilename]; } + (NSURL *) cachedAudioURLForMPMediaItem:(MPMediaItem*) item { NSString *assetFilepath = [[self class] cachedAudioFilepathForMPMediaItem:item]; return [NSURL fileURLWithPath:assetFilepath]; } 

现在做“业务”的init方法,

 - (id) initWithMPMediaItem:(MPMediaItem*) item completionBlock:(void (^)(UIImage* delayedImagePreparation))completionBlock { NSFileManager *fman = [NSFileManager defaultManager]; NSString *assetPictogramFilepath = [[self class] cachedAudioPictogramPathForMPMediaItem:item]; if ([fman fileExistsAtPath:assetPictogramFilepath]) { NSLog(@"Returning cached waveform pictogram: %@",[assetPictogramFilepath lastPathComponent]); self = [self initWithContentsOfFile:assetPictogramFilepath]; return self; } NSString *assetFilepath = [[self class] cachedAudioFilepathForMPMediaItem:item]; NSURL *assetFileURL = [NSURL fileURLWithPath:assetFilepath]; if ([fman fileExistsAtPath:assetFilepath]) { NSLog(@"scanning cached audio data to create UIImage file: %@",[assetFilepath lastPathComponent]); [assetFileURL retain]; [assetPictogramFilepath retain]; [NSThread MCSM_performBlockInBackground: ^{ AVURLAsset *asset = [[AVURLAsset alloc] initWithURL:assetFileURL options:nil]; NSData *waveFormData = [self renderPNGAudioPictogramForAsset:asset]; [waveFormData writeToFile:assetPictogramFilepath atomically:YES]; [assetFileURL release]; [assetPictogramFilepath release]; if (completionBlock) { [waveFormData retain]; [NSThread MCSM_performBlockOnMainThread:^{ UIImage *result = [UIImage imageWithData:waveFormData]; NSLog(@"returning rendered pictogram on main thread (%d bytes %@ data in UIImage %0.0fx %0.0f pixels)",waveFormData.length,[imgExt uppercaseString],result.size.width,result.size.height); completionBlock(result); [waveFormData release]; }]; } }]; return nil; } else { NSString *assetFolder = [[self class] assetCacheFolder]; [fman createDirectoryAtPath:assetFolder withIntermediateDirectories:YES attributes:nil error:nil]; NSLog(@"Preparing to import audio asset data %@",[assetFilepath lastPathComponent]); [assetPictogramFilepath retain]; [assetFileURL retain]; TSLibraryImport* import = [[TSLibraryImport alloc] init]; NSURL * assetURL = [item valueForProperty:MPMediaItemPropertyAssetURL]; [import importAsset:assetURL toURL:assetFileURL completionBlock:^(TSLibraryImport* import) { //check the status and error properties of //TSLibraryImport if (import.error) { NSLog (@"audio data import failed:%@",import.error); } else{ NSLog (@"Creating waveform pictogram file: %@", [assetPictogramFilepath lastPathComponent]); AVURLAsset *asset = [[AVURLAsset alloc] initWithURL:assetFileURL options:nil]; NSData *waveFormData = [self renderPNGAudioPictogramForAsset:asset]; [waveFormData writeToFile:assetPictogramFilepath atomically:YES]; if (completionBlock) { [waveFormData retain]; [NSThread MCSM_performBlockOnMainThread:^{ UIImage *result = [UIImage imageWithData:waveFormData]; NSLog(@"returning rendered pictogram on main thread (%d bytes %@ data in UIImage %0.0fx %0.0f pixels)",waveFormData.length,[imgExt uppercaseString],result.size.width,result.size.height); completionBlock(result); [waveFormData release]; }]; } } [assetPictogramFilepath release]; [assetFileURL release]; } ]; return nil; } } 

调用这个的一个例子:

 -(void) importMediaItem { MPMediaItem* item = [self mediaItem]; // since we will be needing this for playback, save the url to the cached audio. [url release]; url = [[UIImage cachedAudioURLForMPMediaItem:item] retain]; [waveFormImage release]; waveFormImage = [[UIImage alloc ] initWithMPMediaItem:item completionBlock:^(UIImage* delayedImagePreparation){ waveFormImage = [delayedImagePreparation retain]; [self displayWaveFormImage]; }]; if (waveFormImage) { [waveFormImage retain]; [self displayWaveFormImage]; } } 

对数版本的平均和渲染方法

 #define absX(x) (x<0?0-x:x) #define minMaxX(x,mn,mx) (x<=mn?mn:(x>=mx?mx:x)) #define noiseFloor (-90.0) #define decibel(amplitude) (20.0 * log10(absX(amplitude)/32767.0)) -(UIImage *) audioImageLogGraph:(Float32 *) samples normalizeMax:(Float32) normalizeMax sampleCount:(NSInteger) sampleCount channelCount:(NSInteger) channelCount imageHeight:(float) imageHeight { CGSize imageSize = CGSizeMake(sampleCount, imageHeight); UIGraphicsBeginImageContext(imageSize); CGContextRef context = UIGraphicsGetCurrentContext(); CGContextSetFillColorWithColor(context, [UIColor blackColor].CGColor); CGContextSetAlpha(context,1.0); CGRect rect; rect.size = imageSize; rect.origin.x = 0; rect.origin.y = 0; CGColorRef leftcolor = [[UIColor whiteColor] CGColor]; CGColorRef rightcolor = [[UIColor redColor] CGColor]; CGContextFillRect(context, rect); CGContextSetLineWidth(context, 1.0); float halfGraphHeight = (imageHeight / 2) / (float) channelCount ; float centerLeft = halfGraphHeight; float centerRight = (halfGraphHeight*3) ; float sampleAdjustmentFactor = (imageHeight/ (float) channelCount) / (normalizeMax - noiseFloor) / 2; for (NSInteger intSample = 0 ; intSample < sampleCount ; intSample ++ ) { Float32 left = *samples++; float pixels = (left - noiseFloor) * sampleAdjustmentFactor; CGContextMoveToPoint(context, intSample, centerLeft-pixels); CGContextAddLineToPoint(context, intSample, centerLeft+pixels); CGContextSetStrokeColorWithColor(context, leftcolor); CGContextStrokePath(context); if (channelCount==2) { Float32 right = *samples++; float pixels = (right - noiseFloor) * sampleAdjustmentFactor; CGContextMoveToPoint(context, intSample, centerRight - pixels); CGContextAddLineToPoint(context, intSample, centerRight + pixels); CGContextSetStrokeColorWithColor(context, rightcolor); CGContextStrokePath(context); } } // Create new image UIImage *newImage = UIGraphicsGetImageFromCurrentImageContext(); // Tidy up UIGraphicsEndImageContext(); return newImage; } - (NSData *) renderPNGAudioPictogramLogForAsset:(AVURLAsset *)songAsset { NSError * error = nil; AVAssetReader * reader = [[AVAssetReader alloc] initWithAsset:songAsset error:&error]; AVAssetTrack * songTrack = [songAsset.tracks objectAtIndex:0]; NSDictionary* outputSettingsDict = [[NSDictionary alloc] initWithObjectsAndKeys: [NSNumber numberWithInt:kAudioFormatLinearPCM],AVFormatIDKey, // [NSNumber numberWithInt:44100.0],AVSampleRateKey, /*Not Supported*/ // [NSNumber numberWithInt: 2],AVNumberOfChannelsKey, /*Not Supported*/ [NSNumber numberWithInt:16],AVLinearPCMBitDepthKey, [NSNumber numberWithBool:NO],AVLinearPCMIsBigEndianKey, [NSNumber numberWithBool:NO],AVLinearPCMIsFloatKey, [NSNumber numberWithBool:NO],AVLinearPCMIsNonInterleaved, nil]; AVAssetReaderTrackOutput* output = [[AVAssetReaderTrackOutput alloc] initWithTrack:songTrack outputSettings:outputSettingsDict]; [reader addOutput:output]; [output release]; UInt32 sampleRate,channelCount; NSArray* formatDesc = songTrack.formatDescriptions; for(unsigned int i = 0; i < [formatDesc count]; ++i) { CMAudioFormatDescriptionRef item = (CMAudioFormatDescriptionRef)[formatDesc objectAtIndex:i]; const AudioStreamBasicDescription* fmtDesc = CMAudioFormatDescriptionGetStreamBasicDescription (item); if(fmtDesc ) { sampleRate = fmtDesc->mSampleRate; channelCount = fmtDesc->mChannelsPerFrame; // NSLog(@"channels:%u, bytes/packet: %u, sampleRate %f",fmtDesc->mChannelsPerFrame, fmtDesc->mBytesPerPacket,fmtDesc->mSampleRate); } } UInt32 bytesPerSample = 2 * channelCount; Float32 normalizeMax = noiseFloor; NSLog(@"normalizeMax = %f",normalizeMax); NSMutableData * fullSongData = [[NSMutableData alloc] init]; [reader startReading]; UInt64 totalBytes = 0; Float64 totalLeft = 0; Float64 totalRight = 0; Float32 sampleTally = 0; NSInteger samplesPerPixel = sampleRate / 50; while (reader.status == AVAssetReaderStatusReading){ AVAssetReaderTrackOutput * trackOutput = (AVAssetReaderTrackOutput *)[reader.outputs objectAtIndex:0]; CMSampleBufferRef sampleBufferRef = [trackOutput copyNextSampleBuffer]; if (sampleBufferRef){ CMBlockBufferRef blockBufferRef = CMSampleBufferGetDataBuffer(sampleBufferRef); size_t length = CMBlockBufferGetDataLength(blockBufferRef); totalBytes += length; NSAutoreleasePool *wader = [[NSAutoreleasePool alloc] init]; NSMutableData * data = [NSMutableData dataWithLength:length]; CMBlockBufferCopyDataBytes(blockBufferRef, 0, length, data.mutableBytes); SInt16 * samples = (SInt16 *) data.mutableBytes; int sampleCount = length / bytesPerSample; for (int i = 0; i < sampleCount ; i ++) { Float32 left = (Float32) *samples++; left = decibel(left); left = minMaxX(left,noiseFloor,0); totalLeft += left; Float32 right; if (channelCount==2) { right = (Float32) *samples++; right = decibel(right); right = minMaxX(right,noiseFloor,0); totalRight += right; } sampleTally++; if (sampleTally > samplesPerPixel) { left = totalLeft / sampleTally; if (left > normalizeMax) { normalizeMax = left; } // NSLog(@"left average = %f, normalizeMax = %f",left,normalizeMax); [fullSongData appendBytes:&left length:sizeof(left)]; if (channelCount==2) { right = totalRight / sampleTally; if (right > normalizeMax) { normalizeMax = right; } [fullSongData appendBytes:&right length:sizeof(right)]; } totalLeft = 0; totalRight = 0; sampleTally = 0; } } [wader drain]; CMSampleBufferInvalidate(sampleBufferRef); CFRelease(sampleBufferRef); } } NSData * finalData = nil; if (reader.status == AVAssetReaderStatusFailed || reader.status == AVAssetReaderStatusUnknown){ // Something went wrong. Handle it. } if (reader.status == AVAssetReaderStatusCompleted){ // You're done. It worked. NSLog(@"rendering output graphics using normalizeMax %f",normalizeMax); UIImage *test = [self audioImageLogGraph:(Float32 *) fullSongData.bytes normalizeMax:normalizeMax sampleCount:fullSongData.length / (sizeof(Float32) * 2) channelCount:2 imageHeight:100]; finalData = imageToData(test); } [fullSongData release]; [reader release]; return finalData; } 

比较输出

线性
Acme Swing公司开始“温暖起来”的线性图

对数的
由Acme Swing Company开始的“温暖起来”的对数图

您应该能够从sampleBuffRef获取audio缓冲区,然后遍历这些值来构build波形:

 CMBlockBufferRef buffer = CMSampleBufferGetDataBuffer( sampleBufferRef ); CMItemCount numSamplesInBuffer = CMSampleBufferGetNumSamples(sampleBufferRef); AudioBufferList audioBufferList; CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer( sampleBufferRef, NULL, &audioBufferList, sizeof(audioBufferList), NULL, NULL, kCMSampleBufferFlag_AudioBufferList_Assure16ByteAlignment, &buffer ); // this copies your audio out to a temp buffer but you should be able to iterate through this buffer instead SInt32* readBuffer = (SInt32 *)malloc(numSamplesInBuffer * sizeof(SInt32)); memcpy( readBuffer, audioBufferList.mBuffers[0].mData, numSamplesInBuffer*sizeof(SInt32));