如何处理原始的UDP数据包,以便它们可以通过directshow源filter中的解码器filter进行解码

很长的故事:

  1. 有一个H264 / MPEG-4来源
  2. 我可以连接这个来源与RTSP协议。
  3. 我可以通过RTP协议获得原始的UDP数据包。
  4. 然后将这些原始UDP数据包发送到解码器[h264 / mpeg-4] [DS Source Filter]
  5. 但是这些“原始的”UDP数据包不能被解码器[h264 / mpeg-4]filter解码

不久:

如何处理这些原始UDP数据以便通过H264 / MPEG-4解码器filter进行解码? 任何人都可以清楚地确定与H264 / MPEGstream有关的步骤吗?

额外信息:

我可以用FFmpeg做到这一点…我真的不知道如何FFmpeg处理原始数据,以便它可以通过解码器解码。

和平的蛋糕!

1.获取数据

正如我所看到的,你已经知道如何做到这一点(启动RTSP会话,设置一个RTP/AVP/UDP;unicast;传输,并获取用户数据报)…但如果你有疑问,问。

无论是传输(UDP还是TCP),数据格式都是一样的:

  • RTP数据: [RTP Header - 12bytes][Video data]
  • UDP: [RTP Data]
  • TCP: [$ - 1byte][Transport Channel - 1byte][RTP data length - 2bytes][RTP data]

所以要从UDP获取数据,你只需要去掉表示RTP头的前12个字节。 但要小心,你需要它来获得video时间信息,而对于MPEG4来说则是分组信息!

对于TCP,您需要先读取第一个字节,直到获得字节$为止。 然后读下一个字节,这将是以下数据所属的传输通道(当服务器响应SETUP请求时它说: Transport: RTP/AVP/TCP;unicast;interleaved=0-1这意味着VIDEO DATA将具有TRANSPORT_CHANNEL = 0而VIDEO RTCP DATA将具有TRANSPORT_CHANNEL = 1)。 你想得到video数据,所以我们期望0 …然后读取一个短的(2字节),表示后面的RTP数据的长度,所以读取这么多的字节,现在做相同的UDP。

2.解包数据

H264和MPEG4数据通常是分组化的(在SDP中有packetization-mode参数,它们的值分别为0,1和2,每个packetization-mode含义如何,以及如何拆分,你可以看到这里 ),因为有一定的networking限制一个端点可以通过称为MTU的TCP或UDP发送。 通常是1500字节或更less。 所以如果video帧大于(通常是),则需要将其分段(打包)为MTU大小的片段。 这可以通过编码器/stream媒体在TCP和UDP传输上完成,或者你可以在IP上中继来分割和重新组合video帧在另一边……第一个要好得多,如果你想有一个stream畅的video错误UDP和TCP。

H264:为了检查RTP数据(通过UDP到达,还是通过TCP进行交织)保存一个较大的H264video帧的片段,必须知道片段在打包时的外观:

H264碎片

 First byte: [ 3 NAL UNIT BITS | 5 FRAGMENT TYPE BITS] Second byte: [ START BIT | END BIT | RESERVED BIT | 5 NAL UNIT BITS] Other bytes: [... VIDEO FRAGMENT DATA...] 

现在,获取名为Data字节数组中的第一个VIDEO DATA,并获取以下信息:

 int fragment_type = Data[0] & 0x1F; int nal_type = Data[1] & 0x1F; int start_bit = Data[1] & 0x80; int end_bit = Data[1] & 0x40; 

如果fragment_type == 28则其后的video数据代表video帧片段。 接下来的检查是start_bit设置,如果是,则该片段是序列中的第一个片段。 通过从第一个有效载荷字节( 3 NAL UNIT BITS )取前3个位,并用第二个有效载荷字节( 5 NAL UNIT BITS )的最后5个位组合它们来重buildIDR的NAL字节,这样就可以得到这样一个字节[3 NAL UNIT BITS | 5 NAL UNIT BITS] [3 NAL UNIT BITS | 5 NAL UNIT BITS] 。 然后将该NAL字节先写入带有来自该片段的VIDEO FRAGMENT DATA的清除缓冲区。

如果start_bitend_bit是0,则只需将VIDEO FRAGMENT DATA (跳过标识片段的前两个有效载荷字节)写入缓冲区。

如果start_bit是0, end_bit是1,那就意味着它是最后一个片段,你只需要把它的VIDEO FRAGMENT DATA (跳过标识片段的前两个字节)写入缓冲区,现在就可以重buildvideo帧!

请记住,RTP数据在前12个字节中包含RTP头,并且如果帧被分段,则永远不要在碎片整理缓冲区中写入前两个字节,而需要重新构buildNAL字节并将其写入第一个字节。 如果你在这里搞砸了一些东西,那么图片将会是局部的(一半是灰色或者黑色,否则你会看到文物)。

MPEG4:这是一个容易的。 您需要检查RTP头中的MARKER_BIT。 如果video数据表示整个video帧,则该字节被设置为( 1 ),并且video数据的0是一个video帧片段。 所以要拆包,你需要看看什么是MARKER_BIT。 如果它是1那就只是读取video数据字节。

整个框架:

  [MARKER = 1] 

包装框架:

  [MARKER = 0], [MARKER = 0], [MARKER = 0], [MARKER = 1] 

具有MARKER_BIT=0的第一个分组是第一个video帧分段,包括第一个MARKER_BIT=1所有其他分组都是同一个video帧的分段。 所以你需要做的是:

  • MARKER_BIT=0将VIDEO DATA放在MARKER_BIT=0缓冲区中
  • 将下一个VIDEO DATA(其中MARKER_BIT=1放入同一个缓冲区
  • 分包缓冲区现在拥有一个完整的MPEG4帧

3.解码器的处理数据(NAL字节stream)

当你有拆包video帧,你需要做NAL字节stream。 它有以下格式:

  • H264: 0x000001[SPS], 0x000001[PPS], 0x000001[VIDEO FRAME], 0x000001...
  • MPEG4: 0x000001[Visual Object Sequence Start], 0x000001[VIDEO FRAME]

规则:

  • 无论编解码器如何,每帧必须预先加上0x000001 3字节的代码
  • 每个stream必须以configuration信息开始,对于H264,即SPS和PPS帧(SDP中的sprop-parameter-sets ),MPEG4是VOS帧(SDP中的config参数)

所以你需要为H264和MPEG4build立一个configuration缓冲区,前面加上3个字节0x000001 ,先发送,然后将每个拆包的video帧加上相同的3个字节,然后发送给解码器。

如果您需要任何澄清只是评论… 🙂

通过UDP数据包,您可以接收到H.264数据stream,您可以将其解包到H.264 NAL单元中 ,然后您通常会从filter中进入DirectShowpipe道。

NAL单元将被格式化为DirectShow媒体样本,也可能作为媒体types( SPS / PPS NAL单元)的一部分。

拆分步骤在RFC 6184 – H.264video的RTP有效载荷格式中描述。 这是RTPstream量的有效载荷部分,由RFC 3550 – RTP:实时应用传输协议定义。

清楚,但不是很短。

我有这个@ https://net7mma.codeplex.com/的实现;

这是相关的代码

 /// <summary> /// Implements Packetization and Depacketization of packets defined in <see href="https://tools.ietf.org/html/rfc6184">RFC6184</see>. /// </summary> public class RFC6184Frame : Rtp.RtpFrame { /// <summary> /// Emulation Prevention /// </summary> static byte[] NalStart = { 0x00, 0x00, 0x01 }; public RFC6184Frame(byte payloadType) : base(payloadType) { } public RFC6184Frame(Rtp.RtpFrame existing) : base(existing) { } public RFC6184Frame(RFC6184Frame f) : this((Rtp.RtpFrame)f) { Buffer = f.Buffer; } public System.IO.MemoryStream Buffer { get; set; } /// <summary> /// Creates any <see cref="Rtp.RtpPacket"/>'s required for the given nal /// </summary> /// <param name="nal">The nal</param> /// <param name="mtu">The mtu</param> public virtual void Packetize(byte[] nal, int mtu = 1500) { if (nal == null) return; int nalLength = nal.Length; int offset = 0; if (nalLength >= mtu) { //Make a Fragment Indicator with start bit byte[] FUI = new byte[] { (byte)(1 << 7), 0x00 }; bool marker = false; while (offset < nalLength) { //Set the end bit if no more data remains if (offset + mtu > nalLength) { FUI[0] |= (byte)(1 << 6); marker = true; } else if (offset > 0) //For packets other than the start { //No Start, No End FUI[0] = 0; } //Add the packet Add(new Rtp.RtpPacket(2, false, false, marker, PayloadTypeByte, 0, SynchronizationSourceIdentifier, HighestSequenceNumber + 1, 0, FUI.Concat(nal.Skip(offset).Take(mtu)).ToArray())); //Move the offset offset += mtu; } } //Should check for first byte to be 1 - 23? else Add(new Rtp.RtpPacket(2, false, false, true, PayloadTypeByte, 0, SynchronizationSourceIdentifier, HighestSequenceNumber + 1, 0, nal)); } /// <summary> /// Creates <see cref="Buffer"/> with a H.264 RBSP from the contained packets /// </summary> public virtual void Depacketize() { bool sps, pps, sei, slice, idr; Depacketize(out sps, out pps, out sei, out slice, out idr); } /// <summary> /// Parses all contained packets and writes any contained Nal Units in the RBSP to <see cref="Buffer"/>. /// </summary> /// <param name="containsSps">Indicates if a Sequence Parameter Set was found</param> /// <param name="containsPps">Indicates if a Picture Parameter Set was found</param> /// <param name="containsSei">Indicates if Supplementatal Encoder Information was found</param> /// <param name="containsSlice">Indicates if a Slice was found</param> /// <param name="isIdr">Indicates if a IDR Slice was found</param> public virtual void Depacketize(out bool containsSps, out bool containsPps, out bool containsSei, out bool containsSlice, out bool isIdr) { containsSps = containsPps = containsSei = containsSlice = isIdr = false; DisposeBuffer(); this.Buffer = new MemoryStream(); //Get all packets in the frame foreach (Rtp.RtpPacket packet in m_Packets.Values.Distinct()) ProcessPacket(packet, out containsSps, out containsPps, out containsSei, out containsSlice, out isIdr); //Order by DON? this.Buffer.Position = 0; } /// <summary> /// Depacketizes a single packet. /// </summary> /// <param name="packet"></param> /// <param name="containsSps"></param> /// <param name="containsPps"></param> /// <param name="containsSei"></param> /// <param name="containsSlice"></param> /// <param name="isIdr"></param> internal protected virtual void ProcessPacket(Rtp.RtpPacket packet, out bool containsSps, out bool containsPps, out bool containsSei, out bool containsSlice, out bool isIdr) { containsSps = containsPps = containsSei = containsSlice = isIdr = false; //Starting at offset 0 int offset = 0; //Obtain the data of the packet (without source list or padding) byte[] packetData = packet.Coefficients.ToArray(); //Cache the length int count = packetData.Length; //Must have at least 2 bytes if (count <= 2) return; //Determine if the forbidden bit is set and the type of nal from the first byte byte firstByte = packetData[offset]; //bool forbiddenZeroBit = ((firstByte & 0x80) >> 7) != 0; byte nalUnitType = (byte)(firstByte & Common.Binary.FiveBitMaxValue); //o The F bit MUST be cleared if all F bits of the aggregated NAL units are zero; otherwise, it MUST be set. //if (forbiddenZeroBit && nalUnitType <= 23 && nalUnitType > 29) throw new InvalidOperationException("Forbidden Zero Bit is Set."); //Determine what to do switch (nalUnitType) { //Reserved - Ignore case 0: case 30: case 31: { return; } case 24: //STAP - A case 25: //STAP - B case 26: //MTAP - 16 case 27: //MTAP - 24 { //Move to Nal Data ++offset; //Todo Determine if need to Order by DON first. //EAT DON for ALL BUT STAP - A if (nalUnitType != 24) offset += 2; //Consume the rest of the data from the packet while (offset < count) { //Determine the nal unit size which does not include the nal header int tmp_nal_size = Common.Binary.Read16(packetData, offset, BitConverter.IsLittleEndian); offset += 2; //If the nal had data then write it if (tmp_nal_size > 0) { //For DOND and TSOFFSET switch (nalUnitType) { case 25:// MTAP - 16 { //SKIP DOND and TSOFFSET offset += 3; goto default; } case 26:// MTAP - 24 { //SKIP DOND and TSOFFSET offset += 4; goto default; } default: { //Read the nal header but don't move the offset byte nalHeader = (byte)(packetData[offset] & Common.Binary.FiveBitMaxValue); if (nalHeader > 5) { if (nalHeader == 6) { Buffer.WriteByte(0); containsSei = true; } else if (nalHeader == 7) { Buffer.WriteByte(0); containsPps = true; } else if (nalHeader == 8) { Buffer.WriteByte(0); containsSps = true; } } if (nalHeader == 1) containsSlice = true; if (nalHeader == 5) isIdr = true; //Done reading break; } } //Write the start code Buffer.Write(NalStart, 0, 3); //Write the nal header and data Buffer.Write(packetData, offset, tmp_nal_size); //Move the offset past the nal offset += tmp_nal_size; } } return; } case 28: //FU - A case 29: //FU - B { /* Informative note: When an FU-A occurs in interleaved mode, it always follows an FU-B, which sets its DON. * Informative note: If a transmitter wants to encapsulate a single NAL unit per packet and transmit packets out of their decoding order, STAP-B packet type can be used. */ //Need 2 bytes if (count > 2) { //Read the Header byte FUHeader = packetData[++offset]; bool Start = ((FUHeader & 0x80) >> 7) > 0; //bool End = ((FUHeader & 0x40) >> 6) > 0; //bool Receiver = (FUHeader & 0x20) != 0; //if (Receiver) throw new InvalidOperationException("Receiver Bit Set"); //Move to data ++offset; //Todo Determine if need to Order by DON first. //DON Present in FU - B if (nalUnitType == 29) offset += 2; //Determine the fragment size int fragment_size = count - offset; //If the size was valid if (fragment_size > 0) { //If the start bit was set if (Start) { //Reconstruct the nal header //Use the first 3 bits of the first byte and last 5 bites of the FU Header byte nalHeader = (byte)((firstByte & 0xE0) | (FUHeader & Common.Binary.FiveBitMaxValue)); //Could have been SPS / PPS / SEI if (nalHeader > 5) { if (nalHeader == 6) { Buffer.WriteByte(0); containsSei = true; } else if (nalHeader == 7) { Buffer.WriteByte(0); containsPps = true; } else if (nalHeader == 8) { Buffer.WriteByte(0); containsSps = true; } } if (nalHeader == 1) containsSlice = true; if (nalHeader == 5) isIdr = true; //Write the start code Buffer.Write(NalStart, 0, 3); //Write the re-construced header Buffer.WriteByte(nalHeader); } //Write the data of the fragment. Buffer.Write(packetData, offset, fragment_size); } } return; } default: { // 6 SEI, 7 and 8 are SPS and PPS if (nalUnitType > 5) { if (nalUnitType == 6) { Buffer.WriteByte(0); containsSei = true; } else if (nalUnitType == 7) { Buffer.WriteByte(0); containsPps = true; } else if (nalUnitType == 8) { Buffer.WriteByte(0); containsSps = true; } } if (nalUnitType == 1) containsSlice = true; if (nalUnitType == 5) isIdr = true; //Write the start code Buffer.Write(NalStart, 0, 3); //Write the nal heaer and data data Buffer.Write(packetData, offset, count - offset); return; } } } internal void DisposeBuffer() { if (Buffer != null) { Buffer.Dispose(); Buffer = null; } } public override void Dispose() { if (Disposed) return; base.Dispose(); DisposeBuffer(); } //To go to an Image... //Look for a SliceHeader in the Buffer //Decode Macroblocks in Slice //Convert Yuv to Rgb } 

也有各种其他RFC的实现,它们可以帮助媒体在MediaElement或其他软件中播放,或者只是将其保存到磁盘。

写容器格式正在进行中。