MPEG4 Encoder/Decoder DMO

MPEG4 Encoder/Decoder DMO


Xvid is an open-source video codec library following the MPEG-4 standard. Xvid is published under the GNU GPL License, it can be used only in free and open source project.

In this article, I will present the XvidCoreDmo. I decided to write a DMO instead of DirectShow filter because DMOs can be used in much broader range of applications and they are also much easier to use since a filter graph is not required.


Microsoft® DirectX® Media Objects (DMOs) are COM-based data-streaming components. DMOs are very similar to DirectShow filters but much more powerful. They are also easier to develop, test and use. One of the most important reason to consider DMOs versus DirectShow filter is the fact they don't require a filter graph.
If your system is already using Directshow filters, a DMO wrapper filter will make this task really simple. This filter handles all the messy work. Since this wrapper aggregates your DMO, you can access its pointer in case you want to do something special.
Let me explain two ways that you can use XvidCoreDmo in your own application.

Using DMOs in DirectShow

Applications based on DirectShow technology may use DMOs in a filter graph, using the DMO Wrapper filter. Creating DMO requires you to create an instance of the DMO Wrapper filter and call its Init method. Specify the CLSID of the DMO and its category.

The following image shows a typical graph using both the MPEG-4 decoder DMO.
XvidCore Decoder graph
XvidCore Decoder render
// Create the DMO Wrapper filter
CComPtr<IBaseFilter> pFilter;
HRESULT hr = pFilter.CoCreateInstance(CLSID_DMOWrapperFilter);
if (SUCCEEDED(hr)) 
    // Query for IDMOWrapperFilter
    CComQIPtr<IDMOWrapperFilter> pDmoWrapper = pFilter;
    if ( pDmoWrapper != NULL ) 
        // Initialize the filter
        hr = pDmoWrapper->Init(CLSID_Mpeg4Dec, DMOCATEGORY_VIDEO_DECODER);

        if (SUCCEEDED(hr)) 
            // Add the filter to the graph.
            hr = pGraph->AddFilter(pFilter, L"XvidCoreDec");

C++ developers, you may want to check VideoPlayer Demo and use the GraphBuilderImpl class.
.NET developers, you may want to consider WindowsMedia library.  DSGraphBuilder class has the features than the C++ version.
XvidCoreDmo Encoder accepts various media types, the following types are supported.
  • RGB - Basic Windows bitmap format. 16, 24 and 32bpp samples format
  • UYVY - YUV 4:2:2 (Y sample at every pixel, U and V sampled at every second pixel horizontally on each line.

Using DMOs in Non-DirectShow Application

If you are developing a specialized application, you will find that DMOs are most useful to you because you can initialize it directly and do the conversion that you interested in without having to deal with filter graph. The snippets below show the common tasks that you may wish to do with DMOs.

Initializing DMO
CComPtr<IMpg4Dec> pMpg4Dec;
HRESULT InitializeDecoder(long lWidth, long lHeight)

    DMO_MEDIA_TYPE mIn = { 0 };
    hr = MoInitMediaType(&mIn, sizeof(VIDEOINFOHEADER));

    DMO_MEDIA_TYPE mOut = { 0 };
    hr = MoInitMediaType(&mOut, sizeof(VIDEOINFOHEADER));

    // Initialize Input type
    mIn.majortype  = MEDIATYPE_Video;
    mIn.subtype    = MEDIASUBTYPE_xvid;
    mIn.formattype = FORMAT_VideoInfo;

    // Initialize video header structure
    // Initialize Input type
    ULONG biCompression = mIn.subtype.Data1; // this is the compression
    VIDEOINFOHEADER *pvih = reinterpret_cast<VIDEOINFOHEADER*>(mIn.pbFormat);
    memset(pvih, 0, sizeof(VIDEOINFOHEADER));
    pvih->bmiHeader.biSize               = sizeof(BITMAPINFOHEADER);
    pvih->bmiHeader.biWidth              = lWidth; // set video width
    pvih->bmiHeader.biHeight             = lHeight; // set video height
    pvih->bmiHeader.biPlanes             = 1;
    pvih->bmiHeader.biBitCount           = 16;
    pvih->bmiHeader.biCompression        = biCompression;
    hr = pMpg4Dec->SetInputType(0, &mIn, 0);

    // Initialize Output type
    mOut.majortype  = MEDIATYPE_Video;
    mOut.subtype    = MEDIASUBTYPE_RGB24;
    mOut.formattype = FORMAT_VideoInfo;
    VIDEOINFOHEADER *pvioh = reinterpret_cast<VIDEOINFOHEADER*>(mOut.pbFormat);
    memset(pvioh, 0, sizeof(VIDEOINFOHEADER));
    pvioh->bmiHeader.biSize               = sizeof(BITMAPINFOHEADER);
    pvioh->bmiHeader.biWidth              = lWidth;
    pvioh->bmiHeader.biHeight             = lHeight;
    pvioh->bmiHeader.biPlanes             = 1;
    pvioh->bmiHeader.biBitCount           = 24; //GetBitCount(&mOut.subtype);
    pvioh->bmiHeader.biCompression        = BI_RGB;
    hr = pMpg4Dec->SetOutputType(0, &mOut, 0);

    hr = pMpg4Dec->AllocateStreamingResources();

    return hr;

Using IMediaBuffer to get samples
DMOs use IMediaBuffer interface to manage multimedia samples. Typically, you can implement your own version of this interface as described here.
//...IMediaBuffer implementation
class CBaseMediaBuffer : public IMediaBuffer {
   CBaseMediaBuffer() {}
   CBaseMediaBuffer(BYTE *pData, ULONG ulSize, ULONG ulData) :
      m_pData(pData), m_ulSize(ulSize), m_ulData(ulData), m_cRef(1) {}
      return InterlockedIncrement((long*)&m_cRef);
      long l = InterlockedDecrement((long*)&m_cRef);
      if (l == 0)
         delete this;
      return l;
   STDMETHODIMP QueryInterface(REFIID riid, void **ppv) {
      if (riid == IID_IUnknown) {
         *ppv = (IUnknown*)this;
         return NOERROR;
      else if (riid == IID_IMediaBuffer) {
         *ppv = (IMediaBuffer*)this;
         return NOERROR;
         return E_NOINTERFACE;
   STDMETHODIMP SetLength(DWORD ulLength) {m_ulData = ulLength; return NOERROR;}
   STDMETHODIMP GetMaxLength(DWORD *pcbMaxLength) {*pcbMaxLength = m_ulSize; return NOERROR;}
   STDMETHODIMP GetBufferAndLength(BYTE **ppBuffer, DWORD *pcbLength) {
      if (ppBuffer) *ppBuffer = m_pData;
      if (pcbLength) *pcbLength = m_ulData;
      return NOERROR;
   BYTE *m_pData;
   ULONG m_ulSize;
   ULONG m_ulData;
   ULONG m_cRef;
Processing Media Samples
ULONG DecodeFrame(
   LPBYTE pbData,
   ULONG lLength,
   LPBYTE pbBuffer,
   ULONG lSize)

    CBaseMediaBuffer inputBuffer(pbData, lLength, lLength);
    CBaseMediaBuffer outputBuffer(pbBuffer, lSize, 0);
    HRESULT hr;
    if ( pMpg4Dec == NULL )
        hr = InitializeDecoder(320,240);
    DWORD dwStatus = 0;
    DMOBuffer.pBuffer = &outputBuffer;
    DMOBuffer.dwStatus = 0;
    hr = pMpg4Dec->ProcessInput(0, &inputBuffer, 0, 0, 0);
    hr = pMpg4Dec->ProcessOutput(0, 1, &DMOBuffer, &dwStatus);
    DWORD cbLength = 0;
    if ( SUCCEEDED(hr) ) {
        outputBuffer.GetBufferAndLength(0, &cbLength);
    return cbLength;
The method IMediaObject::ProcessOutput returns S_OK to indicate that the output data is available. It returns S_FALSE when no output available or an error code in case of failure. If you are building a RTP server, you may fragment your stream and send it over the network. Or if you were decoding, you could simply convert the stream to a native format (RGB bitmap) and render it to your screen.


I hope this gives you a good overview how you can use the XvidCoreDmo in your own application. Remember Xvid is released under GNU GPL License. This library is subject to the same freedom.



12/10/2008: Mpeg4Dmo - initial release
01/20/2009: Mpeg4Dmo - Updated, improved color transformation
01/20/2009: Mpeg4Dmo - Updated, sync Xvid 1.2.2 release
03/25/2011: Mpeg4Dmo - Updated, sync Xvid 1.3.1 release