[Libav-user] How can we extract frames from a video stream in MPEG-4 AVC format

Carson Harper carsonharper at gmail.com
Thu Apr 7 07:26:55 CEST 2011


Hey Amir,

Could you tell us a little bit more about what you want to do with the "raw"
YUV or RGB data? For instance, do you want to save those individual frames
as still images (BMP, PNG...) or are you wanting to transcode them for use
in another video stream?

If you want to just save to disk try-

ffmpeg -f image2 </path/to/your/video.mp4> frame_%06d.png

See here <http://www.catswhocode.com/blog/19-ffmpeg-commands-for-all-needs> for
some other useful CLI stuff.

I recently went through the tutorials listed above and came up with a C++
routine that convert's the first video stream from an input file to a vector
of ImageMagick objects, but it would be trivial to adapt it to something
else.

You mentioned having random access to frames by selecting frame number,
which is something I'll need to write eventually anyways, but this may work
for you as a lazy solution. I mostly deal with small/short video clips which
is why I can get aways with loading the whole video into memory.

I tried to tweak a few things to make it easier to understand, but you
will definitely still need to change things to get it to work for you.
Obviously, I'm throwing my own custom exceptions etc... You could return an
array of (int8_t *) if you don't want to use the ImageMagick stuff, and then
just play with the raw data from there.

Hope this helps,

CH
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://ffmpeg.org/pipermail/libav-user/attachments/20110407/9981ba1b/attachment.html>
-------------- next part --------------
vector<Image> getBackgroundFrames(const string& inputVidPath,
		size_t numFrames, int width, int height) {

	vector<Image> videoFrames;
	try {

		//NOTE: This is your output format, you could
		//change it to YUV or whatever you want and it
		//should still work, I think the definitions are
		//in libavcodec.h or libavformat.h
		enum PixelFormat pixelFmt = PIX_FMT_RGBA;
		AVFormatContext *formatCtx = NULL;
		AVCodecContext *codecCtx = NULL;
		AVCodec *codec = NULL;
		AVFrame *srcFrame = NULL;
		AVFrame *destFrame = NULL;
		uint8_t *buffer = NULL;
		AVPacket packet;
		SwsContext* swsCtx = NULL;

		int frameFinished;
		int numBytes;

		const char* vidPath = inputVidIdPath.c_str();
		int videoStream = -1;

		/* TODO: Try only registering video's codec
		 *  But currently only takes .325 millis */
		av_register_all();

		if (av_open_input_file(&formatCtx, vidPath, NULL, 0, NULL) != 0)
			throw(chEx("av_open_input_file() unable to open: %s", vidPath));

		if (av_find_stream_info(formatCtx) < 0)
			throw(chEx("av_find_stream_info() failed"));

		//NOTE: Grabs first video stream it can find, nothing else
		for (unsigned int i = 0; i < formatCtx->nb_streams; ++i)
			if (formatCtx->streams[i]->codec->codec_type == CODEC_TYPE_VIDEO) {
				videoStream = i;
				break;
			}
		if (videoStream == -1)
			throw(chEx("Unable to find video stream"));

		// Get a pointer to the codec context for the video stream
		codecCtx = formatCtx->streams[videoStream]->codec;

		codec = avcodec_find_decoder(codecCtx->codec_id);
		if (codec == NULL)
			throw(chEx("Unsupported codec"));

		// Open codec
		if (avcodec_open(codecCtx, codec) < 0)
			throw(chEx("Unable to open codec"));

		// Allocate frames
		srcFrame = avcodec_alloc_frame();
		destFrame = avcodec_alloc_frame();

		if (destFrame == NULL || srcFrame == NULL)
			throw(chEx("Unable to allocate av_frames"));

		numBytes = avpicture_get_size(pixelFmt, width, height);

		buffer = (uint8_t *) av_malloc(numBytes * sizeof(uint8_t));

		avpicture_fill((AVPicture *) destFrame, buffer, pixelFmt, width, height);

		//avcodec_pix_fmt_to_codec_tag(codecCtx->pix_fmt);
		//	char pix_fmt_buf[256];
		//	avcodec_pix_fmt_string(pix_fmt_buf, 256, codecCtx->pix_fmt);
		//	cout << "\npix_fmt:\t" << pix_fmt_buf << "\n";

		swsCtx = sws_getContext(codecCtx->width, codecCtx->height,
				codecCtx->pix_fmt, width, height, pixelFmt, SWS_BICUBIC, NULL,
				NULL, NULL);
		while (av_read_frame(formatCtx, &packet) >= 0) {

			// Is this a packet from the video stream?
			if (packet.stream_index == videoStream) {

				// Decode video frame
				avcodec_decode_video2(codecCtx, srcFrame, &frameFinished,
						&packet);

				// Did we get a video frame?
				if (frameFinished) {
					sws_scale(swsCtx, srcFrame->data, srcFrame->linesize, 0,
							codecCtx->height, destFrame->data,
							destFrame->linesize);
					//NOTE: destFrame->data now points to raw RGBA pixel data
					//so you could do something else with it (E.g. transcode)
					videoFrames.push_back(
							Image(width, height, "RGBA", CharPixel,
									destFrame->data[0]));

				}
			}

			// Free the packet that was allocated by av_read_frame
			av_free_packet(&packet);
			if (videoFrames.size() >= numFrames)
				break;
		}

		sws_freeContext(swsCtx);
		av_free(buffer);
		av_free(destFrame);
		av_free(srcFrame);

		// Close the codec
		avcodec_close(codecCtx);

		// Close the video file
		av_close_input_file(formatCtx);

		if (videoFrames.empty())
			throw(chEx("Unable to get any frames from video %s", vidPath));

		//If more frames are needed, copy from the beginning
		for (size_t i = 0; videoFrames.size() < numFrames; ++i)
			videoFrames.push_back(videoFrames.at(i));

		for (size_t i = 0; i < videoFrames.size(); ++i)
			videoFrames[i].modifyImage();

	} catch (chEx& tle) {
		videoFrames.clear();
		throw(tle);
	} catch (...) {
		videoFrames.clear();
		throw(chEx("Fatal error in Tl_FFMPEG::getBackgroundFrames"));
	}

	return videoFrames;
}


More information about the Libav-user mailing list