[MEncoder-users] OCR for subtitles

Nicolas George nicolas.george at normalesup.org
Sat Apr 26 12:19:01 CEST 2008


Hi.

This is some kind of a shameless plug, but I think it may be useful to some
people around here. If not so, I apologize for the spam.

In the last months, I have written a tool to perform OCR on vobsub
subtitles.

The other similar tools around here did not suit me. Compared to these other
tools, my system uses only exact match, so it will never give a wrong
character, except when there are identical glyphs (I and l in some fonts).
Furthermore, it works with connected components rather than horizontal
intervals, which makes it more reliable when glyphs overlap due to kerning.
And last but not least, it can handle italics (and other text enrichments)
and outputs ASS.

A copy of the source tree can be found here:
http://gitorious.org/projects/exocr
It is not a clean finalized project, but it is usable as is.

I hope this may be useful to other people than me.

Regards,

-- 
  Nicolas George
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/mencoder-users/attachments/20080426/f195a66c/attachment.pgp>


More information about the MEncoder-users mailing list