[MEncoder-users] Convert VOB subtitles to SRT format
Grozdan Nikolov
microchip at chello.be
Thu Aug 2 11:16:33 CEST 2007
On Thursday 02 August 2007 02:30, Peter Cordes wrote:
> On Mon, Jul 30, 2007 at 05:48:22PM +0200, Grozdan Nikolov wrote:
> > Hi,
> >
> > I know this is not mencoder related but I'm only subscribed to this
> > list.... I have a subtitle here (subtitle.sub and subtitle.idx files - 1
> > language) and I want to convert it to the SRT format. How do I do it? I
> > read at
> > http://www.mplayerhq.hu/DOCS/HTML/en/subosd.html that it is possible but
> > neither the man page nor the link above is very clear on how to do it.
> > Can someone give me a command line example?
>
> I just did this the other day. avidemux has a nice tool for OCRing a
> vobsub into a SubRip (.srt) sub. You give it the .idx, and choose which
> language you want. It works by matching exact glyphs, so it always gets
> I vs. l wrong because they always look the same in all the vobsubs I've
> seen. You have to type in the character for everything it hasn't seen yet,
> but once you've hit most of the alphabet it doesn't take long. If you see
> parts of letters, change the threshold it uses to draw contours of things.
> It's better to have to tell it that it's found tt or ff than to deal with
> the left half of an m, or something.
Woa... thanks for your extensive answer. I will try it out
>
> I had mixed results trying to type in non-ascii characters with accents
> using xvkbd (an on-screen keyboard which can switch to different layouts,
> spanish, french, german, UK (for pound sign), etc. For English, it worked
> perfectly.
>
> When you're done, save your glyphs to a file so you can load them next
> time. avidemux's file selector remembers what directory you were in, so
> the easiest thing to do is make a symlink to your glyph file in your home
> dir from wheverer you're working on the movie. Then you don't have to go
> far in the file selector dialog, and then back.
>
> To spell check the srt, I used subtitleeditor. It has a spell checker.
> It can even play the segment of the movie that goes with the selected
> subtitle. I'd like to find a spell checker that realized it was working on
> text OCRed from a sans-serif font, and would try switching l and I as a
> first option in the replace window. Unfortunately, subtitleeditor is not
> smart like that. (If those letters look the same to you in this email, go
> find a font where you can tell them apart. And o, O, and 0, while you're
> at it.)
>
>
> So, avidemux for vobsub -> srt, and subtitleeditor for srt spellcheck.
>
>
> I was making srt subs just for the forced vobsubs in the English subtitle
> track. I don't have rar-2.80 on my amd64, and I didn't want to leave the
> vobsubs uncompressed. I was able to merge them into an mkv with the movie,
> but mplayer (and vlc and xine) seem to ignore the .idx which goes into the
> codecprivate. I editted the .idx to say forced subs: ON before I mkvmerged
> them, and looking at the mkv with less I could see that line in there.
> Unfortunately, mplayer doesn't seem to actually only play forced subs.
> Also, it won't default to playing the vobsub track that has the default
> flag in an mkv, maybe because one sub track always ends up with the default
> flag, and mplayer is working around that annoyance. (I saw a post from
> someone who muxed an empty subtitle track as the default for that reason,
> presumably with a different player.)
>
> There might be a bug in mplayer re: forced subs only being set in the
> ".idx" in an mkv, since pressing F switches to forced subs: disabled, and
> pressing it again switches to enabled (and then it really is enabled).
>
> Anyway, so I decided to just make srt subs for the forced subs. The
> problem became finding which subs were forced. spuunmux (in Ubuntu's
> dvdauthor package) can read a vobsub .sub and write an XML file and a
> directory of png images of the bitmaps. If there was an option to just
> write the xml, it would run much faster, but it doesn't take too long as it
> is.
>
> I had multiple movies to look for forced subs in, so I did:
> mkdir /tmp/subs
> for movie in *;
> do pushd "$movie" && [ -e *cd1*.sub ] || unrar x *.rar
> for i in 1 2;do
> spuunmux -o /tmp/subs/"$movie"."$i" *cd$i*.sub
> done
> popd
> done
> grep -l force /tmp/subs/*.xml
>
> Then to find which subs are the forced ones
> grep force /tmp/subs/movie.xml
> (spuunmux seems to generate bogus timestamps. They're maybe off by a
> factor of some frame rate ratio? No idea why.)
>
> Then you can mark the forced ones in subtitleeditor, and then delete all
> the non-forced ones. (You can add a "name" column in subtitleeditor, and
> use it to put a mark before you start deleting, so the numbers still match
> up (after you delete one to correct for 0-based vs. 1-based.))
>
> You can always use an image viewer to see the text for a given subtitle
> number.
>
>
> Anyway, hope this helps.
More information about the MEncoder-users
mailing list