A Guide to Movie Encoding
This is a guide to encoding and recoding movies, mostly on Linux, and also partly a rant against the most egregious practices.
I’m talking of encoding here, but actually, just about all the sources you can get movies will already be encoded, be it DVDs, bluerays, modern cameras or files. Very rarely you will get an unencoded stream, e.g. from a VHS. So all this applies actually mostly to re-encoding.
Also, being on Linux, one of the main requirements is that all the formats are supported by open source software. I don’t care about any possible patent-violations, because those would involve software patents, and these would haven been granted illegally anyway.
The tools used and denoted by fixed font
are Linux commands or Debian/Ubuntu packages; but most of the software is available on other platforms as well.
Use the source
The quality of the encoding relies most heavily on the quality of the source you have. The more artifacts — no matter where from, be it from the actual film, dust, scratches, the projector, the camera, the VHS-drive, or the more modern electronic encoding-artifacts — the bigger the encoding will get to retain the same quality as the source. Some of the worst things I’ve seen are early black and white movies with loads of dust, scratches and grain.
Basically, artifacts increase the entropy, and the more entropy the less compression is possible.
- Use the best source available. Usually blueray, unless the producer just interpolated from a DVD, in which case adjust the resolution back down to the DVD level, usually 720 wide (but 704 or 352 is possible).
- Codecs matter. Some are notoriously efficient at encoding artifacts, that any re-encode will actually increase the file size. DIV3 is one such.
- Otherwise you might gain from 20% to 50% by re-encoding DIVX, XVID, DX50 with a better codec, with no loss in visible quality. And of course, with MPEG2 from DVDs you can gain around 80-90% space, and with MPEG-4 AVC or VC-1 from bluerays, around 50-80%, depending on your quality needs.
- Generally, a 500MB file encoded from a blueray will look much better than the same 500MB file encoded from a DVD, at the same resolution. Actually, you might even get a better resolution from the blueray, at the same file size.
Acquiring target
For the target, there are basically three factors that matter in the overall quality: container, codec and encoder. Apart from resolution, of course, but there the maximum is dictated by the source.
- Container is easy: It must support multiple audio streams, multiple subtitles, preferrably in textual format (e.g. srt or ssa), and metadata, preferrably also cover images. This Comparison of container formats makes clear this is Matroska, probably followed by MP4.
- Codec is a bit more tricky. But basically, you want one of the newer ones, they’re offering consistently better quality at lower file size. Which about leaves H.264 and VP9. You probably want H.264, Bluerays already mostly come in it, so do youtube-videos nowadays.
- Stop using DIV3, DIVX, XVID, DX50 right now. They’re vastly inferior compared to what modern codecs deliver in half the filesize.
- Audio codecs don’t have a large influence on file size, But as AC3 can’t do VBR, you don’t want that, and MP3 can’t do more than 2 channels. That leaves AAC and Opus as viable options, which happen to be the defaults to go with H.264 and VP9 respectively. Don’t use AC3, and don’t use DTS, both are obsolete.
- Fortunately,
handbrake-gtk
already comes with H.264 and AAC as defaults, you only need to set the container to Matroska, and you’re good. A quality factor RF of 20 is usually good; 25 is still acceptable everything more is visually bad. - If you’ve already got a load of MP4-files encoded with H.264 and AAC,
mmg
(frommkvtoolnix-gui
) can rewrite the container of the file to Matroska without re-encoding. And it also supports adding more audio-tracks, subtitles and image-attachements. - If you want to reduce the dimensions of the movie in order to reduce filesize, don’t go below a width of 720, Actually, rather reduce the quality somewhat before reducing dimensions, the visual impact is less noticeable.
- Don’t ever go for a “filesize of 700MB”, that’s just stupid. Nobody wants to put a movie on a CD (and actually most people wouldn’t, even 15 years ago).
- But be careful about filesize. Sadly, there’s still VFAT filesystems out there, which can’t contain files bigger than 2GB. some of them used by todays “Smart” TVs.
Dub stands for Dumb
There is only one reason for dubbing a movie — making it available to children who haven’t learned to read yet, and to the illiterate.
- Whoever, ever had and has the idea to voice-over instead of just leaving the original language alone and subtitle it, is a total moron. And so is everyone encoding a movie with such an audio track. However, it is acceptable to voice-over parts with foreign speakers in documentaries (but not the whole documentary!).
- If you still want to encode a dubbed audio track, make sure to also include the original language track. If it’s not possible with your container format, you’re using the wrong one.
- Since not everyone is expected to read every language, include all available subtitles. Again, if your container doesn’t allow that, you’re using the wrong one
- Hardcoded subtitles (within the movie stream itself) probably means you’re either a moron or using the wrong software. It’s only acceptable if the source had them too.
- Those pesky vobsub-files, which are actually (mpeg-)streams, can be OCR’d to textfiles (srt, ssa) with
vobsub2srt
. Whatevervobsub2srt
cannot recognize can be OCRd with SubRip (works with wine), for instance, but it will require heavy work. So you would be better off either to get them from opensubtitles.org or just include the vobsub. - Subtitles that are out of sync can be fixed with
subtitleeditor
. If they just start late or early, you can also just set an offset withinmmg
(frommkvtoolnix-gui
)
Finishing Touches
After having a decent file, you might want to add metadata and (if applicable) cover-images.
- The minimum metadata you need to provide is title, year and director (yes, there are at least two movies with the same name, published the same year).
- If the movie is a known published one, can fetch the metadata, and my nfo2xml can convert it into a Matroska meta-data xml which can be muxed in with
mmg
.