PDF to CBZ for e-reader
A small post about the "correct" way to convert a PDF to CBZ. But first, why? In my case, I
had just finished my novel and wanted to try and see if the much hyped House of Leaves
was all that. The thing is that typesetting is pretty important in that novel, so EPUB or even
text PDF isn't going to cut it, which is why I obtained it as a scanned PDF (i.e. containing raster
pictures).
When I tried it on my e-reader (Kobo Aura One running koreader), each page took something like 10~20s to turn. Unacceptable. Since I already had some knowledge from prior e-reader conversion scripts, I decided to precompute as much as possible and save the result.
For this recipe, you'll need to know your e-reader's display resolution (1404x1872 for the Aura One) and its grey level count (16 for anything even slightly modern). Then it's just a matter of using the right tools:
mkdir work && cd work # Convert the PDF to images fitting the final dimensions mutool draw -o %04d.ppm -w 1404 -h 1872 '../(2000) House of Leaves.pdf' # Convert to grayscale, apply a bit of sigmoidal contrast to clean up and # quantize to 16 colors with proper dithering ls *.ppm | pararun -p 'magick "$1" -colorspace gray -sigmoidal-contrast 7x50% \ -dither FloydSteinberg -colors 16 -depth 8 -define png:compression-level=0 \ "${1%.ppm}.png"' # Finally compress the results oxipng --ng -o3 -s *.png # And zip it zip '../(2000) House of Leaves.cbz' *.png
And there you are! Sure, the book went from 34 MB to 240, but page turns are instantaneous
and it's much easier to read thanks to the increased contrast. Worth mentioning that this inspired
me to add a sigmoidal contrast operator to pyvips (still have to write the sRGB (u8) -> scRGB (float) -> numpy -> sRGB (u8)
codepath) during this afternoon and
that I'm planning to add a way to import GIMP Curves in there too.