Speeding up cljqalbum
Part 2 of my cljq
adventures, this time around one of my end-user applications: the music (album) query tools I
use with rymscrap.
To make it short, everything went fine, I was able to enact my migration simply by swapping
the command called via find -exec {} ;
and get that:
$ jqalbum 'has_genre("Sludge Metal") and year < 2000' /home/user/Music/Acid Bath/(1994) When the Kite String Pops/album.json /home/user/Music/Acid Bath/(1996) Paegan Terrorism Tactics/album.json … $ cljqalbum '(and (has-genre "Sludge Metal") (< year 2000))' /home/user/Music/Acid Bath/(1994) When the Kite String Pops/album.json /home/user/Music/Acid Bath/(1996) Paegan Terrorism Tactics/album.json …
But as for the performance…
$ jq --version; sbcl --version jq-1.7.1 SBCL 2.5.2 $ hyperfine 'jqalbum '\''has_genre("Sludge Metal") and year < 2000'\''' \ 'cljqalbum '\''(and (has-genre "Sludge Metal") (< year 2000))'\''' Benchmark 1: jqalbum 'has_genre("Sludge Metal") and year < 2000' Time (mean ± σ): 3.531 s ± 0.028 s [User: 1.949 s, System: 1.573 s] Range (min … max): 3.490 s … 3.570 s 10 runs Benchmark 2: cljqalbum '(and (has-genre "Sludge Metal") (< year 2000))' Time (mean ± σ): 8.373 s ± 0.055 s [User: 3.444 s, System: 4.973 s] Range (min … max): 8.302 s … 8.456 s 10 runs Summary jqalbum 'has_genre("Sludge Metal") and year < 2000' ran 2.37 ± 0.02 times faster than cljqalbum '(and (has-genre "Sludge Metal") (< year 2000))'
Unacceptable. But I don't cry uncle this easily so I started to investigate the two suspects that immediately came to mind:
- Simply the cost of launching SBCL, loading Quicklisp, ASDF/UIOP, reloading the FASL, etc…
- Possibly those simple query functions' implementation (album.jq vs album.lisp), as I know cl-ppcre - the de facto standard CL regexp lib - can be a tad slow. Nothing I can do about it without ditching regexps, and that'd be both painful and unfair in the context of benchmarking.
So I decided to at least solve the first one and do everything inside a single Lisp
process. Didn't even have to re-implement find, as uiop:launch-program
gives me popen
-like process spawning.
(declaim (optimize (speed 3) (debug 0) (safety 0))) (require "asdf") (asdf:load-system "q3cpma-json-utils") (ql:quickload '("com.inuoe.jzon" "cl-ppcre" "iterate") :silent t) (defpackage #:cljqalbum (:use #:cl #:iterate) (:export #:toplevel)) (in-package #:cljqalbum) (defmacro ? (json &rest path) `(q3cpma-json:query ,json ',path)) (defmacro ?1 (json &rest path) `(car (q3cpma-json:query ,json ',path))) (defparameter $ nil) (load (merge-pathnames "album" *load-truename*)) (defun toplevel () (destructuring-bind (form-str &optional (dir (uiop:native-namestring (merge-pathnames "Music/" (user-homedir-pathname))))) (uiop:command-line-arguments) (let ((find-process (uiop:launch-program `("find" "-L" ,dir "-type" "f" "-name" "album.json") :output :stream)) (form-fun (let ((*package* (find-package :cljqalbum))) (compile nil `(lambda () ,(read-from-string form-str)))))) (iter (for line = (read-line (uiop:process-info-output find-process) nil nil)) (while line) (let (($ (com.inuoe.jzon:parse (uiop:parse-native-namestring line)))) (when (funcall form-fun) (write-line line)))))))
Let's see the result with both SBCL and Clozure CL:
$ hyperfine 'jqalbum '\''has_genre("Sludge Metal") and year < 2000'\''' \ 'sbcl … '\''(and (has-genre "Sludge Metal") (< year 2000))'\''' \ 'ccl … '\''(and (has-genre "Sludge Metal") (< year 2000))'\''' Benchmark 1: jqalbum 'has_genre("Sludge Metal") and year < 2000' Time (mean ± σ): 3.530 s ± 0.027 s [User: 1.923 s, System: 1.597 s] Range (min … max): 3.474 s … 3.560 s 10 runs Benchmark 2: sbcl … '(and (has-genre "Sludge Metal") (< year 2000))' Time (mean ± σ): 396.8 ms ± 2.9 ms [User: 327.8 ms, System: 94.0 ms] Range (min … max): 393.4 ms … 402.4 ms 10 runs Benchmark 3: ccl … '(and (has-genre "Sludge Metal") (< year 2000))' Time (mean ± σ): 1.298 s ± 0.006 s [User: 1.243 s, System: 0.082 s] Range (min … max): 1.292 s … 1.308 s 10 runs Summary sbcl … '(and (has-genre "Sludge Metal") (< year 2000))' ran 3.27 ± 0.03 times faster than ccl … '(and (has-genre "Sludge Metal") (< year 2000))' 8.90 ± 0.09 times faster than jqalbum 'has_genre("Sludge Metal") and year < 2000'
Alright, already crushing jq by a x9 factor! Now I'm kind of obligated to go all in and try with "executable images" (fat bundling of the runtime with a dumped image; around 40 MB):
$ sbcl --no-userinit --load ~/.local/lib/quicklisp/setup.lisp --script cljq/make-cljqalbum.lisp cljqalbum.sbcl $ ccl --no-init --load ~/.local/lib/quicklisp/setup.lisp --load cljq/make-cljqalbum.lisp --eval '(uiop:quit)' -- cljqalbum.ccl $ hyperfine 'jqalbum '\''has_genre("Sludge Metal") and year < 2000'\''' \ 'cljqalbum.sbcl '\''(and (has-genre "Sludge Metal") (< year 2000))'\''' \ 'cljqalbum.ccl '\''(and (has-genre "Sludge Metal") (< year 2000))'\''' Benchmark 1: jqalbum 'has_genre("Sludge Metal") and year < 2000' Time (mean ± σ): 3.541 s ± 0.041 s [User: 1.957 s, System: 1.574 s] Range (min … max): 3.480 s … 3.618 s 10 runs Benchmark 2: cljqalbum.sbcl '(and (has-genre "Sludge Metal") (< year 2000))' Time (mean ± σ): 57.9 ms ± 1.5 ms [User: 41.5 ms, System: 41.5 ms] Range (min … max): 55.4 ms … 62.8 ms 50 runs Benchmark 3: cljqalbum.ccl '(and (has-genre "Sludge Metal") (< year 2000))' Time (mean ± σ): 176.8 ms ± 2.7 ms [User: 151.0 ms, System: 51.7 ms] Range (min … max): 174.2 ms … 183.3 ms 16 runs Summary cljqalbum.sbcl '(and (has-genre "Sludge Metal") (< year 2000))' ran 3.06 ± 0.09 times faster than cljqalbum.ccl '(and (has-genre "Sludge Metal") (< year 2000))' 61.18 ± 1.71 times faster than jqalbum 'has_genre("Sludge Metal") and year < 2000'
jq is now thoroughly steamrolled, not merely crushed. Mission accomplished. For the second performance point, I could try to replace cl-ppcre with the experimental one-more-re-nightmare, but I'm not super confident about the potential wins (and it doesn't have full POSIX ERE support yet).