Fast SEQUENCE iteration in Common Lisp

Published on 2025-12-13

Introduction and code §

If you don't know what sequences are in CL, here's the gist of it: either a linked list or a vector. The main idea being that some operations like element search, comparison or deletion should have the same interface for both those types.

Basically, sequences are a band-aid over the lack of real iterator protocol (yes, even considering the mildy supported extensible sequences thing). But well, that's what we have to work with and it's not so bad really; we could be in barren C-ountry after all.

One of the problems of sequences is that if you want to iterate over these while supporting the conventional keywords (:start, :end, :key) without writing entirely separate (and gnarly) versions, there are only two unified interfaces provided by ANSI:

elt+length+loop (or do): naïve and simple, what's actually used under the hood by iterate's in-sequence driver. No free keyword forwarding and quadratic (thus extremely slow) on lists.
reduce: way more practical as it handles all the keywords for you (even :from-end) and much faster since any non-toy implementation internally dispatches over the actual sequence type. Still not ideal as it has the overhead of repeat funcall of your closure and you don't have access to the often needed internal raw element (without :key applied) or iteration index.

So when I made a naïve max-elt (for this years' Advent of Code) then was forced to rewrite it to use reduce, I said to myself I must build that Lovecraftian macro to handle the required nested monomorphization/specialization… so here's the beast:

;; Support :FROM-END ? Would add yet another level of specialization though...
(defmacro do-sequence ((var seq &key (start 0) end key with-index with-raw-elt) &body body)
  "Iterate on a sequence with the usual keywords available. :WITH-INDEX (resp. :WITH-RAW-ELT)
takes an unevaluated symbol to use as index (resp. element without :KEY applied).

NB: since iteration is done via `(LOOP ... :DO (LOCALLY ,@BODY)), said body can use RETURN and
contain declarations."
  (once-only (seq start end key)
    (macrolet ((impl (type has-key-p &optional has-end-p)
                 `(let ((ivar (or with-index (gensym "IDX")))
                        (rvar (cond (with-raw-elt     with-raw-elt)
                                    (,(not has-key-p) var)
                                    (t                (gensym "RAW")))))
                    `(loop
                       ,@(ecase ,type
                           (list
                            `(,@(when (or ,has-end-p with-index)
                                  `(:for ,ivar :of-type (integer 0 #.array-dimension-limit)
                                    :from ,start ,@(when ,has-end-p `(:below ,end))))
                              :for ,rvar :in (nthcdr ,start ,seq)))
                           (vector
                            `(:for ,ivar :of-type (integer 0 #.array-dimension-limit)
                              :from ,start :below (or ,end (length ,seq))
                              :for ,rvar := (aref ,seq ,ivar))))
                       ,@(cond (,has-key-p   `(:for ,var := (funcall ,key ,rvar)))
                               (with-raw-elt `(:for ,var := ,rvar)))
                       :do (locally ,@body)))))
      `(etypecase ,seq
         (list
          (cond ((and ,key ,end) ,(impl 'list t   t))
                (,key            ,(impl 'list t   nil))
                (,end            ,(impl 'list nil t))
                (t               ,(impl 'list nil nil))))
         ((simple-array * (*)) ;; Same impl. as VECTOR, monomorphize for performance
          (if ,key
              ,(impl 'vector t)
              ,(impl 'vector nil)))
         (vector
          (if ,key
              ,(impl 'vector t)
              ,(impl 'vector nil)))))))

Except for the de facto standard once-only macro, it is ANSI CL compliant. You can copy it under the Zlib license, if you wish so.

Benchmarks §

Now, for some benchmarks! Here I compare the aforementioned max-elt core loop written with reduce then with do-sequence against large sequences of different types; needing both the index and raw element somewhat complicates the result and illustrate why I included :with-index and :with-raw-elt.

(deftype index () `(integer 0 ,array-dimension-limit))
(deftype end ()   '(or null index))
(deftype key ()   '(or symbol (function (t) t) null))
(deftype test ()  '(or symbol (function (t t) t)))

(declaim (ftype (function (sequence test &key (:start index) (:end end) (:key key))
                          (values t (or null index)))
                max-elt-reduce max-elt-do-seq))

(defun max-elt-reduce (seq pred &key (start 0) end key)
  (let ((i start) maxi max)
    (declare (type index i))
    (reduce (if key ;; Hoist NIL key checking
                (lambda (kmax el)
                  (incf i)
                  (let ((kel (funcall key el)))
                    (if (or (= i start) (funcall pred kel kmax))
                        (progn (setf maxi i max el) kel)
                        kmax)))
                (lambda (kmax el)
                  (incf i)
                  (if (or (= i start) (funcall pred el kmax))
                      (progn (setf maxi i max el) el)
                      kmax)))
            seq :start start :end end)
    (values max maxi)))

(defun max-elt-do-seq (seq pred &key (start 0) end key)
  (let (maxi rmax max)
    (do-sequence (el seq :start start :end end :key key :with-index i :with-raw-elt r)
      (when (or (= i start) (funcall pred el max))
        (setf maxi i rmax r max el)))
    (values rmax maxi)))


(defconstant +len+ 5000000)

(let ((l   (make-sequence 'list   +len+ :initial-element 0))
      (sv  (make-sequence 'vector +len+ :initial-element 0))
      (fsv (make-array +len+ :element-type 'fixnum :initial-element 0))
      (fv  (make-array +len+ :element-type 'fixnum :initial-element 0 :adjustable t)))
  (format t "Test,~A~%" (lisp-implementation-type))
  (loop :for (args name) :in
        `(((,l   ,#'>) "LIST")
          ((,l   ,#'> :start 100 :end ,(- +len+ 100) :key ,#'1+) "LIST (fiddly)")
          ((,sv  ,#'>) "SIMPLE-VECTOR")
          ((,fsv ,#'>) "(SIMPLE-ARRAY FIXNUM)")
          ((,fv  ,#'>) "(VECTOR FIXNUM)"))
        :do (let ((tref (nth-value 1 (measure (apply #'max-elt-reduce args))))
                  (tnew (nth-value 1 (measure (apply #'max-elt-do-seq args)))))
              (format t "~A,~D → ~D (~@D%)~%"
                      name
                      (round (* tref 1000))
                      (round (* tnew 1000))
                      (round (* 100 (- (/ tref tnew) 1)))))))

And here are the results on some popular implementations (durations in ms):

Test	SBCL	CCL	ECL	CLISP
LIST	39 → 26 (+50%)	124 → 64 (+93%)	979 → 499 (+96%)	372 → 251 (+48%)
LIST (fiddly)	46 → 39 (+18%)	129 → 77 (+69%)	1186 → 703 (+69%)	450 → 408 (+10%)
SIMPLE-VECTOR	44 → 32 (+38%)	140 → 81 (+71%)	966 → 600 (+61%)	417 → 314 (+33%)
(SIMPLE-ARRAY FIXNUM)	44 → 31 (+42%)	141 → 81 (+73%)	954 → 608 (+57%)	417 → 315 (+32%)
(VECTOR FIXNUM)	51 → 37 (+38%)	177 → 121 (+46%)	960 → 635 (+51%)	422 → 331 (+27%)

And another, simpler case (position-all, you can guess what it does) to exercise compilers a bit differently:

Code

(deftype queue () '(cons cons list))
(defun make-queue ()
  (let ((q (cons nil nil)))
    (setf (car q) q)
    q))

(declaim (ftype (function (t queue) queue) push-queue)
         (inline push-queue))
(defun push-queue (obj queue)
  (let ((new-tail (list obj)))
    (setf (cdar queue) new-tail
          (car queue)  new-tail))
  queue)


(deftype index () `(integer 0 ,array-dimension-limit))
(deftype end ()   '(or null index))
(deftype key ()   '(or symbol (function (t) t) null))
(deftype test ()  '(or symbol (function (t t) t)))

(declaim (ftype (function (t sequence &key (:start index) (:end end) (:key key) (:test test))
                          list)
                position-all-reduce position-all-do-seq))

(defun position-all-reduce (obj seq &key (start 0) end key (test #'eql))
  (let ((i start))
    (declare (type index i))
    (cdr
     (reduce (if key ;; Hoist NIL key checking
                 (lambda (acc el)
                   (incf i)
                   (when (funcall test (funcall key el) obj)
                     (push-queue i acc))
                   acc)
                 (lambda (acc el)
                   (incf i)
                   (when (funcall test el obj)
                     (push-queue i acc))
                   acc))
             seq :start start :end end :initial-value (make-queue)))))

(defun position-all-do-seq (obj seq &key (start 0) end key (test #'eql))
  (let ((acc (make-queue)))
    (do-sequence (el seq :start start :end end :key key :with-index i)
      (when (funcall test el obj)
        (push-queue i acc)))
    (cdr acc)))


(defconstant +len+ 5000000)

(let* ((sv  (let ((tmp (make-sequence 'vector +len+ :initial-element 0)))
              (loop :for i :from 0 :below +len+ :by 1000
                    :do (setf (aref tmp i) 42))
              tmp))
       (l   (coerce sv 'list))
       (fsv (make-array +len+ :element-type 'fixnum :initial-contents l))
       (fv  (make-array +len+ :element-type 'fixnum :initial-contents l :adjustable t)))
  (format t "Test,~A~%" (lisp-implementation-type))
  (loop :for (args name) :in
        `(((42   ,l) "LIST")
          ((42   ,l :start 100 :end ,(- +len+ 100) :key ,#'1+) "LIST (fiddly)")
          ((42  ,sv) "SIMPLE-VECTOR")
          ((42 ,fsv) "(SIMPLE-ARRAY FIXNUM)")
          ((42  ,fv) "(VECTOR FIXNUM)"))
        :do (let ((tref (nth-value 1 (measure (apply #'position-all-reduce args))))
                  (tnew (nth-value 1 (measure (apply #'position-all-do-seq args)))))
              (format t "~A,~D → ~D (~@D%)~%"
                      name
                      (round (* tref 1000))
                      (round (* tnew 1000))
                      (round (* 100 (- (/ tref tnew) 1)))))))

Test	SBCL	CCL	ECL	CLISP
LIST	30 → 11 (+173%)	58 → 18 (+224%)	879 → 351 (+150%)	291 → 147 (+98%)
LIST (fiddly)	33 → 21 (+57%)	63 → 28 (+122%)	907 → 538 (+69%)	352 → 301 (+17%)
SIMPLE-VECTOR	29 → 20 (+45%)	70 → 33 (+110%)	826 → 458 (+80%)	310 → 220 (+41%)
(SIMPLE-ARRAY FIXNUM)	29 → 21 (+38%)	71 → 35 (+106%)	855 → 477 (+79%)	314 → 219 (+43%)
(VECTOR FIXNUM)	40 → 26 (+54%)	108 → 70 (+54%)	835 → 468 (+78%)	321 → 228 (+41%)

Benchmarking details

Versions used: SBCL 2.5.11, CCL 1.13, ECL 24.5.10, CLISP 2.49.92 (bytecode compiler used)
Hardware: AMD 5900X, 64 GB DDR4
Software: Gentoo Linux, linux 6.12, gcc 15.2, glibc 2.41
Misc.: optimize was set to (speed 3) (safety 0) (debug 0) for the sake of benchmarking, speedups are comparable without. measure does a full GC before starting its timer.

Conclusion §

Well, isn't that welcome? Both the performance and readability gains were worth the effort, in my opinion; especially outside of SBCL which seems to have a particularly well optimized reduce.

Beware code bloat and compilation slowdown though, such deep specialization with loop macros as leaves isn't free; so don't inline unless you intend on exploiting your compiler's DCE pass.