Project Isidore Design Notebook

Divus Isidorus Hispalensis ora pro nobis


Updated: 2022-06-24 Fri 19:39

Project Isidore is a personal website written in LISP. This article simultaneously serves as project documentation and enables literate programming via org-babel.


1. Introduction

Welcome! This is the documentation webpage of my personal website. The canonical source code is located at GitHub - HanshenWang/project-isidore: Personal Web Application. For usage, please see the user manual and the navigation bar located at the top of the website.

Navigation buttons are inserted next to each header to take one back to the table of contents.

Copyright (c) 2021 Hanshen Wang. Source code is under the GNU-AGPL-3.0 License. Blog content is available under the CC-BY-SA 4.0 License unless otherwise noted.

2. Common Lisp Environment Setup

He is like to a man building a house, who digged deep and laid the foundation upon a rock. And when a flood came, the stream beat vehemently upon that house: and it could not shake it: for it was founded on a rock. But he that heareth and doth not is like to a man building his house upon the earth without a foundation: against which the stream beat vehemently. And immediately it fell: and the ruin of that house was great.

–Luke 6:48-49

Why lisp? There are always more poets than non-poets in the world, or so I've heard.

Practically speaking, a complete and recent tour of the LISP world for beginners has already been written: Steve Losh - Intro to Common lisp. Steve Losh characterizes the field of web development – at least in Anno Domini 2021 – not as a hamster wheel of backwards incompatibility, but a hamster centrifuge. A house of sand, indeed. In technical aspects, Paul Graham's writings convey any advantages better than I can.

Below are the steps I have taken to setup a Common Lisp environment inside Spacemacs.

  • CL the Language = ANSI INCITS 226-1994
  • CL Implementation = Steel Bank Common Lisp If you've deep pockets or love reading about the history of Lisp implementations: The History of Franz and Lisp | IEEE Journals & Magazine | IEEE Xplore
  • CL Library/Package Manager = Quicklisp
  • CL System Manager/Build Tooling = Another System Definition Facility (comes bundled with SBCL)
  • CL IDE = Spacemacs with SLY
  • Install Steel Bank Common Lisp, a Common Lisp implementation.

    Other implementations of Common Lisp the language exist, but SBCL is the premier open-source implementation as of this writing.

    Installing SBCL from your Linux distribution's package manager is the most straight forward way. On Debian/Ubuntu this is as simple as

    sudo apt install sbcl
    

    Downloading and unpacking a binary from http://www.sbcl.org/getting.html or building from source are options for later discretion. In the shell,

    sbcl
    * (+ 2 3)
    * (quit)
    

    will produce the startup SBCL banner and REPL, evaluate an expression, and quit.

  • Install Quicklisp - the Common Lisp Package Manager

    In the shell,

    curl -O https://beta.quicklisp.org/quicklisp.lisp
    sbcl --load quicklisp.lisp
    ## Inside the SBCL prompt,
    (quicklisp-quickstart:install)
    (ql:add-to-init-file) # autoload into sbcl initialization
    (quit)
    

    After loading – via (ql:quickload :project-name) – a project, it will be stored locally. It will be in a directory similar to:

    /home/$USER/quicklisp/dists/quicklisp/software/example-project-2020-01-01-git
    

    Also, (ql:where-is-system 'system-name) will return the system's location. project-name and system-name are interchangeable here.

    In order to (ql:quickload :your-local-project) Quicklisp looks in /home/$USER/quicklisp/local-projects/ for said project. You can symlink your project if you desire to use some other folder.

    ln -s /home/$USER/quicklisp/local-projects/project-a/ /home/$USER/project-a/
    

    Not of primary importance yet is a definitions of what entails a lisp package or lisp system. It will explained in proper order during your introduction to the language, to be more specific it is covered in Peter Seibel's excellent pedagogical work, Practical Common Lisp and other guides are located here and here. N.B. that (ql:quickload :your-local-project) also calls (asdf:load-system :your-local-project). The difference is (ql:quickload) will download any missing system dependencies.

  • Install Spacemacs common-lisp layer - a Common Lisp IDE

    The below steps assume you are already familiar with Spacemacs. Inside Spacemacs, SPC-h-SPC RET common-lisp RET and follow the layer README.

    For those who are understandably wary of Emacs other IDE options exist. Lisp is one of the oldest higher level languages (beaten by FORTRAN by a few years). With that rich tradition comes a time where the underlying hardware, operating system, and developer tooling were unified under lisp. Unfortunately, the closest one can reasonably come today is Emacs, a lisp interpreter running on C/UNIX/hardware. Emacs still offers the best-in-class experience amongst open-source offerings, but the notorious learning curve of Emacs can be tempered with a preset configuration: Spacemacs. For those on a Microsoft Windows machine, I have written installation instructions.

  • Optional: Enable goto definition for SBCL primitives

    Download CL Implementation source files and extract them to the location specified by (sb-ext:set-sbcl-source-location), which is set in your user configuration dotfile: /home/$USER/.sbclrc. Add it if not already present,

    (sb-ext:set-sbcl-source-location "/usr/share/sbcl-source/")
    

    Now you can use g d to call jump-to-definition to goto the raw documentation: the source code.

  • Optional: Enable offline access to the CL HyperSpec language reference

    The full text of the ANSI Common Lisp Standard (1994) is available online in HTML form. To have the reference handy offline and to be able to browse it within Emacs, first download and extract the HTML source for HyperSpec 7.0.tar.gz from the great Internet Archive.

    Then we can configure Emacs to open only HyperSpec links inside the Emacs web browser EWW and also inform our Common Lisp IDE of the HyperSpec location.

    (setf common-lisp-hyperspec-root "file:///home/$USER/HyperSpec/")
    ;; Optionally, execute the HYPERSPEC-LOOKUP function with local variable
    ;; changes to view HyperSpec links exclusively in EWW.
    (advice-add 'hyperspec-lookup
                :around
                (lambda (orig-fun &rest args)
                  (setq-local browse-url-browser-function 'eww-browse-url)
                  (apply orig-fun args)))
    

    Now , h H will call sly-hyperspec-lookup to peruse the symbol at point with the HyperSpec. Note the default behavior of sly-hyperspec-lookup is to open a web browser at the online HyperSpec.

Newcomers to the language are advised to dive right into the recommended reading. Now's a good chance to use org-babel for some note-taking. If you've forked the repo to play around with, a high-level overview of the project may be useful.

2.1. Literate Programming

Lisp was the language for research into artificial intelligence before the AI winter. Now the field calls itself machine intelligence. The aim remains much like the story of Icarus: to ape the most noble aspect of man, his rational nature. Until that is realized, humans will be first and foremost the most important audience for any computer language or software. Here is how I setup literate programming with lisp.

You likely heard of org-babel, an extension of org-mode that allows one to interleave text and code. It is comparable to a more powerful jupyter notebook. Get lisp in org-babel blocks by, adding to your init.el/user-config.el

(org-babel-do-load-languages
 'org-babel-load-languages
 '((lisp . t)))

When evaluating a code block with C-c C-c (, , in spacemacs), make sure to start SLIME first (M-x slime RET).

(princ "Hello World!")

Some may be familiar with poly-org, a MELPA package which allow multiple major modes. Naturally this comes in handy when using literate programming. It uses font-lock-mode to turn on the relevant major mode when your cursor is inside said code block. This saves you from having to call org-edit-special repeatedly.

Furthermore, for most languages you can only evaluate the entire code block. Not so for lisp. M-x slime-compile-defun and M-x slime-compile-region do as they say on the tin: compile the specific function or highlighted region at cursor. Poly-org breaks these functions slightly as they do not treat #+begin_src and #+end_src as the start and end-of-file respectively. The following emacs lisp snippet fixes that.

(with-eval-after-load "poly-org"
  ;; sly-compile-file sends entire .org file. Narrow to span as done in poly-R
  ;; https://github.com/polymode/poly-org/issues/25
  (when (fboundp 'advice-add)
    (advice-add 'slime-compile-file :around 'pm-execute-narrowed-to-span)
    (advice-add 'slime-compile-defun :around 'pm-execute-narrowed-to-span)
    (advice-add 'slime-load-file :around 'pm-execute-narrowed-to-span)
    (advice-add 'slime-eval-defun :around 'pm-execute-narrowed-to-span)
    (advice-add 'slime-eval-last-expression :around 'pm-execute-narrowed-to-span)
    (advice-add 'slime-eval-buffer :around 'pm-execute-narrowed-to-span)))

Since we've already shaved the editor yak up to this point, the last package I'd like to mention that makes literate programming possible is org-tanglesync]]. To "tangle" a file is, in literate programming parlance, to extract just the source code from a document. Then to de-tangle (two-way sync) has always been a problem, and traditionally org-babel-detangle has relied upon unsightly link comments to do so.

Org-tanglesync keeps your .git controlled source code and your .org mode file in sync. And it does that without any markings or artifacts on the tangled code, making collaboration easier. If a more elegant solution to de-tangle a file exists out in the wild, please do let me know.

3. Project Isidore System Definition

Bibliographies exist for a written corpus of work. The same sort of metadata is needed a code base. In Common Lisp, it is the .asd file.

Project Isidore follows the common Model-view-controller (MVC) design pattern. It entails encapsulating data together with its processing (the model) and isolating it from the manipulation (the controller) and presentation (the view) part that has to be done on a user interface.

The project follows the ASDF package inferred system style of using defpackage forms to specify file inter-dependencies. The entry point of dependency graph is packages.lisp. To understand the code base, the files are named after the MVC design pattern.

For an index of symbols, functions and definitions, see the Reference Manual.

3.1. Libraries & System dependencies

  1. Finding new Libraries & Surveying the Ecosystem

    Looking for a library? In addition to online search queries, use command(s)

    (ql:system-apropos :library) ; search for term in quicklisp dist
    (ql:who-depends-on :library) ; usage in lisp ecosystem
    (ql-dist:dependency-tree :library) ; number of dependencies upstream
    ;; See quicklisp-stats README for example usage.
    (ql:quickload "quicklisp-stats") ; look at quicklisp download stats
    (ql:quickload "quicksearch")
    (qs:quicksearch "bluetooth" :du 100)
    

    In addition, look to Sabracrolleton's detailed reviews of Common Lisp libraries. Another great hint to library quality is the Github page for Zach Beane, who does the thankless job of maintaining Quicklisp.

    Quicklisp in one respect is less like Javascript's Node Package Manager and more like Debian's apt. Zach makes sure that ALL libraries on Quicklisp build together. He takes this burden upon himself so the end users might avoid dependency hell.

    A helper to gather all lisp system's dependencies · GitHub

  2. System Dependency graph

    Call sly-eval-buffer on the following code block to update the graph.

    ;; https://40ants.com/lisp-project-of-the-day/2020/05/0063-asdf-viz.html
    ;; Inside shell "sudo apt install graphviz".
    (ql:quickload :cl-dot)
    ;; Not present in quicklisp, retrieve from https://github.com/guicho271828/asdf-viz
    (ql:quickload :asdf-viz)
    (ql:quickload :project-isidore) ; also loads cl-ppcre.
    (setf cl-dot:*dot-path* (string-trim '(#\space #\newline) (second (ppcre:split " "(nth-value 0 (uiop:run-program "whereis -b dot" :output :string))))))
    ;; Tilde char "~" in destination pathname throws an error.
    (asdf-viz:visualize-asdf-hierarchy (asdf:system-relative-pathname "project-isidore" "assets/project-isidore-dependency-graph.png") (list (asdf:find-system :project-isidore)))
    ;; "asdf-viz" also can draw class hierarchies and call graphs.
    
    Project Isidore dependency graph
    Figure 1: Project Isidore dependency graph

    Transitive dependencies & Lines of Code from running,

    cd ~/quicklisp/dists/quicklisp/software/
    find . -name '*.lisp' | xargs wc -l
    
    Date Version (Commit) # of Libraries LOC
    2021-11-05 v1.1.0 (7cc0598) 35 251532
    2021-12-23 v1.2.0 (fd7c9f3) 42 264852
  3. Javascript Dependencies
    • Highlight.js 11.2.0 Upgrade via editing :head under org-publish-project-alist.
    • Mathjax 2.7.0 See C-h v org-html-mathjax-options and org-html-mathjax-template.
  4. Lisp Web Example Projects & Misc. Resources

    Cool libraries to check out:

    Library Author Function
    CL-CSV Edward Marco Baringer Spreadsheet
    CLOG David Botton Websockets
    CL-Unification Marco Antonetti Generalized pattern matching
    Cells Kenny Tilton Dataflow (GUI)

4. Org Publish Pipeline

In the /assets/blog/ folder, all HTML files are generated by org-publish. archive.html lists all published articles.

In Spacemacs you will need to enable the org layer. Org Mode comes with Org Publish built in. Org Publish takes advantage of Org's excellent export capabilities to generate not only the HTML but also the sitemap and RSS.

To publish/update an article:

  1. Prepare draft/existing notes for publishing, move .org file to input folder, remove extraneous links, update citations, check formatting etc.
  2. M-x RET org-publish RET blog
  3. Commit and push changes in git

My Org Publish config is part of my literate .spacemacs org file. I plan to publish the entire dotfile one day, similar to Org Mode - Organize Your Life In Plain Text! Until then, here is all the relevant configuration extracted.

Other exemplars here: good tutorials on org-publish : emacs, New and Improved: Two-Wrongs Now Powered By Org Mode

;;-------------------------------------------------------------------------
;; *** Org Ox Publish Config
;;-------------------------------------------------------------------------
(require 'ox-rss)
;; Following 2 lines are needed to exclude parent heading from table of contents but still export the content
;; https://emacs.stackexchange.com/questions/30183/orgmode-export-skip-ignore-first-headline-level
(require 'ox-extra)
(ox-extras-activate '(ignore-headlines))
;; Allows exporting bibtex citations to html
(require 'ox-bibtex)
;; Exclude default CSS from html export and add external stylesheet
(setq org-html-head-include-default-style nil)
;; Omit inline css as we use an imported stylesheet
(setq org-html-htmlize-output-type 'css)
;; https://www.taingram.org/blog/org-mode-blog.html
(setq org-export-global-macros
      '(("timestamp" . "@@html:<span class=\"timestamp\">[$1]</span>@@")))
(defun my/org-sitemap-date-entry-format (entry style project)
  "Format ENTRY in org-publish PROJECT Sitemap format ENTRY ENTRY STYLE format that includes date."
  (let ((filename (org-publish-find-title entry project)))
    (if (= (length filename) 0)
        (format "*%s*" entry)
      (format "{{{timestamp(%s)}}} [[file:%s][%s]]"
              (format-time-string "%Y-%m-%d"
                                  (org-publish-find-date entry project))
              entry
              filename))))
(setq org-publish-project-alist
      '(("blog"
         :components ("blog-content" "blog-rss"))
        ("blog-content"
         :base-directory "~/Dropbox/project-maria/blog"
         :html-extension "html"
         :base-extension "org"
         :recursive t
         :publishing-function org-html-publish-to-html
         :publishing-directory "~/project-isidore/assets/blog"
         :section-numbers t
         :table-of-contents t
         :exclude "rss.org"
         :with-title nil
         :auto-sitemap t
         :sitemap-filename "archive.org"
         :sitemap-title "Blog Archive"
         :sitemap-sort-files anti-chronologically
         :sitemap-style tree
         :sitemap-format-entry my/org-sitemap-date-entry-format
         ;; Use HTML5
         ;; https://orgmode.org/manual/HTML-doctypes.html#HTML-doctypes
         :html-doctype "html5"
         :html-html5-fancy t
         ;; Link to external custom stylesheet
         ;; If you need code highlight from highlight.js, include the latter three lines.
         :html-head "
                      <link rel=\"stylesheet\" type=\"text/css\" href=\"../global.css\"/>
                      <link rel=\"stylesheet\"
                            href=\"//cdnjs.cloudflare.com/ajax/libs/highlight.js/11.2.0/styles/base16/solarized-light.min.css\">
                      <script src=\"//cdnjs.cloudflare.com/ajax/libs/highlight.js/11.2.0/highlight.min.js\" defer></script>
                      <script>var hlf=function(){Array.prototype.forEach.call(document.querySelectorAll(\"pre.src\"),function(t){var e;e=t.getAttribute(\"class\"),e=e.replace(/src-(\w+)/,\"src-$1 $1\"),console.log(e),t.setAttribute(\"class\",e),hljs.highlightBlock(t)})};addEventListener(\"DOMContentLoaded\",hlf);</script>"
         :html-preamble "
                                    <div class=\"header header-fixed\">
                                      <div class=\"navbar container\">
                                        <div class=\"logo\"><a href=\"/\">Hanshen Wang</a></div>
                                        <input type=\"checkbox\" id=\"navbar-toggle\" >
                                        <label for=\"navbar-toggle\"><i></i></label>
                                        <nav class=\"menu\">
                                          <ul>
                                            <li><a href=\"/about\">About</a></li>
                                            <li><a href=\"/work\">Work</a></li>
                                            <li><a href=\"/blog/archive.html\">Blog</a></li>
                                            <li><a href=\"/contact\">Contact</a></li>
                                          </ul>
                                        </nav>
                                      </div>
                                    </div>
                                    <h1 class=\"title\">%t</h1>
                                    <p class=\"subtitle\">%s</p> <br/>
                                    <p class=\"updated\"><a href=\"/contact#article-history\">Updated:</a> %C</p>"

         ;; Article Postamble includes
         ;; Javascript snippet to insert anchor links to Table of Contents
         ;; HTML Footer
         :html-postamble "<script>
                              const headers = Array.from( document.querySelectorAll('h2, h3, h4, h5, h6') );

                              headers.forEach( header => {
                                header.insertAdjacentHTML('afterbegin',
                                 '<a href=\"#table-of-contents\">&#8689;</a>'
                                );
                              });
                              </script>
                              <hr/>
                              <footer>
                                <div class=\"copyright-container\">
                                  <div class=\"copyright\">
                                    Comments? Corrections? <a
                                href=\"https://hanshenwang.com/contact\">
                                Please do reach out.</a><a
                                href=\"https://hanshenwang.com/blog/rss.xml\">
                                RSS Feed. </a><a
                                href=\"https://hanshenwang.com/subscribe\">
                                Mailing List. </a><br/>
                                    Copyright &copy; 2021 Hanshen Wang. Some Rights Reserved.<br/>
                                    Blog content is available under
                                    <a rel=\"license\" href=\"http://creativecommons.org/licenses/by-sa/4.0/\">
                                      CC-BY-SA 4.0
                                    </a> unless otherwise noted.
                                  </div>
                                  <div class=\"cc-badge\">
                                    <a rel=\"license\" href=\"http://creativecommons.org/licenses/by-sa/4.0/\">
                                      <img alt=\"Creative Commons License\"
                                           src=\"https://i.creativecommons.org/l/by-sa/4.0/88x31.png\"
                                           height=\"31\"
                                           width=\"88\">
                                    </a>
                                  </div>
                                  <div class=\"rss-badge\">
                                    <a rel=\"license\" href=\"http://hanshenwang.com/blog/rss.xml\">
                                      <img alt=\"Really Simple Syndication - RSS\"
                                           src=\"https://upload.wikimedia.org/wikipedia/en/thumb/4/43/Feed-icon.svg/50px-Feed-icon.svg.png\"
                                           height=\"50\"
                                           width=\"50\">
                                    </a>
                                  </div>
                                </div>

                                <div class=\"generated\">
                                  Created with %c on <a href=\"https://www.gnu.org\">GNU</a>/<a href=\"https://www.kernel.org/\">Linux</a>
                                </div>
                              </footer>"
         )
        ("blog-rss"
         :base-directory "~/Dropbox/project-maria/blog"
         :base-extension "org"
         :publishing-directory "~/project-isidore/assets/blog"
         :publishing-function publish-posts-rss-feed
         :rss-extension "xml"
         :html-link-home "http://hanshenwang.com/"
         :html-link-use-abs-url t
         :html-link-org-files-as-html t
         :exclude "archive.org"
         :auto-sitemap t
         :sitemap-function posts-rss-feed
         :sitemap-title "Hanshen Wang Blog RSS"
         :sitemap-filename "rss.org"
         :sitemap-style list
         :sitemap-sort-files anti-chronologically
         :sitemap-format-entry format-posts-rss-feed-entry)
        ))
;; https://alhassy.github.io/AlBasmala#Clickable-Headlines
(defun my/ensure-headline-ids (&rest _)
  "Org trees without a custom ID will have
                              All non-alphanumeric characters are cleverly replaced with ‘-’.

                              If multiple trees end-up with the same id property, issue a
                              message and undo any property insertion thus far.

                              E.g., ↯ We'll go on a ∀∃⇅ adventure
                                 ↦  We'll-go-on-a-adventure
                              "
  (interactive)
  (let ((ids))
    (org-map-entries
     (lambda ()
       (org-with-point-at (point)
         (let ((id (org-entry-get nil "CUSTOM_ID")))
           (unless id
             (thread-last (nth 4 (org-heading-components))
               (s-replace-regexp "[^[:alnum:]']" "-")
               (s-replace-regexp "-+" "-")
               (s-chop-prefix "-")
               (s-chop-suffix "-")
               (setq id))
             (if (not (member id ids))
                 (push id ids)
               (message-box "Oh no, a repeated id!\n\n\t%s" id)
               (undo)
               (setq quit-flag t))
             (org-entry-put nil "CUSTOM_ID" id))))))))

;; Whenever html & md export happens, ensure we have headline ids.
(advice-add 'org-html-export-to-html   :before 'my/ensure-headline-ids)
(advice-add 'org-md-export-to-markdown :before 'my/ensure-headline-ids)
;; https://nicolasknoebber.com/posts/blogging-with-emacs-and-org.html
(defun format-posts-rss-feed-entry (entry _style project)
  "Format ENTRY for the posts RSS feed in PROJECT."
  (org-publish-initialize-cache "blog-rss")
  (let* ((title (org-publish-find-title entry project))
         (link (concat "blog/" (file-name-sans-extension entry) ".html"))
         (author (org-publish-find-property entry :author project))
         (pubdate (format-time-string (car org-time-stamp-formats)
                                      (org-publish-find-date entry project))))
    (message pubdate)
    (format "%s
                :properties:
                :rss_permalink: %s
                :author: %s
                :pubdate: %s
                :end:\n"
            title
            link
            author
            pubdate)))
(defun posts-rss-feed (title list)
  "Generate a sitemap of posts that is exported as a RSS feed.
                TITLE is the title of the RSS feed.  LIST is an internal
                representation for the files to include.  PROJECT is the current
                project."
  (concat
   "#+TITLE: " title "\n#+EMAIL: [email protected]" "\n\n"
   (org-list-to-subtree list)))
(defun publish-posts-rss-feed (plist filename dir)
  "Publish PLIST to RSS when FILENAME is rss.org.
                DIR is the location of the output."
  (if (equal "rss.org" (file-name-nondirectory filename))
      (org-rss-publish-to-rss plist filename dir)))

5. Development Operations

Follow industry best practice in automating part of development operations. In the context of this project, CI/CD is done on the Github Actions platform. Cheers to Github for allowing unlimited build limits for open source projects! In the following breakdown, I explain how to run the steps on the local machine.

For the uninitiated, an excellent git porcelain (with spacemacs layer integration) is Magit. This, paired with [1], takes care of your version control needs.

5.1. Continous Integration

Some measure of quality assurance through unit, integration and regression testing upon commit. This is done in MAKE.LISP, see also TESTS.LISP.

5.1.1. Testing

Project Isidore code coverage report.

There does exist a detailed evaluation of the many existing Common Lisp testing frameworks, the most recent iteration by Sabracrolleton. From this we select parachute as our testing framework of choice.

Project Isidore code aims to be portable Common Lisp within reason. Common Lisp Portability Library Status

Notes on Drakma

From: https://courses.cs.northwestern.edu/325/admin/json-client.php

Install Drakma

This should be easy.

(ql:quickload "drakma")

Using Drakma

Get data with URLs in Drakma is simple:

(drakma:http-request url)

url needs to be a string containing a complete URL. Drakma will send a request for that URL to the server indicated, just as a browser or other user client would.

drakma:http-request returns seven values, as Lisp multiple values. If you need to save or use values other than the first, use multiple-value-bind or a similar form. The values returned, in order, are

the body of the reply, either a string, when getting HTML or plain text, a binary array for images, audio, and JSON, or a file stream, if requested using a keyword parameter the HTTP status code an alist of the headers sent by the server the URI the reply comes from, which might be different than the request when redirects occur the stream the reply was read from a boolean indicating whether the stream should be closed the server's status text, e.g., "OK"

Debugging

From a former maintainer of SBCL, Nikodemus Siivola. The tracing and stickers functionality in the SLY IDE is also very useful. Call sly-compile-defun with a universal argument C-u to recompile with highest debug settings.

  1. Test case reduction is an essential skill, no matter the language or the environment. It is the art of reducing the code that provokes the issue (wrong result, an error, whatever) down to manageable size – including the full call path involved, and environmental issues like Slime vs no Slime, or current directory. The smaller the better in general, but it is a balancing act: if you can identify the issue using other methods in five minutes, it doesn't make sense to spend an hour or two boiling down the test case. …but when other methods are not producing results, and time is dragging on then you should consider this.
  2. Defensive programming. Not just coding to protect against errors, but coding so that your code is easy to debug. Avoid IGNORE-ERRORS and generally swallowing errors silently. Sometimes it is the right thing to do, but the more you do it, the harder BREAK-ON-SIGNALS becomes to use when you need it. Avoid SAFETY 0 like the plague – it can hide a multitude of sins. Avoid DEBUG 0 – it doesn't pay. Write PRINT-OBJECT methods for your objects, give your objects names to use when printing. Check that slots are bound before you use them in your PRINT-OBJECT methods. NEVER use INTERRUPT-THREAD or WITH-TIMEOUT unless you really know what you are doing and exactly why I'm telling you not to use them.
  3. Stop to think. Read the error messages, if any, carefully. Sometimes they're no help at all, but sometimes there are nuggets of information in them that a casual glance will miss.
  4. Know your environment. (This is what the question was really about, I know…)

3.0. M-. is gold. Plain v on a backtrace frame in Slime may also take you places if your code has a sufficiently high debug setting, but M-. should work pretty much always.

3.1. The Slime Inspector is one of my primary debugging tools – but that is probably affected by the kind of code I work on, so it might not be the same for everyone. Still, you should familiarize yourself with it – and use the fancy one. :)

3.2. While SBCL backtraces aren't at the moment the pretties ones in the world, try to make sense out of them. Just do (error "foo") in the REPL, and figure out what is going on. Experiment with both the plain SBCL debugger and the fancy Slime Debugger before you need to use them for real. They'll feel a lot less hostile that way. I'll write advice on interpreting the backtraces at another time.

3.3. Learn about BREAK-ON-SIGNALS and TRACE. Also note the SBCL extensions to TRACE.

3.4. The stepper isn't really a debugging tool, IMO – it is a tool for understanding control flow, which sometimes helps in debugging – but if you compile your code with DEBUG 3, then (STEP (FOO)) can take you to places.

3.5. Learn about M-RET (macroexpand) in Slime. Discover what happens if you do (SETF PRINT-GENSYM NIL) first, and understand the potential danger there – but also the utility of being easily able to copy-paste the expansion into your test case when you're trying to reduce it. (Replacing expansions of macros in the COMMON-LISP package is typically pointless, but replacing those from user packages can be golden.)

3.6. If all else fails, do (sb-ext:restrict-compiler-policy 'safety 3) and (sb-ext:restrict-compiler-policy 'debug 3) and recompile your code. Debugging should be easier now. If the error goes away, either (a) you had a type-error or similar in SAFETY 0 code that was breaking stuff but is now swallowed by an IGNORE-ERRORS or a HANDLER-CASE or (b) you may have found on SBCL bug: compiler policy should not generally speaking change codes behaviour – though there are some ANSI mandated things for SAFETY 3, and high DEBUG can inhibit tail-call optimizations which, as Schemers know, can matter.

3.7. DISASSEMBLE isn't normally a debugging tool, but sometimes it can help too. Depends on what the issue is.

  1. Extensions to printf() debugging. Sometimes this is just the easiest thing. No shame in there.

4.1. Special variables are nice.

(LET ((DEBUG :MAGIC)) …) and elsewhere (WHEN (EQ :MAGIC DEBUG) (PRINT (LIST :FOO FOO)))

Because I'm lazy, I tend to use * as the magic variable, so I can also trivially set it in the REPL. This allows you to get the debugging output for stuff you are interested in only when certain conditions are true. Or you can use it to tell which code path the call is coming from, etc.

4.2. Don't be afraid to add (break "foo=~S" foo) and similar calls to the code you're debugging.

4.3. SB-DEBUG:BACKTRACE can sometimes be of use.

Logging

Heroku will store your application logs for you. Logging is a step above print debugging and can be thought of as "live" documentation. If you don't have the interactivity of the LISP REPL for diagnosis as physician, then you must debug through diagnosis as mortician. Good logs are as useful as detailed blood stains.

Edi Weitz 2007-05-11 19:45:03 UTC Permalink Post by Slava Akhmechet

I'm looking through the Hunchetoot log and occassionally I see the

couldn't write to #<SB-SYS:FD-STREAM for "a Connection reset by peer couldn't write to #<SB-SYS:FD-STREAM for "a Broken pipe

I don't see any problems in the browser, I only found out about these messages because I looked at the log. Can someone think of a situation in which these errors would occur without interrupting user experience and why?

The user pressed "Stop" or went to another website before the page (including all images and other stuff) was fully loaded. "Broken pipe" and "Connection reset by peer" are pretty common error messages you'll find (for example) in every Apache logfile - this is nothing specific to Hunchentoot or Lisp.

HTH, Edi.

Generate Code Coverage Report
;;; Generate Code Coverage Report
;; SBCL specific contrib module.
(require :sb-cover)
;; Compiler knob: Turn on generation instrumentation
(declaim (optimize sb-cover:store-coverage-data))
;; Load while ensuring that it's recompiled with the new optimization
;; policy.
(asdf:load-system :project-isidore :force t)
(asdf:test-system :project-isidore)
;; HTML report output location
(sb-cover:report "../assets/code-coverage-report/")
(declaim (optimize (sb-cover:store-coverage-data 0)))

5.2. Continous Delivery

Building an executable binary for all 3 major operating system platforms on the x86-64 architecture. Binaries built this way may be downloaded through the Github release interface. This is done as the final step of MAKE.LISP by calling (asdf:make "system-name"). More esoteric architecture/OS pairs will have to compile from source, refer to the SBCL compiler supported platform table.

Lisp being amenable to image based development means an executable binary in the Lisp world saves the entire global state (the stack is unwound), libraries and SBCL runtime included.

5.3. Continous Deployment

Commit Company Service
N/A Heroku Platform (Buildpacks)
603da4f Fly.io Platform (Docker)
1c7a8e9 Oracle Cloud Infrastructure Virtual Private Server
TBA Multiple old Thinkpads Physical Server(s)
TBA Electronic scrap, a soldering iron, and oscilloscope FULL STACK

Having to go through initial setup of a new computer dampens the otherwise joyous occasion of unwrapping new hardware. The pain is multiplied if it happens to be one's job to manage many computers. In its most general form, nerds around the world are busy tackling the problem of software reproducibility. NixOS and Guix System are the ones currently well-known. I would prefer Guix over NixOS myself on account of preferring Guile Scheme over the Nix Expression Language (self-admittedly designed with the goal of not being a full-featured, general purpose language). I was not able to get Guix System (version 1.3.0) working on Oracle's aarch64 A1-Flex VMs. Please email me if you have. In the meantime I am using and enjoying consfigurator for declarative server configuration, at least avoiding another Tower of Babel situation.

Current infrastructure diagram:

Note to self, before scaling this solution horizontally, consider the COST of doing so.

pi-infra.svg
Project Isidore Infrastructure
Figure 2: Project Isidore Infrastructure

Prerequisites:

  1. Deploy Project Isidore

    Root SSH access to a Debian machine is needed. Whether this is a physical server or a virtual private server through a cloud provider is up to you. I have a guided setup on Oracle Cloud Infrastructure.

    Read PRODUCTION.LISP to properly supply SSH credentials and SSL keys.

  2. DNS Resolution - associate domain name with public IP addresses

    Optional: Enable IPv6 on Oracle Cloud Infrastructure.

    Visitors to our website won't want to key in http://140.291.294.154:8080. Our Domain Registrar will point http://my-domain-name.com to http://140.291.294.154. Purchase a domain name if needed.

    I have no affiliation with any Domain Registrar companies, but know that Cloudflare offers at-cost (zero-margin) pricing for domain registration. I will also be making use of their excellent yet generous services in the next step and have done so for years. They seem like an ethical company from what I can gather from their blogs. Oracle has my gratitude, but my recommendation of Cloudflare comes without reservations.

    Cloudflare DNS Records is straightforward to use. Search online for current methods to create a DNS A records (and/or AAAA records for IPV6) specific to your Domain Registrar. Port 80 is the default port for HTTP traffic, as entering http://my-domain-name.com is equivalent to entering http://my-domain-name.com. If you don't want visitors to have to key in http://my-domain-name.com:8080/ then either read on, use OCI's docs to redirect a subdomain to a URL with a port number, or configure your web server to listen on – what is on UNIX a privileged port – 80. Cloudflare can also redirect all HTTP requests to HTTPS instead of doing this on the server NGINX configuration. This allows me to close port 80 and only have port 443 exposed to the public.

  3. Cloudflare

    Cloudflare here acts as a reverse proxy (man-in-the-middle) between the website visitor (client) and the server; it provides Distributed Denial of Service (DDoS) protection as well as Content Delivery Network (CDN) caching services.

    Now by registering your domain with Cloudflare your domain will be served as HTTPS to the client. But communications from Cloudflare to our server is still via un-encrypted HTTP. We remedy this by using Origin Certificates.

    After setting up Authenticated Origin Pulls the origin server now only accepts HTTP requests that use Cloudflare's valid client certificate. There will be no more directly connecting to the web server via IP address.

    This helps Cloudflare's Web Application Firewall and its blocking of automated WordPress vulnerability scanning. WordPress powers 43% of all websites. This means any public website will face a barrage of probing attacks targeting WordPress vulnerabilities. As an example, use Web Application Firewall to block all requests with URI paths containing "php" or "wp-includes" from reaching Nginx.

  4. Developing a remote LISP image

    Of course with the incremental re-defintion in the LISP REPL and our object database, Rucksack, having update-instance-for-redefined-class it would be fun to imagine the flexibility if we could connect a local sly to our production LISP image. Doing so gives us great observation capabilities and is an cool technique to add to our toolbox. What has by now become the stuff of internet legend, Ron Garret's story of Debugging code from 60 million miles away, continues to serve as an inspiration.

    SLY User Manual, version 1.0.42

    For an example of how this application starts a slynk server see da1f7fa. There is no need to allow ingress on OCI's virtual cloud network or in firewalld for port 4005. The 4005 ports on both the local and remote machines are communicate via SSH port 22. Create the SSH tunnel on the local machine,

    ssh -L4005:localhost:4005 pi-server
    

    M-x sly-connect RET "localhost" RET 4005 RET.

    I was playing around with connecting sly to a remote LISP image. I wondered if slynk:create-server would still work after sbcl:save-lisp-and-die. I mean there's no reason to think it wouldn't, but I had to move the slynk:create-server form to the toplevel function used by sbcl:save-lisp-and-die. Otherwise "sudo ss -tulpn | grep LISTEN" didn't show the open port. Mentioning this if it saves somebody else some time.

  5. Crack open a cold one.

    Optional: with the boys.

5.3.1. Oracle Cloud Infrastructure

Oracle Corporation offers the most generous free tier in cloud computing (Infrastructure-as-a-Service) by far. I speculate that this is due to Amazon Web Services (AWS) capturing the biggest part of the pie with Oracle competing with the likes of Google and IBM. Oracle historically has had disputes over their stewardship of Open Source software (Java/JVM patent issues, MySQL, OpenOffice, and Solaris) and – with a significant portion of their clients being Fortune 500 companies– you could say that it represents the best and worst parts of litigation-happy corporate America. But I'm not interested in traveling down the road to serfdom (love some Thomas Sowell as well) and neither am I interested in preaching on the idolatrous love of money. What I will relate to below is my experience with Oracle Cloud Infrastructure (OCI) which has thus far been very positive. Thank you Oracle, OCI has been great to use. Create an account to get started.

See also:

Run Always Free Docker Container on Oracle Cloud Infrastructure | by Lucas Je…

Oracle cloud free tier quirks - The Tallest Dwarf

  1. Create a Compute instance

    Bring your own Image - Debian Stable.

    Download the correct .qcow2 file for your system architecture (ARM64 in our case). In OCI search for Storage > Object Storage & Archive > Buckets, and upload the file into a bucket. Go to Compute > Custom Images and click on Import Image. Make sure "Paravirtualized Mode" is selected. After importing, click Edit Details and tick the box for "VM.Standard.A1.Flex". Edit Custom Image Capabilities to ensure UEFI64 is preferred.

    Top left hamburger menu > Compute > Instances > Create Instance

    Select the imported Debian image and a shape labeled "always free".

    After the "Image and Shape" section is the "Networking" section. The defaults are fine here. Rename if desired and take note of the Virtual Cloud Network (VCN) name and subnet name.

    Under the subheading "Add SSH keys" we can choose to copy and paste the contents of a public key. Generate it like so:

    ssh-keygen -t ed25519 -a 100 -N "" -C "oracle-vm-key" -f ~/.ssh/oracle-vm-key
    

    The file oracle-vm-key contains the private key (-N "" means no passphrase protection). The file oracle-vm-key.pub contains the public key that we will give to cloud-init by pasting the contents of the public key ~/.ssh/oracle-vm-key.pub file.

    Note the ability to specify a custom boot volume size. I believe the minimum boot volume size is 47GB. So with the free allowance of 200GB it is possible to have 4 VM instances. For now I would rather avoid the added complexity of distributed computing. I enlarge the boot volume of my one VM to 200GB.

    After the instance is finished provisioning, write down the public IP address assigned to the VM.

  2. Setup Root SSH login into the Server

    ssh [email protected] -i private-key-file
    

    Replace the host-address with the public IP assigned to the VM. Replace private-key-file with a reference to the file that contains the SSH private key. "opc" is the default user for Oracle Linux. For the Debian image, the default user would be "debian". So in our local shell,

    ssh [email protected] -i ~/.ssh/oracle-vm-key
    # Setup Root SSH access.
    sudo sed -i -e 's/PermitRootLogin no/PermitRootLogin without-password/' /etc/ssh/sshd_config
    sudo cp -f /home/debian/.ssh/authorized_keys /root/.ssh/authorized_keys
    exit
    

    Create the file ~/.ssh/config on our local machine if it does not already exist and add the following lines to connect more ergonomically with $ ssh oci-a1-flex.

    Host oci-a1-flex
      User root
      HostName 140.291.294.154
      IdentityFile ~/.ssh/oracle-vm-key
      ControlPath ~/.ssh/%[email protected]%h:%p
      ControlMaster auto
      ControlPersist yes
    

    Optional: connect from Emacs TRAMP.

    Now C-x C-f with the address /ssh:pi-server:/. ought to work. Popping a shell should also just work thanks to the magic of TRAMP.

  3. Setup Ingress Rules in Security List to open ports on Virtual Cloud Network.

    We can open the ports we need on our server, but we also need to open said ports on the virtual cloud network level.

    Top left hamburger menu > Networking > Virtual Cloud Networks > VCN-name > Subnet-name > Default Security List for VCN-name > Add Ingress Rules

    Example: To allow incoming requests from any IP address to port 443: set source CIDR to 0.0.0.0/0 for IPV4 (::/0 for IPV6) and leave Source Port Range blank. Destination Port Range is set to 443.

    Caution is required here when exposing our compute instance to the wild, wild internet. Oracle ought to and will shutdown any hijacked bot VPSs and terminate the accounts. I have also seen at least one email screen capture of account termination due to torrenting copyrighted material (while using OCI as a VPN). I think it goes without saying that crypto-mining violates some end user license agreement. Consult Oracle's manual to safely run graphical applications. While googling "oracle cloud caveats", I was led to pay special attention to their FAQ.

    However, if you have more Ampere A1 Compute instances provisioned than are available for an Always Free tenancy, all existing Ampere A1 instances are disabled and then deleted after 30 days, unless you upgrade to a paid account. [emphasis mine] To continue using your existing Arm-based instances as an Always Free user, before your trial ends, ensure that your total use of OCPUs and memory across all the Ampere A1 Compute instances in your tenancy is within the Always Free limit.

    You are able to re-create any deleted instances. Still, given the upgrade process from a free account to pay-as-you-go relies on further fraud prevention through Cybersource, I would not be surprised if a share of user woes are unique to the free-tier classification and Oracle's interpretation of "The Always Free services will continue to be available for unlimited time as long as you use the services" (Ibid.)

    In my experience of upgrading an always free tenancy to pay-as-you-go, you can enter your billing details perfectly accurately. I even saw the test charge successfully debited and credited to my online banking portal, but still the upgrade process failed. I suppose my gmail account was too new and had to change my email to my old outlook address in order for the upgrade process to complete.

    Perform due diligence, use the features all major cloud services provide to prevent going over budget.

    Top left hamburger menu > Billing & Cost Management > Budgets > Create Budget

5.4. Micro Benchmarks

The most likely worse case scenario is a frontpage post to some link aggregate website such as Reddit. The number of requests made to the server per client obviously varies depending on architecture and workload.

The plural of anecdote is data I hope, so if I'm allowed to draw a similarity between the application in the above anecdote and my own, I should ballpark for around 300 requests per second. Cloudflare is the star of the story here, and I will continue to sing their praises: their CDN allows the caching of my static assets.

I originally spent some time seeing if there was a convenient way to hide .html from the URL when serving blog entries. Org-publish generates my static blog, and what seemed like a minor annoyance at first proved useful when setting up Cloudflare page rules; it took one rule to tell Cloudflare to cache all URL's ending in .html. They have a very reasonable usage policy for their CDN (see section 2.8 of their EULA). OCI deserves applause in tandem, for their 10TB free data egress. Thank you to abuse-prevention teams in both these companies! It makes the homegrown part of the internet possible.

We use loader.io to benchmark our application.

15 clients per second for a duration of 1 minute. HTTP Resource: https://www.hanshenwang.com/bible/1-1-1/1-1-31

COMMIT OS CPU MEMORY AVG. RESP (ms) AVG. ERR RATE (%)
4eaaa66 Alpine 3.15 standard-1x 512MB N/A N/A
cfe3566 Distroless 3x share-cpu-1x 3x 256MB 1820 0
e557bb5 Debian 11.3 4 OCPU VM.Standard.A1.Flex 24GB 3504 0

300 clients per second for a duration of 1 minute. HTTP Resource: https://www.hanshenwang.com/about

COMMIT OS CPU MEMORY AVG. RESP (ms) AVG. ERR RATE (%)
4eaaa66 Alpine 3.15 standard-1x 512MB 1400.66 6.53
cfe3566 Distroless 3x share-cpu-1x 3x 256MB 43 0
e557bb5 Debian 11.3 4 OCPU VM.Standard.A1.Flex 24GB 43 0

300 clients per second for a duration of 1 minute. HTTP Resource: https://www.hanshenwang.com/assets/blog/installation-of-spacemacs.html

COMMIT OS CPU MEMORY AVG. RESP (ms) AVG. ERR RATE (%)
4eaaa66 Alpine 3.15 standard-1x 512MB N/A N/A
cfe3566 Distroless 3x share-cpu-1x 3x 256MB 19 0
e557bb5 Debian 11.3 4 OCPU VM.Standard.A1.Flex 24GB 37 0

For a primer on high performance LISP web servers see Woo: a high-performance Common Lisp web server. It should be pointed out that the hunchentoot listed on Woo's benchmark graph is the single threaded version. The multi-threaded version benchmarks are more impressive. The article about Woo also fails to mention quux-hunchentoot which employs a thread-pooling taskmaster as an extension to Hunchentoot.

These stress tests are run with sudo apt install wrk on OCI's A1 Flex VM's with 4 cores and 24GB at commit d0f10c5.

  1. Cl-tbnl-gserver-tmgr

    [email protected]:~$ wrk -t4 -c100 -d10 "http://localhost:8081/about"
    Running 10s test @ http://localhost:8081/about
      4 threads and 100 connections
      Thread Stats   Avg      Stdev     Max   +/- Stdev
        Latency     2.94ms    2.84ms  43.51ms   90.52%
        Req/Sec     0.91k   509.03     1.73k    46.00%
      27146 requests in 10.02s, 50.51MB read
    Requests/sec:   2709.57
    Transfer/sec:      5.04MB
    
    [email protected]:~$ wrk -t4 -c100 -d10 "http://localhost:8081/about"
    Running 10s test @ http://localhost:8081/about
      4 threads and 100 connections
      Thread Stats   Avg      Stdev     Max   +/- Stdev
        Latency     2.93ms    2.82ms  51.14ms   90.58%
        Req/Sec     0.91k   660.73     2.08k    64.67%
      27306 requests in 10.02s, 50.81MB read
    Requests/sec:   2726.07
    Transfer/sec:      5.07MB
    
    [email protected]:~$ wrk -t4 -c100 -d10 "http://localhost:8081/about"
    Running 10s test @ http://localhost:8081/about
      4 threads and 100 connections
      Thread Stats   Avg      Stdev     Max   +/- Stdev
        Latency     2.92ms    2.82ms  49.55ms   90.39%
        Req/Sec     1.37k   808.96     2.52k    52.00%
      27355 requests in 10.02s, 50.90MB read
    Requests/sec:   2731.18
    Transfer/sec:      5.08MB
    
  2. Default multi-threaded Hunchentoot

    [email protected]:~$ wrk -t4 -c100 -d10 "http://localhost:8082/about"
    Running 10s test @ http://localhost:8082/about
      4 threads and 100 connections
      Thread Stats   Avg      Stdev     Max   +/- Stdev
        Latency    64.00ms  131.31ms   1.95s    92.41%
        Req/Sec   504.07    391.26     1.67k    71.28%
      14308 requests in 10.05s, 26.62MB read
      Socket errors: connect 0, read 0, write 0, timeout 36
    Requests/sec:   1423.18
    Transfer/sec:      2.65MB
    
    [email protected]ter-instance:~$ wrk -t4 -c100 -d10 "http://localhost:8082/about"
    Running 10s test @ http://localhost:8082/about
      4 threads and 100 connections
      Thread Stats   Avg      Stdev     Max   +/- Stdev
        Latency    65.20ms  142.16ms   1.96s    93.24%
        Req/Sec   416.33    321.55     1.55k    69.21%
      14800 requests in 10.08s, 27.54MB read
      Socket errors: connect 0, read 0, write 0, timeout 40
    Requests/sec:   1468.68
    Transfer/sec:      2.73MB
    
    [email protected]:~$ wrk -t4 -c100 -d10 "http://localhost:8082/about"
    Running 10s test @ http://localhost:8082/about
      4 threads and 100 connections
      Thread Stats   Avg      Stdev     Max   +/- Stdev
        Latency    65.67ms  145.15ms   2.00s    93.67%
        Req/Sec   452.20    421.83     2.00k    74.68%
      14152 requests in 10.06s, 26.33MB read
      Socket errors: connect 0, read 0, write 0, timeout 35
    Requests/sec:   1406.08
    Transfer/sec:      2.62MB
    
  3. Clack with Woo and libev-dev

    [email protected]:~$ wrk -t4 -c100 -d10 "http://localhost:8083/about"
    Running 10s test @ http://localhost:8083/about
      4 threads and 100 connections
      Thread Stats   Avg      Stdev     Max   +/- Stdev
        Latency    43.78ms   29.71ms 221.07ms   88.44%
        Req/Sec   634.03    234.05     0.91k    77.25%
      25272 requests in 10.02s, 47.02MB read
    Requests/sec:   2522.89
    Transfer/sec:      4.69MB
    
    [email protected]:~$ wrk -t4 -c100 -d10 "http://localhost:8083/about"
    Running 10s test @ http://localhost:8083/about
      4 threads and 100 connections
      Thread Stats   Avg      Stdev     Max   +/- Stdev
        Latency    55.64ms   72.08ms 597.50ms   92.30%
        Req/Sec   630.25    245.64     0.95k    76.80%
      24364 requests in 10.02s, 45.33MB read
    Requests/sec:   2432.66
    Transfer/sec:      4.53MB
    
    [email protected]:~$ wrk -t4 -c100 -d10 "http://localhost:8083/about"
    Running 10s test @ http://localhost:8083/about
      4 threads and 100 connections
      Thread Stats   Avg      Stdev     Max   +/- Stdev
        Latency    44.21ms   29.66ms 204.73ms   88.33%
        Req/Sec   629.64    227.31     0.89k    78.00%
      25085 requests in 10.01s, 46.67MB read
    Requests/sec:   2505.62
    Transfer/sec:      4.66MB
    
  4. Quux-Hunchentoot Thread Pool

    [email protected]:~$ wrk -t4 -c100 -d10 http://localhost:8080/
    Running 10s test @ http://localhost:8080/
      4 threads and 100 connections
      Thread Stats   Avg      Stdev     Max   +/- Stdev
        Latency     3.34ms    4.82ms 166.69ms   94.11%
        Req/Sec   820.70    791.93     2.29k    72.15%
      24371 requests in 10.04s, 160.35MB read
    Requests/sec:   2426.30
    Transfer/sec:     15.96MB
    [email protected]:~$ wrk -t4 -c100 -d10 http://localhost:8080/
    Running 10s test @ http://localhost:8080/
      4 threads and 100 connections
      Thread Stats   Avg      Stdev     Max   +/- Stdev
        Latency     3.13ms    2.97ms  43.67ms   93.27%
        Req/Sec     1.28k   452.76     1.89k    72.50%
      25593 requests in 10.05s, 168.39MB read
    Requests/sec:   2546.40
    Transfer/sec:     16.75MB
    [email protected]:~$ wrk -t4 -c100 -d10 http://localhost:8080/about
    Running 10s test @ http://localhost:8080/about
      4 threads and 100 connections
      Thread Stats   Avg      Stdev     Max   +/- Stdev
        Latency     2.86ms    2.70ms  52.46ms   94.02%
        Req/Sec     0.94k   653.16     2.06k    64.67%
      27966 requests in 10.04s, 52.09MB read
    Requests/sec:   2784.27
    Transfer/sec:      5.19MB
    

Because MAKE.LISP fiddles with compiler knobs in search of performance, a single [email protected] in production can reach upwards of 3300 requests per second. To improve application resiliency Nginx is used to load balance between 12 project-isidore processes each at 2GB of RAM. Lastly, of note is the 4GBps network bandwidth offered by Oracle Cloud. My bottleneck should be at this point the rucksack database, despite offering concurrent transactions.

So with the bare minimum amount of testing done I can say with confidence that my website is well prepared given the restraints on cost, time, and money.

6. Data Persistence

Project Isidore uses an embedded database (Rucksack) over more typical client-server RDBMS's such as PostgreSQL. Extremely heavy read-heavy data or whatever data that ought to be cached are stored by the in-memory object prevalence model (BKNR.Datastore). Moore's law has brought us significant improvements, and as a result SQLite is a viable choice for this application.

6.1. PostgreSQL

This section is now outdated. I implemented a trivial PostgreSQL mailing list but running what amounts to your own mail server that does not have its messages marked by the big email providers as spam is most definitely non-trivial. As of commit 7a4fc5d, I have switched to Mailchimp. On production instances, there's a savings of 138MB-92.8MB=45.2MB RAM.

PostgreSQL has very clear and structured documentation. Refer to the documentation to install PostgreSQL locally on your computer. Afterwards a good introduction to basic Create, Read, Update, Delete (CRUD) operations is here: 12 Steps to Build and Deploy Common Lisp in the Cloud (and Comparing Rails) |…

Documentation on Postmodern is better than your average Common Lisp library. Still to supplement the official docs are examples and specifically examples using the Data Access Objects.

  1. To start PostgreSQL server process
# Install PostgreSQL.
sudo apt install postgresql
# Start server process. PostgreSQL defaults to PORT 5432
sudo service postgresql start
# Create database "Test"
createdb test
# Delete database "Test"
dropdb test
# Login as superuser to create user for database "Test"
sudo -u postgres psql
# See defparameter `*local-db-params*' in MODEL.LISP.
CREATE USER user1 WITH PASSWORD 'user1';
# Use Shell to login as host: localhost with database: test and user:user1
psql -h localhost -d test -U user1
# Once logged in,
test=# select * from tablename;

6.2. In-Memory Datastore

Design constraints imposed by the current deployment platform, Heroku & Github.

  • Heroku managed PostgreSQL free tier limitations = 10000 rows, 1GB disk capacity.
  • Heroku free tier dyno memory (automatic dyno shutdown at 1GB) = 512 MB.
  • Heroku free tier slug size = 500 MB.
  • Github large file limit = 100MB-2GB.

Characteristics of Bible dataset:

  • Read-only data.
  • Dataset should be available offline.
  • Non-expanding dataset.
  • Fits within Heroku free tier dyno memory (18-20MB). Online reports of 80-140mb RAM usage by hunchentoot + ironclad.
  • Limited developer resources mean instead of programming/debugging in LISP, I would need to master a second domain specific programming language: SQL.
  • Very cost sensitive (cut me some slack, I'm a college student).

Object Relational Mappers (ORM) are notoriously hard to get right. It is too bad the pure LISP persistence solutions (Allegrocache) remain proprietary. For open source solutions, I still think bknr.datastore is among the best for now. Rucksack by Arthur Lemmens is also worth playing with, but due to the restrictions of the Heroku ephemeral filesystem, the library with the best fit for my application would be bknr.datastore.

Memory-Centric Data Management A Monash Information Services White Paper by Curt

  1. Monash, Ph.D. May, 2006, accessible at http://www.monash.com/whitepapers.html

Object Prevalence : An In-Memory, No-Database Solution to Persistence | by Pa…

https://nicklevine.org/lisp-book/contents/chac.pdf

https://cl-pdx.com/static/persistence-lemmens.txt

6.3. BKNR.Datastore vs Rucksack vs Postmodern

Rucksack measurements are listed first, then BKNR.Datastore.

PROJECT-ISIDORE/VIEWS> (time (get-bible-text 23))
Evaluation took:
  0.000 seconds of real time
  0.000098 seconds of total run time (0.000095 user, 0.000003 system)
  100.00% CPU
  228,122 processor cycles
  0 bytes consed

PROJECT-ISIDORE/VIEWS> (time (get-bible-text 23))
Evaluation took:
  0.000 seconds of real time
  0.000020 seconds of total run time (0.000020 user, 0.000000 system)
  100.00% CPU
  36,872 processor cycles
  0 bytes consed

PROJECT-ISIDORE/VIEWS> (time (bible-page "1-1-1-2-2-2"))
Evaluation took:
  0.330 seconds of real time
  0.328973 seconds of total run time (0.328973 user, 0.000000 system)
  99.70% CPU
  821,152,853 processor cycles
  30,101,264 bytes consed

PROJECT-ISIDORE/VIEWS> (time (bible-page "1-1-1-2-2-2"))
Evaluation took:
  0.440 seconds of real time
  0.441031 seconds of total run time (0.431896 user, 0.009135 system)
  [ Run times consist of 0.017 seconds GC time, and 0.425 seconds non-GC time. ]
  100.23% CPU
  1,098,042,384 processor cycles
  91,860,080 bytes consed

PROJECT-ISIDORE/VIEWS> (time (bible-page "1-1-1-73-22-21"))
Evaluation took:
  8.990 seconds of real time
  8.998338 seconds of total run time (8.102798 user, 0.895540 system)
  [ Run times consist of 0.077 seconds GC time, and 8.922 seconds non-GC time. ]
  100.09% CPU
  22,446,254,265 processor cycles
  796,903,104 bytes consed

PROJECT-ISIDORE/VIEWS> (time (bible-page "1-1-1-73-22-21"))
Evaluation took:
  0.660 seconds of real time
  0.659741 seconds of total run time (0.641298 user, 0.018443 system)
  [ Run times consist of 0.025 seconds GC time, and 0.635 seconds non-GC time. ]
  100.00% CPU
  1,641,964,703 processor cycles
  327,536,880 bytes consed

A sampling of postmodern DAO speeds,

PROJECT-ISIDORE/MODEL> (time (friend-email (mailinglist-get 2)))
Evaluation took:
  0.010 seconds of real time
  0.002794 seconds of total run time (0.002794 user, 0.000000 system)
  30.00% CPU
  25,589,575 processor cycles
  32,432 bytes consed

"[email protected]"

And a regular SQL query done in postmodern,

PROJECT-ISIDORE/MODEL>  (time (pomo:with-connection (db-params) (pomo:query (:select 'email :from 'mailinglist :where (:= 'id 2)) :single)))
Evaluation took:
  0.010 seconds of real time
  0.003089 seconds of total run time (0.003089 user, 0.000000 system)
  30.00% CPU
  24,084,690 processor cycles
  16,368 bytes consed

"[email protected]"
1 (1 bit, #x1, #o1, #b1)

A sampling of cl-sqlite with the in-memory database,

SQLITE>
(time (execute-single *db* "select id from users where user_name = ?" "dvk"))
Evaluation took:
  0.000 seconds of real time
  0.000047 seconds of total run time (0.000045 user, 0.000002 system)
  100.00% CPU
  108,448 processor cycles
  0 bytes consed

2 (2 bits, #x2, #o2, #b10)

and with a regular disk based database,

SQLITE> (time (execute-single *dba* "select id from users where user_name = ?" "dvk"))
Evaluation took:
  0.000 seconds of real time
  0.000085 seconds of total run time (0.000081 user, 0.000004 system)
  100.00% CPU
  196,908 processor cycles
  0 bytes consed

2 (2 bits, #x2, #o2, #b10)

For this admittedly shallow testing, BKNR.Datastore and Rucksack perform admirably! Quicklisp-stats shows BKNR.Datastore is still in use, with the most recent example being a startup by an ex-Facebook engineer. The most recent mention I can find of Rucksack is by Ravenpack, at ELS2020.

The quicklisp download stats for Rucksack show an 200% increase (from NIL > 223/239) around the months of January and February. I find it plausible that Ravenpack, who does data analysis for the financial sector, have their engineers repull all libraries from quicklisp once per year; I have noticed same patterns for some other libraries.

BKNR.Datastore is the reason I am able to structure the Tabular Douay Rheims in the way that I have. Any disk based solution would have been too slow, forcing me to cache the pages or store them as static files. Many thanks to Hans Hubner, the author of said library. And yes, cl-store or simple serialization would have sufficed for my use case but I'll take Edmund Weitz's word when he recommends BKNR.Datastore over cl-store in his ebib:weitzCommonLispRecipes2016.

The question of ORM's comes up again. As of the current writing, I am aware of a handful of options.

Elephant - interfaces to the C databases (Berkeley DB, PostgreSQL) most likely bitrotted, perhaps the author got hired to work on allegrocache? Quicklisp-stats show 35 downloads total for the past 2 years.

CLSQL - by far the greatest number of backends supported, with the necessary compromises that suggests. Tradition extends back many years, shares ideas with CommonSQL and has a SOLID track record. Packaged for Debian. Great documentation.

Hu.dwim.perec - Originally started as a fork of CLSQL. Greatly extends the ORM capabilities and is kept up to date: I think mostly by one person, Levente Mészáros. This is one of the libraries that show almost exact same download patterns as Rucksack. So what's the catch? Very little documentation. You will have to dive deep into a bunch of hu.dwim.* packages and look at tests. Pretty much pulls in the entire hu.dwim.* ecosystem with it when downloaded from quicklisp. With great power comes…

Mito - Fukamachi-ware. Similar download stats to CLSQL. Very young project, started in 2016. Also see Crane, the stats show close to zero usage though.

If one is willing to make the trade-off of SQL for object persistence, then as far as I know there are really only two pure lisp options. I say "only" but I'm not aware of any other language with libraries comparable to the ones below.

Rucksack (open source) - as mentioned earlier, authored by hacker Arthur Lemmens (worked at Ravenbrook). Small, written in portable lisp, possessing performance that isn't bad at all, a real gem of a project. The mailing lists of Elephant and Rucksack show some attempt made to combine the ideas of Rucksack into a pure lisp backend for Elephant. Rucksack also shows a lack of recent updates, but unlike Elephant, has users to this day. Ain't it beautiful how the stability of the language shines through the library? Don't let the date of the last commit turn you away. Give it a shot, look at the talk-eclm2006.txt file under the /docs folder.

Allegrocache (proprietary, Franz Inc.) - what Rucksack could have been if you threw a bunch of money at the problem. Tightly integrated with the rest of the Allegro CL ecosystem. You do get what you pay for in this case. I have heard they have great support too. Allegrocache was originally based on ObjectStore (Dan Weinreb of Symbolics fame + others). Dan does a fair job at defending object oriented database management systems here. I would like to point out that Glenn D. House of 2Is (DoD contractor) testifies (21:30) to the conclusions found in Prechault and Garret when comparing Lisp v. Java v. C/C++. Grammatech is also a DARPA funded shop that uses Common Lisp. I would also be remiss if I did not mention the recent milestone of CLASP (Common Lisp implementation with C++ interop) version 1.0. The geriatric IP of Symbolics is still closed source; rumor is there are still legacy DoD contracts and that American Express's fraud detection used to (up until the mid 2000's?) use Open Genera.

7. Case Study: Profiling and Performance

A through treatment of the generalities of optimization in Common Lisp can be found in weitzCommonLispRecipes2016 pages 503-560. Dr. Weitz testifies that lisp (SBCL) can confidently reach within a 5x ballpark of C. Less, obviously, if there are many fixnums. SBCL compiler contributor Paul Khuong has also testified to a ballpark of 3x within C. Of course, squabbling over language performance is time better spent on data structures and algorithms, but to have a ballpark estimate is good to know. Python and Ruby for example, are magnitudes slower than Lisp. Ballpark of 20x and 40x respectively. For a dynamically typed, high level language, Lisp performs admirably. Especially striking is the non-leaky abstraction and degree of control provided to the programmer at read, compile and runtime; see the "disassemble" function and runtime compilation tricks.

Lastly, Professor Scott Fahlman, one of the original designers of Common Lisp, weighs in on his experience circa 2020.

Short answer: I don’t know about the Clozure compiler, but the compiler used in the open-source CMU Common Lisp (CMU CL) system produces code that is is very close in performance to C a little faster for some benchmarks, a little slower for others.

But there are some things a Common Lisp user needs to understand in order to get that performance. Basically, you need to declare the types of variables and arguments carefully, and you should not do a lot of dynamic storage allocation (“consing”) in performance critical inner loops usually just 10% or 20% of your total system.

(Steel Bank Common Lisp (SBCL) is essentially the same as CMU CL in terms of the performance of compiled code. The CMU CL open-source community split in two in December 1999 as a result of some disagreements about design and philosophy, and one branch was renamed SBCL. I believe that the parts of the compiler concerned with optimization have not changed much. I currently prefer SBCL for my work on the Scone knowledge-base system and other things.)

The Java compilers I know about produce code that is considerably slower than CMU CL and SBCL. I don’t know much about Haskell performance. C++ is similar to C in performance if you use it like C; I believe that if you make heavy use of the object-oriented features it is considerably slower.

Longer answer, for nerds: I was one of the core designers of Common Lisp. (We were sometimes referred to in those days as the “Gang of Five”.) I wrote what I believe was the first Common Lisp compiler (drawing heavily on the design of earlier compilers for Maclisp and Lisp Machine Lisp).

For many years I ran the CMU Common Lisp project. As part of that project, we developed a public-domain Common Lisp runtime that became the basis for a number of commercial Common Lisp implementation, adapted and supported by various large companies for their own machines.

David B. McDonald, in my group, spent something like four years developing a very highly optimizing compiler for Common Lisp. We called this the “Python” compiler, which caused some confusion when the Python programming language became popular more than a decade later. No relation.

(I proposed that we name our compiler “Python” because a python the snake eats a whole pig, then goes under a bush for several weeks to sleep. The pig makes its way slowly through the snake’s internal pipelines, ultimately emerging as a very compact pellet. Which is pretty much what compilers do, to one degree or another.)

At the time the early 1980s I was starting to work on programs for implementing (or simulating) artificial neural nets. These needed very efficient floating-point arithmetic and vector operations, and I wanted to be sure that we could efficiently program these things in CMU Common Lisp. But at the time, Common Lisp had the reputation of being really awful at floating point a straightforward implementation would constantly be allocating “boxed” floating-point numbers that had to be garbage-collected later.

So Dave McDonald labored mightily over type inference and non-consing ways to handle floating point, and he got the job done. I wrote some neural-net programs in CMU CL, they were later translated into C for wider distribution (by an undergrad coding wizard who really knew what he was doing). The two versions were very close in runtime. When DARPA pulled the plug on support for Common Lisp development, CMU Common Lisp became an open-source project with a different set of developers/maintainers. Both our runtime and our “Python” compiler are part of that distribution (and of SBCL), though of course there has been some evolution since then.

What programmers need to know to get good performance in Common Lisp: People speak of Lisp as a “dynamically typed” language. I think it is more correct to call it (at least for the CMU CL implementation) an “optionally strongly typed” language. The philosophy is this: Programmers can say as much or as little as they like about the type of an argument or value. Whatever you say had better be true you can tell the compiler to trust the declarations or to be suspicious. The more precisely you specify what the entity is, the more likely it will be that the compiler can do some clever optimization based on what you told it.

So, for example, you could say just “number” or “integer” or “integer between 0 and 1024” or “prime number between 0 and 1024”. If you use a very general declaration, the code will work, but it will have to do some runtime type-checking to see what kind of number it is dealing with. It must be ready to deal with some of the exotic number-types that Common Lisp supports: infinite-precisions integers (“bignums”), ratios of integers, imaginary numbers, several levels of floating-point precision, and so on. There is a special, very compact and efficient format that can be used for small integers, but it can only be used if you tell the compiler what to expect.

The same is true of things other than integers: There are several array formats. You can specify what size/shape data to expect, or you can wait and see what someone hands you at runtime. When you get into object-oriented programming, you can tell the system what to expect (and it can figure out more internal data-types via type inference), or you can wait and see what you get and do a runtime type-dispatch to find the proper method to use. That takes time.

So, if you want good performance in Common Lisp, especially for arithmetic and array operations, you have to declare the types as precisely as you can. Then the compiler will do its magic.

Flash forward: Sometime around 2001, I was working for IBM Research (on leave for a while from CMU long story…) and my project was to implement an early version of what became (after several restarts) my Scone knowledge-base system. At the time, the prevalent language in IBM Research was Java, and I knew that it would be hard to interest the IBM people in my system if I did it in Lisp. So I started out doing it in Java.

This system had essentially no arithmetic in it, but did a lot of pointer-chasing off into main memory, a lot of boolean operations, and had a few very intensive inner loops where it spent all its time. Common Lisp was clearly the right tool for this job, but I worked hard on the Java version. Finally, for reasons too complicated to go into here, I decided that I could no longer stand programming the system in Java some very important facilities were missing, especially the Lisp macro system so I decided to port the half-done system to Common Lisp. (As predicted, IBM then lost interest, and I returned to CMU.)

The performance-critical inner loops had already been programmed at that time, so I was able to compare the performance of the Lisp and Java versions. The Lisp version was about 3X faster. Part of that difference was because I was a very good Lisp programmer, and rather new at Java, though I had talked to Java experts about how to get good performance for code like mine. So some of the performance difference was experience, but mostly it was because at the time (and I think still today) Java did not do a lot of object-type inference at compile time. So pretty much every function call was a type-dispatch to find the right method, and that was slow.

There wasn’t a good way around this inefficiency except by writing the performance-critical code as long linear stretches of code with no function calls. And without a good macro system, that was much too tedious.

What follows is an amateur recording of my struggles, useful to jog my memory in the future when working with the SBCL statistical profiler. I had a pretty good idea the function call that needed to be profiled, but if it was a foreign system, I would try a library like Daniel Kochmański / metering · GitLab.

;; Load project into image.
(ql:quickload :project-isidore)
;; Load SBCL's statisical profiler.
(require :sb-sprof)
;; Includes COMMON LISP USER package symbols as defined in the Hyperspec.
(sb-sprof:profile-call-counts "CL-USER")
;; Profile and output both graph and flat formats.
(sb-sprof:with-profiling (:max-samples 5000
                          :report :graph
                          :loop t
                          :show-progress t)
  (project-isidore:bible-page "1-1-1-3-3-3"))
           Self        Total        Cumul
  Nr  Count     %  Count     %  Count     %    Calls  Function
------------------------------------------------------------------------
   1   2004  40.1   2004  40.1   2004  40.1        -  EQUALP
   2    299   6.0   1054  21.1   2303  46.1        -  (SB-PCL::EMF SB-MOP:SLOT-VALUE-USING-CLASS)
   3    205   4.1    205   4.1   2508  50.2        -  (LAMBDA (SB-PCL::.ARG0.) :IN "SYS:SRC;PCL;DLISP3.LISP")
   4    193   3.9    443   8.9   2701  54.0        -  (SB-PCL::FAST-METHOD BKNR.SKIP-LIST:SL-CURSOR-NEXT (BKNR.SKIP-LIST:SKIP-LIST-CURSOR))
   5    190   3.8   3896  77.9   2891  57.8        -  REMOVE-IF-NOT
   6    177   3.5    212   4.2   3068  61.4        -  COPY-LIST
   7    165   3.3    165   3.3   3233  64.7        -  (LAMBDA (SB-PCL::.ARG0.) :IN "SYS:SRC;PCL;BRAID.LISP")
   8    157   3.1    157   3.1   3390  67.8        -  (LAMBDA (SB-PCL::.ARG0.) :IN "SYS:SRC;PCL;PRECOM2.LISP")
   9    143   2.9    143   2.9   3533  70.7        -  foreign function syscall
  10    141   2.8    141   2.8   3674  73.5        -  SB-KERNEL:TWO-ARG-STRING-EQUAL
  11    134   2.7    134   2.7   3808  76.2        -  (LAMBDA (CLASS SB-KERNEL:INSTANCE SB-PCL::SLOTD) :IN SB-PCL::MAKE-OPTIMIZED-STD-SLOT-VALUE-USING-CLASS-METHOD-FUNCTION)
  12    121   2.4    121   2.4   3929  78.6        -  (LAMBDA (SB-KERNEL:INSTANCE) :IN SB-PCL::GET-ACCESSOR-FROM-SVUC-METHOD-FUNCTION)
  13    110   2.2    110   2.2   4039  80.8        -  (LAMBDA (SB-PCL::.ARG0.) :IN "SYS:SRC;PCL;BRAID.LISP")

Self is how much time was spent doing work directly in that function. Total is how much time was spent in that function, and in the functions it called. As this is my own code, I know that REMOVE-IF-NOT calls EQUALP in a lot of functions. But it can also be a probable hypothesis from just this data alone. Cumul is obviously the additive results of the Self column. I have cut it off at 80% in light of the Pareto principle. The hypothesis can be confirmed from looking at the graph formatted portion of the profiler output, pasted below.

------------------------------------------------------------------------
  3840  76.8                   PROJECT-ISIDORE/MODEL:GET-BIBLE-UID [99]
    44   0.9                   REMOVE-IF-NOT [5]
    55   1.1                   PROJECT-ISIDORE/MODEL:GET-HAYDOCK-TEXT [96]
   190   3.8   3896  77.9   REMOVE-IF-NOT [5]
     1   0.0                   (LAMBDA (PROJECT-ISIDORE/MODEL::X) :IN PROJECT-ISIDORE/MODEL::FILTER-LIST-BY-VERSE) [114]
     1   0.0                   (SB-PCL::EMF SB-MOP:SLOT-VALUE-USING-CLASS) [2]
     1   0.0                   (LAMBDA (SB-PCL::.ARG0.) :IN "SYS:SRC;PCL;DLISP3.LISP") [3]
     1   0.0                   foreign function alloc_list [29]
    43   0.9                   (LAMBDA (PROJECT-ISIDORE/MODEL::X) :IN PROJECT-ISIDORE/MODEL::FILTER-LIST-BY-CHAPTER) [30]
   140   2.8                   SB-KERNEL:TWO-ARG-STRING-EQUAL [10]
  1421  28.4                   (LAMBDA (PROJECT-ISIDORE/MODEL::X) :IN PROJECT-ISIDORE/MODEL::FILTER-LIST-BY-BOOK) [17]
  2004  40.1                   EQUALP [1]
    44   0.9                   REMOVE-IF-NOT [5]
    47   0.9                   (LAMBDA (PROJECT-ISIDORE/MODEL::X) :IN PROJECT-ISIDORE/MODEL:GET-HAYDOCK-TEXT) [38]
------------------------------------------------------------------------

GET-BIBLE-UID is expected to take up a large portion of function calls based on my design choices and some back of the napkin math. The profiler has confirmed the information. Let's see if we can't optimize this particular function further.

(time (project-isidore:bible-page "1-1-1-3-3-3"))
Evaluation took:
  64.490 seconds of real time
  64.673257 seconds of total run time (64.032109 user, 0.641148 system)
  [ Run times consist of 2.276 seconds GC time, and 62.398 seconds non-GC time. ]
  100.28% CPU
  160,990,605,382 processor cycles
  16,089,927,136 bytes consed

Replacing equalp with eql in places where appropriate in the file model.lisp . For the best explanation on the different equality predicates in Common Lisp, see Equality in Lisp - Eli Bendersky's website.

Lisp's equality operators are:

= compares only numbers, regardless of type.

eq compares symbols. Two objects are eq if they are actually the same object in memory. Don't use it for numbers and characters.

eql compares symbols similarly to eq, numbers (type sensitive) and characters (case sensitive)

equal compares more general objects. Two objects are equal if they are eql, strings of eql characters, bit vectors of the same contents, or lists of equal objects. For anything else, eq is used.

equalp is like equal, just more advanced. Comparison of numbers is type insensitive. Comparison of chars and strings is case insensitive. Lists, hashes, arrays and structures are equalp if their members are equalp. For anything else, eq is used.

Evaluation took:
  74.060 seconds of real time
  74.232869 seconds of total run time (73.519834 user, 0.713035 system)
  [ Run times consist of 2.961 seconds GC time, and 71.272 seconds non-GC time. ]
  100.23% CPU
  184,855,087,325 processor cycles
  16,089,928,160 bytes consed

Would you look at that. Worse performance as a result of going from a more general equality predicate to a more specific equality predicate. I'm guessing SBCL does some fancy optimization tricks here.

From Edmund Weitz on string-equal/equalp

I would assume that on most implementations STRING-EQUAL is a bit faster (given the right optimization declarations) because it "knows" that its arguments are strings. It's most likely a micro-optimization that's only noticable in tight loops.

It can also be self-documenting to use STRING-EQUAL because the reader of your code then knows that you expect both of its arguments to be strings.

Therefore switching EQUALP to STRING-EQUAL in FILTER-LIST-BY-BOOK gives me the following speedup.

Evaluation took:
  69.390 seconds of real time
  69.608650 seconds of total run time (68.808030 user, 0.800620 system)
  [ Run times consist of 2.482 seconds GC time, and 67.127 seconds non-GC time. ]
  100.32% CPU
  173,202,179,865 processor cycles
  16,089,928,912 bytes consed

Going back a few steps and using = instead of eql doesn't result in anything significant at all.

I decided to replace the functional approach of REMOVE-IF-NOT with the LOOP DSL. loops - Iterate through a list and check each element with your own condition… This surprisingly did nothing.

Instead of going through the same list three times and collecting one item at a time, I decided to remove the nested loops and go through the same list once but collect three items. This was more due to my unfamiliarity with loop keywords. Good speedup results.

Evaluation took:
  14.960 seconds of real time
  15.014469 seconds of total run time (14.824816 user, 0.189653 system)
  [ Run times consist of 0.963 seconds GC time, and 14.052 seconds non-GC time. ]
  100.36% CPU
  37,334,905,433 processor cycles
  6,407,417,456 bytes consed

Removing some list copying in the filter and get-bible-uid functions yielded:

Evaluation took:
  13.300 seconds of real time
  13.353905 seconds of total run time (13.183019 user, 0.170886 system)
  [ Run times consist of 0.731 seconds GC time, and 12.623 seconds non-GC time. ]
  100.41% CPU
  33,191,116,692 processor cycles
  6,406,945,232 bytes consed

Proving once again that I am careless and forgetful, the following change resulted in a 133x speedup. I believe this function was coded when bknr.datastore:store-objects-with-class was the only external function I was familiar with in bknr.datastore and I was ignorant of bknr.datastore:store-object-with-id. Prior to the change, every single time get-haydock-text was called, it would iterate through all 35817 verses of the bible to find one instance of haydock-text. Another concrete example to read the manual of whatever library I am using.

(defun get-haydock-text (bible-uid)
  "Returns a string if bible-uid is valid else return NIL.
The bible-uid can be found by calling `get-bible-uid' with valid arguments."
  (let ((cpylist (remove-if-not
                  (lambda (x)
                    (and x
                         (equalp
                          bible-uid
                          (slot-value x 'bknr.datastore::id))))
                  (copy-list
                   (bknr.datastore:store-objects-with-class
                    'bible)))))
    (if (slot-boundp (car cpylist) 'haydock-text)
        (slot-value (car cpylist) 'haydock-text)
        (format t "GET-HAYDOCK-TEXT called with invalid bible-uid ~a" bible-uid))))
(defun get-haydock-text (bible-uid)
  "Returns a string if bible-uid is valid else return NIL.
The bible-uid can be found by calling `get-bible-uid' with valid arguments."
    (if (slot-boundp (bknr.datastore:store-object-with-id bible-uid) 'haydock-text)
        (slot-value (bknr.datastore:store-object-with-id bible-uid) 'haydock-text)
        (format t "GET-HAYDOCK-TEXT called with invalid bible-uid ~a" bible-uid)))
Evaluation took:
  0.100 seconds of real time
  0.107482 seconds of total run time (0.097531 user, 0.009951 system)
  [ Run times consist of 0.007 seconds GC time, and 0.101 seconds non-GC time. ]
  107.00% CPU
  267,219,700 processor cycles
  40,850,160 bytes consed

I admit this was less of an exercise in speeding up Common Lisp and more of a demonstration of human frailty.

Looking through the git log for commits of type "Perf" shows further optimization commits I have done. Current result for version 1.2.1,

Evaluation took:
  0.010 seconds of real time
  0.011488 seconds of total run time (0.011127 user, 0.000361 system)
  110.00% CPU
  28,639,406 processor cycles
  10,191,536 bytes consed

After the addition of regex generated cross-references in version 1.2.2,

Evaluation took:
  0.840 seconds of real time
  0.833305 seconds of total run time (0.813444 user, 0.019861 system)
  [ Run times consist of 0.002 seconds GC time, and 0.832 seconds non-GC time. ]
  99.17% CPU
  2,078,095,375 processor cycles
  80,577,968 bytes consed

8. User Manual

8.1. How do I find past versions of a blog article?

To view the entire revision history of an article, find and click the article title on the repository's blog subfolder. Then click on the History button to view specific, atomic changes. Project release notes are available for a general overview and inputting the article URL into the Internet Wayback Machine is also an option.

8.2. How do I unsubscribe from the mailing list?

To remove an email address from the mailing list, fill out and submit the form on the unsubscribe page. To resubscribe, visit https://www.hanshenwang.com/subscribe.

8.3. Can I visit this website offline?

To access the website offline, download the appropriate executable from the project release page. Releases · HanshenWang/project-isidore · GitHub.

Find the appropriate exectuable matching both your computer's processor architecture and the operating system. Currently only the x86-64 architecture is supported on Ubuntu, MacOS and Windows operating systems.

Project Isidore does not offer signed binaries for MacOS, therefore you will have to manually execute the unsigned binaries. Please see https://lapcatsoftware.com/articles/unsigned.html for more details.

An audit of the source code can be done at any time. Please see the source repository as well as the third party dependencies.

Be advised that the program consumes around 50MB of RAM when used by a single user locally. Please understand the executable is provided AS IS, WITHOUT WARRANTY. See the provided COPYING.txt included in the download.

8.4. I can't find what I'm looking for. How is the documentation organized?

The documentation is organized according to the best practices outlined here: The documentation system — divio.

The closest thing to a tutorial as understood by the divio documentation system ought to be the development quickstart (present in the README.org) or embedded as close as possible to the end-user interface. How-to guides are meant to be placed here in the user manual. The Reference is auto-generated with the help of the Declt system. The Explanation is this Design Notebook blog article and git commit messages.

9. Reference Manual

Reference manuals are technical descriptions of Project Isidore's internal artifact architecture and how to operate it. For end users, please see the User Manual.

The Project Isidore Reference Manual is complete with cross-references to ASDF component dependencies, parents and children, classes' direct methods, super and subclasses, slot readers and writers, setf expanders access and update functions etc. The reference manual also includes exhaustive and multiple-entry indexes for every documented item.

  • System components (modules and files)
  • Packages
  • Exported and internal definitions of
    • Constants
    • Special variables
    • Symbol macros
    • Macros
    • Setf expanders
    • Compiler macros
    • Functions
    • Generic functions
    • Generic methods,
    • Method combinations,
    • Conditions
    • Structures
    • Classes
    • Types

With all that being said, when the boundary between user and developer is crossed, it makes much more sense to clone the source code and explore it in your LISP IDE. Auto generated manuals may be slightly more useful in LISP than other languages, thanks to the excellent introspection capabilities of SBCL and ASDF, but still are largely only useful for index generation.

9.1. Generate Reference Manual

The Declt Common Lisp library is used generate the reference manual in .texi texinfo format. GNU Texinfo is able to convert a single .texi source file into online HTML format, PDF documentation as well as others. On a high level, Declt uses ASDF and SBCL contrib sb-introspect to query and extract documentation strings, lambda lists, slot type, allocation and initialization arguments, and definition source files. It requires no special architecture choices in a system, other than to conform to ASDF conventions. Beyond that, writing clear docstrings where the opportunity arises will yield great results.

generate-doc.lisp script is run by a git pre-commit hook. Upon every commit, it will regenerate the reference manual and include changes into the commit. pre-commit is originally located at /project-isidore/.git/hooks/pre-commit.sample.

Is this workflow overkill for the size of the codebase currently? Probably. It is nice to have an auto-generated manual though.

#!/bin/sh
#
# An example hook script to verify what is about to be committed.
# Called by "git commit" with no arguments.  The hook should
# exit with non-zero status after issuing an appropriate message if
# it wants to stop the commit.
#
# To enable this hook, rename this file to "pre-commit".

if git rev-parse --verify HEAD >/dev/null 2>&1
then
    against=HEAD
else
    # Initial commit: diff against an empty tree object
    against=$(git hash-object -t tree /dev/null)
fi

# Redirect output to stderr.
exec 1>&2

# Generate updated manual. SBCL must be installed. See documentation for
# environment setup
sbcl --script src/generate-doc.lisp
# Add updated manual.html to commit
git add assets/reference-manual.html

# COMMENT OUT THE WHITESPACE CHECK.
# If there are whitespace errors, print the offending file names and fail.
# exec git diff-index --check --cached $against --
;;;; generate-doc.lisp
;;; See subheading 'Generate Reference Manual' at
;;; https://www.hanshenwang.com/public/blog/project-isidore-doc.html/

(load "~/quicklisp/setup.lisp") ; Quicklisp is installed in default location
(ql:quickload :project-isidore) ; If you need to download dependencies
(ql:quickload :net.didierverna.declt)
;; Generate 'project-isidore.texi' in TEXI-DIRECTORY
(net.didierverna.declt:declt :project-isidore
                             :texi-name "reference-manual"
                             :texi-directory
                             (asdf:system-relative-pathname
                              :project-isidore "assets/")
                             :library-name "Project Isidore"
                             ;; links are machine specific
                             :hyperlinks nil
                             ;; :long will print generation time. This will be
                             ;; picked up by git. Otherwise I would pick :long
                             :declt-notice :short)
;; https://lispcookbook.github.io/cl-cookbook/os.html#input-and-output-from-subprocess
(defparameter *shell* (uiop:launch-program "bash" :input :stream
                                                  :output :stream))
;; Change to proper directory
(defparameter *reference-manual-path* (concatenate
                             'string "cd "
                             (namestring
                              (asdf:system-relative-pathname
                               :project-isidore "assets/"))))
(write-line *reference-manual-path* (uiop:process-info-input *shell*))
;; Convert .texi to .html
(write-line
   ;; For makeinfo flags, see
   ;; https://www.gnu.org/software/texinfo/manual/texinfo/texinfo.html#HTML-CSS
 "makeinfo --html 'reference-manual.texi' --no-split --css-include='global.css'"
 (uiop:process-info-input *shell*))
(force-output (uiop:process-info-input *shell*))
(uiop:quit)

10. Project History & Credits

For previous project iterations and experience, see project-isidore-java repository on GitHub using Java Spring, and project-isidore-javascript currently on GitHub using NextJS. See also MEAN stack notes.

Credit must be given where credit is due. This website would not be possible without,

From the bottom of my heart, thank you!

References

[1] Scott Chacon. Pro Git. The Expert's Voice in Software Development. Apress, New York, NY, second edition edition, 2014.