Search code examples
lispcommon-lispsbclsubsequence

How can I use Lisp subseq using colon (or other non-alphanumeric characters)?


I need to extract a substring from a string; the substring is enclosed by ":" and ";". E.g.

:substring;

But with Lisp (SBCL), I'm having trouble extracting the substring. When I run:

(subseq "8.I:123;" : ;)

I get:

#<THREAD "main thread" RUNNING {1000510083}>:
  illegal terminating character after a colon: #\

    Stream: #<SYNONYM-STREAM :SYMBOL SB-SYS:*STDIN* {1000025923}>

Type HELP for debugger help, or (SB-EXT:EXIT) to exit from SBCL.

restarts (invokable by number or by possibly-abbreviated name):
  0: [ABORT] Exit debugger, returning to top level.

(SB-IMPL::READ-TOKEN #<SYNONYM-STREAM :SYMBOL SB-SYS:*STDIN* {1000025923}> #\:)

I've tried preceding the colon and semicolon with \ but that throws a different error. Can anyone advise? Thanks in advance for the help!


Solution

  • As you can see in docs for subseq, start and end are bounding index designators and they can be either integer or nil.

    #\: and #\; are characters, so you can't use them, but you can use the function position to find the first index of each character and use these indices as arguments for subseq. You have to check that both indices exist and the second one is bigger than the first one:

    (let* ((string "8.I:123;")
           (pos1 (position #\: string))
           (pos2 (position #\; string)))
      (when (and pos1 pos2 (> pos2 pos1))
        (subseq string
                (1+ pos1)
                pos2)))
    
    => "123"
    

    This is a little bit cumbersome, so I suggest you to use some regex library. The following example was created with CL-PPCRE:

    (load "~/quicklisp/setup.lisp")
    (ql:quickload :cl-ppcre)
    
    > (cl-ppcre:all-matches-as-strings "(?<=:)([^;]*)(?=;)" "8.I:123;:aa;")
    ("123" "aa")