close Warning: Can't synchronize with repository "(default)" ("(default)" is not readable or not a Git repository.). Look in the Trac log for more information.

Opened 11 years ago

Closed 11 years ago

#74 closed defect (fixed)

utf-8 bug in compile-file

Reported by: Anton Vodonosov Owned by:
Priority: major Milestone:
Component: Unicode Version: 20d
Keywords: Cc:

Description

When compiling the attached file using utf8 as shown below, CMUCL produces error:

$ rlwrap ./lisps/cmucl-20d/bin/lisp -noinit -nositeinit -eval '(defpackage :cl-haml (:use :common-lisp))' -eval '(compile-file "read-insert.lisp" :external-format :utf8)'

; Python version 1.1, VM version Intel x86/sse2 on 2013-02-26 19:54:05.
; Compiling: /home/testgrid/read-insert.lisp 2013-02-26 19:53:52


; In: LAMBDA (STREAM::%SLOTS%)

;   (STREAM::OCTETS-TO-CHAR :UTF-8 STREAM::STATE
;                           (AREF STREAM::OCOUNT STREAM::K)
;                           (IF # # #)
;                           ...)
; --> LET IF LET STREAM::OCTETS-TO-CODEPOINT MULTIPLE-VALUE-BIND
; --> MULTIPLE-VALUE-CALL LABELS BLOCK LET DOTIMES DO BLOCK LET TAGBODY LET
; --> TAGBODY LET IF SETF LET* MULTIPLE-VALUE-BIND LET LET
; ==>
;   (SETQ #:G29 #:G40)
; Note: Doing signed word to integer coercion (cost 20) to #:G29.

Type-error in KERNEL::OBJECT-NOT-TYPE-ERROR-HANDLER:
   -79 is not of type (OR (MOD 536870911) NULL)
   [Condition of type TYPE-ERROR]

Restarts:
  0: [ABORT] Skip remaining initializations.

Debug  (type H for help)

("DEFSTRUCT COMPILER-ERROR-CONTEXT" 268431572 14)[:OPTIONAL]
Source: Error finding source:
Error in function DEBUG::GET-FILE-TOP-LEVEL-FORM:  Source file no longer exists:
  target:compiler/ir1util.lisp.
0] backtrace

0: ("DEFSTRUCT COMPILER-ERROR-CONTEXT" 268431572 14)[:OPTIONAL]
1: (C::NOTE-UNDEFINED-REFERENCE READ-HAML-INSERT-LINE :FUNCTION "")
2: (C::FIND-FREE-REALLY-FUNCTION READ-HAML-INSERT-LINE "")
3: (C::FIND-FREE-FUNCTION READ-HAML-INSERT-LINE "")
4: (C::GET-DEFINED-FUNCTION READ-HAML-INSERT-LINE)
5: (C::IR1-CONVERT-%DEFUN #<Continuation c1>
                          #<Continuation c2>
                          (C::%DEFUN 'READ-HAML-INSERT-LINE
                                     #'(LAMBDA # #)
                                     NIL
                                     '(DEFUN READ-HAML-INSERT-LINE # #)))
6: (C::IR1-CONVERT #<Continuation c1>
                   #<Continuation c2>
                   (C::%DEFUN 'READ-HAML-INSERT-LINE
                              #'(LAMBDA # #)
                              NIL
                              '(DEFUN READ-HAML-INSERT-LINE # #)))
7: (C::IR1-CONVERT-PROGN-BODY #<Continuation c1>
                              #<Continuation c3>
                              ((C::%DEFUN 'READ-HAML-INSERT-LINE #'# NIL '#)
                               NIL))
8: (C::IR1-CONVERT-PROGN-BODY 3
                              #<Continuation c1>
                              #<Continuation c3>
                              ((C::%DEFUN 'READ-HAML-INSERT-LINE #'# NIL '#)
                               NIL))[:EXTERNAL]
9: (C::IR1-CONVERT-AUX-BINDINGS #<Continuation c1>
                                #<Continuation c3>
                                ((C::%DEFUN 'READ-HAML-INSERT-LINE #'# NIL '#)
                                 NIL)
                                NIL
                                ...)
10: (C::IR1-CONVERT-DYNAMIC-EXTENT-BINDINGS #<Continuation c1>
                                            #<Continuation c3>
                                            ((C::%DEFUN 'READ-HAML-INSERT-LINE
                                                        #'#
                                                        NIL
                                                        '#)
                                             NIL)
                                            NIL
                                            ...)
11: (C::IR1-CONVERT-SPECIAL-BINDINGS #<Continuation c1> #<Continuation c3>
     ((C::%DEFUN 'READ-HAML-INSERT-LINE #'# NIL '#) NIL) NIL ...)
12: (C::IR1-CONVERT-LAMBDA-BODY
     ((C::%DEFUN 'READ-HAML-INSERT-LINE #'# NIL '#) NIL)
     NIL
     NIL
     NIL
     ...)
13: (C::IR1-CONVERT-LAMBDA-BODY 2
                                ((C::%DEFUN 'READ-HAML-INSERT-LINE #'# NIL '#)
                                 NIL)
                                NIL
                                #(0 2 3 4 5 ...)
                                ...)[:EXTERNAL]
14: (C::IR1-TOP-LEVEL
     (C::%DEFUN 'READ-HAML-INSERT-LINE
                #'(LAMBDA # #)
                NIL
                '(DEFUN READ-HAML-INSERT-LINE # #))
     ((C::%DEFUN 'READ-HAML-INSERT-LINE #'# NIL '#) C::ORIGINAL-SOURCE-START 0
      1)
     NIL)
15: (C::CONVERT-AND-MAYBE-COMPILE
     (C::%DEFUN 'READ-HAML-INSERT-LINE
                #'(LAMBDA # #)
                NIL
                '(DEFUN READ-HAML-INSERT-LINE # #))
     ((C::%DEFUN 'READ-HAML-INSERT-LINE #'# NIL '#) C::ORIGINAL-SOURCE-START 0
      1))
16: (C::PROCESS-FORM
     (C::%DEFUN 'READ-HAML-INSERT-LINE
                #'(LAMBDA # #)
                NIL
                '(DEFUN READ-HAML-INSERT-LINE # #))
     (C::ORIGINAL-SOURCE-START 0 1))
17: (C::PROCESS-FORM
     (DEFUN READ-HAML-INSERT-LINE (STREAM &OPTIONAL # #) (LIST +HAML+ #))
     (C::ORIGINAL-SOURCE-START 0 1))
18: (C::PROCESS-FORM 2
                     (DEFUN READ-HAML-INSERT-LINE (STREAM &OPTIONAL # #)
                       (LIST +HAML+ #))
                     (C::ORIGINAL-SOURCE-START 0 1))[:EXTERNAL]
19: (C::PROCESS-SOURCES #<Source-Info>)
20: ((FLET #:G0 C::SUB-COMPILE-FILE))
21: (C::SUB-COMPILE-FILE #<Source-Info> NIL)
22: (C::SUB-COMPILE-FILE 1 #<Source-Info> NIL)[:EXTERNAL]
23: (COMPILE-FILE "read-insert.lisp" :OUTPUT-FILE T :ERROR-FILE ...)
24: (EXTENSIONS::EVAL-SWITCH-DEMON
     #<Command Line Switch "eval" -- ("(compile-file \"read-insert.lisp\" :external-format :utf8)")>)
25: ((FLET EXTENSIONS::INVOKE-DEMON EXTENSIONS::INVOKE-SWITCH-DEMONS)
     #<Command Line Switch "eval" -- ("(compile-file \"read-insert.lisp\" :external-format :utf8)")>)
26: (EXTENSIONS::INVOKE-SWITCH-DEMONS
     (#<Command Line Switch "noinit"> #<Command Line Switch "nositeinit">
      #<Command Line Switch "eval" -- ("(defpackage :cl-haml (:use :common-lisp))")>
      #<Command Line Switch "eval" -- ("(compile-file \"read-insert.lisp\" :external-format :utf8)")>)
     (("-help" . #) ("help" . #) ("load" . #) ("eval" . #)))
27: ((LABELS LISP::%RESTART-LISP EXTENSIONS:SAVE-LISP))
28: ((LABELS LISP::RESTART-LISP EXTENSIONS:SAVE-LISP))

0]

This file is from cl-hamls library. May be found in quicklisp 2013-01-28 under the path quicklisp/dists/quicklisp/software/cl-haml-20130128-git/src/read-insert.lisp

Original source code repository: https://github.com/Unspeakable/cl-haml/blob/master/src/read-insert.lisp

All other CL implementations compile the file successfully.

Also, iconv doesn't report any UTF8 problems:

$ iconv -f UTF-8 read-insert.lisp -o /dev/null
$ echo $?

(AFAIK iconv would return 1 in case of encoding problems and print the offset of wrong octets)

Compiling by CMUCL without UTF8 also succeeds:

./lisps/cmucl-20d/bin/lisp -noinit -nositeinit -eval '(defpackage :cl-haml (:use :common-lisp))' -eval '(compile-file "read-insert.lisp")' 

Attachments (1)

read-insert.lisp (411 bytes) - added by Anton Vodonosov 11 years ago.
read-insert.lisp from cl-haml of quicklisp 2013-01-28

Download all attachments as: .zip

Change History (3)

Changed 11 years ago by Anton Vodonosov

Attachment: read-insert.lisp added

read-insert.lisp from cl-haml of quicklisp 2013-01-28

comment:1 Changed 11 years ago by Raymond Toy

Thanks for the detailed report. It appears to be a bug in file-position. Unicode buffering is quite messy so it will take a bit of time to work this out.

comment:2 Changed 11 years ago by toy.raymond@…

Resolution: fixed
Status: newclosed

commit e8f64b3f83455a82edad394c472481fadde6ccb5 Author: Raymond Toy <toy.raymond@…> Date: Tue Feb 26 20:44:18 2013 -0800

Fix ticket:74

When accounting for the octets left in the in-buffer that we haven't read (or converted to characters), we were subtracting the index from the total in-buffer length. This is wrong if the file is less than the total in-buffer length. We should have subtracted from the actual number of octets in the in-buffer.

Note: See TracTickets for help on using tickets.