Opened 19 years ago

Closed 19 years ago

Last modified 19 years ago

#2 closed enhancement (fixed)

external format guessing for incoming messages

Reported by: afuchs Owned by: Erik Huelsmann
Priority: major Milestone:
Component: chat Version:
Keywords: Cc:
Patch attached: yes

Description

(reposted from mail to cl-irc-devel) Hi,

the following patch is a proof-of-concept implementation of external format guessing for incoming messages (and customizable external formats for outgoing messages, defaulting conservatively to latin-1).

With that patch, cl-irc now opens a binary stream to the server, and opens flexi-streams on top of that. The outgoing part is pretty straightforward; the incoming part not so much (:

Reading works like this:

  • We read a line of latin-1 chars
  • We try to decode their code-chars (latin1 is a 1:1 translation to code-chars, and where it isn't, I hope flexistreams takes care of that (-:) using the list of external-formats in *default-incoming-external-formats*.
  • When we find a decoding that doesn't throw a error, we build a message from that.

Positive side effect: cl-irc should now work on windows because the external format allows specification of eol convention. Negative side effect: I didn't get parsing to work without #\Return, so this patch appends a #\Return character to the raw message.

Which brings me to the todo list:

  • DCC connections probably don't work. I don't care about dcc, so I won't fix them (:
  • The parser should work without #\Return.
  • reading latin1 and decoding from the char codes is ... ugly. But it's probably less ugly than doing our own buffering. Maybe somebody wants to investigate that. (:

Attachments (1)

,external-format-guessing.patch (9.0 KB) - added by afuchs 19 years ago.
external format guessing patch

Download all attachments as: .zip

Change History (9)

Changed 19 years ago by afuchs

external format guessing patch

comment:1 Changed 19 years ago by Erik Huelsmann

Owner: changed from somebody to anonymous
Status: newassigned

DCC connections didn't work anyway, so there's no reason (yet) to make them work with this change.

comment:2 Changed 19 years ago by Erik Huelsmann

Owner: changed from anonymous to Erik Huelsmann
Status: assignednew

comment:3 Changed 19 years ago by Erik Huelsmann

Status: newassigned

comment:4 Changed 19 years ago by Erik Huelsmann

The second item in the TODO list (eliminate #\Return dependency) has been addressed in r149.

comment:5 Changed 19 years ago by Erik Huelsmann

Ok, I have investigated that we can't do our own buffering, since there's no platform independent way to tell how much data we can read from the stream without blocking for more input.

We can OTOH use read-char to create a somewhat read-line like function: read-sequence-until; it reads a stream until a limiting sequence is matched. In our case, we could use '(13 10) as the limiting sequence.

I noticed in the SBCL sources that read-char is a buffered call just as any other stream read action, so, at least for SBCL, the performance impact should be limited.

comment:6 Changed 19 years ago by Erik Huelsmann

DCC connections don't work at all, so, that point doesn't need addressing in this issue...

The only issue remaining is the reading of latin1. I think it's not only ugly, but there may be a little flaw in using that technique: when reading latin1 characters, the char-code of the internal representation of the character read will not necessarily be the same as the value in the latin1 encoding.

comment:7 Changed 19 years ago by Erik Huelsmann

Resolution: fixed
Status: assignedclosed

Resolved in r150, which is heavily inspired by, but not

comment:8 Changed 19 years ago by Erik Huelsmann

(continued from the last comment)

... an immediate or even tweaked application of it.

Note: See TracTickets for help on using tickets.