#2 closed enhancement (fixed)
external format guessing for incoming messages
Reported by: | afuchs | Owned by: | Erik Huelsmann |
---|---|---|---|
Priority: | major | Milestone: | |
Component: | chat | Version: | |
Keywords: | Cc: | ||
Patch attached: | yes |
Description
(reposted from mail to cl-irc-devel) Hi,
the following patch is a proof-of-concept implementation of external format guessing for incoming messages (and customizable external formats for outgoing messages, defaulting conservatively to latin-1).
With that patch, cl-irc now opens a binary stream to the server, and opens flexi-streams on top of that. The outgoing part is pretty straightforward; the incoming part not so much (:
Reading works like this:
- We read a line of latin-1 chars
- We try to decode their code-chars (latin1 is a 1:1 translation to code-chars, and where it isn't, I hope flexistreams takes care of that (-:) using the list of external-formats in *default-incoming-external-formats*.
- When we find a decoding that doesn't throw a error, we build a message from that.
Positive side effect: cl-irc should now work on windows because the external format allows specification of eol convention. Negative side effect: I didn't get parsing to work without #\Return, so this patch appends a #\Return character to the raw message.
Which brings me to the todo list:
- DCC connections probably don't work. I don't care about dcc, so I won't fix them (:
- The parser should work without #\Return.
- reading latin1 and decoding from the char codes is ... ugly. But it's probably less ugly than doing our own buffering. Maybe somebody wants to investigate that. (:
Attachments (1)
Change History (9)
Changed 19 years ago by
Attachment: | ,external-format-guessing.patch added |
---|
comment:1 Changed 19 years ago by
Owner: | changed from somebody to anonymous |
---|---|
Status: | new → assigned |
DCC connections didn't work anyway, so there's no reason (yet) to make them work with this change.
comment:2 Changed 19 years ago by
Owner: | changed from anonymous to Erik Huelsmann |
---|---|
Status: | assigned → new |
comment:3 Changed 19 years ago by
Status: | new → assigned |
---|
comment:4 Changed 19 years ago by
The second item in the TODO list (eliminate #\Return dependency) has been addressed in r149.
comment:5 Changed 19 years ago by
Ok, I have investigated that we can't do our own buffering, since there's no platform independent way to tell how much data we can read from the stream without blocking for more input.
We can OTOH use read-char to create a somewhat read-line like function: read-sequence-until; it reads a stream until a limiting sequence is matched. In our case, we could use '(13 10) as the limiting sequence.
I noticed in the SBCL sources that read-char is a buffered call just as any other stream read action, so, at least for SBCL, the performance impact should be limited.
comment:6 Changed 19 years ago by
DCC connections don't work at all, so, that point doesn't need addressing in this issue...
The only issue remaining is the reading of latin1. I think it's not only ugly, but there may be a little flaw in using that technique: when reading latin1 characters, the char-code of the internal representation of the character read will not necessarily be the same as the value in the latin1 encoding.
comment:7 Changed 19 years ago by
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
Resolved in r150, which is heavily inspired by, but not
comment:8 Changed 19 years ago by
(continued from the last comment)
... an immediate or even tweaked application of it.
external format guessing patch