elinks-dev
[Prev] Thread [Next] | [Prev] Date [Next]
Re: [elinks-dev] Line-drawing characters when dumping Web pages Kalle Olavi Niemitalo Mon Jun 08 17:00:39 2009
(moved from elinks-users to elinks-dev because of the patch) Karl Ove Hufthammer <[EMAIL PROTECTED]> writes: > When I use ELinks interactively, table borders are drawn using nice > line-drawing characters. However, when I use ‘links --dump’, these are > replaced by ugly -, | and + ASCII characters, even if I dump to UTF-8. > Is there a way to retain the nice borders when dumping a Web page? Not at the moment. The attached patch for elinks-0.12 (20dfdb284f9a23742800fb5b4023bef54c6ad982) implements this, but I'm not sure it is the right solution, because e.g. KOI8-R also supports line-drawing characters so the fix should preferably not be specific to UTF-8. Comments?
From 827a77a6e5fad1f4dc69909090bf07fb7b84ee51 Mon Sep 17 00:00:00 2001
From: Kalle Olavi Niemitalo <[EMAIL PROTECTED]>
Date: Tue, 9 Jun 2009 01:48:42 +0300
Subject: [PATCH] Line-drawing characters in UTF-8 dumps
When dumping the document to a file, ELinks used to represent lines in
tables and HR elements as ASCII -+| characters. Now, if the output
charset is UTF-8, it uses Unicode line-drawing characters instead.
This change affects elinks --dump and File -> Save formatted document,
but not the Lua current_document_formatted function.
---
NEWS | 2 +
src/terminal/screen.c | 2 +-
src/terminal/terminal.h | 1 +
src/viewer/dump/dump-specialized.h | 39 +++++++++++++++++++++++------------
4 files changed, 29 insertions(+), 15 deletions(-)
diff --git a/NEWS b/NEWS
index a84f3f9..f407c7c 100644
--- a/NEWS
+++ b/NEWS
@@ -15,6 +15,8 @@ includes the changes listed under ``ELinks 0.11.6.GIT now''
below.
* minor bug 1017: To work around HTTP server bugs, disable
protocol.http.compression by default, until ELinks can report
decompression errors or automatically retry the connection.
+* enhancement: ``--dump'' and ``Save formatted document'' output
+ line-drawing characters if using UTF-8.
ELinks 0.12pre4:
----------------
diff --git a/src/terminal/screen.c b/src/terminal/screen.c
index 8f838a6..34c93d8 100644
--- a/src/terminal/screen.c
+++ b/src/terminal/screen.c
@@ -41,7 +41,7 @@ static const unsigned char frame_vt100[48] =
"aaaxuuukkuxkjjjkmvwtqnttmlvwtqnvvw
* characters encoded in CP437.
* When UTF-8 I/O is enabled, ELinks uses this array instead of
* ::frame_vt100[], and converts the characters from CP437 to UTF-8. */
-static const unsigned char frame_vt100_u[48] = {
+const unsigned char frame_vt100_u[48] = {
177, 177, 177, 179, 180, 180, 180, 191,
191, 180, 179, 191, 217, 217, 217, 191,
192, 193, 194, 195, 196, 197, 195, 195,
diff --git a/src/terminal/terminal.h b/src/terminal/terminal.h
index c2c1d79..3bf9d19 100644
--- a/src/terminal/terminal.h
+++ b/src/terminal/terminal.h
@@ -166,6 +166,7 @@ extern LIST_OF(struct terminal) terminals;
extern const unsigned char frame_dumb[];
+extern const unsigned char frame_vt100_u[];
struct terminal *init_term(int, int);
void destroy_terminal(struct terminal *);
diff --git a/src/viewer/dump/dump-specialized.h
b/src/viewer/dump/dump-specialized.h
index f60aeed..6d21839 100644
--- a/src/viewer/dump/dump-specialized.h
+++ b/src/viewer/dump/dump-specialized.h
@@ -41,6 +41,9 @@ DUMP_FUNCTION_SPECIALIZED(struct document *document, int fd,
unsigned char *background = &color[3];
int width = get_opt_int("document.dump.width");
#endif /* DUMP_COLOR_MODE_TRUE */
+#ifdef DUMP_CHARSET_UTF8
+ const int cp437 = get_cp_index("cp437");
+#endif
for (y = 0; y < document->height; y++) {
int white = 0;
@@ -105,23 +108,11 @@ DUMP_FUNCTION_SPECIALIZED(struct document *document, int
fd,
c = document->data[y].chars[x].data;
+#ifndef DUMP_CHARSET_UTF8
if ((attr & SCREEN_ATTR_FRAME)
&& c >= 176 && c < 224)
c = frame_dumb[c - 176];
-#ifdef DUMP_CHARSET_UTF8
- else {
- unsigned char *utf8_buf = encode_utf8(c);
-
- while (*utf8_buf) {
- if (write_char(*utf8_buf++,
- fd, buf, &bptr)) return -1;
- }
-
- x += unicode_to_cell(c) - 1;
-
- continue;
- }
-#endif /* DUMP_CHARSET_UTF8 */
+#endif /* !DUMP_CHARSET_UTF8 */
if (c <= ' ') {
/* Count spaces. */
@@ -136,10 +127,30 @@ DUMP_FUNCTION_SPECIALIZED(struct document *document, int
fd,
white--;
}
+#ifdef DUMP_CHARSET_UTF8
+ if ((attr & SCREEN_ATTR_FRAME)
+ && c >= 176 && c < 224)
+ c = cp2u(cp437, frame_vt100_u[c - 176]);
+
+ {
+ unsigned char *utf8_buf = encode_utf8(c);
+
+ while (*utf8_buf) {
+ if (write_char(*utf8_buf++,
+ fd, buf, &bptr)) return -1;
+ }
+
+ x += unicode_to_cell(c) - 1;
+
+ continue;
+ }
+#else /* !DUMP_CHARSET_UTF8 */
/* Print normal char. */
if (write_char(c, fd, buf, &bptr))
return -1;
+#endif /* !DUMP_CHARSET_UTF8 */
}
+
#if defined(DUMP_COLOR_MODE_16) || defined(DUMP_COLOR_MODE_256) ||
defined(DUMP_COLOR_MODE_TRUE)
for (;x < width; x++) {
if (write_char(' ', fd, buf, &bptr))
--
1.6.3.1.25.g0f3af
_______________________________________________ elinks-dev mailing list [EMAIL PROTECTED] http://linuxfromscratch.org/mailman/listinfo/elinks-dev
- Re: [elinks-dev] Line-drawing characters when dumping Web pages Kalle Olavi Niemitalo <=