Loading...

musicbrainz-devel@musicbrainz.org

[Prev] Thread [Next]  |  [Prev] Date [Next]

Re: [mb-devel] PostgreSQL 8.1 and illegal UTF-8 characters in the DB Dave Evans Fri Aug 11 23:33:31 2006

On Sat, Jan 14, 2006 at 06:25:13AM +0100, Björn Krombholz wrote:
> Don't be so sure that it's only those 2 entries.
> It's not unlikely that similar errors are hidden in one of the other
> dumps or the real moderator table (the dumps include only the
> sanitized version), especially in the password field and the very old
> mod notes.

I just checked the whole DB like so:

./admin/ExportAllTables --no-compress --keep
cd /tmp/mbexport-whatever/mbdump
perl -MEncode=FB_CROAK,decode -ne 'eval {decode("utf-8",$_,FB_CROAK)}; print 
"$ARGV:$.:$_" if $@' *

Results:

5 bad rows in moderation_closed (id = 208986, 1405919, 2929418, 2929419,
2925226) and 1 bad row in moderation_note_closed (id=1031).

I haven't attempted to fix them.

Just out of interest, now a load of old mods have been removed from the DB
(see
ftp://ftp.musicbrainz.org/pub/musicbrainz/data/oldmods-20051230-224830-UTC-1.dat.gz),
the maximum row length across the whole DB is down to 34475 bytes (in
moderation_closed, of course).  Therefore if we wanted to we could probably
reduce '$max' (currently 315000) in ExportAllTables.

-- 
Dave Evans

PGP key: http://rudolf.org.uk/pgpkey
_______________________________________________
MusicBrainz-devel mailing list
[EMAIL PROTECTED]
http://lists.musicbrainz.org/mailman/listinfo/musicbrainz-devel