code.vuplus.com Git - vuplus_xbmc/blob - lib/enca/ChangeLog

   1 #============================================================================
   2 # Enca v1.12 (2009-10-29)  guess and convert encoding of text files
   3 # Copyright (C) 2000-2003 David Necas (Yeti) <yeti@physics.muni.cz>
   4 # Copyright (C) 2009 Michal Cihar <michal@cihar.com>
   5 #============================================================================
   6
   7 List of user-visible changes in Enca
   8 More detailed log can be obtained from older changelogs or git log.
   9
  10 Legend: + new feature
  11         * change of behaviour (including disappearing of a feature)
  12         - bugfix
  13
  14 enca-1.12 2009-10-29
  15   - Fixes some minor memory leaks.
  16   - Fixes little problems in autoconf scripts.
  17
  18 enca-1.11 2009-09-25
  19   - Dropped scanf configure test which is not used at all.
  20   - Fixes some wrong format strings.
  21
  22 enca-1.10 2009-08-25
  23   + Enca is back alive or at least in maintenance mode.
  24   * Enca now lives in git repository, see <http://gitorious.org/enca>.
  25   - Add missing charset koi8u to belarussian language.
  26   - Fixed some typos in program and documentation.
  27
  28 enca-1.9 2005-12-18
  29   + support for HZ encoding
  30   * Big5 and GBK detection improved
  31   - enca.spec no longer installs docs to world-unreadable directory
  32
  33 enca-1.8 2005-11-24
  34   + Chinese (Big5 and GBK) support (thanks to Zuxy)
  35   * deb/ subdirectory is gone as there is finally an Enca package in Debian
  36     (thanks to Michal Cihar)
  37   - manual page clean-up (thanks to Michal Cihar)
  38
  39 enca-1.7 2005-02-27
  40   + new name type: preferred MIME name (option -m)
  41   - broken iconv detection on some system was fixed
  42
  43 enca-1.6 2004-09-01
  44   * English language names (--list=languages, enca_language_english_name())
  45     were changed to lowercase to match common locale aliases
  46   - Win32, i.e. MinGW and Cygwin, build problems were fixed
  47
  48 enca-1.5 2004-05-30
  49   - crash on impossible recovery after iconv failure in pipe was fixed
  50   - rpm building problems on Mandrake Linux were fixed
  51
  52 enca-1.4 2004-05-12
  53   - dependency of guessing API on locales (via ctype functions) was fixed
  54   - --help text generation failure on some systems was fixed
  55
  56 enca-1.3 2003-12-24
  57   + [libenca] it's possible to get analyser option values, not just set them
  58   * a good BOM (byte order mark) increases the chance of being recognized for
  59     UCS-4 and UTF-8 too
  60   * external converter wrappers were moved from bin to libexec and the b-
  61     prefix was removed (though it still works)
  62   * external converters are no longer searched in PATH, nonstandard ones
  63     has to be specified with full path
  64
  65 enca-1.2 2003-11-26
  66   - fixed segfault in language detection for some locale setups
  67
  68 enca-1.1 2003-11-17
  69   - fixed losing data at the end of file when using external converters in a
  70     pipe (and maybe in other situations)
  71   - [libenca] enca_analyser_free() not freeing analyser completely was fixed
  72
  73 enca-1.0 2003-11-06
  74   * deprectated options -T, -R, -S, -u, -U, -m, and -M were finally removed
  75   * default HTML API docs installation path changed to the new gtk-doc style
  76     (DATADIR/gtk-doc/html/enca)
  77   * debian/ subdir moved to deb/ to allow official deb creation w/o too much
  78     hassle
  79
  80 enca-0.99.4 2003-07-15
  81   - several race conditions in librecode and iconv interfaces were fixed
  82   - temporary file names are much less predictable now
  83
  84 enca-0.99.3 2003-06-30
  85   * Debian package is back from death
  86   * failure to find external converter is now fatal
  87   - fixed build problems on FreeBSD (and probably other Unices)
  88   - libiconv is not used for `conversion to ASCII' since never does the
  89     Right Thing, whatever it is
  90   - when conversion with libiconv fails, the file should now survive intact
  91   - fixed build problems on systems w/o libiconv (hopefully)
  92   - fixed distclean and uninstall targets to really clean and uninstall
  93     everything
  94   - fixed builds with separate source (read-only) and build directories
  95   - fixed builds with --without-libiconv and --without-librecode on GNU/Linux
  96   - external converter is not checked when it's not going to be used
  97
  98 enca-0.99.2 2003-06-25
  99   + EOL type is used to decide ambiguous cases, e.g. CP1250 is reported
 100     instead of ISO-8859-2/CRLF
 101   * --list languages by default prints English names, instead of ISO-639a
 102     codes, use -e or -r to get the old listing
 103   * if LC_CTYPE is something like en_US, more locale categories are examined
 104     to detect the language
 105   * cork charset was modified to contain \n, \r and \t in the same places as
 106     ASCII
 107   * some heuristics tuning
 108
 109 enca-0.99.1 2003-06-22
 110   + libenca pkg-config support
 111   * all libenca tuning parameters (-T, -R, -S, -u, -U, -m, and -M) were
 112     marked deprecated and are noop, Enca should DWIM
 113   * ambiguity is now always OK when the sample has the same meaning in all the
 114     charsets
 115   * deprecated `built-in-encodings' and `encodings' lists were removed
 116   * PAGER feature was removed
 117   - exchanged `latvian' and `lithuanian' language names were fixed (`lv' and
 118     `lt' were always OK)
 119   - missing tests for the new languages was added to the test suit
 120
 121 enca-0.99.0 2003-06-14
 122   + added some support for: Bulgarian, Croatian, Estonian, Hungarian, Latvian,
 123     Lithuanian, Slovene
 124   + a new algorithm for 8bit-dense languages (cyrillics), the old one is used
 125     as a fallback
 126   * removed support for non-transitive iconv (such a thing should not exist)
 127   * auxiliary tools in data are not longer built in regular builds,
 128     use --enable-maintainer-mode to rebuild them, create dists, etc.
 129   - fixed iconv interface surface check pickier than iconv itself inhibiting
 130     some otherwise possible conversions
 131   - fixed u+x permissions on temporary files (from 0.10.7)
 132   - fixed not deleting temporary files in iconv interface
 133   - fixed broken iconv interface behaviour in pipes
 134   - fixed iconvcap misdetecting Latin5 as ISO-8859-5
 135   - fixed casual `make distclean' failures
 136
 137 enca-0.10.7 2003-01-28
 138   - fixed interchanged iconv and cstocs encoding names
 139   - corrected(?) librecode surface interaction
 140   - fixed a temporary file creation race condition
 141   * added tex and utf8 to cstocs (names and b-cstocs)
 142
 143 enca-0.10.6 2002-10-22
 144   + enconv uses DEFAULT_CHARSET variable, exactly as recode
 145   - ENCAOPT works everywhere, albeit imperfectly
 146   - options -P and -p no longer imply -M too
 147   - ambiguous mode (-M) works again
 148   - pager is run so that help text doesn't disappear
 149   - standard input it printed as STDIN with -d, not as null
 150   - make check works again
 151   - it compiles wihtout recode again
 152
 153 enca-0.10.5 2002-10-13
 154   + UTF-8 recognition in binary and otherwise messy files
 155   + detection of double-encoding from some 8bit charset to UTF-8
 156   + Cork encoding conversion
 157   * librecode interaction was (hopefully) improved
 158   - fixed some build-time problems
 159
 160 enca-0.10.4 2002-10-10
 161   + added Cork encoding support for Czech, Slovak and Polish
 162   - empty files are now considered convertible to any encoding
 163   - removed the so-called faster (in fact slower) I/O
 164   - fixed some more compile-time search path issues
 165
 166 enca-0.10.3 2002-09-22
 167   * added support for perl umap as external converter
 168   - fixed external converter wrappers to work with standard sh
 169   - fixed some compile-time library search path issues
 170
 171 enca-0.10.2 2002-09-15
 172   + target charset is automatically obtained from locales when called as
 173     enconv, new options --guess, --auto-convert
 174   + English language names can be used instead of ISO-639 codes everywhere
 175   - cs_SK and ru_UA locales are properly recognised as Slovak and Ukrainian
 176
 177 enca-0.10.1 2002-08-29
 178   + faster I/O
 179   * external converters can be disabled at build time
 180   - `-' is accepted for standard input
 181   - fixed broken built-in converter
 182   - fixed crasing on an unknown language
 183   - trivial (identity) conversions are not performed any more
 184   - help is now printed when input is a terminal and no argument specified
 185   - changed braindamaged <STDIN>, <STDOUT> to STDIN, STDOUT in messages
 186   - various small fixes and build-time improvements
 187
 188 enca-0.10.0 2002-08-26
 189   + added support for Ukraininan (CP1251, IBM855, ISO-8859-5, KOI8-U, maccyr
 190     CP1125), Belarussian (CP1251, IBM866, ISO-8859-5, KOI8-UNI, maccyr,
 191     IBM855) and Polish (ISO-8859-2, ISO-8859-12, ISO-8859-16, Baltic, macce,
 192     IBM852, CP1250)
 193   + Enca library introduced
 194   * dropped native Debian package
 195   * --details no longer prints guessing details (now is mostly like --human)
 196   * --list=encodings, --list=built-in-encodings corrected to --list=charsets,
 197     --list-built-in charsets (old names supported with a warning)
 198   * improved Czech and Slovak charsets detection
 199
 200 enca-0.9.4: 2002-03-03
 201   - built-in converter didn't convert more than first 64kB of a file
 202
 203 enca-0.9.3: 2001-07-16
 204   + a native Debian package
 205   - fixed random reporting of nonsense results
 206   - fixed self-contradictory --details output when file was quoted-printable
 207     encoded
 208   - fixed poor performance on non-GNU/Linux
 209   - made pager less intrusive (instead of intrusive `less' ;-)
 210   - --list=encodings prints only `known' encodings
 211   - fixed several compile-time/portability problems
 212
 213 enca-0.9.2: 2001-07-13
 214   * --help and --license are displayed through pager (when possible)
 215   - fixed broken language hooks--they were never activated (from 0.9.1)
 216   - fixed reporting ASCII when a 7bit encoding was detected
 217   - fixed boundary-case behaviour when recovering from librecode failures
 218
 219 enca-0.9.1: 2001-06-25
 220   + support for Macintosh Cyrillic, including conversion
 221   + support for unusual UCS-4 byte orders (3412 and 2143)
 222   + new option --license printing full enca license
 223   * exit codes now make sense (0, 1, 2; where 2 means serious troubles)
 224   - temporary files are no longer world-readable
 225
 226 enca-0.9.0: 2001-03-26
 227   Serious incompatibilities:
 228   * -E and -C option letters exchanged (much better mnemonics)
 229   * converter wrappers renamed to b-cstocs and b-recode
 230   * finding only 7bit ASCII is no longer considered failure
 231   * need to use --language to set language (sometimes)
 232   * dull converter behaviour no longer supported, -x syntax changed
 233   * option -g removed (try --name=aliases)
 234   * option -c changed to --list=converters, listing format changed
 235   * option -l changed to --list=encodings, listing format changed
 236   * converter names are no longer case insensitive
 237   * no longer uses cstocs names as canonical
 238   * external converters are called with Enca's names, not cstocs's
 239
 240   Other changes:
 241   + support for slovak and russian (and `none') language
 242   + support for CP1251, IBM866, ISO-8859-5 and KOI8-R, including conversion
 243   + UCS-2, UCS-4, UTF-8, UTF-7 and LaTeX encoding recognition
 244   + much more encoding aliases accepted
 245   + long `GNU style' command line options
 246   + new output types: --enca-name, --iconv-name
 247   + output type --name=WORD allowing to select output type by name
 248   + ENCAOPT environment variable
 249   + language detection from locales
 250   + support for surfaces (experimental)
 251   + new option --list printing various listings
 252   + new converter wrapper b-map (for perl `map')
 253   + new option -m to reset -M back
 254   + new language filters
 255   + new options -u and -U to control multibyte encoding checks
 256   + included [generated] enca.spec into the tarball to allow `rpm -tb'
 257   * -d output improved
 258   * read limit changed to 16MB
 259   * librecode now run with flags diacritics_only and ascii_graphics
 260   - fixed broken -P options
 261   - fixed several build problems on non-GNU/Linux systems
 262   - fixed some missing and wrong characters in Unicode data
 263   - temporary copy of damaged original file is not deleted when rescue fails
 264
 265 enca-0.8.x: Since features planned for 0.8 and 0.9 happened to be developed
 266   simultaneously, this version number has been skipped.
 267
 268 enca-0.7.7: 2001-01-01
 269   + ability to use UNIX98 iconv conversion functions
 270   + the word `none' can be used as -E parameter causing clearing of converter
 271     list
 272   - fixed disarranged help text, misspelled word `European' in macce long
 273     name, obsolete statements in manual page and other stuff of this kind
 274
 275 enca-0.7.6: 2000-11-20
 276   + any converter combination/order can be now specified with -E, old -E
 277     meaning is no longer valid
 278   + new option -c (list all valid converter names)
 279   * cork encoding not supported anymore
 280   * better verbosity
 281   * `/' is added to recode recoding requests thus partially solving the
 282     surface problem---surface never changes
 283   * some errors like specifying invalid value of threshold are no longer fatal,
 284     the bad values are ignored instead
 285   * handling of some exotic characters in bulit-in converter slightly changed
 286   - fixed several fatal bugs regarding stdin to stdout conversion
 287   - stdin is copied to stdout in case of failure whenever possible/applicable
 288
 289 enca-0.7.5: 2000-10-25
 290   * license changed to GNU GPL Version 2 (i.e. license version is explicitly
 291     specified)
 292   * prints error message when conversion is impossible
 293   * binary data filter improved/changed
 294   - fails back to external converter when GNU recode library cannot convert
 295     due to errorneous request
 296   - '' no longer causes enca to read from stdin
 297   - tries to restore files damaged by GNU recode library
 298
 299 enca-0.7.4: 2000-10-12
 300   + box-drawing characters are (carefully) filtered out when guessing
 301   - fixed intermixed behaviour in SMS/nonSMS modes
 302
 303 enca-0.7.3: 2000-10-09
 304   + blocks of probably binary data are filtered out when guessing
 305   * standard input is copied to standard output when its encoding is unknown
 306   - fixed reading only 4096 bytes from pipe (from 0.7.1)
 307
 308 enca-0.7.2: has been never released
 309   + GNU recode recoding chains made possible by starting -x (convert) parameter
 310     with `..'
 311   + second best guess is marked with `-' in -d (print details) output
 312
 313 enca-0.7.1: 2000-10-02
 314   * in case of nonfatal i/o failure enca continues processing remaining files
 315
 316 enca-0.7.0: 2000-09-26
 317   + standard input to standard output conversion
 318   + short message mode -M
 319   + ability to use GNU recode library
 320   + new output type -r (encoding name after RFC1345)
 321   + ability to convert cork internally
 322   + new external converter brecode (recode wrapper)
 323   + new output type -g (list of aliases)
 324   + new option -V (verbose)
 325   * -x (convert) paramteres syntax changed to in_enc..out_enc (old syntax still
 326     supported, will be removed in 0.8.x)
 327   * option -e (disable external) no longer supported, empty string as -C
 328     (external converter) parameter can be used instead
 329   * encoding names specified as -x (convert) parameters are case insensitive
 330   * ascii is not considered unknown encoding (i.e. failure) so enca returns 0
 331   * -d (print details) output improved/changed/updated
 332   * -p (prefix result with file name) no longer prints conversion details
 333   * by default result is prefixed by file name when enca is run on more than
 334     one file
 335
 336 enca-0.6.2: 2000-08-17
 337   + help texts (-h and -v) made usable (thanx to Halef)
 338
 339 enca-0.6.1: 2000-08-15
 340   - tarball bugfix
 341
 342 enca-0.6.0: 2000-07-20
 343   + bulilt-in converter
 344   + -x (convert) can now take form -x in_enc,out_enc causing enca to behave
 345     like a dull converter
 346   + new options -e and -E (disable internal/external converter)
 347   + new option -l (print internally-convertible encodings)
 348
 349 enca-0.5.0: 2000-07-17
 350   * -p (prefix result with file name) causes enca to print what is converted
 351     and how
 352   * iso8859-2/cp1250 recognition improved
 353   - doesn't spawn external converters as fast as is possbile, but waits for
 354     them to return
 355   - fixed `Unrecognized encoding' when winner is 1250 (from 0.4.3)
 356   - corrected -d (print details) table alignment
 357
 358 enca-0.4.3: 2000-07-14
 359   * -d (print details) prints encodings alphabetically sorted
 360   - corrected short encoding name t1 -> cork
 361   - division-by-zero bugfixes
 362
 363 enca-0.4.2: has been never released
 364   * options -m/-M ([don't] use iso8892-2/cp1250 hack) no longer supported
 365   - fixed showing standard input as empty string (<STDIN> is printed now)
 366
 367 enca-0.4.1: 2000-07-12
 368   * default of 60 significant characters changed to 10
 369
 370 enca-0.4.0: 2000-07-10
 371   + first public release
 372