1
0
mirror of https://github.com/GNOME/libxml2.git synced 2025-05-12 22:53:36 +08:00

492 Commits

Author SHA1 Message Date
Adiel Mittmann
8a103793f2 Non ASCII character may be split at buffer end
* HTMLparser.c: make sure when we call xmlParserInputGrow in
  htmlCurrentChar, to reset the current pointer
2009-08-25 11:27:13 +02:00
Markus Kull
56a03035bf 572129 speed up parasing of large HTML text nodes
* HTMLparser.c: use a different lookup function htmlParseLookupChars()
  to avoid the quadratic behaviour
2009-08-24 19:00:23 +02:00
Daniel Veillard
b468f7444c Remove a pedantic warning 2009-08-24 18:45:33 +02:00
Daniel Veillard
856c668c1a Fix HTML parsing with 0 character in CDATA
* HTMLparser.c: 0 before the end of the input need some special case
  handling, raise the error and return a space instead
2009-08-24 18:16:56 +02:00
Daniel Veillard
029a04d265 541335 HTML avoid creating 2 head or 2 body element
* HTMLparser.c: check when we see an head or a body tag and avoid
  autogenerating them
* include/libxml/parser.h: the values for ctxt->html change depending
  on the head or body tags being seen
2009-08-24 12:50:23 +02:00
Daniel Veillard
6339c1a886 541237 error correcting missing end tags in HTML
* HTMLparser.c: make sure /p closes the FONTSTYLE list of elements
2009-08-24 11:59:51 +02:00
Daniel Veillard
db4ac221f0 Fix a small problem on previous HTML parser patch 2009-08-22 17:58:31 +02:00
Daniel Veillard
e77db16ab1 592430 - HTML parser runs into endless loop
* HTMLparser.c: fix the problem with detection erroring absolutely, and
  properly popping up the stack when in EOF, also passes XML_PARSE_HUGE
  when decoding options.
2009-08-22 11:32:38 +02:00
Daniel Veillard
7459c595a0 588441 allow '.' in HTML Names even if invalid
* HTMLparser.c: just allow '.' in htmlParseHTMLName list of characters
2009-08-13 10:10:29 +02:00
Daniel Veillard
533ec0e073 579317 Try to find the HTML encoding information
* HTMLparser.c: if we hit an encoding error before parsing a potential
  <meta> with the info look in the input buffer to see if we can find
  it instead of forcing a blind switch to ISO-8859-1
2009-08-12 23:00:22 +02:00
Jiri Netolicky
446e126de5 576368 – htmlChunkParser with special attributes
* HTMLparser.c: htmlChunkParsing failed when the chunk ends inside
  element after some attribute which  has a '>' char in its value.
2009-08-07 17:05:36 +02:00
Daniel Veillard
4d3e2da7f8 * HTMLparser.c: make sure we keep line numbers fixes #580705
based Aaron Patterson patch
Daniel
2009-05-15 17:55:45 +02:00
Roland Steiner
04f8eef852 * HTMLparser.c: a broken HTML table attributes initialization,
fixes #581803, by Roland Steiner <rolandsteiner@google.com>
Daniel
2009-05-12 09:16:16 +02:00
Daniel Veillard
7f4547cdbd preparing the release of 2.7.2 fix the Solaris portability issue
* configure.in doc/* NEWS: preparing the release of 2.7.2
* dict.c: fix the Solaris portability issue
* parser.c: additional cleanup on #554660 fix
* test/ent13 result/ent13* result/noent/ent13*: added the
  example in the regression test suite.
* HTMLparser.c: handle leading BOM in htmlParseElement()
Daniel

svn path=/trunk/; revision=3799
2008-10-03 07:58:23 +00:00
Daniel Veillard
a57ba4ce96 fix an HTML parsing error on large data sections reported by Mike Day add
* HTMLparser.c: fix an HTML parsing error on large data sections
  reported by Mike Day
* test/HTML/utf8bug.html result/HTML/utf8bug.html.err
  result/HTML/utf8bug.html.sax result/HTML/utf8bug.html: add the
  reproducer to the test suite
daniel

svn path=/trunk/; revision=3797
2008-09-25 16:06:18 +00:00
Daniel Veillard
4cc67bb77e patch from Robert Schwebel , allows to compile the example if configured
* doc/examples/reader3.c: patch from  Robert Schwebel , allows to
  compile the example if configured without output support fixes
  #545582
* Makefile.am: add testrecurse to the make check tests
* HTMLparser.c: if the parser got a encoding argument it should be
  used over what the meta specifies, patch fixing #536346
Daniel

svn path=/trunk/; revision=3785
2008-08-29 19:58:23 +00:00
Daniel Veillard
ae0765b681 more progresses against the official regression tests small cleanup for
* runxmlconf.c: more progresses against the official regression tests
* runsuite.c: small cleanup for non-leak reports
* include/libxml/tree.h: parsing flags and other properties are
  now added to the document node, this is generally useful and
  allow to make Name and NmToken validations based on the parser
  flags, more specifically the 5th edition of XML or not
* HTMLparser.c tree.c: small side effects for the previous changes
* parser.c SAX2.c valid.c: the bulk of teh changes are here,
  the parser and validation behaviour can be affected, parsing
  flags need to be copied, lot of changes. Also fixing various
  validation problems in the regression tests.
Daniel

svn path=/trunk/; revision=3762
2008-07-31 19:54:59 +00:00
Daniel Veillard
ed86dc2383 applied patch from Ashwin fixing a number of realloc problems improve
* uri.c: applied patch from Ashwin fixing a number of realloc problems
* HTMLparser.c: improve handling for misplaced html/head/body
Daniel

svn path=/trunk/; revision=3740
2008-04-24 11:58:41 +00:00
Daniel Veillard
36de63e71d apparently it's okay to forget the semicolumn after entity refs in HTML,
* HTMLparser.c: apparently it's okay to forget the semicolumn after
  entity refs in HTML, fixing char refs parsing accordingly based on
  T. Manske patch, this should fix #517653
Daniel

svn path=/trunk/; revision=3726
2008-04-03 09:05:05 +00:00
Daniel Veillard
35fcbb84d2 patch from Arnold Hendriks improving parsing of html within html bogus
* HTMLparser.c: patch from Arnold Hendriks improving parsing of
  html within html bogus data, still not a complete fix though
Daniel

svn path=/trunk/; revision=3704
2008-03-12 21:43:39 +00:00
Daniel Veillard
c5b43cc03a avoid stopping parsing when encountering out of range characters in an
* HTMLparser.c: avoid stopping parsing when encountering
  out of range characters in an HTML file, report and 
  continue processing instead, should fix #472696
Daniel

svn path=/trunk/; revision=3675
2008-01-11 07:41:39 +00:00
Daniel Veillard
640f89ef61 fix definition for <embed> to avoid error when saving back, patch from
* HTMLparser.c: fix definition for <embed> to avoid error
  when saving back, patch from Stefan Behnel fixing 495213
Daniel

svn path=/trunk/; revision=3671
2008-01-11 06:24:09 +00:00
Daniel Veillard
861101d1fa fixed bug #381877, avoid reading over the end of stream when generating an
* HTMLparser.c: fixed bug #381877, avoid reading over the end
  of stream when generating an UTF-8 encoding error.
Daniel

svn path=/trunk/; revision=3627
2007-06-12 08:38:57 +00:00
Daniel Veillard
491e58e575 applied patch from Michael Day to add support for <embed> Daniel
* HTMLparser.c: applied patch from Michael Day to add support for <embed>
Daniel

svn path=/trunk/; revision=3611
2007-05-02 16:15:18 +00:00
Daniel Veillard
739e9d0981 Dohh !
Daniel

svn path=/trunk/; revision=3610
2007-04-27 09:33:58 +00:00
Daniel Veillard
4d1320fa5b Jean-Daniel Dupas pointed a couple of problems in htmlCreateDocParserCtxt.
* HTMLparser.c: Jean-Daniel Dupas pointed a couple of problems
  in htmlCreateDocParserCtxt.
Daniel

svn path=/trunk/; revision=3609
2007-04-26 08:55:33 +00:00
Daniel Veillard
42720248e6 change the way script/style are parsed to not try to detect comments,
* HTMLparser.c: change the way script/style are parsed to
  not try to detect comments, reported by Mike Day
* result/HTML/doc3.*: affects the result of that test
Daniel

svn path=/trunk/; revision=3598
2007-04-16 07:02:31 +00:00
William M. Brack
e978ae25ca fixed memory access error on parsing of meta data which had errors (bug
* HTMLparser.c: fixed memory access error on parsing of meta data
  which had errors (bug #382206).  Also cleaned up a few warnings
  by adding some additional DECL macros.

svn path=/trunk/; revision=3593
2007-03-21 06:16:02 +00:00
Daniel Veillard
1032ac4c5c applied patch from Steven Rainwater to fix UTF8ToHtml behaviour on code
* HTMLparser.c: applied patch from Steven Rainwater to fix
  UTF8ToHtml behaviour on code points which are not mappable to
  predefined HTML entities, fixes #377544
Daniel
2006-11-23 16:18:30 +00:00
Daniel Veillard
772869fe10 change htmlCtxtReset() following Michael Day bug report and suggestion.
* HTMLparser.c: change htmlCtxtReset() following Michael Day bug
  report and suggestion.
Daniel
2006-11-08 09:16:56 +00:00
Daniel Veillard
890fd9f9f3 applied a reworked version of Usamah Malik patch to avoid growing the
* HTMLparser.c: applied a reworked version of Usamah Malik patch
  to avoid growing the parser stack in some autoclose cases, should
  fix #361221
Daniel
2006-10-27 12:53:28 +00:00
Daniel Veillard
af616a7386 fix one problem found in htmlCtxtUseOptions() and pointed in #340591
* HTMLparser.c: fix one problem found in htmlCtxtUseOptions()
  and pointed in #340591
Daniel
2006-10-17 20:18:39 +00:00
Daniel Veillard
8a82ae12c3 fixed teh 2 stupid bugs affecting htmlReadDoc() and htmlReadIO() this
* HTMLparser.c: fixed teh 2 stupid bugs affecting htmlReadDoc() and
  htmlReadIO() this should fix #340322
Daniel
2006-10-17 20:04:10 +00:00
Daniel Veillard
c47d263049 fixing HTML minimized attribute values to be generated internally if not
* HTMLparser.c: fixing HTML minimized attribute values to be generated
  internally if not present, fixes bug #332124
* result/HTML/doc2.htm.sax result/HTML/doc3.htm.sax
  result/HTML/wired.html.sax: this affects the SAX event strem for
  a few test cases
Daniel
2006-10-17 16:13:27 +00:00
Daniel Veillard
48519092e5 fixing HTML entities in attributes parsing bug #362552 added to the
* HTMLparser.c: fixing HTML entities in attributes parsing bug #362552
* result/HTML/entities2.html* test/HTML/entities2.html: added to
  the regression suite
Daniel
2006-10-17 15:56:35 +00:00
Daniel Veillard
7e30356556 fix #348252 if the document clains to be in a different encoding in the
* HTMLparser.c: fix #348252 if the document clains to be in a
  different encoding in the meta tag and it's obviously wrong,
  don't screw up the end of the content.
Daniel
2006-10-16 13:14:55 +00:00
Daniel Veillard
68716a772c fix a chunking and script bug #347708 Daniel
* HTMLparser.c: fix a chunking and script bug #347708
Daniel
2006-10-16 09:32:17 +00:00
Daniel Veillard
28aac0b0f4 remove a warning check with uppercase for AIX iconv() should fix #352644
* HTMLparser.c: remove a warning
* encoding.c: check with uppercase for AIX iconv() should fix #352644
* doc/examples/Makefile.am: partially handle one bug report
Daniel
2006-10-16 08:31:18 +00:00
Daniel Veillard
f1a27c659e added --html --memory to test htmlReadMemory to test #321632 added various
* xmllint.c: added --html --memory to test htmlReadMemory to
  test #321632
* HTMLparser.c: added various initialization calls which may help
  #321632 but not conclusive
* testapi.c tree.c include/libxml/tree.h: fixed compilation with
  --with-minimum --with-sax1 and --with-minimum --with-schemas
  fixing #326442
Daniel
2006-10-13 22:33:03 +00:00
Daniel Veillard
34c647cfae exports htmlNewParserCtxt() as Michael Day pointed out this is needed to
* HTMLparser.c include/libxml/HTMLparser.h: exports htmlNewParserCtxt()
  as Michael Day pointed out this is needed to use htmlCtxtRead*()
Daniel
2006-09-21 06:53:59 +00:00
Daniel Veillard
065abe8565 applied const'ification of strings patch from Matthias Clasen Daniel
* HTMLparser.c: applied const'ification of strings patch from
  Matthias Clasen
Daniel
2006-07-03 08:55:04 +00:00
Daniel Veillard
30e7607b7a a bunch of small cleanups based on coverity reports. Daniel
* HTMLparser.c parser.c parserInternals.c pattern.c uri.c: a bunch
  of small cleanups based on coverity reports.
Daniel
2006-03-09 14:13:55 +00:00
Daniel Veillard
499cc9204f try to fix xmlParseInNodeContext when operating on an HTML document.
* HTMLparser.c libxml.h parser.c: try to fix xmlParseInNodeContext
  when operating on an HTML document.
Daniel
2006-01-18 17:22:35 +00:00
Daniel Veillard
6a0baa0cd8 fixed a number of warnings shown by HP-UX compiler and reported by Rick
* HTMLparser.c configure.in parserInternals.c runsuite.c runtest.c
  testapi.c xmlschemas.c xmlschemastypes.c xmlstring.c: fixed a number
  of warnings shown by HP-UX compiler and reported by Rick Jones
Daniel
2005-12-10 11:11:12 +00:00
Daniel Veillard
b990008f05 script HTML parser error fix, corrects bug #319715 added test from Michael
* HTMLparser.c: script HTML parser error fix, corrects bug #319715
* result/HTML/53867* test/HTML/53867.html: added test from Michael Day
  to the regression suite
Daniel
2005-10-25 12:36:29 +00:00
Daniel Veillard
2cf36a1cc1 typo fix from Michael Day Daniel
* HTMLparser.c: typo fix from Michael Day
Daniel
2005-10-25 12:21:29 +00:00
Daniel Veillard
36d73403ff Applied the last patch from Gary Coady for #304637 changing the behaviour
* HTMLparser.c: Applied the last patch from Gary Coady for #304637
  changing the behaviour when text nodes are found in body
* result/HTML/*: this changes the output of some tests
Daniel
2005-09-01 09:52:30 +00:00
Daniel Veillard
8874b94cd2 added a parser XML_PARSE_COMPACT option to allocate small text nodes (less
* HTMLparser.c parser.c SAX2.c debugXML.c tree.c valid.c xmlreader.c
  xmllint.c include/libxml/HTMLparser.h include/libxml/parser.h:
  added a parser XML_PARSE_COMPACT option to allocate small
  text nodes (less than 8 bytes on 32bits, less than 16bytes on 64bits)
  directly within the node, various changes to cope with this.
* result/XPath/tests/* result/XPath/xptr/* result/xmlid/*: this
  slightly change the output
Daniel
2005-08-25 13:19:21 +00:00
Daniel Veillard
ea4b0baef2 added a recovery mode for the HTML parser based on the suggestions of bug
* HTMLparser.c include/libxml/HTMLparser.h: added a recovery mode
  for the HTML parser based on the suggestions of bug #169834 by
  Paul Loberg
Daniel
2005-08-23 16:06:08 +00:00
Daniel Veillard
d2755a8134 fixed an uninitialized memory access spotted by valgrind Daniel
* HTMLparser.c: fixed an uninitialized memory access spotted by
  valgrind
Daniel
2005-08-07 23:42:39 +00:00