1
0
mirror of https://github.com/GNOME/libxml2.git synced 2025-05-08 21:07:54 +08:00

147 Commits

Author SHA1 Message Date
Nick Wellnhofer
9bbffec568 doc: Move brief to top, params to bottom of doc comments 2025-05-06 19:51:38 +02:00
Nick Wellnhofer
cb1635a642 doc: Use @since command 2025-05-02 19:05:25 +02:00
Nick Wellnhofer
e78e05c990 doc: Fix autolinks to functions
Unfortunately, autolinks in .c files aren't converted by Doxygen for
some reason.
2025-05-02 17:45:31 +02:00
Nick Wellnhofer
f7c412874b doc: Remove more comment block headers 2025-05-02 17:41:26 +02:00
Nick Wellnhofer
e525564f65 doc: Remove empty lines at start of block
These lines were left over after automatic conversion.
2025-05-02 11:42:05 +02:00
Nick Wellnhofer
e549622bc5 doc: Convert documentation to Doxygen
Automated conversion based on a few regexes.
2025-05-01 23:23:42 +02:00
Nick Wellnhofer
69879da88f doc: Remove email addresses from documentation
Also remove authorship information from generated files, hash.c and
globals.c which were rewritten.
2025-05-01 23:23:42 +02:00
Nick Wellnhofer
61890e399d doc: Prepare for conversion to Doxygen
Fix many params in internal functions (not really necessary but Doxygen
warns about that in XML mode).

Fix formatting in a few corner cases that automatic conversion can't
handle.

Rearrange some DOC_DISABLE blocks.
2025-05-01 23:23:42 +02:00
Nick Wellnhofer
a8d8a70c51 uri: Fix handling of Windows drive letters
Allow drive letters in URI paths. Technically, these should be treated
as URI schemes, but this is not what users expect. This also makes sure
that paths with drive letters are resolved as filesystem paths and
unescaped, for example when used in libxslt's document() function.

Should fix #832.
2025-01-27 14:28:29 +01:00
Nick Wellnhofer
8b2d9ac45b uri: Check reallocations for overflow 2024-12-21 19:37:38 +01:00
Nick Wellnhofer
5d36664fc9 memory: Deprecate xmlGcMemSetup 2024-07-16 17:42:10 +02:00
Nick Wellnhofer
e6f25fdc7c uri: Fix documentation of xmlBuildRelativeURI 2024-06-27 11:55:33 +02:00
Nick Wellnhofer
54c6c7e416 uri: Only set file scheme for special Windows paths
Fixes 2ce70cde.

Also fix a test case.
2024-06-24 20:08:27 +02:00
Nick Wellnhofer
2ce70cde46 uri: Handle filesystem paths in xmlBuildRelativeURISafe
This mainly fixes issues on Windows but should also fix a few general
corner cases.

Should fix #745.
2024-06-23 21:48:16 +02:00
Nick Wellnhofer
28b9bb0309 uri: Enable Windows paths on Cygwin 2024-06-22 22:07:45 +02:00
Nick Wellnhofer
4c3d22b059 uri: Fix xmlBuildURI with NULL base
Don't try to parse URI if base is NULL. Fixes functions like xmlParseDTD
with certain filenames.

Should fix #742.
2024-06-20 21:15:08 +02:00
Nick Wellnhofer
4467b89143 Add missing argument checks for new API functions 2024-06-12 13:57:20 +02:00
Nick Wellnhofer
e75e878e02 doc: Update and fix documentation 2024-05-20 14:23:39 +02:00
Nick Wellnhofer
ab63197149 uri: Keep fragment intact when resolving filesystem paths 2023-12-28 17:07:03 +01:00
Nick Wellnhofer
8ab1b122c4 Fix filename and URI handling
Many strings are passed to the library that could be either URIs or
filesystem paths. We now assume that strings are a URI if they contain
the substring "://". This means that they have a scheme and an
authority. Otherwise, URI resolution wouldn't make much sense.

Fix xmlBuildURI to work with filesystem paths. If the base URI doesn't
contain "://" it is treated as filename. The resolved URI is unescaped,
appended and the result is normalized. Rewrite xmlNormalizePath to
handle Windows quirks.

All special handling for Windows paths is removed in xmlCanonicPath.
If the path looks like an URI, only escape characters allowed in Legacy
Extended IRIs.

Make xmlPathToURI only call xmlCanonicPath. Theh additional round-trip
through URI parser and serializer seems useless.

Add a helper function xmlConvertUriToPath in xmlIO.c which checks for
file URIs and unescapes them.

Always process strings with xmlCanonicPath in xmlLoadExternalEntity.
This should be harmless now.

Should help with #334, #387, #611.
2023-12-25 23:38:40 +01:00
Nick Wellnhofer
28913232f6 uri: Clean up special parsing modes
Add function to handle unreserved check. Give flags meaningful names.
Add support to allow ucschars from Legacy Extended IRIs.
2023-12-25 23:38:40 +01:00
Nick Wellnhofer
da996c8d0f uri: Report malloc failures
Fix many places where malloc failures weren't reported, for example
after calling xmlStrdup.

Introduce new public API functions that return a separate error code if
a memory allocation fails:

- xmlParseURISafe
- xmlBuildURISafe
- xmlBuildRelativeURISafe

Update the fuzzer to check whether malloc failures are reported.
2023-12-11 22:05:47 +01:00
Nick Wellnhofer
699299cae3 globals: Stop including globals.h 2023-09-20 22:07:40 +02:00
Nick Wellnhofer
f65133fc04 uri: Add explicit cast in xmlSaveUri
Fix -fsanitize=implicit-conversion error. We should probably
percent-escape the host name here.
2023-01-24 11:32:15 +01:00
Nick Wellnhofer
ae0c9cfa05 uri: Fix handling of port numbers
Allow port number without host, real fix for #71.

Also compare port numbers in xmlBuildRelativeURI.

Fix handling of port numbers in xmlUriEscape.
2022-12-13 01:43:49 +01:00
Nick Wellnhofer
8ed40c621b Revert "uri: Allow port without host"
This reverts commit f30adb54f55e4e765d58195163f2a21f7ac759fb.

Fixes #460.
2022-12-13 00:51:33 +01:00
Nick Wellnhofer
f30adb54f5 uri: Allow port without host
Don't set port to -1 when host is missing. Host can be empty according
to spec.

Fixes #71.
2022-11-20 21:16:03 +01:00
Nick Wellnhofer
76d6b0d768 html: Don't escape ASCII chars in href attributes
In several cases, href attributes can contain ASCII characters which are
illegal in URIs. Escaping them often does more harm than good.

Fixes #321.
2022-11-20 21:16:03 +01:00
Nick Wellnhofer
6843fc726f Remove or annotate char casts 2022-09-01 04:31:30 +02:00
Nick Wellnhofer
2cac626976 Don't use sizeof(xmlChar) or sizeof(char) 2022-09-01 03:35:19 +02:00
Nick Wellnhofer
0f568c0b73 Consolidate private header files
Private functions were previously declared

- in header files in the root directory
- in public headers guarded with IN_LIBXML
- in libxml.h
- redundantly in source files that used them.

Consolidate all private header files in include/private.
2022-08-26 02:11:56 +02:00
Nick Wellnhofer
2489c1d024 Remove useless __CYGWIN__ checks
From what I can tell, some really early Cygwin versions from around
1998-2000 used to erroneously define _WIN32. This was eventually fixed,
but these days, the `defined(_WIN32) && !defined(__CYGWIN__)` idiom is
unnecessary.

Now, we only check for __CYGWIN__ in xmlexports.h when deciding whether
to use __declspec.
2022-02-28 22:58:35 +01:00
Nick Wellnhofer
346c3a930c Remove elfgcchack.h
The same optimization can be enabled with -fno-semantic-interposition
since GCC 5. clang has always used this option by default.
2022-02-20 21:49:04 +01:00
Nick Wellnhofer
0596d67ddc Add explicit cast in xmlURIUnescapeString
Avoids an integer conversion warning with UBSan.
2022-01-25 01:39:41 +01:00
Elliott Hughes
7c06d99e1f Fix xmlURIEscape memory leaks.
Found by running the fuzz/uri.c fuzzer under asan (internal Android bug
171610679).

Always free `ret` when exiting on failure. I've moved the definition of
NULLCHK down past where ret is always initialized to make it clear that
this is safe.

This patch also fixes the indentation of two of the NULLCHK call sites
to make it more obvious that NULLCHK isn't `if`-like.
2020-11-09 18:17:01 +01:00
Nick Wellnhofer
b46016b870 Allow port numbers up to INT_MAX
Also return an error on overflow.
2020-10-17 18:03:09 +02:00
Nick Wellnhofer
20c60886e4 Fix typos
Resolves #133.
2020-03-08 17:41:53 +01:00
Jared Yanovich
2a350ee9b4 Large batch of typo fixes
Closes #109.
2019-09-30 18:04:38 +02:00
Nick Wellnhofer
f9fce96313 Fix unsigned integer overflow
It's defined behavior but -fsanitize=unsigned-integer-overflow is
useful to discover bugs.
2019-05-20 13:38:22 +02:00
Thomas Holder
a71b98ec9d cleanup: remove some unreachable code 2018-11-29 22:25:35 +01:00
Thomas Holder
b1f87c0e43 Fix building relative URIs
Examples:

testURI --relative --base file:///a file:///b
New correct result: b
Old incorrect result: ../b

testURI --relative --base file:///a file:///
New correct result: ./
Old incorrect result: ../

testURI --relative --base file:///a/b file:///a/
New correct result: ./
Old incorrect result: ../../a/
2018-11-29 22:19:44 +01:00
Nick Wellnhofer
41c0a13fe7 Fix Windows compiler warnings in xmlCanonicPath
The code handling Windows paths assigned some char/xmlChar pointers
without explicit casts. Also remove an unused variable.
2017-10-09 13:46:44 +02:00
Daniel Veillard
3daee3f159 Problem resolving relative URIs
Raised by Matthias Pigulla <mp@webfactory.de>

In a nutshell we had that bug on URI composition after some fixes in
the area of localhost empty shortcuts :

./testURI --base file:///some/where file

Without patch: file:/some/file
With patch: file:///some/file
2017-08-28 21:12:14 +02:00
Nick Wellnhofer
91e5496780 Fix xmlBuildRelativeURI for URIs starting with './'
If the relative URI started with './', the 'pos' index was increased
which also affected indexing into the base path. Aside from producing
wrong results, this could also lead to a heap overread of the base
path buffer. The data read from beyond the buffer was only compared
to some char values, so this is mostly harmless.

Inside libxml2, xmlBuildRelativeURI is only called from xinclude.c.

Found with libFuzzer and ASan.
2017-06-10 17:41:42 +02:00
Nick Wellnhofer
d6b3645f9b Fix memory leak in xmlCanonicPath
Found with libFuzzer and ASan.
2017-05-27 15:59:18 +02:00
Michael Paddon
846cf015a7 Integer overflow parsing port number in URI
For https://bugzilla.gnome.org/show_bug.cgi?id=765566

in xmlParse3986Port(), uri->port can overflow when parsing a the port number.
The type of uri->port is int, so the consequent behavior is undefined and
may differ between compilers and architectures
2016-05-21 17:18:15 +08:00
Daniel Veillard
beb7281055 Fix a problem properly saving URIs
As written by Martin Kletzander <mkletzan@redhat.com>:
Since commit 8eb55d782a2b9afacc7938694891cc6fad7b42a5, when you parse
and save an URI that has no server (or similar) part, two slashes
after the 'schema:' get lost.  It means 'uri:///noserver' is turned
into 'uri:/noserver'.

basically
   foo:///only/path

means a host of "" while

   foo:/only/path

means no host at all

  So the best fix IMHO is to fix the URI parser to record the first
case and an empty host string and the second case as a NULL host string

 I would not revert the initial patch, we should not 'invent' those
slash, but we should instead when parsing keep the information that
it's a host based path and that foo:/// means the presence of a host
but an empty one.

Once applied the resulting patch below, all cases seems to be saved
properly:

thinkpad:~/XML -> ./testURI uri:/noserver
uri:/noserver
thinkpad:~/XML -> ./testURI uri:///noserver
uri:///noserver
thinkpad:~/XML -> ./testURI uri://server/foo
uri://server/foo
thinkpad:~/XML -> ./testURI uri:/noserver/foo
uri:/noserver/foo
thinkpad:~/XML -> ./testURI uri:///
uri:///
thinkpad:~/XML -> ./testURI uri://
uri://
thinkpad:~/XML -> ./testURI uri:/
uri:/
thinkpad:~/XML ->

  If you revert the initial patch that last case fails

The problem is that I don't want to change the xmlURI structure to
minimize ABI breakage, so I could not extend the field. The natural
solution is to denote that uri:/// has an empty host by making
the uri server field an empty string which works very well but breaks
applications (like libvirt ;-) who blindly look at uri->server
not being NULL to try to reach it !
Simplest was to stick the port to -1 in that case, instead of 0
application don't bother looking at the port of there is no server
string, this makes the patch more complex than a 1 liner, but
is better for ABI.
2014-10-03 19:22:39 +08:00
Dennis Filder
8eb55d782a xmlSaveUri() incorrectly recomposes URIs with rootless paths
For https://bugzilla.gnome.org/show_bug.cgi?id=731063

xmlSaveUri() of libxml2 (snapshot 2014-05-31 and earlier) returns
bogus values when called with URIs that have rootless paths
(e.g. "urx🅱️b" becomes "urx://b%3Ab" where "urx:b%3Ab" would be
correct)
2014-06-13 14:56:14 +08:00
Michael Stahl
55b899a23a Support long path names on WNT
so we've got this patch to libxml2 2.7.6 in the LibreOffice code base,
inherited from OOo.  it fixes a definite problem, which is that Windows
has a rather low maximum path length restriction, and there is a special
trick on NT whereby path names can be prefixed with "\\?\", in which
case the maximum length is 32k, which ought to be sufficient even for
bloated office suites :)

I'll attach the patch to the xmlCanonicPath function.  note that i
didn't write this and am by no means an expert on either Microsoftean
platforms or libxml so maybe it's not the best way to do it.
2012-09-07 12:19:25 +08:00
Daniel Veillard
5756038650 Cleanup URI module memory allocation code
* uri.c: cleanup the code doing the allocations, set up a structured
  error handler to report memory errors, and set up an abitrary
  limit on URI saving size
* error.c include/libxml/xmlerror.h: add a new FROM_URI indication
  for structured error reporting, also adding strings for schematron
  and buffer which were missing
2012-07-24 11:44:23 +08:00