Commit Graph

146 Commits

Author SHA1 Message Date
BogDan Vatra
1ca7de5258 Add "int http_body_is_final(const http_parser *parser)" method.
It's useful to check if the current chunk is the last one.
2012-09-01 03:04:48 +02:00
Ben Noordhuis
ad3b631d4f Turn normal_url_char into a bit array.
Makes http_parser slightly more cache friendly.
2012-08-30 00:04:09 +02:00
Ben Noordhuis
add3018ce7 Add bounds check to http_method_str(). 2012-08-29 23:22:53 +02:00
Ben Noordhuis
9f92347851 Make http_should_keep_alive() const correct. 2012-08-29 23:13:16 +02:00
Bertrand Paquet
a828edaf6a Add a comment 2012-07-25 00:36:04 +02:00
Bertrand Paquet
50faa793f4 Coding style : remove space before ++ 2012-07-25 00:20:03 +02:00
Bertrand Paquet
148984cd8d Rename s_req_host* to be compliant with RFC 2396 2012-07-25 00:16:53 +02:00
Bertrand Paquet
7f1b191d6f Minor speed improvment 2012-07-25 00:13:54 +02:00
Bertrand Paquet
d2ce562338 Use new state instead of pointer 2012-07-17 08:48:46 +02:00
Bertrand Paquet
bb29f43741 Coding style improvment 2012-07-17 08:37:39 +02:00
Bertrand Paquet
f6f761596e Small refactoring, add edge cases 2012-07-08 11:58:18 +02:00
Bertrand Paquet
7965096276 User info implementation 2012-07-08 02:04:12 +02:00
Bertrand Paquet
ed8475d49f Refactor host parsing to allow basic auth management 2012-07-08 02:04:07 +02:00
Ben Noordhuis
b97fdb0513 Don't assert() on whitespace in URL.
Be lenient about tabs and form feeds in non-strict mode.
2012-04-23 15:33:57 +02:00
Ben Noordhuis
8bec3ea459 Create method_strings array with HTTP_METHOD_MAP macro. 2012-03-12 02:18:35 +01:00
Nathan Rajlich
a3373d7627 add support for "SEARCH" request methods 2012-03-12 01:51:40 +01:00
Ben Noordhuis
62110efe7a Support PURGE request method.
Fixes joyent/node#2775.
2012-02-20 16:07:00 +01:00
David Gwynne
67568421e9 allow extra ? at the beginning of a query_string.
fixes joyent/http-parser issue #25
2012-02-19 00:22:26 +01:00
David Gwynne
8da60bc423 implement parsing of v6 addresses and rejection of 0-length host and ports.
the v6 parsing works by adding extra states for working with the
[] notation for v6 addresses. hosts and ports cannot be 0-length
because we url parsing from ending when we expect those fields to
begin.

http_parser_parse_url gets a free check for the correctness of
CONNECT urls (they can only be host:port).

this addresses the following issues:

i was bored and had my head in this space.
2012-02-18 02:01:49 +01:00
David Gwynne
0499525110 Fix http_parser_parse_url for urls like "http://host/path".
Before this change it would include the last slash in the separator between the
schema and host as part of the host. we cant use the trick used for skipping the
separator before ports, query strings, and fragments because if it was a CONNECT
style url string (host:port) it would skip the first character of the hostname.

Work around this by introducing a few more states to represent these separators
in a url differently to what theyre separating. this in turn lets us simplify
the url parsing so can simply skip what it considers delimiters rather than
having to special case certain types of url parts and skip their prefixes.

Add tests for the http_parser_parse_url().

This compares the http_parser_url struct that http_parser_parse_url()
produces against one that we expect from the test. If they differ
then http_parser_parse_url() misbehaved.
2012-02-09 22:28:22 +01:00
Ben Noordhuis
c3153bd1a9 Eat CRLF between requests, even on connection:close.
Fixes #47.
2012-02-09 22:27:05 +01:00
Ben Noordhuis
f668e72380 Make content_length unsigned, add overflow checks. 2012-01-27 20:49:29 +01:00
James McLaughlin
03e0d5292a Use "" instead of <> for the http_parser.h include.
This avoids having to specify -I when building.
2012-01-27 20:44:24 +01:00
Ben Noordhuis
3e626c6cb6 Don't use 'inline'.
'inline' is not a recognized C89 keyword, it made the build fail with strict or
older compilers (msvc 2008, gcc with -std=c89).

'inline' is also just a hint, one that gcc 4.4.3 in this particular case happily
ignored. Ergo, remove it.
2012-01-27 20:43:39 +01:00
Ivo Raisr
2a2f99f9cd http_parser_init does not clear status_code 2012-01-27 20:22:58 +01:00
Andre Caron
051d6fe219 Fixes build on MSVC. 2012-01-21 15:16:23 -05:00
Peter Griess
eb04bbe1fa Merge pull request #73 from pgriess/http-10-message-length
Get HTTP/1.1 message length logic working for HTTP/1.0
2012-01-13 06:41:41 -08:00
Peter Griess
d0bb867d1b Implement http_parser_pause().
Summary:
- Add http_parser_pause() API. A callback may invoke this at any time.
  This will cause http_parser_parse() to return indicating that it
  parsed less than the number of requested bytes and set an error to
  HBE_PAUSED. A paused parser with fail with HBE_PAUSED until it is
  un-paused with http_parser_pause().
- Stop using 'state', 'header_state', 'index', and 'nread' shadow
  variables and then updating their http_parser fields when we're done.
  Instead, update the live values as we go. This will make it possible
  to return from anywhere in the parser (say, due to EPAUSED) and have
  valid/expected state.
- Update state before making callbacks so that if the want to pause,
  we'll know the correct state already.
- Make sure that every callback has a state that uniquely identifies the
  next step so that we can resume in the right place if we were suppoed
  to be paused.
- Clean and re-factor up CALLBACK() macros.
- Use CALLBACK() macros for (almost) all callbacks; on_headers_complete
  is still a special case. This includes on_body which we used to invoke
  manually with a long run of bytes. We now use a 'body' mark and hit
  its callback just like every other data callback.
- Clean up (most) gotos and replace with real states.
- Add some unit tests.

Fixes #70
2012-01-08 20:43:35 -06:00
Peter Griess
b115d110a3 Don't wait for EOF on 0-length KA messages.
- Break EOF handling out of http_should_keep_alive() into
  http_message_needs_eof(), which we now use when determining what to do
  with a message of unknown length. This prevents us from falling into
  the s_body_identity_eof state in the cases where we actually *do* know
  the length of the message (e.g. because the response status was 204).
2012-01-07 17:51:04 -06:00
Peter Griess
248fbc3ab4 Get HTTP/1.1 message length logic working for HTTP/1.0
- Port message length logic from #72 to HTTP/1.0.
- Add a bunch of unit tests for handling 0-length messages.
2012-01-07 17:50:16 -06:00
Peter Griess
d7675cd9a6 Add http_parser_parse_url().
- Add an http_parser_parse_url() method to parse a URL into its
  constituent components. This uses the same underlying parser
  as http_parser_parse() and doesn't do any data copies.
- Re-add the URL components in various test.c structures; validate
  them when parsing.
2012-01-07 16:53:11 -06:00
Peter Griess
48a4364fdd Remove some chars from tokens[] per RFC.
- Treat ' ' specially, as apparently IIS6.0 can send this in headers.
  Allow this character through if we're not in strict mode.
- Move some test code around so that test indices don't break when
  HTTP_PARSER_STRICT changes.

Fixes #13.
2012-01-06 15:40:31 -06:00
koichik
b47c44d7a6 Fix response body is not read
With HTTP/1.1, if neither Content-Length nor Transfer-Encoding is present,
section 4.4 of RFC 2616 suggests http-parser needs to read a response body
until the connection is closed (except the response must not include a body)

See also joyent/node#2457.
Fixes #72
2012-01-05 22:17:12 -08:00
Felix Geisendörfer
2498961231 Accept HTTP/0.9 responses
See joyent/node#1711
2011-11-22 12:51:01 -08:00
Paul Querna
f1d48aa31c Move all data to before code to fix http parser for c89. 2011-10-02 00:36:16 -04:00
Fouad Mardini
2b2ba2da1a rename parser->errno to parser->http_errno; conflicts with errno.h where errno is defined as a macro 2011-07-24 18:49:54 +03:00
Peter Griess
53adfacad1 API CHANGE: Remove path, query, fragment CBs.
- Get rid of support for these callbacks in http_parser_settings.
- Retain state transitions between different URL portions in
  http_parser_execute() so that we're making the same correctness
  guarantees as before.
- These are being removed because making multiple callbacks for the same
  byte makes it more difficult to pause the parser.
2011-07-20 12:16:07 -05:00
Peter Griess
49faf2e9cd Merge pull request #53 from pgriess/callback_noclear
Get rid of CALLBACK_NOCLEAR().
2011-07-20 10:06:46 -07:00
Peter Griess
5469827542 Get rid of CALLBACK_NOCLEAR().
- This was only used by CALLBACK() (which then cleared the mark anyway),
  and the end of the http_parser_execute() body (after which they
  go out of scope).
2011-07-09 13:57:17 -05:00
Peter Griess
761a5eaeb1 Break out errno into its own field. 2011-07-09 11:51:13 -05:00
Jon Kolb
8153466643 Group POST refinements, test all request methods, make IS_ALPHA use LOWER internally 2011-06-20 12:42:57 -04:00
Peter Griess
9114e58a77 Facility to report detailed parsing errors.
- Add http_errno enum w/ values for many parsing error conditions. Stash
  this in http_parser.state if the 0x80 bit is set.
- Report line numbers on error generation if the (new) HTTP_PARSER_DEBUG
  cpp symbol is set. Increases http_parser struct size by 8 bytes in
  this case.
- Add http_errno_*() methods to help turning errno values into
  human-readable messages.
2011-06-19 13:25:03 -05:00
Peter Griess
056bcd3672 Merge pull request #49 from pgriess/upgrade-off-by-one
Fix off-by-one in handling upgrade bodies.
2011-06-19 10:51:23 -07:00
Peter Griess
d4ca280af5 Fix off-by-one in handling upgrade bodies.
- When handling upgraded bodies, http_parser_execute() used to return
  one fewer bytes parsed than expected. This caused the final LF to be
  interpreted by the caller as part of the body.
- Add a bunch of upgrade body unit tests.
2011-06-18 18:57:32 -05:00
Cliff Frey
d5f0312eee remove unused LOWER(ch) 2011-06-18 11:47:02 -07:00
Jon Kolb
a6934445e8 Allow uppercase chars in IS_ALPHANUM 2011-06-18 13:46:29 -04:00
Peter Griess
f684abdcc5 Merge pull request #27 from a2800276/master
lowercasing in header after check for CR LF
2011-06-11 09:36:30 -07:00
Jon Kolb
dc314a3cb9 Return error when bad method starts with M or C 2011-06-10 13:36:36 -04:00
Sean Cunningham
b89f94414e Support multi-line folding in header values.
Normal value cb is called for subsequent lines.  LWS is skipped.
  Note that \t whitespace character is now supported after header field name.

  RFC 2616, Section 2.2
  "HTTP/1.1 header field values can be folded onto multiple lines if the
   continuation line begins with a space or horizontal tab. All linear
   white space, including folding, has the same semantics as SP. A
   recipient MAY replace any linear white space with a single SP before
   interpreting the field value or forwarding the message downstream."
2011-06-03 19:25:54 -04:00
Cliff Frey
3258e4a455 Fix build when char is unsigned by default.
I tested by building/testing with -funsigned-char.  Thanks to apaprocki for
pointing out this problem.
2011-06-03 14:15:33 -07:00