Sat, 10 Mar 2018 11:06:13 -0800 hgweb: expose input stream on parsed WSGI request object
Gregory Szorc <gregory.szorc@gmail.com> [Sat, 10 Mar 2018 11:06:13 -0800] rev 36857
hgweb: expose input stream on parsed WSGI request object Our next step towards moving away from wsgirequest to our newer, friendlier parsedrequest type is input stream access. This commit exposes the input stream on the instance. Consumers in the HTTP protocol server switch to it. Because there were very few consumers of the input stream, we stopped storing a reference to the input stream on wsgirequest directly. All access now goes through parsedrequest. However, wsgirequest still may read from this stream as part of cgi.parse(). So we still need to create the stream from wsgirequest. Differential Revision: https://phab.mercurial-scm.org/D2771
Sat, 10 Mar 2018 10:56:10 -0800 hgweb: make parsedrequest part of wsgirequest
Gregory Szorc <gregory.szorc@gmail.com> [Sat, 10 Mar 2018 10:56:10 -0800] rev 36856
hgweb: make parsedrequest part of wsgirequest This is kind of ugly. But an upcoming commit will teach parsedrequest about the input stream. Because the input stream is global state and can't be accessed without side-effects, we need to take actions to ensure that multiple consumers don't read from it independently. The easiest way to do this is for one object to hold a reference to both items having access to the input stream so that when a copy is made, we can remove the attribute from the other instance. So we create our parsed request instance from the wsgirequest constructor and hold a reference to it there. This is better than our new type holding a reference to wsgirequest because all the code for managing access will be temporary and we shouldn't pollute parsedrequest with this ugly history. Differential Revision: https://phab.mercurial-scm.org/D2770
Sat, 10 Mar 2018 11:03:45 -0800 hgweb: refactor the request draining code
Gregory Szorc <gregory.szorc@gmail.com> [Sat, 10 Mar 2018 11:03:45 -0800] rev 36855
hgweb: refactor the request draining code The previous code for draining was only invoked in a few places in the wire protocol. Behavior wasn't consist. Furthermore, it was difficult to reason about. With us converting the input stream to a capped reader, it is now safe to always drain the input stream when its size is known because we can never overrun the input and read into the next HTTP request. The only question is "should we?" This commit changes the draining code so every request is examined. Draining now kicks in for a few requests where it wouldn't before. But I think the code is sufficiently restricted so the behavior is safe. Possibly the most dangerous part of this code is the issuing of Connection: close for POST and PUT requests that don't have a Content-Length. I don't think there are any such uses in our WSGI application, so this should be safe. In the near future, I plan to significantly refactor the WSGI response handling. I anticipate this code evolving a bit. So any minor regressions around draining or connection closing behavior might be fixed as a result of that work. All tests pass with this change. That scares me a bit because it means we are lacking low-level tests for the HTTP protocol. Differential Revision: https://phab.mercurial-scm.org/D2769
Sat, 10 Mar 2018 10:48:34 -0800 hgweb: use a capped reader for WSGI input stream
Gregory Szorc <gregory.szorc@gmail.com> [Sat, 10 Mar 2018 10:48:34 -0800] rev 36854
hgweb: use a capped reader for WSGI input stream Per PEP 3333, the input stream from WSGI should respect EOF and prevent reads past the end of the request body. However, not all WSGI servers guarantee this. Notably, our BaseHTTPServer based built-in HTTP server doesn't. Instead, it exposes the raw socket and you can read() from it all you want, getting the connection in a bad state by doing so. We have a "cappedreader" utility class that proxies a file object and prevents reading past a limit. This commit converts the WSGI input stream into a capped reader when the input length is advertised via Content-Length headers. "cappedreader" only exposes a read() method. PEP 3333 states that the input stream MUST also support readline(), readlines(hint), and __iter__(). However, since our WSGI application code only calls read() and since we're not manipulating the stream exposed by the WSGI server, we're not violating the spec here. Differential Revision: https://phab.mercurial-scm.org/D2768
Sat, 10 Mar 2018 10:47:30 -0800 hgweb: document continuereader
Gregory Szorc <gregory.szorc@gmail.com> [Sat, 10 Mar 2018 10:47:30 -0800] rev 36853
hgweb: document continuereader Differential Revision: https://phab.mercurial-scm.org/D2767
Thu, 08 Mar 2018 18:00:04 -0800 hgweb: remove wsgirequest.__iter__
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 08 Mar 2018 18:00:04 -0800] rev 36852
hgweb: remove wsgirequest.__iter__ This was added in d0db3462d568 in 2006. I can't find a justification for this method in PEP 3333. I suspect we were originally intending to use this type as the WSGI application (which should be iterable)? The tests all pass without this method. So let's nuke it. Differential Revision: https://phab.mercurial-scm.org/D2749
Thu, 08 Mar 2018 17:57:07 -0800 hgweb: remove wsgirequest.read()
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 08 Mar 2018 17:57:07 -0800] rev 36851
hgweb: remove wsgirequest.read() This was just a proxy to self.inp.read(). This method serves little value. Let's nuke it. Callers in the wire protocol server have been updated accordingly. Differential Revision: https://phab.mercurial-scm.org/D2748
Sat, 10 Mar 2018 10:46:08 -0800 hgweb: remove unused methods on wsgirequest
Gregory Szorc <gregory.szorc@gmail.com> [Sat, 10 Mar 2018 10:46:08 -0800] rev 36850
hgweb: remove unused methods on wsgirequest writelines() isn't used in our code base. close() was a no-op. It is an optional method per PEP 3333. My eventual goal is to kill the wsgirequest class, hence why I'm removing code. Differential Revision: https://phab.mercurial-scm.org/D2747
Thu, 08 Mar 2018 17:17:48 -0800 wireprotoserver: remove unused argument from _handlehttperror()
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 08 Mar 2018 17:17:48 -0800] rev 36849
wireprotoserver: remove unused argument from _handlehttperror() Differential Revision: https://phab.mercurial-scm.org/D2746
Sat, 10 Mar 2018 10:44:56 -0800 hgweb: store and use request method on parsed request
Gregory Szorc <gregory.szorc@gmail.com> [Sat, 10 Mar 2018 10:44:56 -0800] rev 36848
hgweb: store and use request method on parsed request PEP 3333 says that REQUEST_METHOD is always defined. Differential Revision: https://phab.mercurial-scm.org/D2745
(0) -30000 -10000 -3000 -1000 -300 -100 -10 +10 +100 +300 +1000 +3000 +10000 tip