HTTP.md (2519B)
1 +++ 2 title = 'HTTP' 3 +++ 4 # HTTP 5 a request-response protocol 6 7 client is a user agent (browser), server is an application process running on a different computer 8 9 URI: uniform resource identifier 10 11 - URN: uniform resource name 12 - unique name for resource (like a book's ISBN) 13 - relies on some authority to provide information 14 - URL: uniform resource locator 15 - location on the web 16 - expect HTTP GET request to provide information about this resource 17 - a modern form is IRI (international resource identifier) which can also include unicode characters, not just ASCII 18 19 URLs: 20 `http://www.example.org:5678/a/b.txt?tc=win&r=10#para5` 21 22 - contain 23 - authority 24 - host (FQDN, fully qualified domain name) 25 - port 26 - request-URI 27 - path 28 - query 29 - fragment 30 - browser connects to authority over TCP 31 - request-URI included in start line (/ is default) 32 - fragment is not sent to server, used to scroll user's view 33 34 Content negotiation 35 36 - multiple representations via the same URI 37 - client-server negotiation determines what shows up 38 39 Request methods 40 41 - GET: get documents, no body in request 42 - POST: e.g. when you click submit, form information is included in the boyd 43 - HEAD: requests only the header fields to be returned 44 45 Request example: 46 47 ``` 48 GET /test.html HTTP/1.1 49 HOST: [www.example.org](http://www.example.org) 50 … 51 ``` 52 53 Response example: 54 55 ``` 56 HTTP/1.1 200 OK 57 Date: <timestamp> 58 ``` 59 60 Header fields are included in the requests, like: 61 62 - user agent 63 - referrer 64 - content type 65 - acceptable MIME types (data types like text/html, image/png, vide/mp4) 66 - character encoding (most popular being UTF-8 encoding of unicode) 67 68 HTTP response codes: 69 70 - three-digit number 71 - first digit is the class 72 - 1: information 73 - 2: success 74 - 3: redirect 75 - 4: client fucked up 76 - 5: server fucked up 77 - other two digits are extra information 78 79 HTTP servers: 80 81 - initially, there weren't that many, mostly FTP in the 90s 82 - then, NCSA made httpd for free, and promptly fucked off to work on Netscape (stopping all support) 83 - so developers started making patches, and turned it into "a patchy server"…the Apache server 84 - a main loop for a server looks like so: 85 86 ``` 87 while forever 88 listen on TCP port 80 89 read request 90 send response 91 ``` 92 93 - the server doesn't know its own hostname, because host names may be on the same machine (virtual hosts) and different responses may be needed for different hostnames 94 - the server uses a config file to determine where "/" is (the document root)