Contents:

1. Introduction

In REBOL, a scheme is a set of definition and functions that handle the operations of a kind of port. This scheme handles the HTTP ports, allowing access to resources served via HTTP. Version 1.1 of the HTTP protocol is supported (although some features are not yet implemented at this point).

2. Overview

The scheme is defined by calling the make-scheme function. Our port spec adds the following words to the standard network port spec:

path
path of the HTTP request target;
method
HTTP request method (as a word);
headers
additional HTTP headers to send (block, gets evaluated into an object);
content
HTTP request content;
timeout
used for sync operations only at this point.

We also add the following words to the standard file info object:

response-line
HTTP response line (string!);
headers
HTTP response headers (object!).

When actor is defined as a block, make-scheme uses append to define it so that the function bodies are not bound to the actor context.

Overview

Support functions

make-scheme [
 name: 'http
 title: "HyperText Transport Protocol v1.1"
 spec: make system/standard/port-spec-net [
  path: %/
  method: 'get
  headers: [ ]
  content: none
  timeout: 15 ; used for sync ops only
 ]
 info: make system/standard/file-info [
  response-line:
  headers: none
 ]
 actor: [
  Actor functions
 ]
]

3. Actor functions

HTTP ports can be accessed at different levels of abstraction. In the higher level, you mostly want to read the contents of a HTTP URL, synchronously, with the system handling things like redirects, authentication, cookies and so on automatically. (Note that cookies do not automatically persist across sections; see [not yet implemented] for more info.)

In the lower level, you can do multiple requests with a single connection, and work asynchronously. The system still handles most of the stuff for you, but you have much more freedom in the handling of the responses. This is also useful to implement HTTP-based protocols, like WebDav.

Actor functions

Higher level actors
Lower level actors

3.1 Higher level actors

read is the highest level actor, when called on an url! or a port! without an awake function defined. In these cases, it is synchronous, and just does a GET request and returns the content.

Higher level actors

read: func [
 port [port!]
 ; refinements not handled yet
] [
 either any-function? :port/awake [
  Asynchronous read
 ] [
  sync-op port [ ]
 ]
]

write is still high level when called on an url! or a port that does not have an awake function (sync behavior). It allows specifying the method, headers and content of the HTTP request, then returns the content of the response. It is very similar to read.

value can be a block, where you can specify the HTTP method (as a word!) for the request (POST by default), the target (as a file! or url!, port/spec/path by default), any additional headers (as a block!, gets evaluated to an object) and the content for the request (any-string!); if you pass any other value, it gets form'ed (while any-string! values are sent as-is) and sent as the content of a POST request (with a Content-Type of "application/x-www-form-urlencoded").

Higher level actors +≡

write: func [
 port [port!]
 value
] [
 unless any [block? :value any-string? :value] [value: form :value]
 unless block? value [value: reduce [[Content-Type: "application/x-www-form-urlencoded"] value]]
 either any-function? :port/awake [
  Asynchronous write
 ] [
  sync-op port [parse-write-dialect port value]
 ]
]

3.1.1 The sync-op function

This function performs a sync operation on the port (eg. read or write). It sets up an awake function for the HTTP port (that will take care of performing the actual request when the port is ready, and returning from wait when the response is complete), opens and sets up the port if necessary, does the body block, then just waits and returns the received data. See The read-sync-awake function for the read-sync-awake function.

The sync-op function

sync-op: func [port body] [
 port/awake: :read-sync-awake
 unless port/state [open port port/state/close?: yes]
 do body
 ; if we're ready already, perform the request
 if port/state/state = 'ready [do-request port]
 unless port? wait [port/state/connection port/spec/timeout] [cause make error! "Timeout"]
 if port/state/close? [close port]
 port/awake: none ; reset so that next op will be sync too
 copy port
]

3.1.2 The read-sync-awake function

The awake function for synchronous read is very simple, and just starts the request when the port is ready, and returs from wait when the response is complete. For do-request, see Support functions.

The read-sync-awake function

read-sync-awake: func [event [event!] /local error] [
 switch/default event/type [
  ; read and wrote can be used to show progress
  connect ready [
   ; ready for a request
   ; users would call read port here - we optimize by calling do-request directly
   do-request event/port
   false
  ]
  done [
   ; response is complete, data is ready
   true
  ]
  ; do we need a close event??
  error [
   ; users can get the error using query - we get it directly
   error: event/port/state/error
   event/port/state/error: none
   cause error
  ]
  ; custom event can be used for custom response processing and
  ; handling of redirects
 ] [
  false
 ]
]

3.2 Lower level actors

The open actor is the one responsible of initializing the port and opening the TCP connection to the server. It is either called directly by users (on an url! or port! etc.), or indirectly when doing a read or write to an url! and so on.

Notice that open is always async, and only initiates the TCP connection; it returns immediately after setting up an awake function for the TCP port (see The awake function for the TCP connection), so you have to wait on the port for the connection to actually happen.

Lower level actors

open: func [
 port [port!]
 ; refinements not handled yet
 /local conn
] [
 ; noop if already open - should throw error?
 if port/state [return port]
 port/state: context [
  state: 'inited
  connection:
  error: none
  close?: no
  info: make port/scheme/info [ ]
 ]
 port/state/connection: conn: make port! [
  scheme: 'tcp
  host: port/spec/host
  port-id: port/spec/port-id
 ]
 conn/awake: :http-awake
 conn/locals: port ; locals is a bad name - need to be able to use custom word
 open conn
 port
]

read and write can also behave as async functions when used on an opened port. See Asynchronous read and Asynchronous write.

The open? actor checks if the port is open.

Lower level actors +≡

open?: func [
 port [port!]
] [
 ; not sure here
 found? all [port/state open? port/state/connection]
]

The close actor closes an open port.

Lower level actors +≡

close: func [
 port [port!]
] [
 if port/state [
  close port/state/connection
  port/state/connection/awake: none
  port/state: none
 ]
 port
]

The copy actor returns the port data, unless the request method was HEAD, in which case the response header is returned. You can use query to obtain more info.

Lower level actors +≡

copy: func [
 port [port!]
] [
 either all [port/spec/method = 'head port/state] [
  reduce [port/state/info/response-line port/state/info/headers]
 ] [
  if port/data [copy port/data]
 ]
]

query can be used at any time to obtain information from the port. In case there has been an error ('error event type), the actual error! value is returned; otherwise it just returns the port/state/info object.

Note that query also resets the error condition.

Lower level actors +≡

query: func [
 port [port!]
 /local error
] [
 either error? error: port/state/error [
  port/state/error: none
  error
 ] [
  port/state/info
 ]
]

length? returns the number of bytes of content that have been read so far. This does not include the header, or the overhead for the chunked transfer, and so on.

Lower level actors +≡

length?: func [
 port [port!]
] [
 either port/data [length? port/data] [0]
]

3.2.1 The awake function for the TCP connection

The http-awake function is set as the awake function of the TCP connection used by the HTTP port. It handles the events on the connection, reads the server's response and parses it, and sends events to the HTTP port's awake function.

The awake function for the TCP connection

http-awake: func [event /local port http-port d1 d2 h] [
 port: event/port
 http-port: port/locals
 switch/default event/type [
  read [
   http-port/awake make event! [type: 'read port: http-port]
   check-response http-port
  ]
  wrote [
   http-port/awake make event! [type: 'wrote port: http-port]
   http-port/state/state: 'reading-headers
   read port
   false
  ]
  lookup [open port false]
  connect [
   http-port/state/state: 'ready
   http-port/awake make event! [type: 'connect port: http-port]
  ]
  ; close event does not seem to work...
 ] [true]
]

3.2.2 Asynchronous read

read can be used for async reads. It does only work when called on a port in ready state, so should be called from the HTTP port's awake function when you get the 'connect or 'ready events. It will inintiate the HTTP request; you'll get 'write events while the request is being written, 'wrote when it has been sent, and 'read events while data is being read (useful for progress indicators - you can query the port to obtain info such as the response line, and length? port to get the number of bytes of the content that have been read so far). When all the response data has been downloaded, you'll get a 'done event.

Asynchronous read

unless open? port [cause-error 'Access 'not-open port/spec/ref]
; should be something like "port is not ready"
if port/state/state <> 'ready [cause make error! "Port not ready"]
do-request port
port

3.2.3 Asynchronous write

write can be used for async writes. It's almost the same as read except that you can specify custom values for the request (see Higher level actors and Support functions).

Asynchronous write

unless open? port [cause-error 'Access 'not-open port/spec/ref]
if port/state/state <> 'ready [cause make error! "Port not ready"]
parse-write-dialect port value
do-request port
port

4. Support functions

This section contains all the support functions used by the scheme.

Support functions

The sync-op function
The read-sync-awake function
The awake function for the TCP connection

make-http-request is used to generate the actual request sent to the server. Note, that the whole request is being generated in memory, then sent to the server. In some cases it may be desirable to generate the request incrementally. We plan to support this in the near future.

Support functions +≡

make-http-request: func [
 "Create an HTTP request (returns string!)"
 method [word! string!] "E.g. GET, HEAD, POST etc."
 target [file! string!] {In case of string!, no escaping is performed (eg. useful to override escaping etc.). Careful!}
 headers [block!] "Request headers (set-word! string! pairs)"
 ; maybe allow content to be a function that can insert directly into output,
 ; for maximum efficiency (no Content-Length in that case though)
 content [any-string! none!] {Request contents (Content-Length is created automatically). Empty string not exactly like none.}

 /local result
] [
 result: rejoin [
  uppercase form method #" "
  ; needs better escaping!!
  either file? target [next mold target] [target]
  " HTTP/1.1" CRLF
 ]
 foreach [word string] headers [
  repend result [mold word #" " string CRLF]
 ]
 if content [
  repend result ["Content-Length: " length? content CRLF]
 ]
 append result CRLF
 if content [append result content]
 result
]

The do-request function makes the HTTP request to the server.

WARNING

Need to change it so that it can write more than 32000 bytes per request.

Support functions +≡

do-request: func [
 "Perform an HTTP request"
 port [port!]
] [
 port/spec/headers: third make make object! [
  Accept: "*/*"
  Host: either port/spec/port-id <> 80 [
   rejoin [form port/spec/host #":" port/spec/port-id]
  ] [
   form port/spec/host
  ]
  User-Agent: "REBOL"
 ] port/spec/headers
 port/state/state: 'doing-request
 port/state/info/headers: port/state/info/response-line: port/data: none
 write port/state/connection
  make-http-request port/spec/method to file! any [port/spec/path %/]
   port/spec/headers port/spec/content
]

The parse-write-dialect function parses the block passed to write and extracts the request method, request target, request headers and request content. The values in the port object are set accordingly so that they can be used later on by the do-request function.

Support functions +≡

parse-write-dialect: func [port block] [
 parse block [
  [set block word! (port/spec/method: block) | (port/spec/method: 'post)]
  opt [set block [file! | url!] (port/spec/path: block)]
  [set block block! (port/spec/headers: block) | (port/spec/headers: [ ])]
  [set block any-string! (port/spec/content: block) | (port/spec/content: none)]
 ]
]

4.1 The check-response function

check-response is the function that checks the response header and handles server errors, redirects, and so on. Returns true if we are done (ie, port/data is the actual requested content).

Support functions +≡

check-response: func [port /local conn res h line] [
 conn: port/state/connection
 h: port/state/info/headers
 line: port/state/info/response-line
 Check if header has been received, and parse it
 If header has not been received completely yet, read more data
 res: false
 parse/all line [
  "HTTP/1." [#"0" | #"1"] some #" " [
   #"1" (Handle Informational response)
   |
   #"2" [Handle Success responses]
   |
   #"3" [Handle Redirect responses]
   |
   #"4" [Handle Client Error responses]
   |
   #"5" (Handle Server Error responses)
  ]
  |
  (
   port/state/error: make error! "HTTP response version not supported"
   res: port/awake make event! [type: 'error port: port]
   close port
  )
 ]
 res
]

4.1.1 Check if header has been received, and parse it

Check if header has been received, and parse it

if all [
 ; assuming HTTP 1.x response - need to handle legacy HTTP 0.9 too
 not h
 d1: find conn/data crlf
 d2: find/tail d1 crlf2
] [
 port/state/info/response-line: line: as-string copy/part conn/data d1
 port/state/info/headers: h: construct/with d1 http-response-headers
 if h/content-length [h/content-length: to integer! h/content-length]
 remove/part conn/data d2
 port/state/state: 'reading-data
]

4.1.2 If header has not been received completely yet, read more data

If header has not been received completely yet, read more data

unless h [
 read conn
 return false
]

4.1.3 Handle Informational response

Informational (1xx) responses are just ignored for now.

Handle Informational response

port/state/info/headers: port/state/info/response-line: port/data: none
port/state/state: 'reading-headers
read conn

4.1.4 Handle Success responses

Handle Success responses

["04" | "05"] ( ; No Content / Reset Content
 port/state/state: 'ready
 Send 'done and 'ready events
)
|
( ; Response has content
 either port/spec/method = 'head [
  ; no need to read content
  port/state/state: 'ready
  Send 'done and 'ready events
 ] [
  res: check-data port
  if all [not res port/state/state = 'ready] [Send 'done and 'ready events]
 ]
)

4.1.5 Handle Redirect responses

Handle Redirect responses

"03" ( ; See other
 either port/spec/method = 'head [
  port/state/state: 'ready
  res: port/awake make event! [type: 'custom port: port code: 0]
 ] [
  res: check-data port
 ]
 if all [not res port/state/state = 'ready] [
  either all [in h 'Location not res] [
   port/spec/method: 'get
   res: do-redirect port h/location
  ] [
   port/state/error: make error! "Redirect requires manual intervention"
   res: port/awake make event! [type: 'error port: port]
  ]
 ]
)
|
"04" (port/state/state: 'ready Send 'done and 'ready events) ; Not modified
|
"05" ( ; Use proxy
 port/state/state: 'ready
 port/state/error: make error! "Proxies not supported yet"
 res: port/awake make event! [type: 'error port: port]
)
|
( ; Any other redir
 either port/spec/method = 'head [
  port/state/state: 'ready
  res: port/awake make event! [type: 'custom port: port code: 0]
 ] [
  res: check-data port
 ]
 if all [not res port/state/state = 'ready] [
  either all [
   ; automatic redir only for GET or HEAD
   find [get head] port/spec/method
   in h 'Location
  ] [
   res: do-redirect port h/location
  ] [
   port/state/error: make error! "Redirect requires manual intervention"
   res: port/awake make event! [type: 'error port: port]
  ]
 ]
)

4.1.6 Handle Client Error responses

Handle Client Error responses

( ; in all cases...
 either port/spec/method = 'head [
  port/state/state: 'ready
 ] [
  check-data port
 ]
)
"01" ( ; Unauthorized
 port/state/error: make error! "Authentication not supported yet"
 res: port/awake make event! [type: 'error port: port]
)
|
"07" ( ; Proxy auth required
 port/state/error: make error! "Authentication and proxies not supported yet"
 res: port/awake make event! [type: 'error port: port]
)
|
; Any other case
(
 port/state/error: make error! join "Server error: " line
 res: port/awake make event! [type: 'error port: port]
)

4.1.7 Handle Server Error responses

Handle Server Error responses

either port/spec/method = 'head [
 port/state/state: 'ready
] [
 check-data port
]
port/state/error: make error! join "Server error: " line
res: port/awake make event! [type: 'error port: port]

4.1.8 Send 'done and 'ready events

Send 'done and 'ready events

res: port/awake make event! [type: 'done port: port]
unless res [res: port/awake make event! [type: 'ready port: port]]

4.1.9 Values needed by check-response

The above function needs crlf2 and the http-response-headers prototype object.

Support functions +≡

crlf2: {^M ^M }
http-response-headers: context [
 Content-Length:
 Transfer-Encoding: none
]

4.2 do-redirect and check-data

do-redirect handles redirections.

Support functions +≡

do-redirect: func [port [port!] new-uri [url! string! file!]] [
 new-uri: construct/with decode-url new-uri port/scheme/spec
 if new-uri/scheme <> 'http [
  port/state/error: make error! {Redirect to a protocol different from HTTP not supported}
  return port/awake make event! [type: 'error port: port]
 ]
 either all [
  ; we should probably check that the IP matches,
  ; not just the host name, however that requires an extra lookup...
  new-uri/host = port/spec/host
  new-uri/port-id = port/spec/port-id
 ] [
  ; redirect to same host
  port/spec/path: new-uri/path
  do-request port
  false
 ] [
  port/state/error: make error! "Redirect to other host - requires custom handling"
  port/awake make event! [type: 'error port: port]
 ]
]

check-data checks that the response content has been transferred completely.

Support functions +≡

check-data: func [port /local h res data out chunk-size mk1 mk2 trailer] [
 h: port/state/info/headers
 res: false
 case [
  ; transfer encoding needs to have precedence over content length
  h/transfer-encoding = "chunked" [
   Handle chunked transfer encoding
  ]
  integer? h/content-length [
   port/data: port/state/connection/data
   either h/content-length <= length? port/data [
    port/state/state: 'ready
    port/state/connection/data: make binary! 32000
    res: port/awake make event! [type: 'custom port: port code: 0]
   ] [
    read port/state/connection
   ]
  ]
  true [
   ; it is assumed that the server will close the connection
   port/data: port/state/connection/data
   read port/state/connection
  ]
 ]
 res
]

4.2.1 Handle chunked transfer encoding

Handle chunked transfer encoding

data: port/state/connection/data
out: port/data: make binary! length? data
until [
 either parse/all data [
  copy chunk-size some hex-digits thru crlf mk1: to end
 ] [
  chunk-size: to integer! to issue! chunk-size
  either chunk-size = 0 [
   if parse/all mk1 [
    crlf (trailer: "") to end | copy trailer to crlf2 to end
   ] [
    trailer: construct trailer
    append h third trailer
    port/state/state: 'ready
    res: port/awake make event! [type: 'custom port: port code: 0]
    clear data
   ]
   true
  ] [
   either parse/all mk1 [
    chunk-size skip mk2: crlf to end
   ] [
    insert/part tail out mk1 mk2
    remove/part data as-binary skip mk2 2
    empty? data
   ] [
    true
   ]
  ]
 ] [
  true
 ]
]
unless port/state/state = 'ready [read port/state/connection]

Charset used by check-data:

Support functions +≡

hex-digits: charset "1234567890abcdefABCDEF"