More about Port layers - the OSI Model

This is a follow-up to my Pruning down READ and WRITE article and the comments posted there. Sorry, it is a bit long, but this topic requires a deeper discussion.

OSI Layer Model

Before I begin, please review OSI layers model. This is the standard layered model of network architecture that has been around for about 30 years.

Ok, so what does it all mean?

When you design any type of "operating system" (and that can include a runtime system such as a TCP stack), you try break it down into clean, well isolated (decoupled) layers. At least, that is always the goal. However, it's not always possible to make it perfect, because the layers interact, and sometimes that interaction is complex or the necessity of performance requires closer coupling than is ideal. (In fact, that's why after 10 years of being a total object-oriented system advocate, I backed away from it. But I digress, that is the subject of a separate blog.)

Layers in Ports

REBOL ports are no exception. You can apply the OSI model to ports (even though we use them for more than just networking.)

For example, if you open a port, and you need to provide an awake function for it, then you are creating code at Layer-5, the session level. Layers 1-4 below it are handled by the operating system, its device drivers, and hardware.

If you don't need to write an awake function, then you are using a higher level port. For example, if you write:

data: read http://www.rebol.com

the HTTP port scheme knows about things like headers, transfer lengths, and content encoding. In this example, HTTP is working at layers 6 and 7. Part of HTTP is to send headers which must be interpreted. In other words, they obtain a context and a meaning. The data is no longer just a string of bytes.

In REBOL, we can create and use ports at different levels.

The lowest level is that of reading raw bytes. When you open a REBOL TCP port and start reading and writing data, you are sending bytes. The meaning of those bytes is not defined by the port itself, it is defined by your application. TCP I/O is a lower level operation.

However, once you begin to interpret the meaning of those bytes, you move to a higher level, and there can be multiple levels.

So, you want it decoded?

One of the first steps to "get the meaning" is to decode the bytes. To convert a stream of bytes into a string of characters, you need to know how the characters are encoded. Are they in UTF-8 or UTF-16, or maybe the data are in a compressed format?

You can then ask, "how do we know the encoding?" Well, there are two main methods:

It is embedded in the data. Something about the data tells us its encoding. For example, Unicode can use a BOM to tell us its encoding. Many image formats include a "magic" signature to tell us what they are, e.g. JPEG uses "JFIF". An HTTP header tells us information about the encoding of the data.
It is specified separate from the data. The encoding is specified to a function that handles the data. This is the /as option we talked about before. For this to work, we already know the encoding in advance. It came from somewhere external to the data. For example, if we noticed that the filename suffix was .jpg, then we can use that information for decoding the format. This is what MIME types are all about.

In reality we often use a mix of these. For example, we may examine HTTP data to find the content-encoding, which we then specify to a decoding function.

So, you want content?

Decoding often means a lot more than just converting bytes to characters. An image file contains structures that give you information about the image, such as its width, height, colors, compression, and more. Even a text file may be in REBOL format that gets loaded into block format so we can interpret its content.

In the design of a system, we must ask ourselves where do we want this content decoding to happen, and how "automatic" is it?

An extreme case would be to ask: If you read bytes from a TCP port, does "the system" automatically determine the bytes are an image and return to you a decoded image datatype?

I think we can obviously say no to that. We want TCP as a stream of bytes. Any other decoding should be part of a separate layer.

This layer rule about bytes versus content applies to a lot more than just images. In R3 it also applies to text, because we now care about Unicode, and have become more text sensitive.

Building the Layers

Ok, so we are now at the fun part: What is the smartest way to handle these layers?

Well, REBOL already has a basic model. We know that the load function is supposed to take data and produce content. We load REBOL code and data, we load an image, and we load a sound. We also know that we read raw bytes from a network or a file.

That basic model provides the top and the bottom layers, but the question is, do we want to access the intermediate layers too? In the past, we would write:

insert port "some text data"

and expect:

probe copy port
"some text data"

But now, we support Unicode, so we have added an additional requirement. What is the encoding of the text? And, where does it get decoded?

In order to handle that, we invented the /as refinement to indicate the encoding. Using R3 port read method, it was:

text: read/as tcp-port 'utf-8

However, applying this refinement at a lower level makes the lower level a lot more complicated! At first, it may not seem like it, but once you get into the details, you find some problems. For example, UTF-8 encoding is multi-byte per character. What happens if we read from TCP, and we don't get all the bytes necessary to decode the last character? We must hold it in a special buffer and insert it into the head of the next part of the data stream. This also goes for CRLF text line conversions as well.

In the past, the way we solved this problem was to use multi-level ports. For example, when you use the HTTP scheme and read from its port, it is not the actual TCP port, it is a virtual port. Within the HTTP scheme code itself is a hidden TCP port.

So, the HTTP port is doing whatever "magic decoding" it needs to do in order to create the result we need.

Using this approach, we can say that maybe we can allow read of an HTTP port to return properly decoded text. In such a case, read is returning a string datatype not a binary datatype. That may be fine.

Of course, we want to determine how far to go with that model. For example, if I read an image using HTTP, do we get an image returned from read? I'd say no. In that case it may return a byte stream, because we may not want the image fully decoded at this point. For example, if it is JPEG, we might want to write it as a JPG file.

So, it is conceivable that we can create a TCPS scheme, for TCP String, that provides a layer on top of the lower level TCP to deal with the necessary string encoding. The TCPS scheme can be implemented in REBOL itself, allowing it to be extended and improved without requiring native code changes. It would then be possible to write:

>> port: open tcps://example.com:8080
>> write port "a string"
>> print read port
   "got it"

And, because open now defines a concept of specification, it is possible to even provide information about the type of encoding we want to use. An example would be:

port: open [
       scheme: 'tcps
       host: "example.com"
       port-id: 8080
       encoding: 'utf8
]

Of course, the TCPS scheme would be implemented with TCP, with that port being embedded within it. And, that's the main point: The lower level TCP layers do not need to deal with encoding and decoding. It just cares about bytes.

Back to the goal...

So, is this a good approach?

It depends on what we want, doesn't it? In REBOL, we have this objective:

Simple things should be simple.
Complex things should be possible (and as simple as possible).

So, we want to be able to simply get text from a file, or data from a REBOL script, or an image from an image file. We want to write code like:

data: get-the-contents-of %file
data: get-the-contents-of web-url

Earlier, people objected to the idea that we might provide helper functions such as:

data: read-text %afile

because it would mean enumerating that for all datatypes, such as read-image, etc.. I agree.

But, if we want to avoid that, we need to say that we are going to use a much smarter function, one that can properly identify the content and decode it.

I do not think of that as the main purpose of read. To me, that seems more like what load does. And, we are allowed to make load as smart as we want. For example, load may use a system table that contains suffixes and take a MIME-style.

The result may be something like this:

image: load %photo.jpg
code: load %script.r
text: load %info.txt

where, .jpg, .r, and even .txt are defined within a system/load-types table of some kind. In the case of .txt, the default would examine the BOM for UTF encoding. I think we can do a smart design job of keeping it simple.

For things like network transfers, when we write more complex code such as handling our own transfers with our own port awake functions, I do not think it is a big problem to deal with a few more details, such as encoding.

And, of course, as we find patterns to more complex usage, we can create new schemes or add options to existing schemes, to help simple things remain simple.

3 Comments