REBOL 3.0 Front Line
Recent Articles:
6-May-2008 - Simple, Clean VID Requirements [0132]Over the years, I've written a lot about VID, the REBOL Visual Interface Dialect. VID is important because it is how we create GUI's in REBOL.
We are now ready to complete VID3 for R3.
We want to be sure we get it right. How do we measure "right"? Well, we each have different requirements, but in general, we should all be happy with it. And, I need to be happy with it too, which is critical.
My VID requirements can be boiled down to just a few lines, not even one page. Of course, we want all the normal features like speed and reliability, but here are the requirements I think are essential:
|
Requirement
|
Details
| |
Easy to build a "page"
|
We know how HTML pages are built. Simple ones are fairly easy. I want VID to be even easier, and it can be. It is not difficult, and only in this way can we get a lot of people, even non-programmers to build VID pages.
| |
All the basic GUI elements
|
We need all the basic GUI elements as a standard part of VID. What elements do I mean? Well, we need at least the HTML basic widgets/gadgets. I want to be able to create a selection text list or a table with just a few words - and, nothing complicated about it. (Of course, we will add a lot more than just the basic HTML widgets, since most apps need more than that.)
| |
Smart defaults and options
|
We want VID to make smart choices for the common patterns. We know what these patterns are. VID2 taught us that. So has the web. If we are forced to specify the same argument or attribute multiple times, then we have failed this requirement. I would also include the need for smart options. If I want to limit a text field to just numeric entry, it should only require a special option word, not code. We can build a lot of GUI's quickly this way.
| |
Not difficult to extend
|
We do not want VID to be difficult to modify or add new styles, new looks, and new behaviors. Any user should be able to change the look of something just as easily (and maybe easier) than they would change a CSS (style sheet) in HTML. If an "average scripter" cannot understand how to do this, then we failed our design.
| |
Easy application interface
|
The GUI is just the front end of an application. It must be attached to application code that does something useful (what is often called the "model"). This "attachment" needs to be fairly simple and clear in VID. There are different methods depending on the type of application. In a quick script, I may want to place code in the GUI itself. For larger apps, I may want to keep the GUI well-isolated from any app code.
|
So those are the requirements that we will use to evaluate our VID 3 design. If I've missed any critical ones (we know there are many less critical ones), please let me know, and if I agree, I will add it to the above list.
Detailed discussion of this topic is in the R3-Alpha AltME world. Or, post your comments here.
18 Comments
25-Apr-2008 - Port documentation updated [0131]Major revisions to the documentation for Ports in R3 have been made and published to REBOL 3.0 Ports on the wiki site. This includes numerous working REBOL 3 Port examples of how to do things like copy and transfer (over TCP) large files. Of course, you will need the latest alpha test version of R3 in order to run those.
2 Comments
21-Apr-2008 - More about Port layers - the OSI Model [0130]This is a follow-up to my Pruning down READ and WRITE article and the comments posted there. Sorry, it is a bit long, but this topic requires a deeper discussion.
OSI Layer Model
Before I begin, please review OSI layers model. This is the standard layered model of network architecture that has been around for about 30 years.
Ok, so what does it all mean?
When you design any type of "operating system" (and that can include a runtime system such as a TCP stack), you try break it down into clean, well isolated (decoupled) layers. At least, that is always the goal. However, it's not always possible to make it perfect, because the layers interact, and sometimes that interaction is complex or the necessity of performance requires closer coupling than is ideal. (In fact, that's why after 10 years of being a total object-oriented system advocate, I backed away from it. But I digress, that is the subject of a separate blog.)
Layers in Ports
REBOL ports are no exception. You can apply the OSI model to ports (even though we use them for more than just networking.)
For example, if you open a port, and you need to provide an awake function for it, then you are creating code at Layer-5, the session level. Layers 1-4 below it are handled by the operating system, its device drivers, and hardware.
If you don't need to write an awake function, then you are using a higher level port. For example, if you write:
data: read http://www.rebol.com
the HTTP port scheme knows about things like headers, transfer lengths, and content encoding. In this example, HTTP is working at layers 6 and 7. Part of HTTP is to send headers which must be interpreted. In other words, they obtain a context and a meaning. The data is no longer just a string of bytes.
In REBOL, we can create and use ports at different levels.
The lowest level is that of reading raw bytes. When you open a REBOL TCP port and start reading and writing data, you are sending bytes. The meaning of those bytes is not defined by the port itself, it is defined by your application. TCP I/O is a lower level operation.
However, once you begin to interpret the meaning of those bytes, you move to a higher level, and there can be multiple levels.
So, you want it decoded?
One of the first steps to "get the meaning" is to decode the bytes.
To convert a stream of bytes into a string of characters, you need to know how the characters are encoded. Are they in UTF-8 or UTF-16, or maybe the data are in a compressed format?
You can then ask, "how do we know the encoding?" Well, there are two main methods:
- It is embedded in the data. Something about the data tells us its encoding. For example, Unicode can use a BOM to tell us its encoding. Many image formats include a "magic" signature to tell us what they are, e.g. JPEG uses "JFIF". An HTTP header tells us information about the encoding of the data.
- It is specified separate from the data. The encoding is specified to a function that handles the data. This is the /as option we talked about before. For this to work, we already know the encoding in advance. It came from somewhere external to the data. For example, if we noticed that the filename suffix was .jpg, then we can use that information for decoding the format. This is what MIME types are all about.
In reality we often use a mix of these. For example, we may examine HTTP data to find the content-encoding, which we then specify to a decoding function.
So, you want content?
Decoding often means a lot more than just converting bytes to characters. An image file contains structures that give you information about the image, such as its width, height, colors, compression, and more. Even a text file may be in REBOL format that gets loaded into block format so we can interpret its content.
In the design of a system, we must ask ourselves where do we want this content decoding to happen, and how "automatic" is it?
An extreme case would be to ask: If you read bytes from a TCP port, does "the system" automatically determine the bytes are an image and return to you a decoded image datatype?
I think we can obviously say no to that. We want TCP as a stream of bytes. Any other decoding should be part of a separate layer.
This layer rule about bytes versus content applies to a lot more than just images. In R3 it also applies to text, because we now care about Unicode, and have become more text sensitive.
Building the Layers
Ok, so we are now at the fun part: What is the smartest way to handle these layers?
Well, REBOL already has a basic model. We know that the load function is supposed to take data and produce content. We load REBOL code and data, we load an image, and we load a sound. We also know that we read raw bytes from a network or a file.
That basic model provides the top and the bottom layers, but the question is, do we want to access the intermediate layers too?
In the past, we would write:
insert port "some text data"
and expect:
probe copy port
"some text data"
But now, we support Unicode, so we have added an additional requirement. What is the encoding of the text? And, where does it get decoded?
In order to handle that, we invented the /as refinement to indicate the encoding. Using R3 port read method, it was:
text: read/as tcp-port 'utf-8
However, applying this refinement at a lower level makes the lower level a lot more complicated! At first, it may not seem like it, but once you get into the details, you find some problems. For example, UTF-8 encoding is multi-byte per character. What happens if we read from TCP, and we don't get all the bytes necessary to decode the last character? We must hold it in a special buffer and insert it into the head of the next part of the data stream. This also goes for CRLF text line conversions as well.
In the past, the way we solved this problem was to use multi-level ports. For example, when you use the HTTP scheme and read from its port, it is not the actual TCP port, it is a virtual port. Within the HTTP scheme code itself is a hidden TCP port.
So, the HTTP port is doing whatever "magic decoding" it needs to do in order to create the result we need.
Using this approach, we can say that maybe we can allow read of an HTTP port to return properly decoded text. In such a case, read is returning a string datatype not a binary datatype. That may be fine.
Of course, we want to determine how far to go with that model. For example, if I read an image using HTTP, do we get an image returned from read? I'd say no. In that case it may return a byte stream, because we may not want the image fully decoded at this point. For example, if it is JPEG, we might want to write it as a JPG file.
So, it is conceivable that we can create a TCPS scheme, for TCP String, that provides a layer on top of the lower level TCP to deal with the necessary string encoding. The TCPS scheme can be implemented in REBOL itself, allowing it to be extended and improved without requiring native code changes. It would then be possible to write:
>> port: open tcps://example.com:8080
>> write port "a string"
>> print read port
"got it"
And, because open now defines a concept of specification, it is possible to even provide information about the type of encoding we want to use. An example would be:
port: open [
scheme: 'tcps
host: "example.com"
port-id: 8080
encoding: 'utf8
]
Of course, the TCPS scheme would be implemented with TCP, with that port being embedded within it. And, that's the main point: The lower level TCP layers do not need to deal with encoding and decoding. It just cares about bytes.
Back to the goal...
So, is this a good approach?
It depends on what we want, doesn't it? In REBOL, we have this objective:
- Simple things should be simple.
- Complex things should be possible (and as simple as possible).
So, we want to be able to simply get text from a file, or data from a REBOL script, or an image from an image file. We want to write code like:
data: get-the-contents-of %file
data: get-the-contents-of web-url
Earlier, people objected to the idea that we might provide helper functions such as:
data: read-text %afile
because it would mean enumerating that for all datatypes, such as read-image, etc.. I agree.
But, if we want to avoid that, we need to say that we are going to use a much smarter function, one that can properly identify the content and decode it.
I do not think of that as the main purpose of read. To me, that seems more like what load does. And, we are allowed to make load as smart as we want. For example, load may use a system table that contains suffixes and take a MIME-style.
The result may be something like this:
image: load %photo.jpg
code: load %script.r
text: load %info.txt
where, .jpg, .r, and even .txt are defined within a system/load-types table of some kind. In the case of .txt, the default would examine the BOM for UTF encoding. I think we can do a smart design job of keeping it simple.
For things like network transfers, when we write more complex code such as handling our own transfers with our own port awake functions, I do not think it is a big problem to deal with a few more details, such as encoding.
And, of course, as we find patterns to more complex usage, we can create new schemes or add options to existing schemes, to help simple things remain simple.
3 Comments View index of all articles...
|