Comments on: Modules: another EXPORT method proposed
In R3, modules are easy to create. I've got one for generating HTML pages of all sorts, including web forms for various sites. The module is small, useful, easy to maintain... and before long it will be an integrated part of all the web site WIP wikis (once they move from R2 to R3.)
The HTML module is defined with the header:
Title: "HTML Formatting Functions"
Author: "Carl Sassenrath"
Exports: [reset-html emit emit-tag emit-link ...]
However, there is a practical matter to consider: I really don't want to maintain the Exports field directly. There are more than 30 words in it.
Some of you know that this issue has been on my mind for several years. In my original modules proposal (long ago) I made serveral suggestions. For example, the header could specify what not to export, then export everything not on that list.
Another solution, one to think about... is to add a keyword into the module making method. The word would be used like this:
export emit: funct ["Emit HTML...
The meaning of export is only important at module load time, where it is used to declare what names are exported. This method works well, because the module's loaded source block is no longer relevant to future evaluation. So, the export words are non-functional.
The main advantage is that the export declaration right next to the function it exports, making maintenance easier.
An alternative that I should mention, because someone will no-doubt mention it anyway is:
#export emit: funct ["Emit HTML...
But, I don't like that as much because issues are not words, they are strings and each one eats memory, where as a word, it wouldn't require any extra memory. (But we need to finish that discussion soon... before alpha is over!)
This export method isn't a change we're going to be making this week, but we need to get your comments, because it will happen sooner or later!
Once you start writing more modules... you'll know what I mean.
I'd rather completely forego of the export concept and add /only to import instead.
its much safer and the binding is done on the caller's pov, not the callee's pov.
I know what I need, not the module.
export become expose, such that the module only dictates what CAN be imported, not what it imports.
slim works like this and its much cleaner, cause the applications have complete control over word usage... not the modules.
plus, when you look at an application, you immediately see what func comes from what lib. slim has a dialect for importing words which even allows you to rename and/or prefix them on the fly. so that you have have two modules exposing all of the same words, being used without collision in the application.
this is based on 7 years of continual usage.
I've actually looked at adding /expose and /only support within modules... but with all the rest, haven't had the time to do it yet.
Max - is it really always true? If I am a module author, I know what you need. So I don't want you let explicitly use variables/funcs you might think you need. But I have something like compressed modules in mind, where you want to hide stuff from the eyes of programmer (think commercial modules).
Well, but then anyway - exported code is being a mezzanine. We would have to have something like Rugby's stubs for funcs, in order to protect module content from visibility ...
Erlang has a list of exported functions from module/file at the top too. The positive side of this at usage time is that you only need to look at the top of module and see a nice list of what it offers to you. You don't have to scroll down the source code seeking for exported words.
Other reason I like this is that export in front of funcname: makes it look different like other rebol code (at least so far). It makes export look like a keyword. Although I suppose it can be just function that takes the set-word and value, binds them and does whatever is needed for exporting.
I think that what Maxim says, that you could import selectively would be a very good feature which doesn't contradict having an export too.
For Carl's proposed method - does your example mean, that it constructs 'Exports header field during load time dynamically?
I think that for reflectivity, we still might need the list of exported stuff.
We could as well use kind of dialects, e.g. ALL, NOT (e.g. Apache uses Allow, Deny) combined with wildcards - emit* ... just an idea ...
Addition to my previous posts: I hope that modules in R3 just don't mean a bit better wrapped contexts we used in R2. I expect things like module compression, checksum in header, ability to protects its internals (no path or other reflective access), etc. in the future. I know we can't get everything with first release, but I hope such stuff is going to be possible.
Will there be any solution that will enable us use short-and-sweet words in modules without the fear of overwriting each others modules words?|
I wrote here a proposition that I was thinking about from a long time back. Would something like this be possible with new modules?
just an idea
Pekr, yes, collecting the exports list at load/import time is the only way this could work. When importing either into the system or as mixins, we need to have a simple list of exports. Reparsing is not an option - it's too slow. Plus, that list serves as documentation at runtime.
Encrypted modules would be better than compressed modules for commercial use, and Rebin as well if we can manage it. Explicit exports should be generated by the compression process so that you won't have stealth exports. Compressed, encrypted or rebin modules/scripts shouldn't process export clauses at runtime because of this.
- Module compression: On the todo list.
- Checksum in header: I'll look into it, but it is really unlikely that this would help since the point of the checksums is for them to be provided independently so they can serve as independent verification of the module. The checksum of a loaded module is available at runtime.
- We already have one protection infrastructure, we don't need another. If the protect and unprotect functions aren't enough, make them so - and don't forget to check the todo list for them in CureCode. We can make mezzanine/keyword wrappers for them if need be in specific cases.
- Import/export dialects: Deemed unnecessary so far, and badly so since adding them would add complexity to the programming process and to applications. Explicit import helper functions that wrap import may be added if there is a need for them, but at this point it is too soon to know the exact behavior that we might need to support.
Janko, export would be a keyword in the top level code of a module, at least at load/bind/make time. When the module code is actually running, it would need to be a local noop function with no arguments, unless we strip it from the source during the collection phase (this could be expensive, but might be required for safety).
The import model of R3's module system doesn't overwrite words by accident. It is designed to allow you or the module author to do so explicitly, but it doesn't happen implicitly.
Carl, I like the idea of cutting down on duplicate declarations, which export clauses would allow. How do you propose to collect the export list: parse, forfind? You can't use collect-words - it doesn't support this kind of thing. If parse I can start right away, since I have to go over that code again to add compressed modules.
I assume that only top-level export clauses would be supported, or else inner modules would break. It would help if you could also specify an explicit export list in the header, to which the export clause words would be added. The explicit export list helps with exporting constructed values from nested code, and with data locality.
An alternative to doing the "keyword becomes noop function" or "keyword gets stripped out" methods would be to set export to the set function after it's past the keyword phase, and then have it be used like this:
export 'emit funct ["Emit HTML...
This would allow you to export blocks of words at their point of initialization. The "keyword gets stripped out" method is probably the safest though.
Maxim, Slim is based on 7 years of continuous usage in single-threaded modular code. It is therefore not aimed at the target market of the R3 module system, which needs to be usable by people who haven't written modular code before in REBOL and shouldn't be required to know that they are using modular code when they are just writing scripts. Multitasking scripts at that.
This means no explicit import word list required - this was a deliberate choice. And importing means importing to the user context, or to the referring context for mixins - this is so that word overrides can be managed explicitly as needed, and so that multitasking safety can be managed.
Slim is an excellent module system for a very different platform and program model; we would have used it directly if it were appropriate to do so. Any major architectural differences between the model of the R3 module system and that of Slim were done on purpose, after months of deliberation and of thinking through the implications.
The export is definitly better than a list of functions to be exported. It would then be like Java's public keyword for methods. Perhaps the suggestion of Maxim to make an import/only refinement could be added in parallel.
So the module writer can define what may be used from outside and the module user can specify what she actually needs.|
I like the idea of Maxim to import selectively functions from modules. Maybe extend it to a possibility to import not only from modules.|
I like using EXPORT in front of funct much more than listing it in the heard.
But EXPORT is (as Carl wrote) a one-shot loadtime tag. Not used after module was loaded. But it become integral part of the source-code and changes the picture of it.
How about hiding it a bit more like:
my-func: func/export ...
Could be used with does/export etc. as well.
Robert, we don't want to hide export, since exporting a word is really important since it affects the external effect of the module. That kind of thing needs to be visible.
And we really don't want to bury it in paths for some good reasons:
- That would add refinement processing overhead to every lower-level function building wrapper, every object builder, and so on.
- It wouldn't be supported for words that aren't assigned functions.
- It would interfere with functions that already have /export options.
- It would be prohibitively expensive. The module building code is mostly low-level mezzanine code that calls a few efficient native functions. If you use path element markers then you would have to recurse into every path! in the top-level code block to search for the word 'export, rather than a simple non-recursive scan that the keyword method would use. That would make module building so expensive that modules wouldn't be used.
If you want to make the export keywords less intrusive at runtime, simply remove them from the code at build time. The recent parse enhancements make that easy.
Reisacher, there was a long discussion in AltME yesterday about selective import. Go read it there - there's no sense in repeating it here. Suffice it to say that selective import is not supported in the Needs header for some very good reasons - module system model reasons, not implementation quirks. However, selective import is supported from the import function already, though it is tricky because of REBOL semantics. Wrapper functions can help with the tricky stuff.|
Brian, there is just one thing That I want to correct because it gives a false impression of slim and R3 modules.
the main difference in slim and R3 is not capacity, flexibility, or scalability. Its in the management model.
One pushes code into the run-time(R3) and the other pulls-it from the module (Slim).
slim's module model is not incompatible, with multi-tasking, nor would it be harder to use under such circumstances.
R2 did not have threads so slim obviously cannot be able to support them.
Slim follows python's module philosophy which gives control to the user of a module, not the creator.
On the other hand, the current R3 module concept is much closer to C's #define/#include model. This leads to very complex module loading order issues and then to module de-construction just so shared code can be accessible to two tools.
All in all, it means the modules have to be managed differently depending on what environment they are being used in.
Slim has none of these issues (and python almost none).
Now as I said, I do appreciate all the work that went into the module management and the fact that they now work as advertised and do fulfill they're job. Now at least we have something to work with.
An addendum, Maxim:
R3's module system doesn't take any control out of the hands of the users of modules, none whatsoever. It just requires the user to apply the control differently and at a different part of the process. R3's module system focuses on management of the whole application, rather than on a module-by-module basis (what I referred to earlier as micromanagement). Either approach is valid, but whole application management is what we chose to focus on.
R3's module system is actually not at all like C's #include model (except for mixin modules). The dependency graphs are more like project files, or at worst makefiles.
Importing means something quite different in R3 than it does in Slim. You can do selective import with R3's module system, but importing into a module is not what the Needs header does (except for mixins). Instead, Needs imports into the shared user context (only shared within a task, btw). Needs is for requirements management and loading precedence, not for importing. Because of this, most modules and scripts won't have to specify a Needs header at all, or only a minimal one (just mixins in most cases). This was done to minimize the management overhead for the whole application, in order to make it easier for R3 to use.
Slim and the R3 module system are not in conflict - they are just focused on completely different parts of the application management process. When Maxim says something like "selective import", his definition of importing is something that doesn't exist in the R3 module system at all. Importing in R3 is a completely different concept (at least two, really) from anything in the Slim model. They're even complementary - you could integrate something like Slim with the R3 module system.
Still I think that selective import would be usefull. Surely I don't want to use R3 modules, plus Slim. Hence the question is, if we can (e.g. later, once design settless), support something like that?|
pekr, the answer is yes... I already plan on slim extensions to R3 module management.
Basically adding what Brian noted isn't yet defined for modules.
I just have more pressing issues to deal with... like showing a video of an R3 application using OpenGL. :-)
and I already know which app I will convert first.
modules and IoC containers
In the Java world there is an interesting discussion of why Tapestry IoC container did not follow Spring or Spring2 (not just the XML issues) ... and it got me wondering: I think of EXPORT as what is needed for dev guy to call an API, but architects want more developers to think in terms of "learn the framework" and "implement the method which the framework will call" (so-called Hollywood Principle). What will a module look like that expects the user to provide wordA and wordB and callback functions? Classic Sun javax precursor of this might be the Servlet interface as protocol-neutral.|
RobertS, APIs of that nature are defined through functions which may be exported from a module. REBOL does dynamic typing, so many of the tricks that are required to make statically typed languages more dynamic simply aren't necessary in REBOL or most other dynamic languages.
What you call the Hollywood Principle is handled through a method called Duck Typing (as in: if it walks like a duck, it's a duck). You don't need formal interface types since object field lookup is checked at runtime. Required fields are simply documented, and then either assumed to be there (throwing an error if they aren't), screened with assert, or accessed with select, with default values or behavior used if they are missing.
Another method used is to have the module contain factory functions and/or prototype objects if the data structures needed are more specific - this is what non-class-based OO languages do instead of classes.
It can be a little tricky for people who are coming from traditional, static, class-based OO languages to understand (at first): Most of the complicated management tricks that those languages need to build complex, dynamic systems are simply unnecessary with most modern dynamic languages. You don't need to jump through the hoops to work around the rules when those restrictions simply aren't there to begin with.
I've been meaning to look into the "inversion of control container" concept. Perhaps there is something there that would be of value to REBOL systems as well.
One way of saying this: Carl wishes to clarify Export and I am asking how to have a standard convention for EXPECTS, as in "expects" setwords x y z to be of type function declared as accepting [ devil-in-the-details-here-AKA-complexity-never-sleeps-only-creeps]
Not to say that I am not dense ;-)|
Nor did I say (or even imply) that you were.
In dynamic languages (in this case I mean languages with runtime type checking rather than compile-time type checking), that kind of thing is usually done with documentation which the programmer is expected to read, sometimes supplemented with runtime validation code. There is no static validation unless you write it into a compiled dialect (the DO dialect is not compiled, nor can it be).
The overall error handling strategy that we adopted for R3 is that error! is the developer's best friend. Critical functions have been designed to validate the assumptions that they depend on early and throw errors if those assumptions aren't met, before they can cause any problems. And we're really picky about what errors get thrown - we want them to be as informative as possible (within reason). This way the errors can be used by the developer to help them fix their code.
Look into the assert function, which was added to support this error handling model. Also, look at the source of the load function, which uses assert liberally in this style. Most of the mezzanines have screening code which calls cause-error when the screening fails.
We haven't quite gone as far as Erlang in the "let it crash" model, but we're a lot closer than R2 was to that. The result is more bulletproof code for everyone.
I only try to report on the culture change that I am seeing in the corporate world: it is a shift from "check the API docs" to the much more demanding "understand what the framework expects you to implement" and that shift, I believe, requires much more in the way of "understand your language" and especially "understand the vulnerabilities" of that language. I remember first confidently looking at lib and, later, DLL exports. So my only thought is this: would a convention for exposing EXPECTS be useful? In Smalltalk we rejected interfaces but were wrong about so many things (class hierarchy sits on top of that pile ;-)|
Functions aren't typed in REBOL, at runtime or load time, beyond their datatype - argument specs aren't part of the type of the function. This means that any EXPECTS that is exposed would by the nature of REBOL not be able to refer to some static type system. Instead it would need to be one of three things: |
- Documentation only.
- Checked by external tools ahead of time, but not by REBOL itself.
- A runtime test.
Ignoring the first two because they are already possible now without changes to REBOL, let's focus on the last one. What you will find is that the assert/type function checks everything that you describe EXPECTS checking, except for the argument lists of functions assigned to variables. And functions check their own argument lists at call time. Any other invariants can be checked with assert.
I agree that something like your EXPECTS proposal would be useful in R3. Which is why we already added it, though with a different name: assert.
My position is still that I want to be able to open a rebol script file in a text editor and read, using my eyes, what all the exports are, *without any damn programming involved*, no matter how easy it is.
what's wrong about marking the private words?
which case is more frequent?
a, export most of the words
b, hide most of the words
i would think it depends on the nature of the module:
a, short utility functions (DSL on top of default REBOL eval semantics)
b, more complex algoritm with a slim "API" (eg. a crypto lib)
what if we support both 'private or 'hide or 'internal and 'export BUT we can specify in the header which one is the default meaning.
Implemented and submitted to DevBase. The export keyword is scanned for and removed before the module code block is run. Exporting blocks of words is supported as well. See bug#1446 in CureCode for details.
onetom, mostly private words are more frequent in most modules.
Post a Comment:
You can post a comment here. Keep it on-topic.