Comments on: Leaky Functions

By default, all REBOL user-defined functions return the last value of their evaluation. For example, you can write:

add3: func [a b c] [a + b + c]

This is a shortcut for:

add3: func [a b c] [return a + b + c]

(And not using the return function is also faster.)

However, REBOL 3.0 will be adding modules, and modules have the ability to hide their words and internal data. This may be done for security reasons, but also for better programming abstraction enforcement. For example, you may use a module that processes passwords (that you don't want scripts to be able to see or access).

The problem is, if a programmer is not careful, the standard function return above can create leaky functions that may accidentally return important internal data.

Here's a simple illustration from part of a module:

passwords: []  ; private list of passwords

add-password: func [new-pass] [
    append passwords new-pass
]

The add-password function is exported, but the password list is not. Unfortunately, the programmer was not careful, and the add-password function ends up returning the entire password list as a result! Not good.

A simple solution is:

add-password: func [new-pass] [
    append passwords new-pass
    none  ; or, you could use exit too
]

but, some programmers will not remember (or know) to do that.

So, we have to ask, what's a good solution to this problem? Do we want to enforce a specific usage pattern (e.g. function blocks must always use a return or exit) or do we want to develop some kind of optional attribute (perhaps within the module specification) to help avoid this situation?

Of course, we cannot change the default function return case (as shown in the first example) because it would break far to much code. But, I think we do have other options, and I'm sure there are some I've not thought of yet. Let's hear some of your ideas.

22 Comments

Comments:

Brian Hawley
9-May-2006 1:58 Are you thinking of a set of module options that can change the basic semantics of REBOL in subtle ways, like "Option Explicit" from Visual Basic?
Tomc
9-May-2006 2:54 let them leak. if a module writer wants fine control of what is returned that capability is there now.
the thing is, you create some new syntatic sugar that says this "function needs an explicit return" in which case the newbie/forgetful one has to know/uinderstand/remember the new functins in addtion to the normal finctions. Is that effort really less than learning how to use existing functions.
isn't it simpler for the newbie/forgetful one to learn/remember ...
the last thing you have is what you are left with.
Tomc
9-May-2006 2:59 let them leak. if a module writer wants fine control of what is returned the capability is there now.
the thing is, you create some new syntatic sugar that says this "function needs an explicit return" in which case the newbie/forgetful one has to know/understand/remember the new function in addtion to the normal function. Is that effort really less than learning how to use the existing function?
Isn't it simpler for the newbie/forgetful one to learn/remember that, the last thing you have is what you pass on.
Marco
9-May-2006 2:59 :) Becarefull, using the return or not is not the same currently, if you use return, you can return an error! :
>> x: func [/local][return try [1 / 0]] >> error? x == true
>> y: func [/local][try [1 / 0]] >> error? y ** Math Error: Attempt to divide by zero ** Where: y ** Near: 1 / 0
Steve Shireman
9-May-2006 2:59 :) Functions could have a refinement added as part of their datatype called /proprietary or /secure which otherwise defaults to leaky, but in secure code can be turned on. Perhaps this refinement could be set/unset in the environment allowing the testing of existing code (this could provide a lint-security functionality meaing that a programmer could test a script for leakiness, and when making it secure, identify the areas which need fixing to make it secure.)
Perhaps the default should be secure, but it can be set to not-secure to allow old scripts to work, so that security is the default.
Brett
9-May-2006 8:41 :) Not sure what modules will look like, but if modules distinguished between functions (or words?) that are visible/callable outside the module and those that are not - that would probably be enough of a "heads up" to the developer.
Brett
9-May-2006 8:45 :) ... And if visibility was a module feature, I'd prefer it default to invisible.
Sunanda
9-May-2006 10:35 This is a security issue, so you really want the default security to be safe rather than leaky.
That way, leakiness from a module is an explicit decision made by the programmer rather than an exploitable and embarrassing oversight.
That means that functions in modules that have no explict RETURN need by default to return a conventional value rather than the last value they generated.
That does not need any additional syntax at all.....Simply document that a function in a module always return (say) NONE if it has no explict RETURN value.
Gregg Irwin
9-May-2006 12:36 I don't know if we want thing behaving differently in modules, at least I think I don't. Hard to say, since we don't know yet how we'll compose and generate modules, etc. For example, if you have a func that you "import" or compose into a module, does its behavior change?
On the one hand, there are bound to be some new things we'll have to do for PITL, but we don't want to lose the ease we have of PITS in REBOL today, and we don't want a visible line you cross where things change, forcing you to think in two modes.
A safe-func approach, as Steve alludes to, might work, or maybe we have tools that analyze "code" and warn you about potential issues (e.g. funcs in modules, that don't use RETURN).
REBOL, today, lets you get in all kinds of trouble--which is good and bad. Is the goal to make REBOL "safer", and what do we give up for that safety? If safety is a goal, and modules are the domain, what should the dialect look like?
Bertrand
9-May-2006 13:48 :) Pardon me in advance for my naīve question (I'm a newbie in programming !!!) :
Is it possible to use a "silent" function as in newLISP ?
P.S.: Extract from the definition of "silent" : SILENT Evaluates one or more expressions... (similar to begin, but) suppresses console output of the return value and the following prompt. silent is often used when communicating from a remote application with newLISP, i.e.: GUI front-ends or other applications controlling newLISP, and the return value is not of interest.
Gregg Irwin
9-May-2006 17:37 I thoguht of something similar, e.g. an IGNORE function. I don't know how it fits into the grand scheme of things though.
Brett
10-May-2006 0:39 :) Strike my previous comments. Re-reading module blog entry and this front line entry leaves me wondering if limiting of access to module words is enough. For super security should values (like block!, string! and issue!) be tied to a module too. If a non-exported value was about to be leaked it could converted to an error. Modules then are not just a namespace, but a memory space.
Anton Rolls
10-May-2006 2:20 I'm pretty much against it. It just seems like extra complication. The programmer should know what they're doing. On the other hand, it may not be your code that you wish to secure. Perhaps you have 50 functions written by somebody else. Still, the solution is just to check that NONE is at the end of each function body. That can be done easily by rebol, perhaps in a preprocessor.
Volker
10-May-2006 2:48 Calls across module-borders are different. I would focus on that. When a function is called across border, do additional checks.
Have something in the function-spec, func[return: block! ..] and enforce it.
Goldevil
10-May-2006 8:24 Just for compatibility reasons, I propose that the default behaviour must be the same than before event in modules. Otherwise to much code has to be rewritten if we want to use them in modules.
I also think that it could be handy to allow the language behaviour to be changed during execution. The same way I start and stop tracing. I propose that when developping a module, the developper can choose the enforce security but that's never an obligation. I like that the default way is identical everywhere. I don't like the "visible line" as described by Gregg
Personnaly, I use return much more than necessary because, when I read this my code, I directly understand that a function returns a value that is intended to be used. Maybe, as the /local refinement, a /return refinement can be specified (whith a block of expected datatypes) in a function. When exiting, if return is omitted or if it returns a value with the wrong datatype, the rebol interpreter throw an error.
About code analysing.
Detection of functions in modules that don't use RETURN is not easy. A function can RETURN something in some cases but not always. Only if the interpreter detect the usage of return during execution, the real behaviour is checked.
myfunc: func [ a b] [ if a < b [return TRUE] ]
Yes, myfunc use RETURN. But a code analyser can never knew if it will be really executed.
More generaly, about compatibility issues between rebol 2 and a rebol 3 with this kind of new behaviour, I hope that an easy way is always possible. I hoped I don't need to rewritte some simple scripts.
I saw a proposal in the form of a script "do %compatibility.r". Why not ? Some instructions built in Rebol can be simply specified there :
64bit-int-mode FALSE func-return-check FALSE jit-binging TRUE ordinal-tail_error TRUE allow-redefine-aliases FALSE
OK, but I suppose that's not so simple. If we change these kind of options, mezzanine functions (built-in) can work differently or even throw errors. Then the %compatibility.r script must certainly contains also a lot of mezzanine function rewritten.
Changing standard behaviour as "Leaky Functions" could imply a lot of work compatibility. We must think carefully before changing these standard rebol behaviours.
That's not easy to imagine a Rebol more fantastic without throwing away thousands ou code lines... But that's very interresting :)
Jaime
10-May-2006 11:50 This seems as simple as introducing a new function construct such as SILENT or PROC. This way compatibilty is retained and for procedures where you are interested only on the side effect use the new word.
f1: func [a b][probe a + b] f2: proc [a b][probe a + b] a: f1 1 2 ;== 3 b: f2 1 2 ;== none
Both print the value 3, but one returns the other not. This seems simple, and similar to what pascal uses.
Defining PROC seem simple enough. That it can be defined as a mezzanine, just like DOES, HAS, etc.
A more concerning thing is that the body of a func is very easy to access, and therefore its values bindings and context.
Should modules restrict this type of access?
trans
10-May-2006 12:59 :) +1 for proc -- that just seems the most obvious solution.
Allen Kamp
10-May-2006 17:33 :) Can always add something "sub" from basic.
sub: func [ "Defines a user function with given spec and body, with no automatic return value." [catch] spec [block!] {Help string (opt) followed by arg words (and opt type and string)} body [block!] "The body block of the function" ][ throw-on-error [make function! spec append body 'exit] ]
passwords: []
add-password: sub [new-pass] [ append passwords new-pass ]
Volker
10-May-2006 19:08 'proc : the problem: its about accidents. If i am aware of the implications, i use 'exit, 'none, something. But often i don't bother about the return-value, because i dont use the result anyway. And then the function tries to be smart and returns something, and if i my last statement deals with passwords, bad luck..
That is something where the language must trap programmer-errors. Not inside a module, but across boundaries. Such leaks can sink the ship.
maximo
11-May-2006 14:10 everyone seems to be scared about compatibility.
modules are new... nothing to fix, and if we had unsafe code before (cause we couldn't easily secure it), then cool if one word in the module spec allows me to bar external probing and return value errors.
I'm with Volker on this one. internal module calls, must not deal with this, but if called from outside, it must be possible to trap it and deny it globally on the module.
although tempting, I'm against adding proc!, cause that will just add bloat to the core language which is already starting to be quite expansive.
Oldes
11-May-2006 14:10 I'm not scared about compatibility. I don't require compatibility. The code I have is working and I have no reason to change it (and run it from R3).
I'm looking forward to see new things in Rebol and if I will need some old code to be evaluate in R3, I don't see any problem to modify it if there will be some compatibility issue.
jeff
25-Aug-2006 14:59:20 how about a way to declare some module variables as private...? Then rebol can check that before returning the value.
Hope that isn't stupid; I'm a newbie.

Comments on: Leaky Functions

Comments:

Post a Comment: