Comments on: FIND/in - a pattern I use a lot

I often store records as blocks. The blocks follow a standard format. For example, if you look at the R3 chat message file, you will see such a format.

So, this common code pattern appears a lot. I often use it for finding a user in a user database:

foreach user users [  ; or similar FORALL loop
    if user/3 = name [return user]
]

You will recognize this as a linear search. I use it in cases where its not worth building a hash (map!) on the target field, normally because it's rarely needed, or the database is small.

It's time to abstract this type of search. The key elements are:

Search block of blocks
A specific field of each
That equals a given value

So, it could be something like:

user: find/in users 3 name

And user is probably the index into the users block, allowing us to do things like:

remove find/in users 3 name

The method could be extended to blocks of objects:

rec: find/in objects word value

The same as:

foreach obj objects [  ; or similar FORALL loop
    if obj/:word = value [return obj]
]

This is one of those cases where you begin to ask, is this more of a native or a mezzanine function?

Of course, another approach would be to add a function that is a bit similar to an SQL select, but it's also more work, and I'm not sure how much use it would get.

Let me know your thoughts.

7 Comments

Comments:

-pekr-
1-Apr-2009 14:00:34 Hello. Here's my feedback.
I am not sure find/in does not mess with 'find for very specific purpose, which I am not sure is related to a search?
What is currently close to above described functionality is 'forskip, no? There you specify kind of record type (rows - block) of data you want to skip.
Other not much used concept is - we established remove-each native, which is pretty fast, has a dialect. Some time ago I thought about its counterparts - change-each, extract-each, etc.
Brian Hawley
1-Apr-2009 14:29:13 If you do this as a FIND option, you might consider that you are getting the value and accessor arguments reversed. The value is what you are searching for (name), while the /in argument (the accessor spec) is an additional option. So the call would be:
user: find/in users name 3

The option name seems weird. Aren't you always finding something in the data? Perhaps /at might be better.
This seems related to the FORFIND and GATHER proposals. GATHER would be like FIND/in/all, returning a block of all values found. FORFIND (your suggestion, Carl) would be like FORSKIP, but the skip is a FIND rather than a fixed value.
FORFIND has been waiting for some equivalent of the [throw] attribute to be added to R3 - until then mezzanine loop functions that do block arguments are a bad idea.
Brian Hawley
1-Apr-2009 14:30:58 You might consider that the name for the comparable option to EXTRACT is /index.
Robert
3-Apr-2009 3:36:13 To my this looks more like a FIND/DEEP way of seaching. You search through "block OF blocks", "block OF objects" etc.
Overall a good pattern to add.
Kaj
4-Apr-2009 1:41:56 I could really use a similar pattern for string parsing: PARSE strings in a block.
-pekr-
4-Apr-2009 3:28:54 ... ah, my first message misses why I mentioned remove-each, change-each - I wanted to propose find-each - it could have its dialect as well as remove-each has, and sort function has ...
Gregg Irwin
11-Apr-2009 16:12:23 I generally prefer mezzanines over natives, as long as the performance isn't magnitudes different. And if it is, that's a hint about what natives might be useful for writing better mezzanines. That is, rather than hiding more functionality in natives, provide the natives I need to write killer mezzanines.
If you distill transformation and filtering functionality, what natives do you really need, that you can build everything else on? What would Lisp do? :)
I reiterate that sentiment--which I've stated before--because this is a good example of "I want something different". That is, Carl's proposal is to use ordinal values for block searches, but allow words for objects.
I don't want to be limited to index values on blocks. I generally avoid using blocks of values that are referenced by position because 1) the data isn't self-describing, 2) it's more fragile, and 3) user/3 just isn't meaningful. I don't like dealing with code I wrote that does it, and I *really* don't like dealing with code someone else wrote that does it.

Comments on: FIND/in - a pattern I use a lot

Comments:

Post a Comment: