Comments on: Update on Comparision Functions (equality, sameness, etc.)

The next build will modify the comparison functions and operators for all datatypes according to our earlier discussion (See Numeric almost-equal, equal, strict-equal, and... ? below).

This is a major change, but should not affect most code. However, it does promote all comparison actions to natives. That really shouldn't change anything, but it's worth noting.

For validation, Ladislav has created a first pass at test files to verify the desired operation. The defining document is Comparisions as published in the DocBase wiki, and there are links to various test files. The official test files are stored in R3 Chat #23 -- just type "get *" to download them.

Ok... although comparison is now more consistent, there are still a few rough spots that need to be worked out. Let me list a few, and I'm sure there will be more:

The new definition made equality valid for similar (but not identical datatypes.) For example, "test" = <test>, but "test" == <test> remains false. This could be problematic because it really stretches the definition of equality. One solution would be to make EQUIV? weaker than EQUAL?, and allow EQUIV? "test" <test> to be true, but EQUAL? to be false. Let's talk about it.
In A3 there is no equality relationship between binary and string. A binary requires decoding before it can be compared with a string. Such decoding is beyond the scope of the comparison functions.
Comparison between binary and numeric datatypes should not be assumed. Again, this is an encoding issue. Use a TO conversion first.
In a prior release, I modified date equality (EQUAL? level) to also include time. But, we've long debated this issue, and perhaps it should remain as it was. (E.g. Jan 10th is Jan 10th, regardless of what time it is.)
ZERO? needs better definition. In my code, I use it quite often as a test of equality to zero. The deeper question is: what datatypes should it include? This trivial question does not have a trivial answer.

14 Comments

Comments:

Ladislav
8-Jul-2009 19:04:30 EQUIV? weaker than EQUAL? - this is problematic.
EQUAL? is not transitive, so there is space and demand for a transitive comparison ignoring datatype differences (when convenient, like not making distinctions between different word types, different any-string types, or different any-block types).
But, it is of advantage to have a hierarchy of comparisons, so the described new comparison shall be finer than EQUAL?.
Ladislav
8-Jul-2009 19:08:51 Binary compared to string: if it is convenient/more corresponding to the implementation, then let's define all comparison operators to yield False when comparing any-string to binary.
Ladislav
8-Jul-2009 19:11:03 binary versus numeric datatypes: yes, I do agree that it is more convenient when it yields False
Brian Hawley
8-Jul-2009 19:21:13 EQUAL? and = already did type-ignorant comparisons for numeric types - the proposed changes just extends that to within string types, and within word types too. The equivalence level of EQUIV? (short for "equivalent") doesn't affect this, since the == op corresponds to STRICT-EQUAL? instead, which was always the one that compared types.
However, we can still make EQUAL? the second level if need be, and have SIMILAR? or ALIKE? be the first level. Note that this will promote EQUAL? and = to exact decimal comparison though, so NOT 0.3 = (0.1 + 0.1 + 0.1).
I'll look through the equality test code and remove the ones with an implied TO conversion from the binary! type. It's not just numeric types and strings that were affected by the test code - image!, bitset! and tuple! also have tests for equality to binary!.
As for date/time comparisons, I would prefer that a date with no time be = to a date with 0:00. The tests show a good balance.
Brian Hawley
8-Jul-2009 19:31:18 "or different any-block types"
That could be a problem. It is better to limit that to within the any-path set, rather than including block! or paren!. The any-path! types look and act a lot like the any-word! types, so equivalence within them makes sense. The other block types look and act too different from paths, so having them be EQUAL? would be too confusing.
If the price of having
>> equal? first [a/b] first [a/b:] == true
is having
>> equal? [a b] 'a/b == true
then I would rather have no equivalence within the block types at all.
Brian Hawley
8-Jul-2009 22:17:33 Note that in alpha 69 there is no equivalence within any-block, so the tests that assume there is fail. Rightly so IMHO, but your opinion may vary.
meijeru
9-Jul-2009 4:16:14 zero? could include time! My reason: one can perform addition on time! values and the addition of 0:00 leaves the value unchanged, which is part of the mathematical definition of zero (a + 0 = a). Subsidiary reason: for a date, it makes a difference if d/time = 0:00 or d/time = none (i.e. not specified). For the moment, zero? cannnot be used to test that.
meijeru
9-Jul-2009 4:18:14 Right now, equiv? and not-equiv? do not have an operator counterpart. May I propose =~ for equiv?
Ladislav
9-Jul-2009 7:33:27 Re any-string and any-block types: see CureCode #1066 for the reason why I proposed this kind of equality.
Normand
9-Jul-2009 10:34:12 If the motivation of meijeru stands, then a zero? is the empty string, too. a: "abcd" concatenation "" -> "abcd". It is as neutral an element for concatenation of strings as is 0 a neutral element for addition of integers, reals (or even time and maybee dates too). But, should the empty string also be a zero? to a date or time, defined as a no detriment (no ill effect) ill conceived operation ? Carls question is about that. Where is the limit, on what principle?
How such consideration could be coded? Zero has multiple meanings, identifying some operands as the function name coded into the disguise of a sign to yield a special result, often a named result. (Functionnal programming gains from expliciteness here, as lasyness/eagerness permits a differing treatement of names as values). An example in arithmetics is that zero is not neutral but absorbant for the multiplication operator, as a byproduct of the definition of multiplication by the neutral element. Because the latter is, reasonably (in the context of the operation of a demultiplication of self where 3 x 1 is 1 + 1 + 1) better interpretible as no operation of multiplication but rather an identity : 1 x 1 is rather a lapse of multiplication resulting in self, trivially). In that context, what then can reasonably be the multiplication by zero: in ancient times, mathematicians concluded to an absorbant effect, an effect that when multiplication is inverted (the division operator) it yielded to infinite recurrence, and consequently banned there. At that time, they did not recognised the lasyness/eagerness interpretation. But now we do. We see here that operations on trivials may not receive a trivial, botched, answer. It calls an understanding of comparisons of the effect of operators with a consistent treatment of trivial cases.
My too long comment only insist on the type of difficulty. The logic is not bivalent, True False, but False divides as applicably false and false as being inadequate but tolerated for reason of expeditiveness and practicality, in the context of frequent conversions from binary to strings. As Ladislav states, lets resist considering equal what call an implicit conversion. But is False the most adequate answer? It depends.
My concern is that for comparisons and zero?, the returned FALSE -- or error comment --, and the function explanation should be clear about out of bounds usage of the comparison or predicate (0 = 1 FALSE, is truely false by consequence of the definitions of 0 and 1, not FALSE by comparing an apple and an orange as apples). We cannot simply reduce the difficutly by infering on the trivial nature of computing, where everything boils down to base 2 (0 and 1) on a computer.
That we exploit an ambiguity of the multiple acceptations of the word 'false, for practical matters, I do not object, if there is no more intelligible way to state it. But the manual should be clear on what is the correct context for the TRUE and FALSE yields. Incorrect (inadequate) comparison is not false, it is a somewhat ill usage of comparisons. The nuance should be explicitely supported, maybe rather by the explicit circumstanciated error message, instead of a rather specious 'false truth value.
Brian Hawley
9-Jul-2009 23:31:05 Meijeru, we tried =~ but the problem is that mathematically that means "approximately equal", which is less strong than "equal". EQUIV? is a stronger function than EQUAL? and =, so a less-strong operator doesn't work.
Brian Hawley
9-Jul-2009 23:36:59 Normand, ZERO? doesn't mean a neutral element, it means a zero element. In REBOL only types you can do math with, numeric types and time for instance, have zero values. You can't do math with strings.
And the point to #[true] and #[false] in REBOL is to remove that ambiguity you mention, when you need to :)
meijeru
10-Jul-2009 8:08:51 For an op! counterpart to equiv?, I see your point about =~. Then what about
=!
... I see two disadvantages: elsewhere ! is used for NOT; and ! sounds stronger than ?, but
=?
is the strongest of the lot... Perhaps
=*
would do?
Normand
10-Jul-2009 16:48:56 Brian, thanks for #[true] #[false]. I was simply drawing interest on the topic. One reason I was suggesting that bivalence does not covers all cases without ambiguity is the case (in linguistics) where two functions mutually deny each other, then you alternate between true and false (liar paradox). Mutual denial is used every day: in light switches in stairs, and one may say that money works alike (owed to be paid; once paid, one owe it). As you know, mutual denial is a function on reactive systems (from on, shut down; from down, boot up). Negation is defined as 'if true then _false_, else _true_', where _false_ and _true_ is the action of alternating the precedent truth value of the state of the system; the nuance is that it is assertive, it says the inverse truth value (it has a metalinguistic value, acting on the system itself). Zero? is not susceptible of being overloaded to something non numeric? Good blog post from Carl then.

Comments on: Update on Comparision Functions (equality, sameness, etc.)

Comments:

Post a Comment: