Comments on: TO-HEX considerations for 64 bit integers
When using to-hex in REBOL 3.0 be aware that the returned
hexidecimal string will be longer due to the use of 64 bit integers.
For example, in REBOL 2.6, 32 bit integers resulted in:
>> to-hex 1
== #00000001
In REBOL 3.0, the results will be appropriate to 64 bit integers:
>> to-hex 1
== #0000000000000001
To help deal with this, a /size refinement has been added. It
specifies the number of hex digits to return. For example, to
request the same result as in REBOL 2.5, you can write:
>> to-hex/size 1 8
== #00000001
to-hex/size -1 8
== #FFFFFFFF
Of course, for numbers larger than 32 bits, this is problematic because the
result will be clipped at eight digits.
Also, for convenience, the to-hex function has been expanded to allow
character and tuple values. For example:
>> to-hex first "REBOL"
== #52
>> to-hex first ##{005200450042004f004c} ; unicode
== #0052
The tuple datatype conversion to hex is handy for converting REBOL
colors to HTML color values:
>> to-hex 255.150.50
== #FF9632
(But be careful if the value contains an alpha value.)
18 Comments Comments:
Brian Hawley 7-Jun-2006 23:32 |
Will you be able to use /size with non-integer values? This would allow you to deal with the alpha channel problem with tuples (to-hex/size 255.150.50.128 6 == #FF9632, to-hex/size 255.150.50 8 == #FF963200, or is it #00FF9632), and unicode extension (to-hex/size "0" 4 == #0030).
I like how you've dealt with this otherwise though. It is good to see another blog... | Brian Hawley 7-Jun-2006 23:44 |
More questions...
Would to-hex 255.150.50.128 return #FF963280 or #80FF9632?
What would be returned by to-hex when passed a unicode character that is outside of the basic multilingual plane, and thus can't be represented in 2 bytes? Do REBOL's Unicode strings even support those characters covered by the more extended UTF16 code points, or is it limited to UCS2 only? | Petr Krenzelok 8-Jun-2006 1:38 |
:)
I suggest /size refinement to be called /part. As for binary operations, there was a group with proposed functionality - current rebol support for easy binary conversions/operations is rather unsatisfactory.
More discussion in AltME Binary Tools group ...
Petr | John Niclasen 8-Jun-2006 2:51 |
Will to-hex 1 resulting in a 64-bit result break many scripts? And letting 64-bit be default, what will happen, when we go to 128-bit?
Maybe 32-bit should be default? Then /size would be needed to get 64-bit result. On the other hand, it makes good sense, that the default result should be the same as internal representation of integers.
Another idea, that Carl probably already thought about: would /single and /double refinements be useful? /single for 32-bit and /double for 64-bit. Or is that just confusing? We could still have /size. | Goldevil 8-Jun-2006 4:13 |
What values are supposed to be accepted by /size refinement ?
- bigger than 64 bits (to-hex 1 32) ?
- non usual values (to-hex 1 10) ?
Petr, I think that /part is better than /size only if his purpose is to shrink the output. If it allows longer values than 64bits, this name is less explicit. Otherwise, I agree.
Ah unicode ! I love unicode. I need unicode.
I hope that string functions will handle unicode and/or will have /unicode refinements (uppercase, lowercase, mold, form, join, rejoin,...) | Carl Sassenrath 8-Jun-2006 16:16 |
All good comments. I will return to answer them later today (if the schedule holds).
Regarding unicode details, will give more info on that in a separate article. | Gregg Irwin 8-Jun-2006 17:23 |
Per John's question, is it necessary to make a 64-bit result the default for any reason? I imagine this change will break quite a few scripts, possibly in ways that won't be immediately apparent.
From a quick check here, it looks like I have about 30 scripts that would be affected. Not all are important, but some are reusable libraries (funcs) and such, used by other scripts.
Could it safely auto-size, between 32 and 64 bits, based on the integer value? What is the most common case? | Gregg Irwin 8-Jun-2006 17:25 |
In any case, it brings up the point that there needs to be a good doc about what changes in R3 will affect existing REBOL scripts, so we know what to look for. | Robert Lancaster 8-Jun-2006 19:20 |
:)
Why, not have a Hex! datatype? eg
>> type? 0x3434
== Hex!
cos at the moment...
>> type? to-hex 34
== issue!
Try explaining that to people... and what about bigendian vs LittleEndian issues?
Cheers,
Rob Lancaster | Carl Sassenrath 8-Jun-2006 20:43 |
It would be possible to auto-scale the result of to-hex to make it more compatible. That is, if the most significant 32 bits are all 0 or 1, we can produce the 32 bit to-hex result.
Then, it would be up to you to decide if you need all 64 bits, or if your math only uses 32. | Carl Sassenrath 8-Jun-2006 20:47 |
Regarding hex as datatype: Hex is really just an encoding format, a string representation. (It's also not the only possibility for encoding, as you can use base-64 or even base-2.)
The fact that to-hex produces an issue is more of a convenience that came as a result of allowing to-integer to convert hex to integer because it knew that issues where hex strings. (That would not work if they were normal strings, in which to-integer would expect to find a base-10 representation.) | Carl Sassenrath 8-Jun-2006 20:51 |
On tuple conversion, yes we could allow /size to be specified for it.
On color-as-a-tuple to-hex represenation, yes alpha is last.
On unicode encoding, see my new article.
On /part instead of /size, its a good suggestion, because /part is used so many other places, it would come to mind easier than /size. | Carl Sassenrath 8-Jun-2006 20:52 |
And, Pekr, I agree regarding binary conversions. We must fix that situation. | Volker 9-Jun-2006 8:56 |
auto-conversion - i think that is confusing.
i like the to-hex/single.
Because if i update my scripts, search/replace is easy. "to-hex/size (some long expression here) 64" needs human intervention. | John Niclasen 9-Jun-2006 10:40 |
I like the /single (for single-precision) and /double (for double-precision) integers too, but we have to think ahead. What should the refinement be for 128-bit, 256- etc.? It could be argued, that those isn't needed, like we have the words: first, second, third, ..., but that stop at some point.
It's crucial to deside, what default (without refinement) should be.
1) If default is internal representation, scripts will be broken, and we'll have problems again, when doing to 128-bit.
2) If default is e.g. 32-bit, we have to use refinement to use 64-bit. But again, if default is 64-bit, you have to use refinement to go 32-bit.
Maybe this isn't a huge problem, because it's only about the TO-HEX function. | Gregg Irwin 9-Jun-2006 18:14 |
I don't like /single and /double.
1) they aren't used anywhere else in REBOL
2) they are very programmerish
3) they refer to floating point types, not integers
The default result, and compatibility are important issues IMO. Anytime you break compatibility, where the runtime is external from the code, you can break things by deploying a new runtime. Admittedly, R3 may break a lot of things, so it's not so much a question now as it is for the future, as John points out.
The default result is equally important, because having to override it 90% of the time will be a pain. The default should cater to the common case which, here, I think is a 32-bit result. | John Niclasen 11-Jun-2006 4:48 |
Good points, Gregg. They made me change my mind. :) | Gordon Raboud 15-Jun-2006 13:58 |
:)
1. Thanks Carl for expanding the function to allow character and tuple values.
2. I personally don't like to-hex/part and prefer to-hex/size. Perhaps both could be used, one being the synonym for the other. |
Post a Comment:
You can post a comment here. Keep it on-topic.
|