Discussion:
[squeak-dev] #= ==> #hash issues
Chris Cunningham
2018-10-26 22:21:20 UTC
Permalink
Hi.

I'm slowly (very) working towards crating a usable test for validating for
classes where #= is true, #hash will also be true.

Last week, the Date issue showed up. This week?
Intervals:
(0 to: 1) = (0 to: 5/3). "true"
(0 to: 1) hash = (0 to: 5/3) hash. "false"
CharacterBlock:
| cb1 cb2 |
cb1 := (CharacterBlock new stringIndex: 5 text: 'StandardText' asText
topLeft: (***@100) extent: (***@20)).
cb2 := (CharacterBlock new stringIndex: 5 text: 'StandardText' asText
topLeft: (***@200) extent: (***@20)).
cb1 = cb2. "true"
cb1 hash = cb2 hash. "false"

These were found by comparing a random sampling of instances of classes
that implement #= or #hash (or both), and finding which have these deviant
properties. The hard part is figuring out instances that are going to have
issues - Date didn't show up in my prototype scanning. Also most classes
don't have instances floating around to compare.

Thanks,
-cbc
David T. Lewis
2018-10-26 23:44:28 UTC
Permalink
Oh my, it sounds like you are tracking down a likely source of very
obscure intermittent bugs. Bravo.

Dave
Post by Chris Cunningham
Hi.
I'm slowly (very) working towards crating a usable test for validating for
classes where #= is true, #hash will also be true.
Last week, the Date issue showed up. This week?
(0 to: 1) = (0 to: 5/3). "true"
(0 to: 1) hash = (0 to: 5/3) hash. "false"
| cb1 cb2 |
cb1 := (CharacterBlock new stringIndex: 5 text: 'StandardText' asText
cb2 := (CharacterBlock new stringIndex: 5 text: 'StandardText' asText
cb1 = cb2. "true"
cb1 hash = cb2 hash. "false"
These were found by comparing a random sampling of instances of classes
that implement #= or #hash (or both), and finding which have these deviant
properties. The hard part is figuring out instances that are going to have
issues - Date didn't show up in my prototype scanning. Also most classes
don't have instances floating around to comp
Chris Cunningham
2018-10-28 17:23:40 UTC
Permalink
Post by David T. Lewis
Oh my, it sounds like you are tracking down a likely source of very
obscure intermittent bugs. Bravo.
Dave
Yes.
Looking at the latest 5.2b, I find that 0 = 0, but 0 hash does not = 0 hash
(for 2 existing instances of LargePositiveInteger). That is odd.
Unfortunately, I can't seem to actually capture the ones that are causing
the issue to investigate - the get normalized (or something) to regular 0
integers.

However, while looking at this, I noticed that the fall back code in
Integer>>digitCompare: is buggy.

If you evaluate
1 digitCompare: -1249. "1"
but, if you then comment out "<primitive: 'primDigitCompare'
module:'LargeInteger'>" in that method and run it again, you get:
1 digitCompare: -1249. "-1"

-cbc
Nicolas Cellier
2018-10-28 21:52:45 UTC
Permalink
http://bugs.squeak.org/view.php?id=3380
Post by Chris Cunningham
Post by David T. Lewis
Oh my, it sounds like you are tracking down a likely source of very
obscure intermittent bugs. Bravo.
Dave
Yes.
Looking at the latest 5.2b, I find that 0 = 0, but 0 hash does not = 0
hash (for 2 existing instances of LargePositiveInteger). That is odd.
Unfortunately, I can't seem to actually capture the ones that are causing
the issue to investigate - the get normalized (or something) to regular 0
integers.
However, while looking at this, I noticed that the fall back code in
Integer>>digitCompare: is buggy.
If you evaluate
1 digitCompare: -1249. "1"
but, if you then comment out "<primitive: 'primDigitCompare'
1 digitCompare: -1249. "-1"
-cbc
Chris Cunningham
2018-10-29 00:34:49 UTC
Permalink
On Sun, Oct 28, 2018 at 2:53 PM Nicolas Cellier <
Post by Nicolas Cellier
http://bugs.squeak.org/view.php?id=3380
and

Intervals:
(0 to: 1) = (0 to: 5/3). "true"
(0 to: 1) hash = (0 to: 5/3) hash. "false"

In the inbox is collections-cbc.810.mcz, which fixes both of these bugs.

You can test them out - #hash still replicates the bugs,
while #hashBetterFastArrayCompatible (on Interval) and
#hashBetterFastIntervalCompatible (on Array) makes both work. The later
also implements your idea of only testing some of the elements - the first
and last 16.

It slows down hash speed of Interval roughly an order of magnitude, though.

If anyone hash ideas I'd be interested. Failing that, I'll ruminate on them
for the next several days, and eventually push something in that fixes this
(meanwhile moving that package to treated).
-----
Here is the series of 'tests' that I did while working on these with
various timings.

{
[(1 to: 100 by: 1) hash] bench.
[(1 to: 100 by: 1) hashBetter] bench.
[(1 to: 100 by: 1) hashBetterAlsoFixBug3380] bench.
[(1 to: 100 by: 1) hashSlowerBetterAlsoFixBug3380] bench.
[(1 to: 100 by: 1) hashFastArrayCompatible] bench.
[(1 to: 100 by: 1) hashBetterFastArrayCompatible] bench.
'---'.
[(1 to: 100.3 by: 1) hash] bench.
[(1 to: 100.3 by: 1) hashBetter] bench.
[(1 to: 100.3 by: 1) hashBetterAlsoFixBug3380] bench.
[(1 to: 100.3 by: 1) hashSlowerBetterAlsoFixBug3380] bench.
[(1 to: 100.3 by: 1) hashFastArrayCompatible] bench.
[(1 to: 100.3 by: 1) hashBetterFastArrayCompatible] bench.
}

{
(0 to: 1) = (0 to: 5/3).
(0 to: 1) hash = (0 to: 5/3) hash.
(0 to: 1) hashBetter = (0 to: 5/3) hashBetter.
(0 to: 1) hashBetterAlsoFixBug3380 = (0 to: 5/3) hashBetterAlsoFixBug3380.
(0 to: 1) hashSlowerBetterAlsoFixBug3380 = (0 to: 5/3)
hashSlowerBetterAlsoFixBug3380.
(0 to: 1) hashFastArrayCompatible = (0 to: 5/3) hashFastArrayCompatible.
(0 to: 1) hashBetterFastArrayCompatible = (0 to: 5/3)
hashBetterFastArrayCompatible.
}


{
(1 to: 3) = #(1 2 3).
(1 to: 3) hash = #(1 2 3) hash.
(1 to: 3) hashBetter = #(1 2 3) hash.
(1 to: 3) hashBetterAlsoFixBug3380 = #(1 2 3) hash.
(1 to: 3) hashSlowerBetterAlsoFixBug3380 = #(1 2 3) hash.
(1 to: 3) hashFastArrayCompatible = #(1 2 3) hashFastIntervalCompatible.
(1 to: 3) hashBetterFastArrayCompatible = #(1 2 3)
hashBetterFastIntervalCompatible.
}

-cbc
Nicolas Cellier
2018-10-29 19:42:38 UTC
Permalink
Remonder: this is because interval is used for text selection and/or cursor
position. From that POV, 3 to: 2 is not equal to 4 to: 3. From collection
POV, they are both an empty collection. I think that i once proposed to
distinguish the two usages and introduce a TextInterval for that purpose.
Post by Chris Cunningham
On Sun, Oct 28, 2018 at 2:53 PM Nicolas Cellier <
Post by Nicolas Cellier
http://bugs.squeak.org/view.php?id=3380
and
(0 to: 1) = (0 to: 5/3). "true"
(0 to: 1) hash = (0 to: 5/3) hash. "false"
In the inbox is collections-cbc.810.mcz, which fixes both of these bugs.
You can test them out - #hash still replicates the bugs,
while #hashBetterFastArrayCompatible (on Interval) and
#hashBetterFastIntervalCompatible (on Array) makes both work. The later
also implements your idea of only testing some of the elements - the first
and last 16.
It slows down hash speed of Interval roughly an order of magnitude, though.
If anyone hash ideas I'd be interested. Failing that, I'll ruminate on
them for the next several days, and eventually push something in that fixes
this (meanwhile moving that package to treated).
-----
Here is the series of 'tests' that I did while working on these with
various timings.
{
[(1 to: 100 by: 1) hash] bench.
[(1 to: 100 by: 1) hashBetter] bench.
[(1 to: 100 by: 1) hashBetterAlsoFixBug3380] bench.
[(1 to: 100 by: 1) hashSlowerBetterAlsoFixBug3380] bench.
[(1 to: 100 by: 1) hashFastArrayCompatible] bench.
[(1 to: 100 by: 1) hashBetterFastArrayCompatible] bench.
'---'.
[(1 to: 100.3 by: 1) hash] bench.
[(1 to: 100.3 by: 1) hashBetter] bench.
[(1 to: 100.3 by: 1) hashBetterAlsoFixBug3380] bench.
[(1 to: 100.3 by: 1) hashSlowerBetterAlsoFixBug3380] bench.
[(1 to: 100.3 by: 1) hashFastArrayCompatible] bench.
[(1 to: 100.3 by: 1) hashBetterFastArrayCompatible] bench.
}
{
(0 to: 1) = (0 to: 5/3).
(0 to: 1) hash = (0 to: 5/3) hash.
(0 to: 1) hashBetter = (0 to: 5/3) hashBetter.
(0 to: 1) hashBetterAlsoFixBug3380 = (0 to: 5/3) hashBetterAlsoFixBug3380.
(0 to: 1) hashSlowerBetterAlsoFixBug3380 = (0 to: 5/3)
hashSlowerBetterAlsoFixBug3380.
(0 to: 1) hashFastArrayCompatible = (0 to: 5/3) hashFastArrayCompatible.
(0 to: 1) hashBetterFastArrayCompatible = (0 to: 5/3)
hashBetterFastArrayCompatible.
}
{
(1 to: 3) = #(1 2 3).
(1 to: 3) hash = #(1 2 3) hash.
(1 to: 3) hashBetter = #(1 2 3) hash.
(1 to: 3) hashBetterAlsoFixBug3380 = #(1 2 3) hash.
(1 to: 3) hashSlowerBetterAlsoFixBug3380 = #(1 2 3) hash.
(1 to: 3) hashFastArrayCompatible = #(1 2 3) hashFastIntervalCompatible.
(1 to: 3) hashBetterFastArrayCompatible = #(1 2 3)
hashBetterFastIntervalCompatible.
}
-cbc
Louis LaBrunda
2018-11-01 12:40:11 UTC
Permalink
Hi Benoit,

On the latest version of VA Smalltalk:

VA Smalltalk V9.1 (32-bit); Image: 9.1 [413]
VM Timestamp: 4.0, 10/01/18 (100)

I see:

(0 to: 1) = (0 to: 5/3). "false"
(0 to: 1) hash = (0 to: 5/3) hash. "true"

Very interesting.

Lou
Interesting!
Squeak 5.2
(0 to: 1) = (0 to: 5/3). "true"(0 to: 1) hash = (0 to: 5/3) hash. "false"
Dolphin 7(0 to: 1) = (0 to: 5/3). "true"(0 to: 1) hash = (0 to: 5/3) hash. "false"
VisualWorks 8.1.1(0 to: 1) = (0 to: 5/3). "true"
(0 to: 1) hash = (0 to: 5/3) hash. "true"
Pharo 5.0(0 to: 1) = (0 to: 5/3). "true"
(0 to: 1) hash = (0 to: 5/3) hash. "false"
I don't have VAST installed on the PC I'm using right now.  I'd be curious to see how other Smalltalk and/or GemStone handle this?  So far (according to what I could test, only VW is right (according to the ANSI standard and just plain logic!)
I wonder how much code relies on this "behavior" out there!
"If the value of receiver = comparand is true then the receiver and comparand *must* have equivalent hash values."
That's what I always thought (or was taught or even read in the Blue Book).  Was this something that was changed at some point???
----------------
Benoît St-Jean
Yahoo! Messenger: bstjean
Pinterest: benoitstjean
Instagram: Chef_Benito
IRC: lamneth
Blogue: endormitoire.wordpress.com
"A standpoint is an intellectual horizon of radius zero".  (A. Einstein)
--
Louis LaBrunda
Keystone Software Corp.
Skyp
Chris Cunningham
2018-11-02 14:44:00 UTC
Permalink
ParcPlace-Digitalk VSE 3.1 (roughly 1999):

(0 to: 1) = (0 to: 5/3). "true"
(0 to: 1) hash = (0 to: 5/3) hash. "true"

So, ancient VSE and current VisualWorks are consistent, and agree on where
they want to be. This is also the direction I want to take Squeak.
VA is also consistent, but #= doesn't match any other Smalltalk varient
that we've looked at.
Squeak, Pharo, Dolphin all currently have the same answer, but are not
consistent.

Interesting indeed.

thanks,
cbc
Post by Louis LaBrunda
Hi Benoit,
VA Smalltalk V9.1 (32-bit); Image: 9.1 [413]
VM Timestamp: 4.0, 10/01/18 (100)
(0 to: 1) = (0 to: 5/3). "false"
(0 to: 1) hash = (0 to: 5/3) hash. "true"
Very interesting.
Lou
On Thu, 1 Nov 2018 02:40:00 +0000 (UTC), Benoit St-Jean via Squeak-dev <
Interesting!
Squeak 5.2
(0 to: 1) = (0 to: 5/3). "true"(0 to: 1) hash = (0 to: 5/3) hash. "false"
Dolphin 7(0 to: 1) = (0 to: 5/3). "true"(0 to: 1) hash = (0 to: 5/3)
hash. "false"
VisualWorks 8.1.1(0 to: 1) = (0 to: 5/3). "true"
(0 to: 1) hash = (0 to: 5/3) hash. "true"
Pharo 5.0(0 to: 1) = (0 to: 5/3). "true"
(0 to: 1) hash = (0 to: 5/3) hash. "false"
I don't have VAST installed on the PC I'm using right now. I'd be
curious to see how other Smalltalk and/or GemStone handle this? So far
(according to what I could test, only VW is right (according to the ANSI
standard and just plain logic!)
I wonder how much code relies on this "behavior" out there!
But the ANSI Smalltalk draft is very clear on this (revision 1.9, page
"If the value of receiver = comparand is true then the receiver and
comparand *must* have equivalent hash values."
That's what I always thought (or was taught or even read in the Blue
Book). Was this something that was changed at some point???
----------------
Benoît St-Jean
Yahoo! Messenger: bstjean
Pinterest: benoitstjean
Instagram: Chef_Benito
IRC: lamneth
Blogue: endormitoire.wordpress.com
"A standpoint is an intellectual horizon of radius zero". (A. Einstein)
--
Louis LaBrunda
Keystone Software Corp.
SkypeMe callto://PhotonDemon
Louis LaBrunda
2018-11-02 16:12:44 UTC
Permalink
Hi Chris,
Post by Chris Cunningham
(0 to: 1) = (0 to: 5/3). "true"
(0 to: 1) hash = (0 to: 5/3) hash. "true"
So, ancient VSE and current VisualWorks are consistent, and agree on where
they want to be. This is also the direction I want to take Squeak.
VA is also consistent, but #= doesn't match any other Smalltalk varient that we've looked at.
Squeak, Pharo, Dolphin all currently have the same answer, but are not
consistent.
Interesting indeed.
I have been talking to the VA Smalltalk guys about this and they are thinking about it but haven't decided what to do
yet. It turns out that the way collections (like Set) that use #hash in VA Smalltalk work, because of the #= test
failing for intervals that cover the same range and have the same hash, that it overrides the equal hash value and adds
the interval to the collection. I find this troubling.

Lou
Post by Chris Cunningham
thanks,
cbc
Post by Louis LaBrunda
Hi Benoit,
VA Smalltalk V9.1 (32-bit); Image: 9.1 [413]
VM Timestamp: 4.0, 10/01/18 (100)
(0 to: 1) = (0 to: 5/3). "false"
(0 to: 1) hash = (0 to: 5/3) hash. "true"
Very interesting.
Lou
On Thu, 1 Nov 2018 02:40:00 +0000 (UTC), Benoit St-Jean via Squeak-dev <
Interesting!
Squeak 5.2
(0 to: 1) = (0 to: 5/3). "true"(0 to: 1) hash = (0 to: 5/3) hash. "false"
Dolphin 7(0 to: 1) = (0 to: 5/3). "true"(0 to: 1) hash = (0 to: 5/3)
hash. "false"
VisualWorks 8.1.1(0 to: 1) = (0 to: 5/3). "true"
(0 to: 1) hash = (0 to: 5/3) hash. "true"
Pharo 5.0(0 to: 1) = (0 to: 5/3). "true"
(0 to: 1) hash = (0 to: 5/3) hash. "false"
I don't have VAST installed on the PC I'm using right now. I'd be
curious to see how other Smalltalk and/or GemStone handle this? So far
(according to what I could test, only VW is right (according to the ANSI
standard and just plain logic!)
I wonder how much code relies on this "behavior" out there!
But the ANSI Smalltalk draft is very clear on this (revision 1.9, page
"If the value of receiver = comparand is true then the receiver and
comparand *must* have equivalent hash values."
That's what I always thought (or was taught or even read in the Blue
Book). Was this something that was changed at some point???
----------------
Benoît St-Jean
Yahoo! Messenger: bstjean
Pinterest: benoitstjean
Instagram: Chef_Benito
IRC: lamneth
Blogue: endormitoire.wordpress.com
"A standpoint is an intellectual horizon of radius zero". (A. Einstein)
--
Louis LaBrunda
Keystone Software Corp.
SkypeMe callto://PhotonDemon
--
Louis LaBrunda
Keystone Software
Chris Cunningham
2018-11-02 16:52:37 UTC
Permalink
Hi Louis,
Post by Louis LaBrunda
Hi Chris,
On Fri, 2 Nov 2018 07:44:00 -0700, Chris Cunningham <
Post by Chris Cunningham
(0 to: 1) = (0 to: 5/3). "true"
(0 to: 1) hash = (0 to: 5/3) hash. "true"
So, ancient VSE and current VisualWorks are consistent, and agree on where
they want to be. This is also the direction I want to take Squeak.
VA is also consistent, but #= doesn't match any other Smalltalk varient
that we've looked at.
Post by Chris Cunningham
Squeak, Pharo, Dolphin all currently have the same answer, but are not
consistent.
Interesting indeed.
I have been talking to the VA Smalltalk guys about this and they are
thinking about it but haven't decided what to do
yet. It turns out that the way collections (like Set) that use #hash in
VA Smalltalk work, because of the #= test
failing for intervals that cover the same range and have the same hash,
that it overrides the equal hash value and adds
the interval to the collection. I find this troubling.
Lou
The rules for = and hash are that if two object are #=, then their hash
values have to be equal as well.

There is no statement about if two objects hashes are the same, what this
means for equality. This, I believe, is intentional.

The collection objects in (most?all?) smalltalks behave similarly to VA's -
if objects have the same hash but are not equal, then they will both be in
the hashed collection (such as Set). The squeak implementation is
described in Set>>scanFor: . This method also shows why having objects
equal but their hash not equal is so dangerous - if you had two objects
that are supposed to be one and the same and are in fact #= but don't have
the same hash, they can both show up in a Set together, or as keys in a
Dictionary together, which breaks what we would expect.

But getting back to VA's collection issue that you have issues with - they
are undoubtedly doing something similar in their collections that we do in
Squeak, which is what is expected (although not necessarily obvious).

A long time ago, I took advantage of this and just hard-coded the hash for
some of my classes to 1. This actually did work, but is a horrible (I mean
HORRIBLE) idea - it really, really slows down the system when you have more
than a couple instances of an object, but it does work.

-cbc
Post by Louis LaBrunda
Post by Chris Cunningham
thanks,
cbc
Post by Louis LaBrunda
Hi Benoit,
VA Smalltalk V9.1 (32-bit); Image: 9.1 [413]
VM Timestamp: 4.0, 10/01/18 (100)
(0 to: 1) = (0 to: 5/3). "false"
(0 to: 1) hash = (0 to: 5/3) hash. "true"
Very interesting.
Lou
On Thu, 1 Nov 2018 02:40:00 +0000 (UTC), Benoit St-Jean via Squeak-dev <
Interesting!
Squeak 5.2
(0 to: 1) = (0 to: 5/3). "true"(0 to: 1) hash = (0 to: 5/3) hash.
"false"
Post by Chris Cunningham
Post by Louis LaBrunda
Dolphin 7(0 to: 1) = (0 to: 5/3). "true"(0 to: 1) hash = (0 to: 5/3)
hash. "false"
VisualWorks 8.1.1(0 to: 1) = (0 to: 5/3). "true"
(0 to: 1) hash = (0 to: 5/3) hash. "true"
Pharo 5.0(0 to: 1) = (0 to: 5/3). "true"
(0 to: 1) hash = (0 to: 5/3) hash. "false"
I don't have VAST installed on the PC I'm using right now. I'd be
curious to see how other Smalltalk and/or GemStone handle this? So far
(according to what I could test, only VW is right (according to the ANSI
standard and just plain logic!)
I wonder how much code relies on this "behavior" out there!
But the ANSI Smalltalk draft is very clear on this (revision 1.9, page
53, http://wiki.squeak.org/squeak/uploads/172/standard_v1_9-indexed.pdf
"If the value of receiver = comparand is true then the receiver and
comparand *must* have equivalent hash values."
That's what I always thought (or was taught or even read in the Blue
Book). Was this something that was changed at some point???
----------------
Benoît St-Jean
Yahoo! Messenger: bstjean
Pinterest: benoitstjean
Instagram: Chef_Benito
IRC: lamneth
Blogue: endormitoire.wordpress.com
"A standpoint is an intellectual horizon of radius zero". (A.
Einstein)
Post by Chris Cunningham
Post by Louis LaBrunda
--
Louis LaBrunda
Keystone Software Corp.
SkypeMe callto://PhotonDemon
--
Louis LaBrunda
Keystone Software Corp.
SkypeMe callto://PhotonDemon
Chris Cunningham
2018-11-02 17:01:38 UTC
Permalink
All of that said, I too find the VA troubling a bit in this case. I rely
on this (0 to: 1) = (0 to: 5/3) being true. VA not supporting is limits
cross-dialect portability, although I don't (personally) use VA, other
folks at work do and we do occasionally share code.

However, this implementation is internally consistent and obeys the = and
hash rule in this case. Its just not what I would want.

-cbc
Post by Chris Cunningham
Hi Louis,
Post by Louis LaBrunda
Hi Chris,
On Fri, 2 Nov 2018 07:44:00 -0700, Chris Cunningham <
Post by Chris Cunningham
(0 to: 1) = (0 to: 5/3). "true"
(0 to: 1) hash = (0 to: 5/3) hash. "true"
So, ancient VSE and current VisualWorks are consistent, and agree on
where
Post by Chris Cunningham
they want to be. This is also the direction I want to take Squeak.
VA is also consistent, but #= doesn't match any other Smalltalk varient
that we've looked at.
Post by Chris Cunningham
Squeak, Pharo, Dolphin all currently have the same answer, but are not
consistent.
Interesting indeed.
I have been talking to the VA Smalltalk guys about this and they are
thinking about it but haven't decided what to do
yet. It turns out that the way collections (like Set) that use #hash in
VA Smalltalk work, because of the #= test
failing for intervals that cover the same range and have the same hash,
that it overrides the equal hash value and adds
the interval to the collection. I find this troubling.
Lou
The rules for = and hash are that if two object are #=, then their hash
values have to be equal as well.
There is no statement about if two objects hashes are the same, what this
means for equality. This, I believe, is intentional.
The collection objects in (most?all?) smalltalks behave similarly to VA's
- if objects have the same hash but are not equal, then they will both be
in the hashed collection (such as Set). The squeak implementation is
described in Set>>scanFor: . This method also shows why having objects
equal but their hash not equal is so dangerous - if you had two objects
that are supposed to be one and the same and are in fact #= but don't have
the same hash, they can both show up in a Set together, or as keys in a
Dictionary together, which breaks what we would expect.
But getting back to VA's collection issue that you have issues with - they
are undoubtedly doing something similar in their collections that we do in
Squeak, which is what is expected (although not necessarily obvious).
A long time ago, I took advantage of this and just hard-coded the hash for
some of my classes to 1. This actually did work, but is a horrible (I mean
HORRIBLE) idea - it really, really slows down the system when you have more
than a couple instances of an object, but it does work.
-cbc
Post by Louis LaBrunda
Post by Chris Cunningham
thanks,
cbc
Post by Louis LaBrunda
Hi Benoit,
VA Smalltalk V9.1 (32-bit); Image: 9.1 [413]
VM Timestamp: 4.0, 10/01/18 (100)
(0 to: 1) = (0 to: 5/3). "false"
(0 to: 1) hash = (0 to: 5/3) hash. "true"
Very interesting.
Lou
On Thu, 1 Nov 2018 02:40:00 +0000 (UTC), Benoit St-Jean via Squeak-dev
<
Post by Chris Cunningham
Post by Louis LaBrunda
Interesting!
Squeak 5.2
(0 to: 1) = (0 to: 5/3). "true"(0 to: 1) hash = (0 to: 5/3) hash.
"false"
Post by Chris Cunningham
Post by Louis LaBrunda
Dolphin 7(0 to: 1) = (0 to: 5/3). "true"(0 to: 1) hash = (0 to: 5/3)
hash. "false"
VisualWorks 8.1.1(0 to: 1) = (0 to: 5/3). "true"
(0 to: 1) hash = (0 to: 5/3) hash. "true"
Pharo 5.0(0 to: 1) = (0 to: 5/3). "true"
(0 to: 1) hash = (0 to: 5/3) hash. "false"
I don't have VAST installed on the PC I'm using right now. I'd be
curious to see how other Smalltalk and/or GemStone handle this? So far
(according to what I could test, only VW is right (according to the
ANSI
Post by Chris Cunningham
Post by Louis LaBrunda
standard and just plain logic!)
I wonder how much code relies on this "behavior" out there!
But the ANSI Smalltalk draft is very clear on this (revision 1.9, page
53,
"If the value of receiver = comparand is true then the receiver and
comparand *must* have equivalent hash values."
That's what I always thought (or was taught or even read in the Blue
Book). Was this something that was changed at some point???
----------------
Benoît St-Jean
Yahoo! Messenger: bstjean
Pinterest: benoitstjean
Instagram: Chef_Benito
IRC: lamneth
Blogue: endormitoire.wordpress.com
"A standpoint is an intellectual horizon of radius zero". (A.
Einstein)
Post by Chris Cunningham
Post by Louis LaBrunda
--
Louis LaBrunda
Keystone Software Corp.
SkypeMe callto://PhotonDemon
--
Louis LaBrunda
Keystone Software Corp.
SkypeMe callto://PhotonDemon
Louis LaBrunda
2018-11-02 17:53:55 UTC
Permalink
Hi Chris,

I know about and understand everything you have said about the spec and the way #= and #hast should work and relate to
each other. My problem with VA Smalltalks implementation of #= is that it doesn't consider what an interval is and how
it is used and therefor what equals should mean. I would interpret two intervals being equal if they span the same
range and not worry about how they got there. In the case where the increment (by) is an integer the start and end
values map down to integers and if those integers are the same in two intervals then the intervals span the same range
and should be considered equal. Any program using those intervals would expect them to work the same. In VA Smalltalk
they would work the same but you can't tell that with #=.

I have no problem with VA's hash based collections, they work the way they should given the way #hash and #= work. But
from a higher level, if I put a bunch of intervals in a Set and only wanted one entry for each range spanned, I would be
out of luck and probably confused as to why. Sure, this is Smalltalk and there are ways around this, if you know you
need to work around it. One could always add a method to intervals to "fix" the start and end values if the increment
is an integer.

I have heard from Instantiations and they plan to leave well enough alone at this point, since the #= method has been
this way since 1996. They are concerned that changing #= may break existing user code. I doubt it but I understand the
concern.

Lou
Post by Chris Cunningham
All of that said, I too find the VA troubling a bit in this case. I rely
on this (0 to: 1) = (0 to: 5/3) being true. VA not supporting is limits
cross-dialect portability, although I don't (personally) use VA, other
folks at work do and we do occasionally share code.
However, this implementation is internally consistent and obeys the = and
hash rule in this case. Its just not what I would want.
-cbc
Post by Chris Cunningham
Hi Louis,
Post by Louis LaBrunda
Hi Chris,
On Fri, 2 Nov 2018 07:44:00 -0700, Chris Cunningham <
Post by Chris Cunningham
(0 to: 1) = (0 to: 5/3). "true"
(0 to: 1) hash = (0 to: 5/3) hash. "true"
So, ancient VSE and current VisualWorks are consistent, and agree on
where
Post by Chris Cunningham
they want to be. This is also the direction I want to take Squeak.
VA is also consistent, but #= doesn't match any other Smalltalk varient
that we've looked at.
Post by Chris Cunningham
Squeak, Pharo, Dolphin all currently have the same answer, but are not
consistent.
Interesting indeed.
I have been talking to the VA Smalltalk guys about this and they are
thinking about it but haven't decided what to do
yet. It turns out that the way collections (like Set) that use #hash in
VA Smalltalk work, because of the #= test
failing for intervals that cover the same range and have the same hash,
that it overrides the equal hash value and adds
the interval to the collection. I find this troubling.
Lou
The rules for = and hash are that if two object are #=, then their hash
values have to be equal as well.
There is no statement about if two objects hashes are the same, what this
means for equality. This, I believe, is intentional.
The collection objects in (most?all?) smalltalks behave similarly to VA's
- if objects have the same hash but are not equal, then they will both be
in the hashed collection (such as Set). The squeak implementation is
described in Set>>scanFor: . This method also shows why having objects
equal but their hash not equal is so dangerous - if you had two objects
that are supposed to be one and the same and are in fact #= but don't have
the same hash, they can both show up in a Set together, or as keys in a
Dictionary together, which breaks what we would expect.
But getting back to VA's collection issue that you have issues with - they
are undoubtedly doing something similar in their collections that we do in
Squeak, which is what is expected (although not necessarily obvious).
A long time ago, I took advantage of this and just hard-coded the hash for
some of my classes to 1. This actually did work, but is a horrible (I mean
HORRIBLE) idea - it really, really slows down the system when you have more
than a couple instances of an object, but it does work.
-cbc
Post by Louis LaBrunda
Post by Chris Cunningham
thanks,
cbc
Post by Louis LaBrunda
Hi Benoit,
VA Smalltalk V9.1 (32-bit); Image: 9.1 [413]
VM Timestamp: 4.0, 10/01/18 (100)
(0 to: 1) = (0 to: 5/3). "false"
(0 to: 1) hash = (0 to: 5/3) hash. "true"
Very interesting.
Lou
On Thu, 1 Nov 2018 02:40:00 +0000 (UTC), Benoit St-Jean via Squeak-dev
<
Post by Chris Cunningham
Post by Louis LaBrunda
Interesting!
Squeak 5.2
(0 to: 1) = (0 to: 5/3). "true"(0 to: 1) hash = (0 to: 5/3) hash.
"false"
Post by Chris Cunningham
Post by Louis LaBrunda
Dolphin 7(0 to: 1) = (0 to: 5/3). "true"(0 to: 1) hash = (0 to: 5/3)
hash. "false"
VisualWorks 8.1.1(0 to: 1) = (0 to: 5/3). "true"
(0 to: 1) hash = (0 to: 5/3) hash. "true"
Pharo 5.0(0 to: 1) = (0 to: 5/3). "true"
(0 to: 1) hash = (0 to: 5/3) hash. "false"
I don't have VAST installed on the PC I'm using right now. I'd be
curious to see how other Smalltalk and/or GemStone handle this? So far
(according to what I could test, only VW is right (according to the
ANSI
Post by Chris Cunningham
Post by Louis LaBrunda
standard and just plain logic!)
I wonder how much code relies on this "behavior" out there!
But the ANSI Smalltalk draft is very clear on this (revision 1.9, page
53,
"If the value of receiver = comparand is true then the receiver and
comparand *must* have equivalent hash values."
That's what I always thought (or was taught or even read in the Blue
Book). Was this something that was changed at some point???
----------------
Benoît St-Jean
Yahoo! Messenger: bstjean
Pinterest: benoitstjean
Instagram: Chef_Benito
IRC: lamneth
Blogue: endormitoire.wordpress.com
"A standpoint is an intellectual horizon of radius zero". (A.
Einstein)
Post by Chris Cunningham
Post by Louis LaBrunda
--
Louis LaBrunda
Keystone Software Corp.
SkypeMe callto://PhotonDemon
--
Louis LaBrunda
Keystone Software Corp.
SkypeMe callto://PhotonDemon
--
Louis LaBrunda
Keystone Software Corp.
SkypeMe callto://PhotonDemon
Tim Olson
2018-11-15 13:44:01 UTC
Permalink
Interval >> size does the correct thing with the stop value, so maybe Interval >> = could use:

isInterval and:
[start = anInterval start and:
[step = anInterval step and:
[self size = anInterval size]]]

— tim
Post by Louis LaBrunda
Hi Chris,
I know about and understand everything you have said about the spec and the way #= and #hast should work and relate to
each other. My problem with VA Smalltalks implementation of #= is that it doesn't consider what an interval is and how
it is used and therefor what equals should mean. I would interpret two intervals being equal if they span the same
range and not worry about how they got there. In the case where the increment (by) is an integer the start and end
values map down to integers and if those integers are the same in two intervals then the intervals span the same range
and should be considered equal. Any program using those intervals would expect them to work the same. In VA Smalltalk
they would work the same but you can't tell that with #=.
I have no problem with VA's hash based collections, they work the way they should given the way #hash and #= work. But
from a higher level, if I put a bunch of intervals in a Set and only wanted one entry for each range spanned, I would be
out of luck and probably confused as to why. Sure, this is Smalltalk and there are ways around this, if you know you
need to work around it. One could always add a method to intervals to "fix" the start and end values if the increment
is an integer.
I have heard from Instantiations and they plan to leave well enough alone at this point, since the #= method has been
this way since 1996. They are concerned that changing #= may break existing user code. I doubt it but I understand the
concern.
Lou
Post by Chris Cunningham
All of that said, I too find the VA troubling a bit in this case. I rely
on this (0 to: 1) = (0 to: 5/3) being true. VA not supporting is limits
cross-dialect portability, although I don't (personally) use VA, other
folks at work do and we do occasionally share code.
However, this implementation is internally consistent and obeys the = and
hash rule in this case. Its just not what I would want.
-cbc
Post by Chris Cunningham
Hi Louis,
Post by Louis LaBrunda
Hi Chris,
On Fri, 2 Nov 2018 07:44:00 -0700, Chris Cunningham <
Post by Chris Cunningham
(0 to: 1) = (0 to: 5/3). "true"
(0 to: 1) hash = (0 to: 5/3) hash. "true"
So, ancient VSE and current VisualWorks are consistent, and agree on
where
Post by Chris Cunningham
they want to be. This is also the direction I want to take Squeak.
VA is also consistent, but #= doesn't match any other Smalltalk varient
that we've looked at.
Post by Chris Cunningham
Squeak, Pharo, Dolphin all currently have the same answer, but are not
consistent.
Interesting indeed.
I have been talking to the VA Smalltalk guys about this and they are
thinking about it but haven't decided what to do
yet. It turns out that the way collections (like Set) that use #hash in
VA Smalltalk work, because of the #= test
failing for intervals that cover the same range and have the same hash,
that it overrides the equal hash value and adds
the interval to the collection. I find this troubling.
Lou
The rules for = and hash are that if two object are #=, then their hash
values have to be equal as well.
There is no statement about if two objects hashes are the same, what this
means for equality. This, I believe, is intentional.
The collection objects in (most?all?) smalltalks behave similarly to VA's
- if objects have the same hash but are not equal, then they will both be
in the hashed collection (such as Set). The squeak implementation is
described in Set>>scanFor: . This method also shows why having objects
equal but their hash not equal is so dangerous - if you had two objects
that are supposed to be one and the same and are in fact #= but don't have
the same hash, they can both show up in a Set together, or as keys in a
Dictionary together, which breaks what we would expect.
But getting back to VA's collection issue that you have issues with - they
are undoubtedly doing something similar in their collections that we do in
Squeak, which is what is expected (although not necessarily obvious).
A long time ago, I took advantage of this and just hard-coded the hash for
some of my classes to 1. This actually did work, but is a horrible (I mean
HORRIBLE) idea - it really, really slows down the system when you have more
than a couple instances of an object, but it does work.
-cbc
Post by Louis LaBrunda
Post by Chris Cunningham
thanks,
cbc
Post by Louis LaBrunda
Hi Benoit,
VA Smalltalk V9.1 (32-bit); Image: 9.1 [413]
VM Timestamp: 4.0, 10/01/18 (100)
(0 to: 1) = (0 to: 5/3). "false"
(0 to: 1) hash = (0 to: 5/3) hash. "true"
Very interesting.
Lou
On Thu, 1 Nov 2018 02:40:00 +0000 (UTC), Benoit St-Jean via Squeak-dev
<
Post by Chris Cunningham
Post by Louis LaBrunda
Interesting!
Squeak 5.2
(0 to: 1) = (0 to: 5/3). "true"(0 to: 1) hash = (0 to: 5/3) hash.
"false"
Post by Chris Cunningham
Post by Louis LaBrunda
Dolphin 7(0 to: 1) = (0 to: 5/3). "true"(0 to: 1) hash = (0 to: 5/3)
hash. "false"
VisualWorks 8.1.1(0 to: 1) = (0 to: 5/3). "true"
(0 to: 1) hash = (0 to: 5/3) hash. "true"
Pharo 5.0(0 to: 1) = (0 to: 5/3). "true"
(0 to: 1) hash = (0 to: 5/3) hash. "false"
I don't have VAST installed on the PC I'm using right now. I'd be
curious to see how other Smalltalk and/or GemStone handle this? So far
(according to what I could test, only VW is right (according to the
ANSI
Post by Chris Cunningham
Post by Louis LaBrunda
standard and just plain logic!)
I wonder how much code relies on this "behavior" out there!
But the ANSI Smalltalk draft is very clear on this (revision 1.9, page
53,
"If the value of receiver = comparand is true then the receiver and
comparand *must* have equivalent hash values."
That's what I always thought (or was taught or even read in the Blue
Book). Was this something that was changed at some point???
----------------
Benoît St-Jean
Yahoo! Messenger: bstjean
Pinterest: benoitstjean
Instagram: Chef_Benito
IRC: lamneth
Blogue: endormitoire.wordpress.com
"A standpoint is an intellectual horizon of radius zero". (A.
Einstein)
Post by Chris Cunningham
Post by Louis LaBrunda
--
Louis LaBrunda
Keystone Software Corp.
SkypeMe callto://PhotonDemon
--
Louis LaBrunda
Keystone Software Corp.
SkypeMe callto://PhotonDemon
--
Louis LaBrunda
Keyston
Louis LaBrunda
2018-11-15 13:55:20 UTC
Permalink
Hi Tim,

After thoroughly discussing this with the VA Smalltalk guys I have concluded that the developer should be responsible
for creating intervals that have equal ranges that compare equal. For example intervals with mixed integers and real
values can have the same range but don't compare equal. If comparing equal is desired, the values should be made to
make that work.

Lou
Post by Tim Olson
[self size = anInterval size]]]
— tim
Post by Louis LaBrunda
Hi Chris,
I know about and understand everything you have said about the spec and the way #= and #hast should work and relate to
each other. My problem with VA Smalltalks implementation of #= is that it doesn't consider what an interval is and how
it is used and therefor what equals should mean. I would interpret two intervals being equal if they span the same
range and not worry about how they got there. In the case where the increment (by) is an integer the start and end
values map down to integers and if those integers are the same in two intervals then the intervals span the same range
and should be considered equal. Any program using those intervals would expect them to work the same. In VA Smalltalk
they would work the same but you can't tell that with #=.
I have no problem with VA's hash based collections, they work the way they should given the way #hash and #= work. But
from a higher level, if I put a bunch of intervals in a Set and only wanted one entry for each range spanned, I would be
out of luck and probably confused as to why. Sure, this is Smalltalk and there are ways around this, if you know you
need to work around it. One could always add a method to intervals to "fix" the start and end values if the increment
is an integer.
I have heard from Instantiations and they plan to leave well enough alone at this point, since the #= method has been
this way since 1996. They are concerned that changing #= may break existing user code. I doubt it but I understand the
concern.
Lou
Post by Chris Cunningham
All of that said, I too find the VA troubling a bit in this case. I rely
on this (0 to: 1) = (0 to: 5/3) being true. VA not supporting is limits
cross-dialect portability, although I don't (personally) use VA, other
folks at work do and we do occasionally share code.
However, this implementation is internally consistent and obeys the = and
hash rule in this case. Its just not what I would want.
-cbc
Post by Chris Cunningham
Hi Louis,
Post by Louis LaBrunda
Hi Chris,
On Fri, 2 Nov 2018 07:44:00 -0700, Chris Cunningham <
Post by Chris Cunningham
(0 to: 1) = (0 to: 5/3). "true"
(0 to: 1) hash = (0 to: 5/3) hash. "true"
So, ancient VSE and current VisualWorks are consistent, and agree on
where
Post by Chris Cunningham
they want to be. This is also the direction I want to take Squeak.
VA is also consistent, but #= doesn't match any other Smalltalk varient
that we've looked at.
Post by Chris Cunningham
Squeak, Pharo, Dolphin all currently have the same answer, but are not
consistent.
Interesting indeed.
I have been talking to the VA Smalltalk guys about this and they are
thinking about it but haven't decided what to do
yet. It turns out that the way collections (like Set) that use #hash in
VA Smalltalk work, because of the #= test
failing for intervals that cover the same range and have the same hash,
that it overrides the equal hash value and adds
the interval to the collection. I find this troubling.
Lou
The rules for = and hash are that if two object are #=, then their hash
values have to be equal as well.
There is no statement about if two objects hashes are the same, what this
means for equality. This, I believe, is intentional.
The collection objects in (most?all?) smalltalks behave similarly to VA's
- if objects have the same hash but are not equal, then they will both be
in the hashed collection (such as Set). The squeak implementation is
described in Set>>scanFor: . This method also shows why having objects
equal but their hash not equal is so dangerous - if you had two objects
that are supposed to be one and the same and are in fact #= but don't have
the same hash, they can both show up in a Set together, or as keys in a
Dictionary together, which breaks what we would expect.
But getting back to VA's collection issue that you have issues with - they
are undoubtedly doing something similar in their collections that we do in
Squeak, which is what is expected (although not necessarily obvious).
A long time ago, I took advantage of this and just hard-coded the hash for
some of my classes to 1. This actually did work, but is a horrible (I mean
HORRIBLE) idea - it really, really slows down the system when you have more
than a couple instances of an object, but it does work.
-cbc
Post by Louis LaBrunda
Post by Chris Cunningham
thanks,
cbc
Post by Louis LaBrunda
Hi Benoit,
VA Smalltalk V9.1 (32-bit); Image: 9.1 [413]
VM Timestamp: 4.0, 10/01/18 (100)
(0 to: 1) = (0 to: 5/3). "false"
(0 to: 1) hash = (0 to: 5/3) hash. "true"
Very interesting.
Lou
On Thu, 1 Nov 2018 02:40:00 +0000 (UTC), Benoit St-Jean via Squeak-dev
<
Post by Chris Cunningham
Post by Louis LaBrunda
Interesting!
Squeak 5.2
(0 to: 1) = (0 to: 5/3). "true"(0 to: 1) hash = (0 to: 5/3) hash.
"false"
Post by Chris Cunningham
Post by Louis LaBrunda
Dolphin 7(0 to: 1) = (0 to: 5/3). "true"(0 to: 1) hash = (0 to: 5/3)
hash. "false"
VisualWorks 8.1.1(0 to: 1) = (0 to: 5/3). "true"
(0 to: 1) hash = (0 to: 5/3) hash. "true"
Pharo 5.0(0 to: 1) = (0 to: 5/3). "true"
(0 to: 1) hash = (0 to: 5/3) hash. "false"
I don't have VAST installed on the PC I'm using right now. I'd be
curious to see how other Smalltalk and/or GemStone handle this? So far
(according to what I could test, only VW is right (according to the
ANSI
Post by Chris Cunningham
Post by Louis LaBrunda
standard and just plain logic!)
I wonder how much code relies on this "behavior" out there!
But the ANSI Smalltalk draft is very clear on this (revision 1.9, page
53,
"If the value of receiver = comparand is true then the receiver and
comparand *must* have equivalent hash values."
That's what I always thought (or was taught or even read in the Blue
Book). Was this something that was changed at some point???
----------------
Benoît St-Jean
Yahoo! Messenger: bstjean
Pinterest: benoitstjean
Instagram: Chef_Benito
IRC: lamneth
Blogue: endormitoire.wordpress.com
"A standpoint is an intellectual horizon of radius zero". (A.
Einstein)
Post by Chris Cunningham
Post by Louis LaBrunda
--
Louis LaBrunda
Keystone Software Corp.
SkypeMe callto://PhotonDemon
--
Louis LaBrunda
Keystone Software Corp.
SkypeMe callto://PhotonDemon
--
Louis LaBrunda
Keystone Software Corp.
SkypeMe callto://PhotonDemon
--
Louis LaBrunda
Eliot Miranda
2018-11-15 15:50:31 UTC
Permalink
Post by Louis LaBrunda
Hi Tim,
After thoroughly discussing this with the VA Smalltalk guys I have concluded that the developer should be responsible
for creating intervals that have equal ranges that compare equal. For example intervals with mixed integers and real
values can have the same range but don't compare equal. If comparing equal is desired, the values should be made to
make that work.
IMO that’s a cop out. An implementation which compares two intervals
as equal if their elements are equal makes perfect sense and is easy
to implement. All that’s needed is that the implementation access
“self last” instead of “stop”.

Implementing newHash as one that uses self last in place of stop then
in my image


| insts s |
insts := Interval allInstances.
{ insts size. s := (insts select: [:i| i hash ~= i newHash]) size. s *
100.0 / insts size } #(3267 0 0.0)

So there's minimal risk in breaking anything simply redefining hash (I
would also reformat #= as per my suggestion ;-) ).
Post by Louis LaBrunda
Lou
Post by Tim Olson
[self size = anInterval size]]]
— tim
Post by Louis LaBrunda
Hi Chris,
I know about and understand everything you have said about the spec and the way #= and #hast should work and relate to
each other. My problem with VA Smalltalks implementation of #= is that it doesn't consider what an interval is and how
it is used and therefor what equals should mean. I would interpret two intervals being equal if they span the same
range and not worry about how they got there. In the case where the increment (by) is an integer the start and end
values map down to integers and if those integers are the same in two intervals then the intervals span the same range
and should be considered equal. Any program using those intervals would expect them to work the same. In VA Smalltalk
they would work the same but you can't tell that with #=.
I have no problem with VA's hash based collections, they work the way they should given the way #hash and #= work. But
from a higher level, if I put a bunch of intervals in a Set and only wanted one entry for each range spanned, I would be
out of luck and probably confused as to why. Sure, this is Smalltalk and there are ways around this, if you know you
need to work around it. One could always add a method to intervals to "fix" the start and end values if the increment
is an integer.
I have heard from Instantiations and they plan to leave well enough alone at this point, since the #= method has been
this way since 1996. They are concerned that changing #= may break existing user code. I doubt it but I understand the
concern.
Lou
Post by Chris Cunningham
All of that said, I too find the VA troubling a bit in this case. I rely
on this (0 to: 1) = (0 to: 5/3) being true. VA not supporting is limits
cross-dialect portability, although I don't (personally) use VA, other
folks at work do and we do occasionally share code.
However, this implementation is internally consistent and obeys the = and
hash rule in this case. Its just not what I would want.
-cbc
Post by Chris Cunningham
Hi Louis,
Post by Louis LaBrunda
Hi Chris,
On Fri, 2 Nov 2018 07:44:00 -0700, Chris Cunningham <
Post by Chris Cunningham
(0 to: 1) = (0 to: 5/3). "true"
(0 to: 1) hash = (0 to: 5/3) hash. "true"
So, ancient VSE and current VisualWorks are consistent, and agree on
where
Post by Chris Cunningham
they want to be. This is also the direction I want to take Squeak.
VA is also consistent, but #= doesn't match any other Smalltalk varient
that we've looked at.
Post by Chris Cunningham
Squeak, Pharo, Dolphin all currently have the same answer, but are not
consistent.
Interesting indeed.
I have been talking to the VA Smalltalk guys about this and they are
thinking about it but haven't decided what to do
yet. It turns out that the way collections (like Set) that use #hash in
VA Smalltalk work, because of the #= test
failing for intervals that cover the same range and have the same hash,
that it overrides the equal hash value and adds
the interval to the collection. I find this troubling.
Lou
The rules for = and hash are that if two object are #=, then their hash
values have to be equal as well.
There is no statement about if two objects hashes are the same, what this
means for equality. This, I believe, is intentional.
The collection objects in (most?all?) smalltalks behave similarly to VA's
- if objects have the same hash but are not equal, then they will both be
in the hashed collection (such as Set). The squeak implementation is
described in Set>>scanFor: . This method also shows why having objects
equal but their hash not equal is so dangerous - if you had two objects
that are supposed to be one and the same and are in fact #= but don't have
the same hash, they can both show up in a Set together, or as keys in a
Dictionary together, which breaks what we would expect.
But getting back to VA's collection issue that you have issues with - they
are undoubtedly doing something similar in their collections that we do in
Squeak, which is what is expected (although not necessarily obvious).
A long time ago, I took advantage of this and just hard-coded the hash for
some of my classes to 1. This actually did work, but is a horrible (I mean
HORRIBLE) idea - it really, really slows down the system when you have more
than a couple instances of an object, but it does work.
-cbc
Post by Louis LaBrunda
Post by Chris Cunningham
thanks,
cbc
Post by Louis LaBrunda
Hi Benoit,
VA Smalltalk V9.1 (32-bit); Image: 9.1 [413]
VM Timestamp: 4.0, 10/01/18 (100)
(0 to: 1) = (0 to: 5/3). "false"
(0 to: 1) hash = (0 to: 5/3) hash. "true"
Very interesting.
Lou
On Thu, 1 Nov 2018 02:40:00 +0000 (UTC), Benoit St-Jean via Squeak-dev
<
Post by Chris Cunningham
Post by Louis LaBrunda
Interesting!
Squeak 5.2
(0 to: 1) = (0 to: 5/3). "true"(0 to: 1) hash = (0 to: 5/3) hash.
"false"
Post by Chris Cunningham
Post by Louis LaBrunda
Dolphin 7(0 to: 1) = (0 to: 5/3). "true"(0 to: 1) hash = (0 to: 5/3)
hash. "false"
VisualWorks 8.1.1(0 to: 1) = (0 to: 5/3). "true"
(0 to: 1) hash = (0 to: 5/3) hash. "true"
Pharo 5.0(0 to: 1) = (0 to: 5/3). "true"
(0 to: 1) hash = (0 to: 5/3) hash. "false"
I don't have VAST installed on the PC I'm using right now. I'd be
curious to see how other Smalltalk and/or GemStone handle this? So far
(according to what I could test, only VW is right (according to the
ANSI
Post by Chris Cunningham
Post by Louis LaBrunda
standard and just plain logic!)
I wonder how much code relies on this "behavior" out there!
But the ANSI Smalltalk draft is very clear on this (revision 1.9, page
53,
"If the value of receiver = comparand is true then the receiver and
comparand *must* have equivalent hash values."
That's what I always thought (or was taught or even read in the Blue
Book). Was this something that was changed at some point???
----------------
Benoît St-Jean
Yahoo! Messenger: bstjean
Pinterest: benoitstjean
Instagram: Chef_Benito
IRC: lamneth
Blogue: endormitoire.wordpress.com
"A standpoint is an intellectual horizon of radius zero". (A.
Einstein)
Post by Chris Cunningham
Post by Louis LaBrunda
--
Louis LaBrunda
Keystone Software Corp.
SkypeMe callto://PhotonDemon
--
Louis LaBrunda
Keystone Software Corp.
SkypeMe callto://PhotonDemon
--
Louis LaBrunda
Keystone Software Corp.
SkypeMe callto://PhotonDemon
--
Louis LaBrunda
Keystone Software Cor
Louis LaBrunda
2018-11-15 17:13:13 UTC
Permalink
Hi Eliot,

I really shouldn't speak for Instantiations but since I brought them into this conversation I will say this:

(0 to: 1) = (0 to: 5/3). "false"
(0 to: 1) hash = (0 to: 5/3) hash. "true"

The comparison of the intervals answers false. I argued strenuously that two intervals that cover the same range should
compare as equal. Unfortunately the ANSI standard is I think ambiguous on this point. It says that if two things
compare equal their hashes should be equal but here the two intervals don't compare equal. The VA Smalltalk code has
been this way for over 20 years. Changing it could impact an unknown amount of customer code. I eventually concluded
that even though the ranges were equal the objects were not and that their definition of equal was as valid as any
other. If this came up 20+ years ago, maybe they could be convinced to change their definition. Now I agree with them,
it is too late and too dangerous.

Since this is Smalltalk, if one is really interested in intervals that cover the same range comparing equal, there are
simple ways to make that work. Yes, moving code from Squeal to VA Smalltalk would need a little love but probably not
much.

Lou
Post by Louis LaBrunda
Hi Tim,
After thoroughly discussing this with the VA Smalltalk guys I have concluded that the developer should be responsible
for creating intervals that have equal ranges that compare equal. For example intervals with mixed integers and real
values can have the same range but don't compare equal. If comparing equal is desired, the values should be made to
make that work.
IMO that’s a cop out. An implementation which compares two intervals
as equal if their elements are equal makes perfect sense and is easy
to implement. All that’s needed is that the implementation access
“self last” instead of “stop”.
Implementing newHash as one that uses self last in place of stop then
in my image
| insts s |
insts := Interval allInstances.
{ insts size. s := (insts select: [:i| i hash ~= i newHash]) size. s *
100.0 / insts size } #(3267 0 0.0)
So there's minimal risk in breaking anything simply redefining hash (I
would also reformat #= as per my suggestion ;-) ).
Post by Louis LaBrunda
Lou
Post by Tim Olson
[self size = anInterval size]]]
— tim
Post by Louis LaBrunda
Hi Chris,
I know about and understand everything you have said about the spec and the way #= and #hast should work and relate to
each other. My problem with VA Smalltalks implementation of #= is that it doesn't consider what an interval is and how
it is used and therefor what equals should mean. I would interpret two intervals being equal if they span the same
range and not worry about how they got there. In the case where the increment (by) is an integer the start and end
values map down to integers and if those integers are the same in two intervals then the intervals span the same range
and should be considered equal. Any program using those intervals would expect them to work the same. In VA Smalltalk
they would work the same but you can't tell that with #=.
I have no problem with VA's hash based collections, they work the way they should given the way #hash and #= work. But
from a higher level, if I put a bunch of intervals in a Set and only wanted one entry for each range spanned, I would be
out of luck and probably confused as to why. Sure, this is Smalltalk and there are ways around this, if you know you
need to work around it. One could always add a method to intervals to "fix" the start and end values if the increment
is an integer.
I have heard from Instantiations and they plan to leave well enough alone at this point, since the #= method has been
this way since 1996. They are concerned that changing #= may break existing user code. I doubt it but I understand the
concern.
Lou
Post by Chris Cunningham
All of that said, I too find the VA troubling a bit in this case. I rely
on this (0 to: 1) = (0 to: 5/3) being true. VA not supporting is limits
cross-dialect portability, although I don't (personally) use VA, other
folks at work do and we do occasionally share code.
However, this implementation is internally consistent and obeys the = and
hash rule in this case. Its just not what I would want.
-cbc
Post by Chris Cunningham
Hi Louis,
Post by Louis LaBrunda
Hi Chris,
On Fri, 2 Nov 2018 07:44:00 -0700, Chris Cunningham <
Post by Chris Cunningham
(0 to: 1) = (0 to: 5/3). "true"
(0 to: 1) hash = (0 to: 5/3) hash. "true"
So, ancient VSE and current VisualWorks are consistent, and agree on
where
Post by Chris Cunningham
they want to be. This is also the direction I want to take Squeak.
VA is also consistent, but #= doesn't match any other Smalltalk varient
that we've looked at.
Post by Chris Cunningham
Squeak, Pharo, Dolphin all currently have the same answer, but are not
consistent.
Interesting indeed.
I have been talking to the VA Smalltalk guys about this and they are
thinking about it but haven't decided what to do
yet. It turns out that the way collections (like Set) that use #hash in
VA Smalltalk work, because of the #= test
failing for intervals that cover the same range and have the same hash,
that it overrides the equal hash value and adds
the interval to the collection. I find this troubling.
Lou
The rules for = and hash are that if two object are #=, then their hash
values have to be equal as well.
There is no statement about if two objects hashes are the same, what this
means for equality. This, I believe, is intentional.
The collection objects in (most?all?) smalltalks behave similarly to VA's
- if objects have the same hash but are not equal, then they will both be
in the hashed collection (such as Set). The squeak implementation is
described in Set>>scanFor: . This method also shows why having objects
equal but their hash not equal is so dangerous - if you had two objects
that are supposed to be one and the same and are in fact #= but don't have
the same hash, they can both show up in a Set together, or as keys in a
Dictionary together, which breaks what we would expect.
But getting back to VA's collection issue that you have issues with - they
are undoubtedly doing something similar in their collections that we do in
Squeak, which is what is expected (although not necessarily obvious).
A long time ago, I took advantage of this and just hard-coded the hash for
some of my classes to 1. This actually did work, but is a horrible (I mean
HORRIBLE) idea - it really, really slows down the system when you have more
than a couple instances of an object, but it does work.
-cbc
Post by Louis LaBrunda
Post by Chris Cunningham
thanks,
cbc
Post by Louis LaBrunda
Hi Benoit,
VA Smalltalk V9.1 (32-bit); Image: 9.1 [413]
VM Timestamp: 4.0, 10/01/18 (100)
(0 to: 1) = (0 to: 5/3). "false"
(0 to: 1) hash = (0 to: 5/3) hash. "true"
Very interesting.
Lou
On Thu, 1 Nov 2018 02:40:00 +0000 (UTC), Benoit St-Jean via Squeak-dev
<
Post by Chris Cunningham
Post by Louis LaBrunda
Interesting!
Squeak 5.2
(0 to: 1) = (0 to: 5/3). "true"(0 to: 1) hash = (0 to: 5/3) hash.
"false"
Post by Chris Cunningham
Post by Louis LaBrunda
Dolphin 7(0 to: 1) = (0 to: 5/3). "true"(0 to: 1) hash = (0 to: 5/3)
hash. "false"
VisualWorks 8.1.1(0 to: 1) = (0 to: 5/3). "true"
(0 to: 1) hash = (0 to: 5/3) hash. "true"
Pharo 5.0(0 to: 1) = (0 to: 5/3). "true"
(0 to: 1) hash = (0 to: 5/3) hash. "false"
I don't have VAST installed on the PC I'm using right now. I'd be
curious to see how other Smalltalk and/or GemStone handle this? So far
(according to what I could test, only VW is right (according to the
ANSI
Post by Chris Cunningham
Post by Louis LaBrunda
standard and just plain logic!)
I wonder how much code relies on this "behavior" out there!
But the ANSI Smalltalk draft is very clear on this (revision 1.9, page
53,
"If the value of receiver = comparand is true then the receiver and
comparand *must* have equivalent hash values."
That's what I always thought (or was taught or even read in the Blue
Book). Was this something that was changed at some point???
----------------
Benoît St-Jean
Yahoo! Messenger: bstjean
Pinterest: benoitstjean
Instagram: Chef_Benito
IRC: lamneth
Blogue: endormitoire.wordpress.com
"A standpoint is an intellectual horizon of radius zero". (A.
Einstein)
Post by Chris Cunningham
Post by Louis LaBrunda
--
Louis LaBrunda
Keystone Software Corp.
SkypeMe callto://PhotonDemon
--
Louis LaBrunda
Keystone Software Corp.
SkypeMe callto://PhotonDemon
--
Louis LaBrunda
Keystone Software Corp.
SkypeMe callto://PhotonDemon
--
Louis LaBrunda
Keystone Software Corp.
SkypeMe callto://PhotonDemon
--
Louis LaBrunda
Keystone Software Corp.
SkypeM
Bert Freudenberg
2018-11-15 19:29:32 UTC
Permalink
Isn't this extremely simple to fix?

#= is implemented in terms of start, step, and last.

So why not implement #hash as

^(start hash bitXor: step hash) bitXor: self last hash

Then in the postscript do a

HashedCollection rehashAll

and that should be it, right?

I did a quick check using

Dictionary allInstances select: [ :dict | dict keys anySatisfy: #isInterval]

and the system seems fine after that.

- Bert -
Chris Muller
2018-11-15 21:32:09 UTC
Permalink
Post by Bert Freudenberg
Isn't this extremely simple to fix?
#= is implemented in terms of start, step, and last.
So why not implement #hash as
^(start hash bitXor: step hash) bitXor: self last hash
Then in the postscript do a
HashedCollection rehashAll
and that should be it, right?
I did a quick check using
Dictionary allInstances select: [ :dict | dict keys anySatisfy: #isInterval]
and the system seems fine after that.
You forgot about the universe that lies beyond the fringes of the
image. There be persistent data files out there. And users.
Possibly with systems that rely on Interval>>#hash.

But since most applications don't use non-Integer based Intervals
(since Interval doesn't really support them, by design, I guess),
there's no reason whatsoever to decimate the universe when Eliot's
simply fixes the corner case that no one is using anyway.

+1 to Eliots suggestion to fix Int
Louis LaBrunda
2018-11-15 21:52:06 UTC
Permalink
Hi Guys,

I don't work for Instantiations, so this decision isn't mine to make. That said, I have to agree with their desire to
be cautious. There is no up side to them to change this and even though the down side should be small, there is no real
way of knowing how big or small it is.

Lou
Post by Chris Muller
Post by Bert Freudenberg
Isn't this extremely simple to fix?
#= is implemented in terms of start, step, and last.
So why not implement #hash as
^(start hash bitXor: step hash) bitXor: self last hash
Then in the postscript do a
HashedCollection rehashAll
and that should be it, right?
I did a quick check using
Dictionary allInstances select: [ :dict | dict keys anySatisfy: #isInterval]
and the system seems fine after that.
You forgot about the universe that lies beyond the fringes of the
image. There be persistent data files out there. And users.
Possibly with systems that rely on Interval>>#hash.
But since most applications don't use non-Integer based Intervals
(since Interval doesn't really support them, by design, I guess),
there's no reason whatsoever to decimate the universe when Eliot's
simply fixes the corner case that no one is using anyway.
+1 to Eliots suggestion to fix Interval>>#hash.
- Chris
--
Louis LaBrunda
Keyst
Chris Cunningham
2018-11-15 21:53:39 UTC
Permalink
Nice. It seems like we have consensus on what to change.

I'll push these changes (with the tests) to trunk soon.

The fix I have for #hash was exactly what Elliot suggested.
I'll make sure to include the rehash as well (thanks for the code snippit
Bert!)
If no one objects strenuously, I'll also include Eliot's slight rewrite of
#= has well - it is marginally cleaner and equally fast, so now is a
reasonable time to include it.

I'll delay working on bug #3380 for now - to fix this, we'd have to also
add in a check on class in #= to make sure we aren't comparing an interval
to an array. Unless someone has been bitten by this recently, I'd rather
wait.

-cbc
Post by Bert Freudenberg
Post by Bert Freudenberg
Isn't this extremely simple to fix?
#= is implemented in terms of start, step, and last.
So why not implement #hash as
^(start hash bitXor: step hash) bitXor: self last hash
Then in the postscript do a
HashedCollection rehashAll
and that should be it, right?
I did a quick check using
#isInterval]
Post by Bert Freudenberg
and the system seems fine after that.
You forgot about the universe that lies beyond the fringes of the
image. There be persistent data files out there. And users.
Possibly with systems that rely on Interval>>#hash.
But since most applications don't use non-Integer based Intervals
(since Interval doesn't really support them, by design, I guess),
there's no reason whatsoever to decimate the universe when Eliot's
simply fixes the corner case that no one is using anyway.
+1 to Eliots suggestion to fix Interval>>#hash.
- Chris
Chris Muller
2018-11-15 22:20:40 UTC
Permalink
Post by Chris Cunningham
Nice. It seems like we have consensus on what to change.
I'll push these changes (with the tests) to trunk soon.
Hey, "seems" carries enough uncertainty to give us one final look
before trunk. By "these changes" are you referring to just the
Interval>>#hash or some Array changes, too? All we've seen so far are
Collections-cbc.810.mcz, could we get one look at your final draft
proposal before trunk?

On a less important note, I personally find a pure conditional
nomenclature more attractive than the embedded ifTrue:ifFalse:, like:

= anObject
^ self == anObject or:
[ (anObject isInterval
and:
[ start = anObject first and:
[ step = anObject increment and: [ self last =
anObject last ] ] ])
or: [ super = anObject ] ]

For whatever and whenever you push, I'm sure you already were but,
just in case, I would be grateful if you would please base it solely
off the current top trunk version with no intermediate versions in the
ancestry. :-)

Thanks a lot finding this and helping get it fixed!

Best Regards,
Chris

- Chris
Post by Chris Cunningham
The fix I have for #hash was exactly what Elliot suggested.
I'll make sure to include the rehash as well (thanks for the code snippit Bert!)
If no one objects strenuously, I'll also include Eliot's slight rewrite of #= has well - it is marginally cleaner and equally fast, so now is a reasonable time to include it.
I'll delay working on bug #3380 for now - to fix this, we'd have to also add in a check on class in #= to make sure we aren't comparing an interval to an array. Unless someone has been bitten by this recently, I'd rather wait.
-cbc
Post by Chris Muller
Post by Bert Freudenberg
Isn't this extremely simple to fix?
#= is implemented in terms of start, step, and last.
So why not implement #hash as
^(start hash bitXor: step hash) bitXor: self last hash
Then in the postscript do a
HashedCollection rehashAll
and that should be it, right?
I did a quick check using
Dictionary allInstances select: [ :dict | dict keys anySatisfy: #isInterval]
and the system seems fine after that.
You forgot about the universe that lies beyond the fringes of the
image. There be persistent data files out there. And users.
Possibly with systems that rely on Interval>>#hash.
But since most applications don't use non-Integer based Intervals
(since Interval doesn't really support them, by design, I guess),
there's no reason whatsoever to decimate the universe when Eliot's
simply fixes the corner case that no one is using anyway.
+1 to Eliots suggestion to fix Interval>>#hash.
- Chris
Chris Cunningham
2018-11-15 22:45:23 UTC
Permalink
HI Chris,

The changes will be limited to Interval, and will be changes to #= and hash
(and the interval test so this doesn't show up again).

I'll push the changes to inbox soon; and to trunk tomorrow/early next
week. The test will go to Trunk with the changes to inbox (the test will
be what I've pushed to the inbox minus the 3380 part).

And, yes, I'll rebase if off of the current trunk version - there has been
significant changes since my last proposal.

Interestingly:

= anObject
^ self == anObject or:
[ (anObject isInterval
and:
[ start = anObject first and:
[ step = anObject increment and: [ self last =
anObject last ] ] ])
or: [ super = anObject ] ]

This is actually wrong - if the two items to compare are intervals but they
don't match based on interval hash (first/last/increment), then it will
check if super #= returns true - that is not desirable. But, I understand
the desire you mention here - which I believe is what Eliot was driving for
as well.

-cbc
Post by Chris Muller
Post by Chris Cunningham
Nice. It seems like we have consensus on what to change.
I'll push these changes (with the tests) to trunk soon.
Hey, "seems" carries enough uncertainty to give us one final look
before trunk. By "these changes" are you referring to just the
Interval>>#hash or some Array changes, too? All we've seen so far are
Collections-cbc.810.mcz, could we get one look at your final draft
proposal before trunk?
On a less important note, I personally find a pure conditional
= anObject
[ (anObject isInterval
[ step = anObject increment and: [ self last =
anObject last ] ] ])
or: [ super = anObject ] ]
For whatever and whenever you push, I'm sure you already were but,
just in case, I would be grateful if you would please base it solely
off the current top trunk version with no intermediate versions in the
ancestry. :-)
Thanks a lot finding this and helping get it fixed!
Best Regards,
Chris
- Chris
Post by Chris Cunningham
The fix I have for #hash was exactly what Elliot suggested.
I'll make sure to include the rehash as well (thanks for the code
snippit Bert!)
Post by Chris Cunningham
If no one objects strenuously, I'll also include Eliot's slight rewrite
of #= has well - it is marginally cleaner and equally fast, so now is a
reasonable time to include it.
Post by Chris Cunningham
I'll delay working on bug #3380 for now - to fix this, we'd have to also
add in a check on class in #= to make sure we aren't comparing an interval
to an array. Unless someone has been bitten by this recently, I'd rather
wait.
Post by Chris Cunningham
-cbc
Post by Chris Muller
Post by Bert Freudenberg
Isn't this extremely simple to fix?
#= is implemented in terms of start, step, and last.
So why not implement #hash as
^(start hash bitXor: step hash) bitXor: self last hash
Then in the postscript do a
HashedCollection rehashAll
and that should be it, right?
I did a quick check using
#isInterval]
Post by Chris Cunningham
Post by Chris Muller
Post by Bert Freudenberg
and the system seems fine after that.
You forgot about the universe that lies beyond the fringes of the
image. There be persistent data files out there. And users.
Possibly with systems that rely on Interval>>#hash.
But since most applications don't use non-Integer based Intervals
(since Interval doesn't really support them, by design, I guess),
there's no reason whatsoever to decimate the universe when Eliot's
simply fixes the corner case that no one is using anyway.
+1 to Eliots suggestion to fix Interval>>#hash.
- Chris
Bert Freudenberg
2018-11-16 00:38:16 UTC
Permalink
Somehow I missed Eliot's version, but unsurprisingly he had exactly the
same idea (use "last" not "stop" for hash). I'd still think bitXor: is
preferable to bitOr, that is the standard way in almost all hash methods.
But ...

BUT: I forgot about the super fallback in #=. That makes this discussion
pretty much moot, because since

#(1 2 3) = (1 to: 3) "true"

is true, this must also be true:

#(1 2 3) hash = (1 to: 3) hash "must be true"

So the only proper fix IMHO is to remove #hash from Interval (or replace it
with ^super hash and a proper comment)

- Bert -
Eliot Miranda
2018-11-16 01:35:59 UTC
Permalink
Hi Bert,
Post by Bert Freudenberg
Somehow I missed Eliot's version, but unsurprisingly he had exactly the
same idea (use "last" not "stop" for hash). I'd still think bitXor: is
preferable to bitOr, that is the standard way in almost all hash methods.
But ...
BUT: I forgot about the super fallback in #=. That makes this discussion
pretty much moot, because since
#(1 2 3) = (1 to: 3) "true"
#(1 2 3) hash = (1 to: 3) hash "must be true"
So the only proper fix IMHO is to remove #hash from Interval (or replace
it with ^super hash and a proper comment)
We discussed this a couple of weeks ago. There is no need for
#(1 2 3) = (1 to: 3)
to be true.
#(1 2 3) = #[1 2 3]
isn’t true. And we have hasEqualElements:. So a more coherent approach is
for the hack that makes intervals equal to arrays be discarded, and the
hashes kept distinct.
Post by Bert Freudenberg
- Bert -
--
_,,,^..^,,,_
best, Eliot
Bert Freudenberg
2018-11-16 03:39:41 UTC
Permalink
Post by Eliot Miranda
Hi Bert,
Post by Bert Freudenberg
Somehow I missed Eliot's version, but unsurprisingly he had exactly the
same idea (use "last" not "stop" for hash). I'd still think bitXor: is
preferable to bitOr, that is the standard way in almost all hash methods.
But ...
BUT: I forgot about the super fallback in #=. That makes this discussion
pretty much moot, because since
#(1 2 3) = (1 to: 3) "true"
#(1 2 3) hash = (1 to: 3) hash "must be true"
So the only proper fix IMHO is to remove #hash from Interval (or replace
it with ^super hash and a proper comment)
We discussed this a couple of weeks ago. There is no need for
#(1 2 3) = (1 to: 3)
to be true.
#(1 2 3) = #[1 2 3]
isn’t true. And we have hasEqualElements:. So a more coherent approach
is for the hack that makes intervals equal to arrays be discarded, and the
hashes kept distinct.
Makes sense. The version you posted ("I would have written...") still
delegated to super>>= so I thought we wanted to keep that. But I agree that
it's of little utility.

- Bert -
Chris Cunningham
2018-11-16 05:52:32 UTC
Permalink
Post by Eliot Miranda
Hi Bert,
Post by Bert Freudenberg
Somehow I missed Eliot's version, but unsurprisingly he had exactly the
same idea (use "last" not "stop" for hash). I'd still think bitXor: is
preferable to bitOr, that is the standard way in almost all hash methods.
But ...
BUT: I forgot about the super fallback in #=. That makes this discussion
pretty much moot, because since
#(1 2 3) = (1 to: 3) "true"
#(1 2 3) hash = (1 to: 3) hash "must be true"
So the only proper fix IMHO is to remove #hash from Interval (or replace
it with ^super hash and a proper comment)
We discussed this a couple of weeks ago. There is no need for
#(1 2 3) = (1 to: 3)
to be true.
#(1 2 3) = #[1 2 3]
isn’t true. And we have hasEqualElements:. So a more coherent approach
is for the hack that makes intervals equal to arrays be discarded, and the
hashes kept distinct.
And I agreed with you weeks ago, but looking at it closer, the code
specifically says Interval is a species of Array.
Interestingly, ByteArray, which is a subclass of ArrayedCollection, doesn't
set its species, so its species is ByteArray. Which is desirable.

If we change the Interval #species to not be array, then many things break
with Interval - most notably #select: and #collect:, so a major overhaul
would be in store for that part of the code.

In line with Bert's allusion, if we removed the super = call, then #= is no
longer associative between Interval's and Arrays:
(1 to: 3) = #(1 2 3) "false"
#(1 2 3) = (1 to: 3)" true"

So, I'm just fixing the Interval only part and punting on the issue between
Interval and Array for now.

-cbc
Post by Eliot Miranda
Post by Bert Freudenberg
- Bert -
--
_,,,^..^,,,_
best, Eliot
Chris Cunningham
2018-11-20 19:11:58 UTC
Permalink
Sorry for the excessive delay in responding to these threads.
Post by Eliot Miranda
Hi Bert,
Post by Bert Freudenberg
Somehow I missed Eliot's version, but unsurprisingly he had exactly the
same idea (use "last" not "stop" for hash). I'd still think bitXor: is
preferable to bitOr, that is the standard way in almost all hash methods.
But ...
BUT: I forgot about the super fallback in #=. That makes this discussion
pretty much moot, because since
#(1 2 3) = (1 to: 3) "true"
#(1 2 3) hash = (1 to: 3) hash "must be true"
So the only proper fix IMHO is to remove #hash from Interval (or replace
it with ^super hash and a proper comment)
We discussed this a couple of weeks ago. There is no need for
#(1 2 3) = (1 to: 3)
to be true.
#(1 2 3) = #[1 2 3]
isn’t true. And we have hasEqualElements:. So a more coherent approach
is for the hack that makes intervals equal to arrays be discarded, and the
hashes kept distinct.
Actually, the hack is that interval is a subclass of SequenceableCollection
with species defined as Array. This makes lots of things very nice - like
#collect: and #select: just work. If we removed #species (which would be
necessary to make interval and array not be equal), that would require
re-implementing these two methods - and many, many more - from the
superclasses.

Basically, that hack is a fundamental part of how the class is built today.

Are we ok with us taking on that much of a change?

-cbc
Post by Eliot Miranda
Post by Bert Freudenberg
- Bert -
--
_,,,^..^,,,_
best, Eliot
Eliot Miranda
2018-11-22 15:36:18 UTC
Permalink
Hi Chris,
Post by Chris Cunningham
Sorry for the excessive delay in responding to these threads.
Post by Eliot Miranda
Hi Bert,
Somehow I missed Eliot's version, but unsurprisingly he had exactly the same idea (use "last" not "stop" for hash). I'd still think bitXor: is preferable to bitOr, that is the standard way in almost all hash methods. But ...
BUT: I forgot about the super fallback in #=. That makes this discussion pretty much moot, because since
#(1 2 3) = (1 to: 3) "true"
#(1 2 3) hash = (1 to: 3) hash "must be true"
So the only proper fix IMHO is to remove #hash from Interval (or replace it with ^super hash and a proper comment)
We discussed this a couple of weeks ago. There is no need for
#(1 2 3) = (1 to: 3)
to be true.
#(1 2 3) = #[1 2 3]
isn’t true. And we have hasEqualElements:. So a more coherent approach is for the hack that makes intervals equal to arrays be discarded, and the hashes kept distinct.
Actually, the hack is that interval is a subclass of SequenceableCollection with species defined as Array. This makes lots of things very nice - like #collect: and #select: just work. If we removed #species (which would be necessary to make interval and array not be equal), that would require re-implementing these two methods - and many, many more - from the superclasses.
Basically, that hack is a fundamental part of how the class is built today.
IMO it is not a hack. But it has nothing to do with whether an Interval with equal elements to an Array is equal to it. A ByteArray is also a SequenceableCollection and is not equal to an Array if it has equal elements. It has a different species to Array, but species exists, as you’ve noted, for the convenience of select: & collect: so that immutable collections can answer a suitable mutable class to be used to construct the result.
Post by Chris Cunningham
Are we ok with us taking on that much of a change?
No one is suggesting changing the species of Interval.
Post by Chris Cunningham
-cbc
Post by Eliot Miranda
- Bert -
--
_,,,^..^,,,_
best, Eliot
Eliot Miranda
2018-11-15 15:41:28 UTC
Permalink
Post by Tim Olson
[self size = anInterval size]]]
The current implementation is correct; it is effectively the same as
your's but has some obvious optimizations. Two intervals are equal if
they have the same sequence of elements, no matter how they are
written. Here is is:

= anObject

^ self == anObject
ifTrue: [true]
ifFalse: [anObject isInterval
ifTrue: [start = anObject first
and: [step = anObject increment
and: [self last = anObject last]]]
ifFalse: [super = anObject]]

which I would have written
= anObject
^self == anObject
or: [anObject isInterval
ifFalse: [super = anObject]
ifTrue:
[start = anObject first
and: [step = anObject increment
and: [self last = anObject last]]]]

The issue is with hash which accesses stop directly instead of last.
If hash read

hash
"Hash is reimplemented because = is implemented."

^(((start hash bitShift: 2)
bitOr: self last hash)
bitShift: 1)
bitOr: self size

(i.e. "bitOr: stop hash)" => "bitOr: self last hash)"
then things will be fine. And most common intervals hash will not change.
Post by Tim Olson
— tim
Post by Louis LaBrunda
Hi Chris,
I know about and understand everything you have said about the spec and the way #= and #hast should work and relate to
each other. My problem with VA Smalltalks implementation of #= is that it doesn't consider what an interval is and how
it is used and therefor what equals should mean. I would interpret two intervals being equal if they span the same
range and not worry about how they got there. In the case where the increment (by) is an integer the start and end
values map down to integers and if those integers are the same in two intervals then the intervals span the same range
and should be considered equal. Any program using those intervals would expect them to work the same. In VA Smalltalk
they would work the same but you can't tell that with #=.
I have no problem with VA's hash based collections, they work the way they should given the way #hash and #= work. But
from a higher level, if I put a bunch of intervals in a Set and only wanted one entry for each range spanned, I would be
out of luck and probably confused as to why. Sure, this is Smalltalk and there are ways around this, if you know you
need to work around it. One could always add a method to intervals to "fix" the start and end values if the increment
is an integer.
I have heard from Instantiations and they plan to leave well enough alone at this point, since the #= method has been
this way since 1996. They are concerned that changing #= may break existing user code. I doubt it but I understand the
concern.
Lou
Post by Chris Cunningham
All of that said, I too find the VA troubling a bit in this case. I rely
on this (0 to: 1) = (0 to: 5/3) being true. VA not supporting is limits
cross-dialect portability, although I don't (personally) use VA, other
folks at work do and we do occasionally share code.
However, this implementation is internally consistent and obeys the = and
hash rule in this case. Its just not what I would want.
-cbc
Post by Chris Cunningham
Hi Louis,
Post by Louis LaBrunda
Hi Chris,
On Fri, 2 Nov 2018 07:44:00 -0700, Chris Cunningham <
Post by Chris Cunningham
(0 to: 1) = (0 to: 5/3). "true"
(0 to: 1) hash = (0 to: 5/3) hash. "true"
So, ancient VSE and current VisualWorks are consistent, and agree on
where
Post by Chris Cunningham
they want to be. This is also the direction I want to take Squeak.
VA is also consistent, but #= doesn't match any other Smalltalk varient
that we've looked at.
Post by Chris Cunningham
Squeak, Pharo, Dolphin all currently have the same answer, but are not
consistent.
Interesting indeed.
I have been talking to the VA Smalltalk guys about this and they are
thinking about it but haven't decided what to do
yet. It turns out that the way collections (like Set) that use #hash in
VA Smalltalk work, because of the #= test
failing for intervals that cover the same range and have the same hash,
that it overrides the equal hash value and adds
the interval to the collection. I find this troubling.
Lou
The rules for = and hash are that if two object are #=, then their hash
values have to be equal as well.
There is no statement about if two objects hashes are the same, what this
means for equality. This, I believe, is intentional.
The collection objects in (most?all?) smalltalks behave similarly to VA's
- if objects have the same hash but are not equal, then they will both be
in the hashed collection (such as Set). The squeak implementation is
described in Set>>scanFor: . This method also shows why having objects
equal but their hash not equal is so dangerous - if you had two objects
that are supposed to be one and the same and are in fact #= but don't have
the same hash, they can both show up in a Set together, or as keys in a
Dictionary together, which breaks what we would expect.
But getting back to VA's collection issue that you have issues with - they
are undoubtedly doing something similar in their collections that we do in
Squeak, which is what is expected (although not necessarily obvious).
A long time ago, I took advantage of this and just hard-coded the hash for
some of my classes to 1. This actually did work, but is a horrible (I mean
HORRIBLE) idea - it really, really slows down the system when you have more
than a couple instances of an object, but it does work.
-cbc
Post by Louis LaBrunda
Post by Chris Cunningham
thanks,
cbc
Post by Louis LaBrunda
Hi Benoit,
VA Smalltalk V9.1 (32-bit); Image: 9.1 [413]
VM Timestamp: 4.0, 10/01/18 (100)
(0 to: 1) = (0 to: 5/3). "false"
(0 to: 1) hash = (0 to: 5/3) hash. "true"
Very interesting.
Lou
On Thu, 1 Nov 2018 02:40:00 +0000 (UTC), Benoit St-Jean via Squeak-dev
<
Post by Chris Cunningham
Post by Louis LaBrunda
Interesting!
Squeak 5.2
(0 to: 1) = (0 to: 5/3). "true"(0 to: 1) hash = (0 to: 5/3) hash.
"false"
Post by Chris Cunningham
Post by Louis LaBrunda
Dolphin 7(0 to: 1) = (0 to: 5/3). "true"(0 to: 1) hash = (0 to: 5/3)
hash. "false"
VisualWorks 8.1.1(0 to: 1) = (0 to: 5/3). "true"
(0 to: 1) hash = (0 to: 5/3) hash. "true"
Pharo 5.0(0 to: 1) = (0 to: 5/3). "true"
(0 to: 1) hash = (0 to: 5/3) hash. "false"
I don't have VAST installed on the PC I'm using right now. I'd be
curious to see how other Smalltalk and/or GemStone handle this? So far
(according to what I could test, only VW is right (according to the
ANSI
Post by Chris Cunningham
Post by Louis LaBrunda
standard and just plain logic!)
I wonder how much code relies on this "behavior" out there!
But the ANSI Smalltalk draft is very clear on this (revision 1.9, page
53,
"If the value of receiver = comparand is true then the receiver and
comparand *must* have equivalent hash values."
That's what I always thought (or was taught or even read in the Blue
Book). Was this something that was changed at some point???
----------------
Benoît St-Jean
Yahoo! Messenger: bstjean
Pinterest: benoitstjean
Instagram: Chef_Benito
IRC: lamneth
Blogue: endormitoire.wordpress.com
"A standpoint is an intellectual horizon of radius zero". (A.
Einstein)
Post by Chris Cunningham
Post by Louis LaBrunda
--
Louis LaBrunda
Keystone Software Corp.
SkypeMe callto://PhotonDemon
--
Louis LaBrunda
Keystone Software Corp.
SkypeMe callto://PhotonDemon
--
Louis LaBrunda
Keystone Software Corp.
S
Eliot Miranda
2018-11-15 15:29:53 UTC
Permalink
Hi Benoît,
Interesting!
Squeak 5.2
(0 to: 1) = (0 to: 5/3). "true"
(0 to: 1) hash = (0 to: 5/3) hash. "false"
Dolphin 7
(0 to: 1) = (0 to: 5/3). "true"
(0 to: 1) hash = (0 to: 5/3) hash. "false"
VisualWorks 8.1.1
(0 to: 1) = (0 to: 5/3). "true"
(0 to: 1) hash = (0 to: 5/3) hash. "true"
Pharo 5.0
(0 to: 1) = (0 to: 5/3). "true"
(0 to: 1) hash = (0 to: 5/3) hash. "false"
I don't have VAST installed on the PC I'm using right now. I'd be curious to see how other Smalltalk and/or GemStone handle this? So far (according to what I could test, only VW is right (according to the ANSI standard and just plain logic!)
I wonder how much code relies on this "behavior" out there!
"If the value of receiver = comparand is true then the receiver and comparand *must* have equivalent hash values."
That's what I always thought (or was taught or even read in the Blue Book). Was this something that was changed at some point???
Nothing was changed. It’s simply people not realizing there is a bug there. Hence the value of Chris’ tests.
----------------
Benoît St-Jean
Yahoo! Messenger: bstjean
Pinterest: benoitstjean
Instagram: Chef_Benito
IRC: lamneth
Blogue: endormitoire.wordpress.com
"A standpoint is an intellectual horizon of radius zero". (A. Einstein)
Loading...