Discussion:
[Gambas-user] I need a hint on how to deleted duplicate items in a array
Fernando Cabral
2017-06-27 12:26:34 UTC
Permalink
Hi

I have a sorted array that may contain several repeated items scattered all
over.

I have to do two different things at different times:
a) Eliminate the duplicates leaving a single specimen from each repeated
item;
b) Eliminate the duplicates but having a count of the original number.

So, if I have, say

A
B
B
C
D
D

In the first option, I want to have
A
B
C
D
In the second option, I want to have
1 A
2 B
1 C
2 D

Any hints on how to do this using some Gambas buit in method?

Note; Presently I have been doing it using external calls to
the utilities sort and uniq.

Regards

- fernando
--
Fernando Cabral
Blogue: http://fernandocabral.org
Twitter: http://twitter.com/fjcabral
e-mail: ***@gmail.com
Facebook: ***@fcabral.com.br
Telegram: +55 (37) 99988-8868
Wickr ID: fernandocabral
WhatsApp: +55 (37) 99988-8868
Skype: fernandojosecabral
Telefone fixo: +55 (37) 3521-2183
Telefone celular: +55 (37) 99988-8868

Enquanto houver no mundo uma só pessoa sem casa ou sem alimentos,
nenhum político ou cientista poderá se gabar de nada.
Hans Lehmann
2017-06-27 13:51:19 UTC
Permalink
Hello,

look here:

8<-----------------------------------------------------------------------------------------
Public Function RemoveMultiple(aStringListe As String[]) As String[]
Dim iCount As Integer
Dim iIndex As Integer
Dim sElement As String

iIndex = 0 ' Initialisierung NICHT notwendig
While iIndex < aStringListe.Count
iCount = 0
sElement = aStringListe[iIndex]
While aStringListe.Find(sElement) <> -1
Inc iCount
aStringListe.Remove(aStringListe.Find(sElement))
Wend
If iCount Mod 2 = 1 Then
aStringListe.Add(sElement, iIndex)
Inc iIndex
Endif ' iCount Mod 2 = 1 ?
Wend

Return aStringListe

End ' RemoveMultiple(...)
8<-----------------------------------------------------------------------------------------

Hans
gambas-buch.de
n***@nothingsimple.com
2017-06-27 14:33:30 UTC
Permalink
Another very effective and simple would be:

You have your array with data
You create a new empty array.

Loop through each item in your array with data
If it's not in the new array, then add it.

Destroy the original array.
Keep the new one.
...something like (syntax may not be correct)

Public Function RemoveMultiple(a As String[]) As String[]

Dim x as Integer
Dim z as NEW STRING[]

For x = 1 to a.count()
if z.Find(a) = 0 Then z.Add(a[x])
Next

Return z

END

-Nando (Canada)




--
Open WebMail Project (http://openwebmail.org)


---------- Original Message -----------
From: Hans Lehmann <***@gambas-buch.de>
To: gambas-***@lists.sourceforge.net
Sent: Tue, 27 Jun 2017 15:51:19 +0200
Subject: Re: [Gambas-user] I need a hint on how to deleted duplicate items in a array
Post by Hans Lehmann
Hello,
8<-------------------------------------------------------------------------------
---------- Public Function RemoveMultiple(aStringListe As String[]) As String[]
Dim iCount As Integer Dim iIndex As Integer Dim sElement As String
iIndex = 0 ' Initialisierung NICHT notwendig
While iIndex < aStringListe.Count
iCount = 0
sElement = aStringListe[iIndex]
While aStringListe.Find(sElement) <> -1
Inc iCount
aStringListe.Remove(aStringListe.Find(sElement))
Wend
If iCount Mod 2 = 1 Then
aStringListe.Add(sElement, iIndex)
Inc iIndex
Endif ' iCount Mod 2 = 1 ?
Wend
Return aStringListe
End ' RemoveMultiple(...)
8<-------------------------------------------------------------------------------
----------
Hans
gambas-buch.de
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
https://lists.sourceforge.net/lists/listinfo/gambas-user
------- End of Original Message -------
Gianluigi
2017-06-27 14:52:48 UTC
Permalink
My two cents.

Public Sub Main()

Dim sSort As String[] = ["A", "B", "B", "B", "C", "D", "D", "E", "E",
"E", "E", "F"]
Dim sSame As String[] = sSort
Dim bb As New Byte[]
Dim sSingle As New String[]
Dim i, n As Integer

For i = 0 To sSort.Max
If i < sSort.Max Then
If sSort[i] = sSame[i + 1] Then
Inc n
Else
sSingle.Push(sSort[i])
bb.Push(n + 1)
n = 0
Endif
Endif
Next
sSingle.Push(sSort[sSort.Max])
bb.Push(n + 1)
For i = 0 To sSingle.Max
Print sSingle[i]
Next
For i = 0 To bb.Max
Print bb[i] & sSingle[i]
Next

End

Regards
Gianluigi
Post by n***@nothingsimple.com
You have your array with data
You create a new empty array.
Loop through each item in your array with data
If it's not in the new array, then add it.
Destroy the original array.
Keep the new one.
...something like (syntax may not be correct)
Public Function RemoveMultiple(a As String[]) As String[]
Dim x as Integer
Dim z as NEW STRING[]
For x = 1 to a.count()
if z.Find(a) = 0 Then z.Add(a[x])
Next
Return z
END
-Nando (Canada)
--
Open WebMail Project (http://openwebmail.org)
---------- Original Message -----------
Sent: Tue, 27 Jun 2017 15:51:19 +0200
Subject: Re: [Gambas-user] I need a hint on how to deleted duplicate items in a array
Post by Hans Lehmann
Hello,
8<----------------------------------------------------------
---------------------
Post by Hans Lehmann
---------- Public Function RemoveMultiple(aStringListe As String[]) As
String[]
Post by Hans Lehmann
Dim iCount As Integer Dim iIndex As Integer Dim sElement As String
iIndex = 0 ' Initialisierung NICHT notwendig
While iIndex < aStringListe.Count
iCount = 0
sElement = aStringListe[iIndex]
While aStringListe.Find(sElement) <> -1
Inc iCount
aStringListe.Remove(aStringListe.Find(sElement))
Wend
If iCount Mod 2 = 1 Then
aStringListe.Add(sElement, iIndex)
Inc iIndex
Endif ' iCount Mod 2 = 1 ?
Wend
Return aStringListe
End ' RemoveMultiple(...)
8<----------------------------------------------------------
---------------------
Post by Hans Lehmann
----------
Hans
gambas-buch.de
------------------------------------------------------------
------------------
Post by Hans Lehmann
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
https://lists.sourceforge.net/lists/listinfo/gambas-user
------- End of Original Message -------
------------------------------------------------------------
------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
https://lists.sourceforge.net/lists/listinfo/gambas-user
n***@nothingsimple.com
2017-06-27 15:59:38 UTC
Permalink
Well, there is complicated, then there is simplicity:
I tested this. Works for sorted, unsorted.
Can't be any simpler.

Public Function RemoveMultiple(a As String[]) As String[]

Dim x as Integer
Dim z as NEW STRING[]

For x = 1 to a.count()
if z.Find(a) = 0 Then z.Add(a[x])
Next

'if you want it sorted, do it here
Return z

END

' - - - - -
use it this way:

myArray = RemoveMultiple(myArray)
'the z array is now myArray.
'the original array is destroyed because there are no references.



--
Open WebMail Project (http://openwebmail.org)


---------- Original Message -----------
From: Gianluigi <***@gmail.com>
To: mailing list for gambas users <gambas-***@lists.sourceforge.net>
Sent: Tue, 27 Jun 2017 16:52:48 +0200
Subject: Re: [Gambas-user] I need a hint on how to deleted duplicate items in a array
Post by Gianluigi
My two cents.
Public Sub Main()
Dim sSort As String[] = ["A", "B", "B", "B", "C", "D", "D", "E", "E",
"E", "E", "F"]
Dim sSame As String[] = sSort
Dim bb As New Byte[]
Dim sSingle As New String[]
Dim i, n As Integer
For i = 0 To sSort.Max
If i < sSort.Max Then
If sSort[i] = sSame[i + 1] Then
Inc n
Else
sSingle.Push(sSort[i])
bb.Push(n + 1)
n = 0
Endif
Endif
Next
sSingle.Push(sSort[sSort.Max])
bb.Push(n + 1)
For i = 0 To sSingle.Max
Print sSingle[i]
Next
For i = 0 To bb.Max
Print bb[i] & sSingle[i]
Next
End
Regards
Gianluigi
Post by n***@nothingsimple.com
You have your array with data
You create a new empty array.
Loop through each item in your array with data
If it's not in the new array, then add it.
Destroy the original array.
Keep the new one.
...something like (syntax may not be correct)
Public Function RemoveMultiple(a As String[]) As String[]
Dim x as Integer
Dim z as NEW STRING[]
For x = 1 to a.count()
if z.Find(a) = 0 Then z.Add(a[x])
Next
Return z
END
-Nando (Canada)
--
Open WebMail Project (http://openwebmail.org)
---------- Original Message -----------
Sent: Tue, 27 Jun 2017 15:51:19 +0200
Subject: Re: [Gambas-user] I need a hint on how to deleted duplicate items in a array
Post by Hans Lehmann
Hello,
8<----------------------------------------------------------
---------------------
Post by Hans Lehmann
---------- Public Function RemoveMultiple(aStringListe As String[]) As
String[]
Post by Hans Lehmann
Dim iCount As Integer Dim iIndex As Integer Dim sElement As String
iIndex = 0 ' Initialisierung NICHT notwendig
While iIndex < aStringListe.Count
iCount = 0
sElement = aStringListe[iIndex]
While aStringListe.Find(sElement) <> -1
Inc iCount
aStringListe.Remove(aStringListe.Find(sElement))
Wend
If iCount Mod 2 = 1 Then
aStringListe.Add(sElement, iIndex)
Inc iIndex
Endif ' iCount Mod 2 = 1 ?
Wend
Return aStringListe
End ' RemoveMultiple(...)
8<----------------------------------------------------------
---------------------
Post by Hans Lehmann
----------
Hans
gambas-buch.de
------------------------------------------------------------
------------------
Post by Hans Lehmann
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
https://lists.sourceforge.net/lists/listinfo/gambas-user
------- End of Original Message -------
------------------------------------------------------------
------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
https://lists.sourceforge.net/lists/listinfo/gambas-user
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
https://lists.sourceforge.net/lists/listinfo/gambas-user
------- End of Original Message -------
Fernando Cabral
2017-06-27 17:23:48 UTC
Permalink
Nando

The problem with this search and destroy method without pre-sorting is the
exponentional
growth in time needed to do the job. If my math is not wrong, this is how
quickly it gets unmanageable:

Items / Comparisons needed (worst case scenario)
10 = 45
100 = 4,950
1000 = 499,500
1000 = 49,995,000

My program has to face a few thousand items, so not sorting does not seem a
good option.

Regards

- fernando
Post by n***@nothingsimple.com
I tested this. Works for sorted, unsorted.
Can't be any simpler.
Public Function RemoveMultiple(a As String[]) As String[]
Dim x as Integer
Dim z as NEW STRING[]
For x = 1 to a.count()
if z.Find(a) = 0 Then z.Add(a[x])
Next
'if you want it sorted, do it here
Return z
END
' - - - - -
myArray = RemoveMultiple(myArray)
'the z array is now myArray.
'the original array is destroyed because there are no references.
--
Open WebMail Project (http://openwebmail.org)
---------- Original Message -----------
Sent: Tue, 27 Jun 2017 16:52:48 +0200
Subject: Re: [Gambas-user] I need a hint on how to deleted duplicate items in a array
Post by Gianluigi
My two cents.
Public Sub Main()
Dim sSort As String[] = ["A", "B", "B", "B", "C", "D", "D", "E", "E",
"E", "E", "F"]
Dim sSame As String[] = sSort
Dim bb As New Byte[]
Dim sSingle As New String[]
Dim i, n As Integer
For i = 0 To sSort.Max
If i < sSort.Max Then
If sSort[i] = sSame[i + 1] Then
Inc n
Else
sSingle.Push(sSort[i])
bb.Push(n + 1)
n = 0
Endif
Endif
Next
sSingle.Push(sSort[sSort.Max])
bb.Push(n + 1)
For i = 0 To sSingle.Max
Print sSingle[i]
Next
For i = 0 To bb.Max
Print bb[i] & sSingle[i]
Next
End
Regards
Gianluigi
Post by n***@nothingsimple.com
You have your array with data
You create a new empty array.
Loop through each item in your array with data
If it's not in the new array, then add it.
Destroy the original array.
Keep the new one.
...something like (syntax may not be correct)
Public Function RemoveMultiple(a As String[]) As String[]
Dim x as Integer
Dim z as NEW STRING[]
For x = 1 to a.count()
if z.Find(a) = 0 Then z.Add(a[x])
Next
Return z
END
-Nando (Canada)
--
Open WebMail Project (http://openwebmail.org)
---------- Original Message -----------
Sent: Tue, 27 Jun 2017 15:51:19 +0200
Subject: Re: [Gambas-user] I need a hint on how to deleted duplicate
items
Post by Gianluigi
Post by n***@nothingsimple.com
in a array
Post by Hans Lehmann
Hello,
8<----------------------------------------------------------
---------------------
Post by Hans Lehmann
---------- Public Function RemoveMultiple(aStringListe As String[])
As
Post by Gianluigi
Post by n***@nothingsimple.com
String[]
Post by Hans Lehmann
Dim iCount As Integer Dim iIndex As Integer Dim sElement As
String
Post by Gianluigi
Post by n***@nothingsimple.com
Post by Hans Lehmann
iIndex = 0 ' Initialisierung NICHT notwendig
While iIndex < aStringListe.Count
iCount = 0
sElement = aStringListe[iIndex]
While aStringListe.Find(sElement) <> -1
Inc iCount
aStringListe.Remove(aStringListe.Find(sElement))
Wend
If iCount Mod 2 = 1 Then
aStringListe.Add(sElement, iIndex)
Inc iIndex
Endif ' iCount Mod 2 = 1 ?
Wend
Return aStringListe
End ' RemoveMultiple(...)
8<----------------------------------------------------------
---------------------
Post by Hans Lehmann
----------
Hans
gambas-buch.de
------------------------------------------------------------
------------------
Post by Hans Lehmann
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
https://lists.sourceforge.net/lists/listinfo/gambas-user
------- End of Original Message -------
------------------------------------------------------------
------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
https://lists.sourceforge.net/lists/listinfo/gambas-user
------------------------------------------------------------
------------------
Post by Gianluigi
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
https://lists.sourceforge.net/lists/listinfo/gambas-user
------- End of Original Message -------
------------------------------------------------------------
------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
https://lists.sourceforge.net/lists/listinfo/gambas-user
--
Fernando Cabral
Blogue: http://fernandocabral.org
Twitter: http://twitter.com/fjcabral
e-mail: ***@gmail.com
Facebook: ***@fcabral.com.br
Telegram: +55 (37) 99988-8868
Wickr ID: fernandocabral
WhatsApp: +55 (37) 99988-8868
Skype: fernandojosecabral
Telefone fixo: +55 (37) 3521-2183
Telefone celular: +55 (37) 99988-8868

Enquanto houver no mundo uma só pessoa sem casa ou sem alimentos,
nenhum político ou cientista poderá se gabar de nada.
Jussi Lahtinen
2017-06-27 18:43:50 UTC
Permalink
As Fernando stated your code is good only for small arrays. But if someone
is going to use it, here is correct implementation:

For x = 0 to a.Max
if z.Find(a[x]) = -1 Then z.Add(a[x])
Next


z.Exist() might be faster... I don't know.



Jussi
Post by n***@nothingsimple.com
I tested this. Works for sorted, unsorted.
Can't be any simpler.
Public Function RemoveMultiple(a As String[]) As String[]
Dim x as Integer
Dim z as NEW STRING[]
For x = 1 to a.count()
if z.Find(a) = 0 Then z.Add(a[x])
Next
'if you want it sorted, do it here
Return z
END
' - - - - -
myArray = RemoveMultiple(myArray)
'the z array is now myArray.
'the original array is destroyed because there are no references.
--
Open WebMail Project (http://openwebmail.org)
---------- Original Message -----------
Sent: Tue, 27 Jun 2017 16:52:48 +0200
Subject: Re: [Gambas-user] I need a hint on how to deleted duplicate items in a array
Post by Gianluigi
My two cents.
Public Sub Main()
Dim sSort As String[] = ["A", "B", "B", "B", "C", "D", "D", "E", "E",
"E", "E", "F"]
Dim sSame As String[] = sSort
Dim bb As New Byte[]
Dim sSingle As New String[]
Dim i, n As Integer
For i = 0 To sSort.Max
If i < sSort.Max Then
If sSort[i] = sSame[i + 1] Then
Inc n
Else
sSingle.Push(sSort[i])
bb.Push(n + 1)
n = 0
Endif
Endif
Next
sSingle.Push(sSort[sSort.Max])
bb.Push(n + 1)
For i = 0 To sSingle.Max
Print sSingle[i]
Next
For i = 0 To bb.Max
Print bb[i] & sSingle[i]
Next
End
Regards
Gianluigi
Post by n***@nothingsimple.com
You have your array with data
You create a new empty array.
Loop through each item in your array with data
If it's not in the new array, then add it.
Destroy the original array.
Keep the new one.
...something like (syntax may not be correct)
Public Function RemoveMultiple(a As String[]) As String[]
Dim x as Integer
Dim z as NEW STRING[]
For x = 1 to a.count()
if z.Find(a) = 0 Then z.Add(a[x])
Next
Return z
END
-Nando (Canada)
--
Open WebMail Project (http://openwebmail.org)
---------- Original Message -----------
Sent: Tue, 27 Jun 2017 15:51:19 +0200
Subject: Re: [Gambas-user] I need a hint on how to deleted duplicate
items
Post by Gianluigi
Post by n***@nothingsimple.com
in a array
Post by Hans Lehmann
Hello,
8<----------------------------------------------------------
---------------------
Post by Hans Lehmann
---------- Public Function RemoveMultiple(aStringListe As String[])
As
Post by Gianluigi
Post by n***@nothingsimple.com
String[]
Post by Hans Lehmann
Dim iCount As Integer Dim iIndex As Integer Dim sElement As
String
Post by Gianluigi
Post by n***@nothingsimple.com
Post by Hans Lehmann
iIndex = 0 ' Initialisierung NICHT notwendig
While iIndex < aStringListe.Count
iCount = 0
sElement = aStringListe[iIndex]
While aStringListe.Find(sElement) <> -1
Inc iCount
aStringListe.Remove(aStringListe.Find(sElement))
Wend
If iCount Mod 2 = 1 Then
aStringListe.Add(sElement, iIndex)
Inc iIndex
Endif ' iCount Mod 2 = 1 ?
Wend
Return aStringListe
End ' RemoveMultiple(...)
8<----------------------------------------------------------
---------------------
Post by Hans Lehmann
----------
Hans
gambas-buch.de
------------------------------------------------------------
------------------
Post by Hans Lehmann
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
https://lists.sourceforge.net/lists/listinfo/gambas-user
------- End of Original Message -------
------------------------------------------------------------
------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
https://lists.sourceforge.net/lists/listinfo/gambas-user
------------------------------------------------------------
------------------
Post by Gianluigi
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
https://lists.sourceforge.net/lists/listinfo/gambas-user
------- End of Original Message -------
------------------------------------------------------------
------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
https://lists.sourceforge.net/lists/listinfo/gambas-user
Fernando Cabral
2017-06-27 18:51:43 UTC
Permalink
Post by Jussi Lahtinen
As Fernando stated your code is good only for small arrays. But if someone
No, Jussi, I didn't say it is good only for small arrays. I said some
suggestions apply only
to small arrays because if I have to traverse the array again and again,
advancing one item at a time, and coming back to the next item, to repeat
it one more time, then time requirement will grow exponentially. This makes
most suggestion unusable for large arrays. The arrays I have might grow to
thousands and thousands os items.

Regards

- fernando
Post by Jussi Lahtinen
As Fernando stated your code is good only for small arrays. But if someone
For x = 0 to a.Max
if z.Find(a[x]) = -1 Then z.Add(a[x])
Next
z.Exist() might be faster... I don't know.
Jussi
Post by n***@nothingsimple.com
I tested this. Works for sorted, unsorted.
Can't be any simpler.
Public Function RemoveMultiple(a As String[]) As String[]
Dim x as Integer
Dim z as NEW STRING[]
For x = 1 to a.count()
if z.Find(a) = 0 Then z.Add(a[x])
Next
'if you want it sorted, do it here
Return z
END
' - - - - -
myArray = RemoveMultiple(myArray)
'the z array is now myArray.
'the original array is destroyed because there are no references.
--
Open WebMail Project (http://openwebmail.org)
---------- Original Message -----------
Sent: Tue, 27 Jun 2017 16:52:48 +0200
Subject: Re: [Gambas-user] I need a hint on how to deleted duplicate
items
Post by n***@nothingsimple.com
in a array
Post by Gianluigi
My two cents.
Public Sub Main()
Dim sSort As String[] = ["A", "B", "B", "B", "C", "D", "D", "E", "E",
"E", "E", "F"]
Dim sSame As String[] = sSort
Dim bb As New Byte[]
Dim sSingle As New String[]
Dim i, n As Integer
For i = 0 To sSort.Max
If i < sSort.Max Then
If sSort[i] = sSame[i + 1] Then
Inc n
Else
sSingle.Push(sSort[i])
bb.Push(n + 1)
n = 0
Endif
Endif
Next
sSingle.Push(sSort[sSort.Max])
bb.Push(n + 1)
For i = 0 To sSingle.Max
Print sSingle[i]
Next
For i = 0 To bb.Max
Print bb[i] & sSingle[i]
Next
End
Regards
Gianluigi
Post by n***@nothingsimple.com
You have your array with data
You create a new empty array.
Loop through each item in your array with data
If it's not in the new array, then add it.
Destroy the original array.
Keep the new one.
...something like (syntax may not be correct)
Public Function RemoveMultiple(a As String[]) As String[]
Dim x as Integer
Dim z as NEW STRING[]
For x = 1 to a.count()
if z.Find(a) = 0 Then z.Add(a[x])
Next
Return z
END
-Nando (Canada)
--
Open WebMail Project (http://openwebmail.org)
---------- Original Message -----------
Sent: Tue, 27 Jun 2017 15:51:19 +0200
Subject: Re: [Gambas-user] I need a hint on how to deleted duplicate
items
Post by Gianluigi
Post by n***@nothingsimple.com
in a array
Post by Hans Lehmann
Hello,
8<----------------------------------------------------------
---------------------
Post by Hans Lehmann
---------- Public Function RemoveMultiple(aStringListe As String[])
As
Post by Gianluigi
Post by n***@nothingsimple.com
String[]
Post by Hans Lehmann
Dim iCount As Integer Dim iIndex As Integer Dim sElement As
String
Post by Gianluigi
Post by n***@nothingsimple.com
Post by Hans Lehmann
iIndex = 0 ' Initialisierung NICHT notwendig
While iIndex < aStringListe.Count
iCount = 0
sElement = aStringListe[iIndex]
While aStringListe.Find(sElement) <> -1
Inc iCount
aStringListe.Remove(aStringListe.Find(sElement))
Wend
If iCount Mod 2 = 1 Then
aStringListe.Add(sElement, iIndex)
Inc iIndex
Endif ' iCount Mod 2 = 1 ?
Wend
Return aStringListe
End ' RemoveMultiple(...)
8<----------------------------------------------------------
---------------------
Post by Hans Lehmann
----------
Hans
gambas-buch.de
------------------------------------------------------------
------------------
Post by Hans Lehmann
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
https://lists.sourceforge.net/lists/listinfo/gambas-user
------- End of Original Message -------
------------------------------------------------------------
------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
https://lists.sourceforge.net/lists/listinfo/gambas-user
------------------------------------------------------------
------------------
Post by Gianluigi
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
https://lists.sourceforge.net/lists/listinfo/gambas-user
------- End of Original Message -------
------------------------------------------------------------
------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
https://lists.sourceforge.net/lists/listinfo/gambas-user
------------------------------------------------------------
------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
https://lists.sourceforge.net/lists/listinfo/gambas-user
--
Fernando Cabral
Blogue: http://fernandocabral.org
Twitter: http://twitter.com/fjcabral
e-mail: ***@gmail.com
Facebook: ***@fcabral.com.br
Telegram: +55 (37) 99988-8868
Wickr ID: fernandocabral
WhatsApp: +55 (37) 99988-8868
Skype: fernandojosecabral
Telefone fixo: +55 (37) 3521-2183
Telefone celular: +55 (37) 99988-8868

Enquanto houver no mundo uma só pessoa sem casa ou sem alimentos,
nenhum político ou cientista poderá se gabar de nada.
Fabien Bodard
2017-06-30 10:44:07 UTC
Permalink
The best way is the nando one ... at least for gambas.

As you have not to matter about what is the index value or the order,
the walk ahead option is the better.


Then Fernando ... for big, big things... I think you need to use a DB.
Or a native language.... maybe a sqlite memory structure can be good.
Post by Fernando Cabral
Post by Jussi Lahtinen
As Fernando stated your code is good only for small arrays. But if someone
No, Jussi, I didn't say it is good only for small arrays. I said some
suggestions apply only
to small arrays because if I have to traverse the array again and again,
advancing one item at a time, and coming back to the next item, to repeat
it one more time, then time requirement will grow exponentially. This makes
most suggestion unusable for large arrays. The arrays I have might grow to
thousands and thousands os items.
Regards
- fernando
Post by Jussi Lahtinen
As Fernando stated your code is good only for small arrays. But if someone
For x = 0 to a.Max
if z.Find(a[x]) = -1 Then z.Add(a[x])
Next
z.Exist() might be faster... I don't know.
Jussi
Post by n***@nothingsimple.com
I tested this. Works for sorted, unsorted.
Can't be any simpler.
Public Function RemoveMultiple(a As String[]) As String[]
Dim x as Integer
Dim z as NEW STRING[]
For x = 1 to a.count()
if z.Find(a) = 0 Then z.Add(a[x])
Next
'if you want it sorted, do it here
Return z
END
' - - - - -
myArray = RemoveMultiple(myArray)
'the z array is now myArray.
'the original array is destroyed because there are no references.
--
Open WebMail Project (http://openwebmail.org)
---------- Original Message -----------
Sent: Tue, 27 Jun 2017 16:52:48 +0200
Subject: Re: [Gambas-user] I need a hint on how to deleted duplicate
items
Post by n***@nothingsimple.com
in a array
Post by Gianluigi
My two cents.
Public Sub Main()
Dim sSort As String[] = ["A", "B", "B", "B", "C", "D", "D", "E", "E",
"E", "E", "F"]
Dim sSame As String[] = sSort
Dim bb As New Byte[]
Dim sSingle As New String[]
Dim i, n As Integer
For i = 0 To sSort.Max
If i < sSort.Max Then
If sSort[i] = sSame[i + 1] Then
Inc n
Else
sSingle.Push(sSort[i])
bb.Push(n + 1)
n = 0
Endif
Endif
Next
sSingle.Push(sSort[sSort.Max])
bb.Push(n + 1)
For i = 0 To sSingle.Max
Print sSingle[i]
Next
For i = 0 To bb.Max
Print bb[i] & sSingle[i]
Next
End
Regards
Gianluigi
Post by n***@nothingsimple.com
You have your array with data
You create a new empty array.
Loop through each item in your array with data
If it's not in the new array, then add it.
Destroy the original array.
Keep the new one.
...something like (syntax may not be correct)
Public Function RemoveMultiple(a As String[]) As String[]
Dim x as Integer
Dim z as NEW STRING[]
For x = 1 to a.count()
if z.Find(a) = 0 Then z.Add(a[x])
Next
Return z
END
-Nando (Canada)
--
Open WebMail Project (http://openwebmail.org)
---------- Original Message -----------
Sent: Tue, 27 Jun 2017 15:51:19 +0200
Subject: Re: [Gambas-user] I need a hint on how to deleted duplicate
items
Post by Gianluigi
Post by n***@nothingsimple.com
in a array
Post by Hans Lehmann
Hello,
8<----------------------------------------------------------
---------------------
Post by Hans Lehmann
---------- Public Function RemoveMultiple(aStringListe As String[])
As
Post by Gianluigi
Post by n***@nothingsimple.com
String[]
Post by Hans Lehmann
Dim iCount As Integer Dim iIndex As Integer Dim sElement As
String
Post by Gianluigi
Post by n***@nothingsimple.com
Post by Hans Lehmann
iIndex = 0 ' Initialisierung NICHT notwendig
While iIndex < aStringListe.Count
iCount = 0
sElement = aStringListe[iIndex]
While aStringListe.Find(sElement) <> -1
Inc iCount
aStringListe.Remove(aStringListe.Find(sElement))
Wend
If iCount Mod 2 = 1 Then
aStringListe.Add(sElement, iIndex)
Inc iIndex
Endif ' iCount Mod 2 = 1 ?
Wend
Return aStringListe
End ' RemoveMultiple(...)
8<----------------------------------------------------------
---------------------
Post by Hans Lehmann
----------
Hans
gambas-buch.de
------------------------------------------------------------
------------------
Post by Hans Lehmann
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
https://lists.sourceforge.net/lists/listinfo/gambas-user
------- End of Original Message -------
------------------------------------------------------------
------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
https://lists.sourceforge.net/lists/listinfo/gambas-user
------------------------------------------------------------
------------------
Post by Gianluigi
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
https://lists.sourceforge.net/lists/listinfo/gambas-user
------- End of Original Message -------
------------------------------------------------------------
------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
https://lists.sourceforge.net/lists/listinfo/gambas-user
------------------------------------------------------------
------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
https://lists.sourceforge.net/lists/listinfo/gambas-user
--
Fernando Cabral
Blogue: http://fernandocabral.org
Twitter: http://twitter.com/fjcabral
Telegram: +55 (37) 99988-8868
Wickr ID: fernandocabral
WhatsApp: +55 (37) 99988-8868
Skype: fernandojosecabral
Telefone fixo: +55 (37) 3521-2183
Telefone celular: +55 (37) 99988-8868
Enquanto houver no mundo uma só pessoa sem casa ou sem alimentos,
nenhum político ou cientista poderá se gabar de nada.
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
https://lists.sourceforge.net/lists/listinfo/gambas-user
--
Fabien Bodard
Fernando Cabral
2017-06-30 11:20:55 UTC
Permalink
Post by Fabien Bodard
The best way is the nando one ... at least for gambas.
As you have not to matter about what is the index value or the order,
the walk ahead option is the better.
Then Fernando ... for big, big things... I think you need to use a DB.
Or a native language.... maybe a sqlite memory structure can be good.
Fabien, since this is a one-time only thing, I don't think I'd be better
off witha database.
Basically, I read a text file an then break it down into words, sentences
and paragraphs.
Next I count the items in each array (words, sentences paragraphs).
Array.count works wonderfully.
After that, have to eliminate the duplicate words (Array.words). But in
doing it, al also have to count
how many times each word appeared.

Finally I sort the Array.Sentences and the Array.Paragraphs by size
(string.len()). The Array.WOrds are
sorted by count + lenght. This is all woring good.

So, my quest is for the fastest way do eliminate the words duplicates while
I count them.
For the time being, here is a working solution based on system' s sort |
uniq:

Here is one of the versions I have been using:

Exec ["/usr/bin/uniq", "Unsorted.txt", "Sorted.srt2"] Wait
Exec ["/usr/bin/uniq", "-ci", "SortedWords.srt2", SortedWords.srt3"] Wait
Exec ["/usr/bin/sort", "-bnr", SortedWords.srt3] To UniqWords

WordArray = split (UniqWords, "\n")

So, I end up with the result I want. It's effective. Now, it would be more
elegant If I could do the same
with Gambas. Of course, the sorting would be easy with the builting
WordArray.sort ().
But how about te '"/usr/bin/uniq", "-ci" ...' part?

Regards

- fernando
Post by Fabien Bodard
Post by Fernando Cabral
Post by Jussi Lahtinen
As Fernando stated your code is good only for small arrays. But if
someone
Post by Fernando Cabral
Post by Jussi Lahtinen
For x = 0 to a.Max
if z.Find(a[x]) = -1 Then z.Add(a[x])
Next
z.Exist() might be faster... I don't know.
Jussi
Post by n***@nothingsimple.com
I tested this. Works for sorted, unsorted.
Can't be any simpler.
Public Function RemoveMultiple(a As String[]) As String[]
Dim x as Integer
Dim z as NEW STRING[]
For x = 1 to a.count()
if z.Find(a) = 0 Then z.Add(a[x])
Next
'if you want it sorted, do it here
Return z
END
' - - - - -
myArray = RemoveMultiple(myArray)
'the z array is now myArray.
'the original array is destroyed because there are no references.
--
Open WebMail Project (http://openwebmail.org)
---------- Original Message -----------
Sent: Tue, 27 Jun 2017 16:52:48 +0200
Subject: Re: [Gambas-user] I need a hint on how to deleted duplicate
items
Post by n***@nothingsimple.com
in a array
Post by Gianluigi
My two cents.
Public Sub Main()
Dim sSort As String[] = ["A", "B", "B", "B", "C", "D", "D", "E",
"E",
Post by Fernando Cabral
Post by Jussi Lahtinen
Post by n***@nothingsimple.com
Post by Gianluigi
"E", "E", "F"]
Dim sSame As String[] = sSort
Dim bb As New Byte[]
Dim sSingle As New String[]
Dim i, n As Integer
For i = 0 To sSort.Max
If i < sSort.Max Then
If sSort[i] = sSame[i + 1] Then
Inc n
Else
sSingle.Push(sSort[i])
bb.Push(n + 1)
n = 0
Endif
Endif
Next
sSingle.Push(sSort[sSort.Max])
bb.Push(n + 1)
For i = 0 To sSingle.Max
Print sSingle[i]
Next
For i = 0 To bb.Max
Print bb[i] & sSingle[i]
Next
End
Regards
Gianluigi
Post by n***@nothingsimple.com
You have your array with data
You create a new empty array.
Loop through each item in your array with data
If it's not in the new array, then add it.
Destroy the original array.
Keep the new one.
...something like (syntax may not be correct)
Public Function RemoveMultiple(a As String[]) As String[]
Dim x as Integer
Dim z as NEW STRING[]
For x = 1 to a.count()
if z.Find(a) = 0 Then z.Add(a[x])
Next
Return z
END
-Nando (Canada)
--
Open WebMail Project (http://openwebmail.org)
---------- Original Message -----------
Sent: Tue, 27 Jun 2017 15:51:19 +0200
Subject: Re: [Gambas-user] I need a hint on how to deleted
duplicate
Post by Fernando Cabral
Post by Jussi Lahtinen
Post by n***@nothingsimple.com
items
Post by Gianluigi
Post by n***@nothingsimple.com
in a array
Post by Hans Lehmann
Hello,
8<----------------------------------------------------------
---------------------
Post by Hans Lehmann
---------- Public Function RemoveMultiple(aStringListe As
String[])
Post by Fernando Cabral
Post by Jussi Lahtinen
Post by n***@nothingsimple.com
As
Post by Gianluigi
Post by n***@nothingsimple.com
String[]
Post by Hans Lehmann
Dim iCount As Integer Dim iIndex As Integer Dim sElement As
String
Post by Gianluigi
Post by n***@nothingsimple.com
Post by Hans Lehmann
iIndex = 0 ' Initialisierung NICHT notwendig
While iIndex < aStringListe.Count
iCount = 0
sElement = aStringListe[iIndex]
While aStringListe.Find(sElement) <> -1
Inc iCount
aStringListe.Remove(aStringListe.Find(sElement))
Wend
If iCount Mod 2 = 1 Then
aStringListe.Add(sElement, iIndex)
Inc iIndex
Endif ' iCount Mod 2 = 1 ?
Wend
Return aStringListe
End ' RemoveMultiple(...)
8<----------------------------------------------------------
---------------------
Post by Hans Lehmann
----------
Hans
gambas-buch.de
------------------------------------------------------------
------------------
Post by Hans Lehmann
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
https://lists.sourceforge.net/lists/listinfo/gambas-user
------- End of Original Message -------
------------------------------------------------------------
------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
https://lists.sourceforge.net/lists/listinfo/gambas-user
------------------------------------------------------------
------------------
Post by Gianluigi
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
https://lists.sourceforge.net/lists/listinfo/gambas-user
------- End of Original Message -------
------------------------------------------------------------
------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
https://lists.sourceforge.net/lists/listinfo/gambas-user
------------------------------------------------------------
------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
https://lists.sourceforge.net/lists/listinfo/gambas-user
--
Fernando Cabral
Blogue: http://fernandocabral.org
Twitter: http://twitter.com/fjcabral
Telegram: +55 (37) 99988-8868
Wickr ID: fernandocabral
WhatsApp: +55 (37) 99988-8868
Skype: fernandojosecabral
Telefone fixo: +55 (37) 3521-2183
Telefone celular: +55 (37) 99988-8868
Enquanto houver no mundo uma só pessoa sem casa ou sem alimentos,
nenhum político ou cientista poderá se gabar de nada.
------------------------------------------------------------
------------------
Post by Fernando Cabral
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
https://lists.sourceforge.net/lists/listinfo/gambas-user
--
Fabien Bodard
--
Fernando Cabral
Blogue: http://fernandocabral.org
Twitter: http://twitter.com/fjcabral
e-mail: ***@gmail.com
Facebook: ***@fcabral.com.br
Telegram: +55 (37) 99988-8868
Wickr ID: fernandocabral
WhatsApp: +55 (37) 99988-8868
Skype: fernandojosecabral
Telefone fixo: +55 (37) 3521-2183
Telefone celular: +55 (37) 99988-8868

Enquanto houver no mundo uma só pessoa sem casa ou sem alimentos,
nenhum político ou cientista poderá se gabar de nada.
ML
2017-06-30 12:32:50 UTC
Permalink
Post by Fernando Cabral
Post by Fabien Bodard
The best way is the nando one ... at least for gambas.
As you have not to matter about what is the index value or the order,
the walk ahead option is the better.
Then Fernando ... for big, big things... I think you need to use a DB.
Or a native language.... maybe a sqlite memory structure can be good.
Fabien, since this is a one-time only thing, I don't think I'd be
better off witha database.
Basically, I read a text file an then break it down into words,
sentences and paragraphs.
Next I count the items in each array (words, sentences paragraphs).
Array.count works wonderfully.
After that, have to eliminate the duplicate words (Array.words). But
in doing it, al also have to count how many times each word appeared.
Finally I sort the Array.Sentences and the Array.Paragraphs by size
(string.len()). The Array.WOrds are sorted by count + lenght. This is
all woring good.
So, my quest is for the fastest way do eliminate the words duplicates
while I count them.
For the time being, here is a working solution based on system' s sort
Exec ["/usr/bin/uniq", "Unsorted.txt", "Sorted.srt2"] Wait
Exec ["/usr/bin/uniq", "-ci", "SortedWords.srt2", SortedWords.srt3"] Wait
Exec ["/usr/bin/sort", "-bnr", SortedWords.srt3] To UniqWords
WordArray = split (UniqWords, "\n")
So, I end up with the result I want. It's effective. Now, it would be
more elegant If I could do the same with Gambas. Of course, the
sorting would be easy with the builting WordArray.sort ().
But how about te '"/usr/bin/uniq", "-ci" ...' part?
Regards
- fernando
Not tried, but for the duplicate count, what about iterating the word
array copying each word to a keyed collection?
For any new given word, the value (item) added would be 1 (integer), and
the key would be UCase(word$).
If an error happens, the handler would just Inc the keyed Item value. So
(please note my syntax may be slightly off, especially in If Error):

Public Function CountWordsInArray(sortedWordArray As String[]) As Collection

Dim wordCount As New Collection
Dim currentWord As String = Null

For Each currentWord In sortedWordArray

Try wordCount.Add(1, UCase$(currentWord))
If Error Then
Inc wordCount(UCase$(currentWord))
Error.Clear 'Is this needed, or even correct?
End If

Next

Return (wordCollection)

End

The returned collection should be sorted if the array was, and for each
item you will have a numeric count as the item and the word as the key.
Hope it helps,
zxMarce.
Tobias Boege
2017-06-30 13:05:24 UTC
Permalink
Post by Fernando Cabral
Post by Fabien Bodard
The best way is the nando one ... at least for gambas.
As you have not to matter about what is the index value or the order,
the walk ahead option is the better.
Then Fernando ... for big, big things... I think you need to use a DB.
Or a native language.... maybe a sqlite memory structure can be good.
Fabien, since this is a one-time only thing, I don't think I'd be better
off witha database.
Basically, I read a text file an then break it down into words, sentences
and paragraphs.
Next I count the items in each array (words, sentences paragraphs).
Array.count works wonderfully.
After that, have to eliminate the duplicate words (Array.words). But in
doing it, al also have to count
how many times each word appeared.
Finally I sort the Array.Sentences and the Array.Paragraphs by size
(string.len()). The Array.WOrds are
sorted by count + lenght. This is all woring good.
So, my quest is for the fastest way do eliminate the words duplicates while
I count them.
For the time being, here is a working solution based on system' s sort |
Exec ["/usr/bin/uniq", "Unsorted.txt", "Sorted.srt2"] Wait
Exec ["/usr/bin/uniq", "-ci", "SortedWords.srt2", SortedWords.srt3"] Wait
Exec ["/usr/bin/sort", "-bnr", SortedWords.srt3] To UniqWords
Are those temporary files? You can avoid those by piping your data into the
processes and reading their output directly. Otherwise the Temp$() function
gives you better temporary files.
Post by Fernando Cabral
WordArray = split (UniqWords, "\n")
So, I end up with the result I want. It's effective. Now, it would be more
elegant If I could do the same
with Gambas. Of course, the sorting would be easy with the builting
WordArray.sort ().
But how about te '"/usr/bin/uniq", "-ci" ...' part?
I feel like my other mail answered this, but I can give you another version
of that routine (which I said I would leave as an exercise to you):

' Remove duplicates in an array like "uniq -ci". String comparison is
' case insensitive. The i-th entry in the returned array counts how many
' times aStrings[i] (in the de-duplicated array) was present in the input.
' The data in ~aStrings~ is overridden. Assumes the array is sorted.
Private Function Uniq(aStrings As String[]) As Integer[]
Dim iSrc, iLast As Integer
Dim aCount As New Integer[](aStrings.Count)

If Not aStrings.Count Then Return []
iLast = 0
aCount[iLast] = 1
For iSrc = 1 To aStrings.Max
If String.Comp(aStrings[iSrc], aStrings[iLast], gb.IgnoreCase) Then
Inc iLast
aStrings[iLast] = aStrings[iSrc]
aCount[iLast] = 1
Else
Inc aCount[iLast]
Endif
Next

' Now shrink the arrays to the memory they actually need
aStrings.Resize(iLast + 1)
aCount.Resize(iLast + 1)
Return aCount
End

What, in my opinion, is at least theoretically better here than the other
proposed solutions is that it runs in linear time, while nando's is
quadratic[*]. (Of course, if you sort beforehand, it will become n*log(n),
which is still better than quadratic.)

Attached is a test script with some words. It runs the sort + uniq utilities
first and then Array.Sort() + the Uniq() function above. The program then
prints the *diff* between the two outputs. I get an empty diff, meaning that
my Gambas routines produce exactly the same output as the shell utilities.

Regards,
Tobi

[*] He calls array functions Add() and Find() inside a For loop that runs
over an array of size n. Adding elements to an array or searching an
array have themselves worst-case linear complexity, giving quadratic
overall. My implementation reserves some more space in advance to
avoid calling Add() in a loop. Since the array is sorted, we can go
without Find(), too. Actually, as you may know, adding an element to
the end of an array can be implemented in amortized constant time
(as C++'s std::vector does), by wasting space, but AFAICS Gambas
doesn't do this, but I could be wrong.
--
"There's an old saying: Don't change anything... ever!" -- Mr. Monk
Gianluigi
2017-06-30 14:57:56 UTC
Permalink
What was wrong in my example which meant this?

Public Sub Main()

Dim sSort As String[] = ["A", "B", "B", "B", "C", "D", "D", "E", "E",
"E", "E", "F"]
Dim s As String

For Each s In ReturnArrays(sSort, 0)
Print s
Next
For Each s In ReturnArrays(sSort, -1)
Print s
Next

End

Private Function ReturnArrays(SortedArray As String[], withNumber As
Boolean) As String[]

Dim sSingle, sWithNumber As New String[]
Dim i, n As Integer

For i = 0 To SortedArray.Max
' You can avoid with Tobias's trick (For i = 1 To ...)
If i < SortedArray.Max Then
If SortedArray[i] = SortedArray[i + 1] Then
Inc n
Else
Inc n
sSingle.Push(SortedArray[i])
sWithNumber.Push(n & SortedArray[i])
n = 0
Endif
Endif
Next
Inc n
sSingle.Push(SortedArray[SortedArray.Max])
sWithNumber.Push(n & SortedArray[SortedArray.Max])
If withNumber Then
Return sWithNumber
Else
Return sSingle
Endif

End

Regards
Gianluigi
Post by Tobias Boege
Post by Fernando Cabral
Post by Fabien Bodard
The best way is the nando one ... at least for gambas.
As you have not to matter about what is the index value or the order,
the walk ahead option is the better.
Then Fernando ... for big, big things... I think you need to use a DB.
Or a native language.... maybe a sqlite memory structure can be good.
Fabien, since this is a one-time only thing, I don't think I'd be better
off witha database.
Basically, I read a text file an then break it down into words, sentences
and paragraphs.
Next I count the items in each array (words, sentences paragraphs).
Array.count works wonderfully.
After that, have to eliminate the duplicate words (Array.words). But in
doing it, al also have to count
how many times each word appeared.
Finally I sort the Array.Sentences and the Array.Paragraphs by size
(string.len()). The Array.WOrds are
sorted by count + lenght. This is all woring good.
So, my quest is for the fastest way do eliminate the words duplicates
while
Post by Fernando Cabral
I count them.
For the time being, here is a working solution based on system' s sort |
Exec ["/usr/bin/uniq", "Unsorted.txt", "Sorted.srt2"] Wait
Exec ["/usr/bin/uniq", "-ci", "SortedWords.srt2", SortedWords.srt3"]
Wait
Post by Fernando Cabral
Exec ["/usr/bin/sort", "-bnr", SortedWords.srt3] To UniqWords
Are those temporary files? You can avoid those by piping your data into the
processes and reading their output directly. Otherwise the Temp$() function
gives you better temporary files.
Post by Fernando Cabral
WordArray = split (UniqWords, "\n")
So, I end up with the result I want. It's effective. Now, it would be
more
Post by Fernando Cabral
elegant If I could do the same
with Gambas. Of course, the sorting would be easy with the builting
WordArray.sort ().
But how about te '"/usr/bin/uniq", "-ci" ...' part?
I feel like my other mail answered this, but I can give you another version
' Remove duplicates in an array like "uniq -ci". String comparison is
' case insensitive. The i-th entry in the returned array counts how many
' times aStrings[i] (in the de-duplicated array) was present in the input.
' The data in ~aStrings~ is overridden. Assumes the array is sorted.
Private Function Uniq(aStrings As String[]) As Integer[]
Dim iSrc, iLast As Integer
Dim aCount As New Integer[](aStrings.Count)
If Not aStrings.Count Then Return []
iLast = 0
aCount[iLast] = 1
For iSrc = 1 To aStrings.Max
If String.Comp(aStrings[iSrc], aStrings[iLast], gb.IgnoreCase) Then
Inc iLast
aStrings[iLast] = aStrings[iSrc]
aCount[iLast] = 1
Else
Inc aCount[iLast]
Endif
Next
' Now shrink the arrays to the memory they actually need
aStrings.Resize(iLast + 1)
aCount.Resize(iLast + 1)
Return aCount
End
What, in my opinion, is at least theoretically better here than the other
proposed solutions is that it runs in linear time, while nando's is
quadratic[*]. (Of course, if you sort beforehand, it will become n*log(n),
which is still better than quadratic.)
Attached is a test script with some words. It runs the sort + uniq utilities
first and then Array.Sort() + the Uniq() function above. The program then
prints the *diff* between the two outputs. I get an empty diff, meaning that
my Gambas routines produce exactly the same output as the shell utilities.
Regards,
Tobi
[*] He calls array functions Add() and Find() inside a For loop that runs
over an array of size n. Adding elements to an array or searching an
array have themselves worst-case linear complexity, giving quadratic
overall. My implementation reserves some more space in advance to
avoid calling Add() in a loop. Since the array is sorted, we can go
without Find(), too. Actually, as you may know, adding an element to
the end of an array can be implemented in amortized constant time
(as C++'s std::vector does), by wasting space, but AFAICS Gambas
doesn't do this, but I could be wrong.
--
"There's an old saying: Don't change anything... ever!" -- Mr. Monk
------------------------------------------------------------
------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
https://lists.sourceforge.net/lists/listinfo/gambas-user
Tobias Boege
2017-06-30 15:21:49 UTC
Permalink
Post by Gianluigi
What was wrong in my example which meant this?
Public Sub Main()
Dim sSort As String[] = ["A", "B", "B", "B", "C", "D", "D", "E", "E",
"E", "E", "F"]
Dim s As String
For Each s In ReturnArrays(sSort, 0)
Print s
Next
For Each s In ReturnArrays(sSort, -1)
Print s
Next
End
Private Function ReturnArrays(SortedArray As String[], withNumber As
Boolean) As String[]
Dim sSingle, sWithNumber As New String[]
Dim i, n As Integer
For i = 0 To SortedArray.Max
' You can avoid with Tobias's trick (For i = 1 To ...)
If i < SortedArray.Max Then
If SortedArray[i] = SortedArray[i + 1] Then
Inc n
Else
Inc n
sSingle.Push(SortedArray[i])
sWithNumber.Push(n & SortedArray[i])
n = 0
Endif
Endif
Next
Inc n
sSingle.Push(SortedArray[SortedArray.Max])
sWithNumber.Push(n & SortedArray[SortedArray.Max])
If withNumber Then
Return sWithNumber
Else
Return sSingle
Endif
End
I wouldn't say there is anything *wrong* with it, but it also has quadratic
worst-case running time. You use String[].Push() which is just another name
for String[].Add(). Adding an element to an array (the straightforward way)
is done by extending the space of that array by one further element and
storing the value there. But extending the space of an array could potentially
require you to copy the whole array somewhere else (where you have enough
free memory at the end of the array to enlarge it). Doing worst-case analysis,
we have to assume that this bad case always occurs.

If you fill an array with n values, e.g.

Dim a As New Integer[]
For i = 1 To n
a.Add(i)
Next

then you loop n times and in the i-th iteration there will be already
i-many elements in your array. Adding one further element to it will,
in the worst case, require i copy operations to be performed. 9-year-old
C.F. Gauss will tell you that the amount of store operations is about n^2.

And your function does two jobs simultaneously but only returns the result
of one of the jobs. The output you get is only worth half the time you spent.

Regards,
Tobi
--
"There's an old saying: Don't change anything... ever!" -- Mr. Monk
Gianluigi
2017-06-30 15:44:04 UTC
Permalink
Post by Tobias Boege
I wouldn't say there is anything *wrong* with it, but it also has quadratic
worst-case running time. You use String[].Push() which is just another name
for String[].Add(). Adding an element to an array (the straightforward way)
is done by extending the space of that array by one further element and
storing the value there. But extending the space of an array could potentially
require you to copy the whole array somewhere else (where you have enough
free memory at the end of the array to enlarge it). Doing worst-case analysis,
we have to assume that this bad case always occurs.
If you fill an array with n values, e.g.
Dim a As New Integer[]
For i = 1 To n
a.Add(i)
Next
then you loop n times and in the i-th iteration there will be already
i-many elements in your array. Adding one further element to it will,
in the worst case, require i copy operations to be performed. 9-year-old
C.F. Gauss will tell you that the amount of store operations is about n^2.
Tobias you are always kind and thank you very much.
Is possible for you to explain this more elementarily, for me (a poorly
educated boy :-) )
Post by Tobias Boege
And your function does two jobs simultaneously but only returns the result
of one of the jobs. The output you get is only worth half the time you spent.
I did two functions in one, just to save space, this is a simple example.
:-)

Regards
Gianluigi
Gianluigi
2017-06-30 15:58:36 UTC
Permalink
Sorry Tobias,
other explanations are not necessary.
I would not be able to understand :-(
I accept what you already explained to me as a dogma and I will try to put
it into practice by copying your code :-).

Thanks again.

Gianluigi
Post by Gianluigi
Post by Tobias Boege
I wouldn't say there is anything *wrong* with it, but it also has quadratic
worst-case running time. You use String[].Push() which is just another name
for String[].Add(). Adding an element to an array (the straightforward way)
is done by extending the space of that array by one further element and
storing the value there. But extending the space of an array could potentially
require you to copy the whole array somewhere else (where you have enough
free memory at the end of the array to enlarge it). Doing worst-case analysis,
we have to assume that this bad case always occurs.
If you fill an array with n values, e.g.
Dim a As New Integer[]
For i = 1 To n
a.Add(i)
Next
then you loop n times and in the i-th iteration there will be already
i-many elements in your array. Adding one further element to it will,
in the worst case, require i copy operations to be performed. 9-year-old
C.F. Gauss will tell you that the amount of store operations is about n^2.
Tobias you are always kind and thank you very much.
Is possible for you to explain this more elementarily, for me (a poorly
educated boy :-) )
Post by Tobias Boege
And your function does two jobs simultaneously but only returns the result
of one of the jobs. The output you get is only worth half the time you spent.
I did two functions in one, just to save space, this is a simple example.
:-)
Regards
Gianluigi
Gianluigi
2017-06-30 18:10:18 UTC
Permalink
Just for curiosity, on my computer, my function (double) processes 10
million strings (first and last name) in about 3 seconds.
Very naif measurement using Timers and a limited number of names and
surnames eg Willy Weber has come up 11051 times

To demonstrate the goodness of Tobias' arguments, about 1 million 3 cents a
second I really understood (I hope) what he wanted to say.

Sorry my response times but today my modem works worse than my brain.

Regards
Gianluigi
Post by Gianluigi
Sorry Tobias,
other explanations are not necessary.
I would not be able to understand :-(
I accept what you already explained to me as a dogma and I will try to put
it into practice by copying your code :-).
Thanks again.
Gianluigi
Post by Gianluigi
Post by Tobias Boege
I wouldn't say there is anything *wrong* with it, but it also has quadratic
worst-case running time. You use String[].Push() which is just another name
for String[].Add(). Adding an element to an array (the straightforward way)
is done by extending the space of that array by one further element and
storing the value there. But extending the space of an array could potentially
require you to copy the whole array somewhere else (where you have enough
free memory at the end of the array to enlarge it). Doing worst-case analysis,
we have to assume that this bad case always occurs.
If you fill an array with n values, e.g.
Dim a As New Integer[]
For i = 1 To n
a.Add(i)
Next
then you loop n times and in the i-th iteration there will be already
i-many elements in your array. Adding one further element to it will,
in the worst case, require i copy operations to be performed. 9-year-old
C.F. Gauss will tell you that the amount of store operations is about n^2.
Tobias you are always kind and thank you very much.
Is possible for you to explain this more elementarily, for me (a poorly
educated boy :-) )
Post by Tobias Boege
And your function does two jobs simultaneously but only returns the result
of one of the jobs. The output you get is only worth half the time you spent.
I did two functions in one, just to save space, this is a simple example.
:-)
Regards
Gianluigi
Fernando Cabral
2017-07-01 03:18:42 UTC
Permalink
I thank you guys for the hints on counting and eliminating duplicates. In
the end, I resorted to something that is very simple and does the trick in
three steps. In the first step I sort the array.
In the second step I count the number of occurrences and prepend it to the
word itself (with a separator). In the third step I sort the array again,
so now I have it sorted by the number of occurrences from the largest to
the smallest.

That is all I need.

Nevertheless, I am concerned with the performance. For 69,725 words, from
which 8,987 were unique, it took 28 seconds for the code below to execute.
I will survive this 28 seconds, if I have to. But I still would like to
find a faster solution.

On the other hand, I think I am close to the fastest possible solution.
Basically, the array will be traversed once only, no matter how many terms
and how many repetitions it may have.

(What do you think about this efficiency, Tobi?)




















*MatchedWords.Sort(gb.ascent + gb.language + gb.IgnoreCase) For i = 0 To
MatchedWords.Max n = 1 For j = i + 1 To MatchedWords.Max If
(Comp(MatchedWords[i], MatchedWords[j], gb.language + gb.ignorecase) = 0)
Then n += 1 Else Break Endif Next
UniqWords.Push(Format(n, "0###") & "#" & MatchedWords[i]) i += (n - 1)
NextUniqWords.Sort(gb.descent + gb.language + gb.ignorecase)For i = 0 To
UniqWords.Max Print UniqWords[i]Next*
Post by Gianluigi
Just for curiosity, on my computer, my function (double) processes 10
million strings (first and last name) in about 3 seconds.
Very naif measurement using Timers and a limited number of names and
surnames eg Willy Weber has come up 11051 times
To demonstrate the goodness of Tobias' arguments, about 1 million 3 cents a
second I really understood (I hope) what he wanted to say.
Sorry my response times but today my modem works worse than my brain.
Regards
Gianluigi
Post by Gianluigi
Sorry Tobias,
other explanations are not necessary.
I would not be able to understand :-(
I accept what you already explained to me as a dogma and I will try to
put
Post by Gianluigi
it into practice by copying your code :-).
Thanks again.
Gianluigi
Post by Gianluigi
Post by Tobias Boege
I wouldn't say there is anything *wrong* with it, but it also has quadratic
worst-case running time. You use String[].Push() which is just another name
for String[].Add(). Adding an element to an array (the straightforward way)
is done by extending the space of that array by one further element and
storing the value there. But extending the space of an array could potentially
require you to copy the whole array somewhere else (where you have
enough
Post by Gianluigi
Post by Gianluigi
Post by Tobias Boege
free memory at the end of the array to enlarge it). Doing worst-case analysis,
we have to assume that this bad case always occurs.
If you fill an array with n values, e.g.
Dim a As New Integer[]
For i = 1 To n
a.Add(i)
Next
then you loop n times and in the i-th iteration there will be already
i-many elements in your array. Adding one further element to it will,
in the worst case, require i copy operations to be performed.
9-year-old
Post by Gianluigi
Post by Gianluigi
Post by Tobias Boege
C.F. Gauss will tell you that the amount of store operations is about n^2.
Tobias you are always kind and thank you very much.
Is possible for you to explain this more elementarily, for me (a poorly
educated boy :-) )
Post by Tobias Boege
And your function does two jobs simultaneously but only returns the result
of one of the jobs. The output you get is only worth half the time you spent.
I did two functions in one, just to save space, this is a simple
example.
Post by Gianluigi
Post by Gianluigi
:-)
Regards
Gianluigi
------------------------------------------------------------
------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
https://lists.sourceforge.net/lists/listinfo/gambas-user
--
Fernando Cabral
Blogue: http://fernandocabral.org
Twitter: http://twitter.com/fjcabral
e-mail: ***@gmail.com
Facebook: ***@fcabral.com.br
Telegram: +55 (37) 99988-8868
Wickr ID: fernandocabral
WhatsApp: +55 (37) 99988-8868
Skype: fernandojosecabral
Telefone fixo: +55 (37) 3521-2183
Telefone celular: +55 (37) 99988-8868

Enquanto houver no mundo uma só pessoa sem casa ou sem alimentos,
nenhum político ou cientista poderá se gabar de nada.
n***@nothingsimple.com
2017-07-01 23:24:56 UTC
Permalink
there are much faster ways...just a little more
intricate.

Nando

--
Open WebMail Project (http://openwebmail.org)


---------- Original Message -----------
From: Fernando Cabral <***@gmail.com>
To: mailing list for gambas users <gambas-
***@lists.sourceforge.net>
Sent: Sat, 1 Jul 2017 00:18:42 -0300
Subject: Re: [Gambas-user] I need a hint on how to
deleted duplicate items in a array
Post by Fernando Cabral
I thank you guys for the hints on counting and
eliminating duplicates. In
Post by Fernando Cabral
the end, I resorted to something that is very
simple and does the trick in
Post by Fernando Cabral
three steps. In the first step I sort the array.
In the second step I count the number of
occurrences and prepend it to the
Post by Fernando Cabral
word itself (with a separator). In the third step
I sort the array again,
Post by Fernando Cabral
so now I have it sorted by the number of
occurrences from the largest to
Post by Fernando Cabral
the smallest.
That is all I need.
Nevertheless, I am concerned with the performance.
For 69,725 words, from
Post by Fernando Cabral
which 8,987 were unique, it took 28 seconds for
the code below to execute.
Post by Fernando Cabral
I will survive this 28 seconds, if I have to. But
I still would like to
Post by Fernando Cabral
find a faster solution.
On the other hand, I think I am close to the
fastest possible solution.
Post by Fernando Cabral
Basically, the array will be traversed once only,
no matter how many terms
Post by Fernando Cabral
and how many repetitions it may have.
(What do you think about this efficiency, Tobi?)
*MatchedWords.Sort(gb.ascent + gb.language +
gb.IgnoreCase) For i = 0 To
Post by Fernando Cabral
MatchedWords.Max n = 1 For j = i + 1 To
MatchedWords.Max If
Post by Fernando Cabral
(Comp(MatchedWords[i], MatchedWords[j],
gb.language + gb.ignorecase) = 0)
Post by Fernando Cabral
Then n += 1 Else Break
Endif Next
Post by Fernando Cabral
UniqWords.Push(Format(n, "0###") & "#" &
MatchedWords[i]) i += (n - 1)
Post by Fernando Cabral
NextUniqWords.Sort(gb.descent + gb.language +
gb.ignorecase)For i = 0 To
Post by Fernando Cabral
UniqWords.Max Print UniqWords[i]Next*
2017-06-30 15:10 GMT-03:00 Gianluigi
Post by Gianluigi
Just for curiosity, on my computer, my function
(double) processes 10
Post by Fernando Cabral
Post by Gianluigi
million strings (first and last name) in about 3
seconds.
Post by Fernando Cabral
Post by Gianluigi
Very naif measurement using Timers and a limited
number of names and
Post by Fernando Cabral
Post by Gianluigi
surnames eg Willy Weber has come up 11051 times
To demonstrate the goodness of Tobias'
arguments, about 1 million 3 cents a
Post by Fernando Cabral
Post by Gianluigi
second I really understood (I hope) what he
wanted to say.
Post by Fernando Cabral
Post by Gianluigi
Sorry my response times but today my modem works
worse than my brain.
Post by Fernando Cabral
Post by Gianluigi
Regards
Gianluigi
2017-06-30 17:58 GMT+02:00 Gianluigi
Post by Gianluigi
Sorry Tobias,
other explanations are not necessary.
I would not be able to understand :-(
I accept what you already explained to me as a
dogma and I will try to
Post by Fernando Cabral
Post by Gianluigi
put
Post by Gianluigi
it into practice by copying your code :-).
Thanks again.
Gianluigi
2017-06-30 17:44 GMT+02:00 Gianluigi
2017-06-30 17:21 GMT+02:00 Tobias Boege
Post by Tobias Boege
I wouldn't say there is anything *wrong*
with it, but it also has
Post by Fernando Cabral
Post by Gianluigi
Post by Gianluigi
Post by Tobias Boege
quadratic
worst-case running time. You use
String[].Push() which is just another
Post by Fernando Cabral
Post by Gianluigi
Post by Gianluigi
Post by Tobias Boege
name
for String[].Add(). Adding an element to an
array (the straightforward
Post by Fernando Cabral
Post by Gianluigi
Post by Gianluigi
Post by Tobias Boege
way)
is done by extending the space of that array
by one further element and
Post by Fernando Cabral
Post by Gianluigi
Post by Gianluigi
Post by Tobias Boege
storing the value there. But extending the
space of an array could
Post by Fernando Cabral
Post by Gianluigi
Post by Gianluigi
Post by Tobias Boege
potentially
require you to copy the whole array
somewhere else (where you have
Post by Fernando Cabral
Post by Gianluigi
enough
Post by Gianluigi
Post by Tobias Boege
free memory at the end of the array to
enlarge it). Doing worst-case
Post by Fernando Cabral
Post by Gianluigi
Post by Gianluigi
Post by Tobias Boege
analysis,
we have to assume that this bad case always
occurs.
Post by Fernando Cabral
Post by Gianluigi
Post by Gianluigi
Post by Tobias Boege
If you fill an array with n values, e.g.
Dim a As New Integer[]
For i = 1 To n
a.Add(i)
Next
then you loop n times and in the i-th
iteration there will be already
Post by Fernando Cabral
Post by Gianluigi
Post by Gianluigi
Post by Tobias Boege
i-many elements in your array. Adding one
further element to it will,
Post by Fernando Cabral
Post by Gianluigi
Post by Gianluigi
Post by Tobias Boege
in the worst case, require i copy operations
to be performed.
Post by Fernando Cabral
Post by Gianluigi
9-year-old
Post by Gianluigi
Post by Tobias Boege
C.F. Gauss will tell you that the amount of
store operations is about
Post by Fernando Cabral
Post by Gianluigi
Post by Gianluigi
Post by Tobias Boege
n^2.
Tobias you are always kind and thank you very
much.
Post by Fernando Cabral
Post by Gianluigi
Post by Gianluigi
Is possible for you to explain this more
elementarily, for me (a poorly
Post by Fernando Cabral
Post by Gianluigi
Post by Gianluigi
educated boy :-) )
Post by Tobias Boege
And your function does two jobs
simultaneously but only returns the
Post by Fernando Cabral
Post by Gianluigi
Post by Gianluigi
Post by Tobias Boege
result
of one of the jobs. The output you get is
only worth half the time you
Post by Fernando Cabral
Post by Gianluigi
Post by Gianluigi
Post by Tobias Boege
spent.
I did two functions in one, just to save
space, this is a simple
Post by Fernando Cabral
Post by Gianluigi
example.
Post by Gianluigi
:-)
Regards
Gianluigi
------------------------------------------------
------------
Post by Fernando Cabral
Post by Gianluigi
------------------
Check out the vibrant tech community on one of
the world's most
Post by Fernando Cabral
Post by Gianluigi
engaging tech sites, Slashdot.org!
http://sdm.link/slashdot
Post by Fernando Cabral
Post by Gianluigi
_______________________________________________
Gambas-user mailing list
https://lists.sourceforge.net/lists/listinfo/gambas-
user
Post by Fernando Cabral
--
Fernando Cabral
Blogue: http://fernandocabral.org
Twitter: http://twitter.com/fjcabral
Telegram: +55 (37) 99988-8868
Wickr ID: fernandocabral
WhatsApp: +55 (37) 99988-8868
Skype: fernandojosecabral
Telefone fixo: +55 (37) 3521-2183
Telefone celular: +55 (37) 99988-8868
Enquanto houver no mundo uma só pessoa sem casa ou
sem alimentos,
Post by Fernando Cabral
nenhum político ou cientista poderá se gabar de
nada.
Post by Fernando Cabral
--------------------------------------------------
----------------------------
Post by Fernando Cabral
Check out the vibrant tech community on one of the
world's most
Post by Fernando Cabral
engaging tech sites, Slashdot.org!
http://sdm.link/slashdot
Post by Fernando Cabral
_______________________________________________
Gambas-user mailing list
https://lists.sourceforge.net/lists/listinfo/gambas-
user
------- End of Original Message -------

Tobias Boege
2017-06-27 14:29:18 UTC
Permalink
Post by Fernando Cabral
Hi
I have a sorted array that may contain several repeated items scattered all
over.
a) Eliminate the duplicates leaving a single specimen from each repeated
item;
b) Eliminate the duplicates but having a count of the original number.
So, if I have, say
A
B
B
C
D
D
In the first option, I want to have
A
B
C
D
In the second option, I want to have
1 A
2 B
1 C
2 D
Any hints on how to do this using some Gambas buit in method?
Note; Presently I have been doing it using external calls to
the utilities sort and uniq.
Your first sentence is a bit confusing. First you say that your array is
sorted but then you say that duplicates may be scattered across the array.
There are notions of order (namely *preorder*) which are so weak that this
could happen, but are you actually dealing with a preorder on your items?
What are your items, anyway?

When I hear "sorted", I think of a partial order and if you have a partial
order, then sorted implies that duplicates are consecutive! Anyway, I don't
want to bore you with elementary concepts of order theory. There are ways
to handle preorders, partial orders and every stronger notion of order,
of course, from within Gambas. You simply have to ask a better question,
by giving more details.

If you have a sorting where duplicates are consecutive, the solution is
very easy: just go through the array linearly and kick out these consecutive
duplicates (which is precisely what uniq does), e.g. for integers:

Dim aInts As Integer[] = ...
Dim iInd, iLast As Integer

If Not aInts.Count Then Return
iLast = aInts[0]
iInd = 1
While iInd < aInts.Count
If aInts[iInd] = iLast Then ' consecutive duplicate
aInts.Remove(iInd, 1)
Else
iLast = aInts[iInd]
Inc iInd
Endif
Wend

Note that the way I wrote it to get the idea across is not a linear-time
operation (it depends on the complexity of aInts.Remove()), but you can
achieve linear performance by writing better code. Think of it as an
exercise. (Of course, you can't hope to be more efficient than linear
time in a general situation.)

The counting task is solved with a similar pattern, but while you kick
an element out, you also increment a dedicated counter:

Dim aInts As Integer[] = ...
Dim aDups As New Integer[]
Dim iInd, iLast As Integer

If Not aInts.Count Then Return
iLast = aInts[0]
iInd = 1
aDups.Add(0)
While iInd < aInts.Count
If aInts[iInd] = iLast Then ' consecutive duplicate
aInts.Remove(iInd, 1)
Inc aDups[aDups.Max]
Else
iLast = aInts[iInd]
aDups.Add(0)
Inc iInd
Endif
Wend

After this executed, the array aInts will not contain duplicates (supposing
it was sorted before) and aDups[i] will contain the number of duplicates of
the item aInts[i] that were removed.

Regards,
Tobi
--
"There's an old saying: Don't change anything... ever!" -- Mr. Monk
Fernando Cabral
2017-06-27 16:57:53 UTC
Permalink
Post by Tobias Boege
Your first sentence is a bit confusing. First you say that your array is
sorted but then you say that duplicates may be scattered across the array.
You are right. My fault. The array is sorted. What I meant by scattered was
that
pairs, duples, triplets or a bunch of duplicates may appear all over
interspersed with non-duplicated items.

My items are either words or sentences (extracted from an ODT file.
After the extraction, the words (or sentences) are sorted with the method
Array.sort(gb.descent).

After sorting it is much more efficient to search for the duplicates. And
it can be done
with some simple code (as some people have exemplified in this thread).

So, my question is basically if Gambas has some built in method do
eliminate duplicates.
The reason I am asking this is because I am new to Gambas, so I have found
myself coding
things that were not needed. For instance, I coded some functions to do
quicksort and bubble sort and then I found Array.sort () was available.
Therefore, I waisted my time coding those quicksort and bubble sort
functions.... :-(

Regards

- fernando
Post by Tobias Boege
If you have a sorting where duplicates are consecutive, the solution is
very easy: just go through the array linearly and kick out these consecutive
Dim aInts As Integer[] = ...
Dim iInd, iLast As Integer
If Not aInts.Count Then Return
iLast = aInts[0]
iInd = 1
While iInd < aInts.Count
If aInts[iInd] = iLast Then ' consecutive duplicate
aInts.Remove(iInd, 1)
Else
iLast = aInts[iInd]
Inc iInd
Endif
Wend
Note that the way I wrote it to get the idea across is not a linear-time
operation (it depends on the complexity of aInts.Remove()), but you can
achieve linear performance by writing better code. Think of it as an
exercise. (Of course, you can't hope to be more efficient than linear
time in a general situation.)
The counting task is solved with a similar pattern, but while you kick
Dim aInts As Integer[] = ...
Dim aDups As New Integer[]
Dim iInd, iLast As Integer
If Not aInts.Count Then Return
iLast = aInts[0]
iInd = 1
aDups.Add(0)
While iInd < aInts.Count
If aInts[iInd] = iLast Then ' consecutive duplicate
aInts.Remove(iInd, 1)
Inc aDups[aDups.Max]
Else
iLast = aInts[iInd]
aDups.Add(0)
Inc iInd
Endif
Wend
After this executed, the array aInts will not contain duplicates (supposing
it was sorted before) and aDups[i] will contain the number of duplicates of
the item aInts[i] that were removed.
Regards,
Tobi
--
"There's an old saying: Don't change anything... ever!" -- Mr. Monk
------------------------------------------------------------
------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gambas-user mailing list
https://lists.sourceforge.net/lists/listinfo/gambas-user
--
Fernando Cabral
Blogue: http://fernandocabral.org
Twitter: http://twitter.com/fjcabral
e-mail: ***@gmail.com
Facebook: ***@fcabral.com.br
Telegram: +55 (37) 99988-8868
Wickr ID: fernandocabral
WhatsApp: +55 (37) 99988-8868
Skype: fernandojosecabral
Telefone fixo: +55 (37) 3521-2183
Telefone celular: +55 (37) 99988-8868

Enquanto houver no mundo uma só pessoa sem casa ou sem alimentos,
nenhum político ou cientista poderá se gabar de nada.
Tobias Boege
2017-06-27 17:08:11 UTC
Permalink
Post by Fernando Cabral
So, my question is basically if Gambas has some built in method do
eliminate duplicates.
The reason I am asking this is because I am new to Gambas, so I have found
myself coding
things that were not needed. For instance, I coded some functions to do
quicksort and bubble sort and then I found Array.sort () was available.
Therefore, I waisted my time coding those quicksort and bubble sort
functions.... :-(
Ah, ok. I'm almost sure there is no built-in "uniq" function which gets
rid of consecutive duplicates, so you can go ahead and write your own :-)
--
"There's an old saying: Don't change anything... ever!" -- Mr. Monk
Loading...