Discussion:
Exception in thread "main" java.lang.IllegalArgumentException: Unable to read output from "mahout -spark classpath"
go canal
2015-10-03 09:14:00 UTC
Permalink
Hello,I am running a very simple Mahout application in Eclipse, but got this error:
Exception in thread "main" java.lang.IllegalArgumentException: Unable to read output from "mahout -spark classpath". Is SPARK_HOME defined?
I have SPARK_HOME defined in Eclipse as an environment variable with value of /usr/local/spark-1.5.1.
What else I need to include/set ?

thanks, canal
Pat Ferrel
2015-10-04 16:23:18 UTC
Permalink
Mahout 0.11.0 is built on Spark 1.4 and so 1.5.1 is a bit unknown. I think the Mahout Shell does not run on 1.5.1.

That may not be the error below, which is caused when Mahout tries to create a set of jars to use in the Spark executors. The code runs `mahout -spark classpath` to get these. So something is missing in your env in Eclipse. Does `mahout -spark classpath` run in a shell, if so check to see if you env matches in Eclipse.

Also what are you trying to do? I have some example Spark Context creation code if you are using Mahout as a Library.


On Oct 3, 2015, at 2:14 AM, go canal <***@yahoo.com.INVALID> wrote:

Hello,I am running a very simple Mahout application in Eclipse, but got this error:
Exception in thread "main" java.lang.IllegalArgumentException: Unable to read output from "mahout -spark classpath". Is SPARK_HOME defined?
I have SPARK_HOME defined in Eclipse as an environment variable with value of /usr/local/spark-1.5.1.
What else I need to include/set ?

thanks, canal
go canal
2015-10-05 02:27:35 UTC
Permalink
Thank you very much for the help. I will try Spark 1.4. 
I would like to try distributed matrix multiplication. not sure if there are sample codes available. I am very new to this stack. thanks, canal


On Monday, October 5, 2015 12:23 AM, Pat Ferrel <***@occamsmachete.com> wrote:


Mahout 0.11.0 is built on Spark 1.4 and so 1.5.1 is a bit unknown. I think the Mahout Shell does not run on 1.5.1.

That may not be the error below, which is caused when Mahout tries to create a set of jars to use in the Spark executors. The code runs `mahout -spark classpath` to get these. So something is missing in your env in Eclipse. Does `mahout -spark classpath` run in a shell, if so check to see if you env matches in Eclipse.

Also what are you trying to do? I have some example Spark Context creation code if you are using Mahout as a Library.


On Oct 3, 2015, at 2:14 AM, go canal <***@yahoo.com.INVALID> wrote:

Hello,I am running a very simple Mahout application in Eclipse, but got this error:
Exception in thread "main" java.lang.IllegalArgumentException: Unable to read output from "mahout -spark classpath". Is SPARK_HOME defined?
I have SPARK_HOME defined in Eclipse as an environment variable with value of /usr/local/spark-1.5.1.
What else I need to include/set ?

thanks, canal
Pat Ferrel
2015-10-06 20:09:29 UTC
Permalink
Linear algebra stuff is what Mahout Samsara is all about. For these docs in-core means in-memory and out of core means distributed http://mahout.apache.org/users/environment/out-of-core-reference.html

On Oct 4, 2015, at 7:27 PM, go canal <***@yahoo.com.INVALID> wrote:

Thank you very much for the help. I will try Spark 1.4.
I would like to try distributed matrix multiplication. not sure if there are sample codes available. I am very new to this stack. thanks, canal


On Monday, October 5, 2015 12:23 AM, Pat Ferrel <***@occamsmachete.com> wrote:


Mahout 0.11.0 is built on Spark 1.4 and so 1.5.1 is a bit unknown. I think the Mahout Shell does not run on 1.5.1.

That may not be the error below, which is caused when Mahout tries to create a set of jars to use in the Spark executors. The code runs `mahout -spark classpath` to get these. So something is missing in your env in Eclipse. Does `mahout -spark classpath` run in a shell, if so check to see if you env matches in Eclipse.

Also what are you trying to do? I have some example Spark Context creation code if you are using Mahout as a Library.


On Oct 3, 2015, at 2:14 AM, go canal <***@yahoo.com.INVALID> wrote:

Hello,I am running a very simple Mahout application in Eclipse, but got this error:
Exception in thread "main" java.lang.IllegalArgumentException: Unable to read output from "mahout -spark classpath". Is SPARK_HOME defined?
I have SPARK_HOME defined in Eclipse as an environment variable with value of /usr/local/spark-1.5.1.
What else I need to include/set ?

thanks, canal
Dmitriy Lyubimov
2015-10-06 22:31:36 UTC
Permalink
:) strictly speaking out of core is anything that is not in memory, e.g.
sequential algorithms are generally also considered out-of-core

btw i though 0.11.x was for 1.3? or that was re-certified for 1.4 too?
Post by Pat Ferrel
Linear algebra stuff is what Mahout Samsara is all about. For these docs
in-core means in-memory and out of core means distributed
http://mahout.apache.org/users/environment/out-of-core-reference.html
Thank you very much for the help. I will try Spark 1.4.
I would like to try distributed matrix multiplication. not sure if there
are sample codes available. I am very new to this stack. thanks, canal
Mahout 0.11.0 is built on Spark 1.4 and so 1.5.1 is a bit unknown. I think
the Mahout Shell does not run on 1.5.1.
That may not be the error below, which is caused when Mahout tries to
create a set of jars to use in the Spark executors. The code runs `mahout
-spark classpath` to get these. So something is missing in your env in
Eclipse. Does `mahout -spark classpath` run in a shell, if so check to see
if you env matches in Eclipse.
Also what are you trying to do? I have some example Spark Context creation
code if you are using Mahout as a Library.
Exception in thread "main" java.lang.IllegalArgumentException: Unable to
read output from "mahout -spark classpath". Is SPARK_HOME defined?
I have SPARK_HOME defined in Eclipse as an environment variable with value
of /usr/local/spark-1.5.1.
What else I need to include/set ?
thanks, canal
go canal
2015-10-07 04:12:42 UTC
Permalink
thank you Pat. I was having that issue when I was trying to do something like that. 
Just curious, how should I prepare the data so that it can satisfy drmDfsRead (path) ? DRM format and how to create the DRM file ? thanks, canal


On Wednesday, October 7, 2015 4:09 AM, Pat Ferrel <***@occamsmachete.com> wrote:


Linear algebra stuff is what Mahout Samsara is all about. For these docs in-core means in-memory and out of core means distributed http://mahout.apache.org/users/environment/out-of-core-reference.html

On Oct 4, 2015, at 7:27 PM, go canal <***@yahoo.com.INVALID> wrote:

Thank you very much for the help. I will try Spark 1.4.
I would like to try distributed matrix multiplication. not sure if there are sample codes available. I am very new to this stack. thanks, canal


    On Monday, October 5, 2015 12:23 AM, Pat Ferrel <***@occamsmachete.com> wrote:


Mahout 0.11.0 is built on Spark 1.4 and so 1.5.1 is a bit unknown. I think the Mahout Shell does not run on 1.5.1.

That may not be the error below, which is caused when Mahout tries to create a set of jars to use in the Spark executors. The code runs `mahout -spark classpath` to get these. So something is missing in your env in Eclipse. Does `mahout -spark classpath` run in a shell, if so check to see if you env matches in Eclipse.

Also what are you trying to do? I have some example Spark Context creation code if you are using Mahout as a Library.


On Oct 3, 2015, at 2:14 AM, go canal <***@yahoo.com.INVALID> wrote:

Hello,I am running a very simple Mahout application in Eclipse, but got this error:
Exception in thread "main" java.lang.IllegalArgumentException: Unable to read output from "mahout -spark classpath". Is SPARK_HOME defined?
I have SPARK_HOME defined in Eclipse as an environment variable with value of /usr/local/spark-1.5.1.
What else I need to include/set ?

thanks, canal
Dmitriy Lyubimov
2015-10-07 04:31:29 UTC
Permalink
DRM format is compatible on persistence level with Mahout MapReduce
algorithms.

It is a Hadoop sequence file. The key is unique, can be one of

-- unique ordinal IntWriteable, treated as a row number (i.e. nrow=max(int
key)), or

-- Text, LongWritable, BytesWritable, or .. forget what else. This
technically do not have to be unique, but they usually are. The number of
operations available to matrices with "unnumbered" rows is therefore
somewhat reduced. For example, expressions that imply a transposition as a
final result, are not possible, because it is impossible to map non-int
rows to int ordinal indices of the columns.

The value of the DRM sequence file is always Mahout's VectorWritable. It is
allowed to mix sparse and dense vector payloads.
Post by go canal
thank you Pat. I was having that issue when I was trying to do something
like that.
Just curious, how should I prepare the data so that it can
satisfy drmDfsRead (path) ? DRM format and how to create the DRM file
? thanks, canal
On Wednesday, October 7, 2015 4:09 AM, Pat Ferrel <
Linear algebra stuff is what Mahout Samsara is all about. For these docs
in-core means in-memory and out of core means distributed
http://mahout.apache.org/users/environment/out-of-core-reference.html
Thank you very much for the help. I will try Spark 1.4.
I would like to try distributed matrix multiplication. not sure if there
are sample codes available. I am very new to this stack. thanks, canal
Mahout 0.11.0 is built on Spark 1.4 and so 1.5.1 is a bit unknown. I think
the Mahout Shell does not run on 1.5.1.
That may not be the error below, which is caused when Mahout tries to
create a set of jars to use in the Spark executors. The code runs `mahout
-spark classpath` to get these. So something is missing in your env in
Eclipse. Does `mahout -spark classpath` run in a shell, if so check to see
if you env matches in Eclipse.
Also what are you trying to do? I have some example Spark Context creation
code if you are using Mahout as a Library.
Exception in thread "main" java.lang.IllegalArgumentException: Unable to
read output from "mahout -spark classpath". Is SPARK_HOME defined?
I have SPARK_HOME defined in Eclipse as an environment variable with value
of /usr/local/spark-1.5.1.
What else I need to include/set ?
thanks, canal
go canal
2015-10-07 06:09:01 UTC
Permalink
thank you very much ! thanks, canal


On Wednesday, October 7, 2015 12:31 PM, Dmitriy Lyubimov <***@gmail.com> wrote:


DRM format is compatible on persistence level with Mahout MapReduce
algorithms.

It is a Hadoop sequence file. The key is unique, can be one of

-- unique ordinal IntWriteable, treated as a row number (i.e. nrow=max(int
key)), or

-- Text, LongWritable, BytesWritable, or .. forget what else. This
technically do not have to be unique, but they usually are. The number of
operations available to matrices with "unnumbered" rows is therefore
somewhat reduced. For example, expressions that imply a transposition as a
final result, are not possible, because it is impossible to map non-int
rows to int ordinal indices of the columns.

The value of the DRM sequence file is always Mahout's VectorWritable. It is
allowed to mix sparse and dense vector payloads.
Post by go canal
thank you Pat. I was having that issue when I was trying to do something
like that.
Just curious, how should I prepare the data so that it can
satisfy drmDfsRead (path) ? DRM format and how to create the DRM file
? thanks, canal
      On Wednesday, October 7, 2015 4:09 AM, Pat Ferrel <
  Linear algebra stuff is what Mahout Samsara is all about. For these docs
in-core means in-memory and out of core means distributed
http://mahout.apache.org/users/environment/out-of-core-reference.html
Thank you very much for the help. I will try Spark 1.4.
I would like to try distributed matrix multiplication. not sure if there
are sample codes available. I am very new to this stack. thanks, canal
Mahout 0.11.0 is built on Spark 1.4 and so 1.5.1 is a bit unknown. I think
the Mahout Shell does not run on 1.5.1.
That may not be the error below, which is caused when Mahout tries to
create a set of jars to use in the Spark executors. The code runs `mahout
-spark classpath` to get these. So something is missing in your env in
Eclipse. Does `mahout -spark classpath` run in a shell, if so check to see
if you env matches in Eclipse.
Also what are you trying to do? I have some example Spark Context creation
code if you are using Mahout as a Library.
Exception in thread "main" java.lang.IllegalArgumentException: Unable to
read output from "mahout -spark classpath". Is SPARK_HOME defined?
I have SPARK_HOME defined in Eclipse as an environment variable with value
of /usr/local/spark-1.5.1.
What else I need to include/set ?
thanks, canal
Pat Ferrel
2015-10-07 15:47:19 UTC
Permalink
There is also a reader for text delimited files that also creates a bi-directional dictionary for rows and columns while creating the “ unique ordinal IntWriteable” keys Dmitriy mentions. The class is IndexedDataset. The Spark version of the companion Object has a constructor that takes a PairRDD[(String, String)] of elements and traits are provided that read text delimited files. Thinking about one that takes JSON or DataFrames but in any case these are easy to construct. An indexedDataset just wraps the DRM used in the linear algebra.

On Oct 6, 2015, at 9:31 PM, Dmitriy Lyubimov <***@gmail.com> wrote:

DRM format is compatible on persistence level with Mahout MapReduce
algorithms.

It is a Hadoop sequence file. The key is unique, can be one of

-- unique ordinal IntWriteable, treated as a row number (i.e. nrow=max(int
key)), or

-- Text, LongWritable, BytesWritable, or .. forget what else. This
technically do not have to be unique, but they usually are. The number of
operations available to matrices with "unnumbered" rows is therefore
somewhat reduced. For example, expressions that imply a transposition as a
final result, are not possible, because it is impossible to map non-int
rows to int ordinal indices of the columns.

The value of the DRM sequence file is always Mahout's VectorWritable. It is
allowed to mix sparse and dense vector payloads.
Post by go canal
thank you Pat. I was having that issue when I was trying to do something
like that.
Just curious, how should I prepare the data so that it can
satisfy drmDfsRead (path) ? DRM format and how to create the DRM file
? thanks, canal
On Wednesday, October 7, 2015 4:09 AM, Pat Ferrel <
Linear algebra stuff is what Mahout Samsara is all about. For these docs
in-core means in-memory and out of core means distributed
http://mahout.apache.org/users/environment/out-of-core-reference.html
Thank you very much for the help. I will try Spark 1.4.
I would like to try distributed matrix multiplication. not sure if there
are sample codes available. I am very new to this stack. thanks, canal
Mahout 0.11.0 is built on Spark 1.4 and so 1.5.1 is a bit unknown. I think
the Mahout Shell does not run on 1.5.1.
That may not be the error below, which is caused when Mahout tries to
create a set of jars to use in the Spark executors. The code runs `mahout
-spark classpath` to get these. So something is missing in your env in
Eclipse. Does `mahout -spark classpath` run in a shell, if so check to see
if you env matches in Eclipse.
Also what are you trying to do? I have some example Spark Context creation
code if you are using Mahout as a Library.
Exception in thread "main" java.lang.IllegalArgumentException: Unable to
read output from "mahout -spark classpath". Is SPARK_HOME defined?
I have SPARK_HOME defined in Eclipse as an environment variable with value
of /usr/local/spark-1.5.1.
What else I need to include/set ?
thanks, canal
go canal
2015-10-09 02:37:03 UTC
Permalink
I tried Spark 1.4.1, same error. Then I saw the same error from shell command. So I suspect that it is the environment configuration problem.
I have followed this https://mahout.apache.org/general/downloads.html for Mahout configuration. 
So it seems to be a Spark configuration problem, I guess, although I can run spark-example without errors. Will need to figure out what are missing.
 thanks, canal


On Monday, October 5, 2015 12:23 AM, Pat Ferrel <***@occamsmachete.com> wrote:


Mahout 0.11.0 is built on Spark 1.4 and so 1.5.1 is a bit unknown. I think the Mahout Shell does not run on 1.5.1.

That may not be the error below, which is caused when Mahout tries to create a set of jars to use in the Spark executors. The code runs `mahout -spark classpath` to get these. So something is missing in your env in Eclipse. Does `mahout -spark classpath` run in a shell, if so check to see if you env matches in Eclipse.

Also what are you trying to do? I have some example Spark Context creation code if you are using Mahout as a Library.


On Oct 3, 2015, at 2:14 AM, go canal <***@yahoo.com.INVALID> wrote:

Hello,I am running a very simple Mahout application in Eclipse, but got this error:
Exception in thread "main" java.lang.IllegalArgumentException: Unable to read output from "mahout -spark classpath". Is SPARK_HOME defined?
I have SPARK_HOME defined in Eclipse as an environment variable with value of /usr/local/spark-1.5.1.
What else I need to include/set ?

thanks, canal
Andrew Palumbo
2015-10-09 03:06:41 UTC
Permalink
The Mahout 0.11.0 Shell requires Spark 1.3. Please try with Spark 1.3.1.
Post by go canal
I tried Spark 1.4.1, same error. Then I saw the same error from shell command. So I suspect that it is the environment configuration problem.
I have followed this https://mahout.apache.org/general/downloads.html for Mahout configuration.
So it seems to be a Spark configuration problem, I guess, although I can run spark-example without errors. Will need to figure out what are missing.
thanks, canal
Mahout 0.11.0 is built on Spark 1.4 and so 1.5.1 is a bit unknown. I think the Mahout Shell does not run on 1.5.1.
That may not be the error below, which is caused when Mahout tries to create a set of jars to use in the Spark executors. The code runs `mahout -spark classpath` to get these. So something is missing in your env in Eclipse. Does `mahout -spark classpath` run in a shell, if so check to see if you env matches in Eclipse.
Also what are you trying to do? I have some example Spark Context creation code if you are using Mahout as a Library.
Exception in thread "main" java.lang.IllegalArgumentException: Unable to read output from "mahout -spark classpath". Is SPARK_HOME defined?
I have SPARK_HOME defined in Eclipse as an environment variable with value of /usr/local/spark-1.5.1.
What else I need to incl
Loading...