Skip to content

Margus Roo –

If you're inventing and pioneering, you have to be willing to be misunderstood for long periods of time

  • Cloudbreak Autoscale fix
  • Endast

mahout and recommenditembased

Posted on March 17, 2014 - March 29, 2014 by margusja

Lets imagine we have data about how user rated our products they have bought.

userID – productID – rate.

So with mahout recommenditembased class we can recommend new products to our users. Here is simple command line example how can we do this.

lets create a file where we are going to put our present data about users, products and rates.

vim intro.csv
1,101,5.0
1,102,3.0
1,103,2.5
2,101,2.0
2,102,2.5
2,103,5.0
2,104,2.0
3,101,2.5
3,104,4.0
3,105,4.5
3,107,5.0
4,101,5.0
4,103,3.0
4,104,4.5
4,106,4.0
5,101,4.0
5,102,3.0
5,103,2.0
5,104,4.0
5,105,3.5

Put it into hadoop dfs:
hdfs dfs -moveFromLocal intro.csv input/

We need output directory in hadoop dfs:
[speech@h14 ~]$ hdfs dfs -mkdir output

Now we can run recommend command:
[speech@h14 ~]$ mahout/bin/mahout recommenditembased –input input/intro.csv –output output/recommendation -s SIMILARITY_PEARSON_CORRELATION

Our result will be in hadoop dfs output/recommendation

[speech@h14 ~]$ hdfs dfs -cat output/recommendation/part-r-00000
1 [104:3.9258494]
3 [102:3.2698717]
4 [102:4.7433763]

But if we do not have rates. We have only users and items they have bought. We can still use mahout recommenditembased class.

speech@h14 ~]$ vim boolean.csv
1,101
1,102
1,103
2,101
2,102
2,103
2,104
3,101
3,104
3,105
3,107
4,101
4,103
4,104
4,106
5,101
5,102
5,103
5,104
5,105

[speech@h14 ~]$ hdfs dfs -moveFromLocal boolean.cvs input/
[speech@h14 ~]$ mahout/bin/mahout recommenditembased –input /user/speech/input/boolean.csv –output output/boolean -b -s SIMILARITY_LOGLIKELIHOOD

[speech@h14 ~]$ hdfs dfs -cat /user/speech/output/boolean/part-r-00000
1 [104:1.0,105:1.0]
2 [106:1.0,105:1.0]
3 [103:1.0,102:1.0]
4 [105:1.0,102:1.0]
5 [106:1.0,107:1.0]
[speech@h14 ~]$

Posted in Machine Learning

Post navigation

Audio (Estonian) to text with Kaldi
Lets build a calculator

The Master

Categories

  • Apache
  • Apple
  • Assembler
  • Audi
  • BigData
  • BMW
  • C
  • Elektroonika
  • Fun
  • Hadoop
  • help
  • Infotehnoloogia koolis
  • IOT
  • IT
  • IT eetilised
  • Java
  • Langevarjundus
  • Lapsed
  • lastekodu
  • Linux
  • M-401
  • Mac
  • Machine Learning
  • Matemaatika
  • Math
  • MSP430
  • Muusika
  • neo4j
  • openCL
  • Õpetaja identiteet ja tegevusvõimekus
  • oracle
  • PHP
  • PostgreSql
  • ProM
  • R
  • Turvalisus
  • Varia
  • Windows
Proudly powered by WordPress | Theme: micro, developed by DevriX.