Alex,
Ouput of $ bin/hadoop fsck / command after running HBase data insert
command in a table is:
.....
.....
.....
.....
.....
/hbase/test/903188508/tags/info/4897652949308499876: Under replicated
blk_-5193
695109439554521_3133. Target Replicas is 3 but found 1 replica(s).
.
/hbase/test/903188508/tags/mapfiles/4897652949308499876/data: Under
replicated
blk_-1213602857020415242_3132. Target Replicas is 3 but found 1
replica(s).
.
/hbase/test/903188508/tags/mapfiles/4897652949308499876/index: Under
replicated
blk_3934493034551838567_3132. Target Replicas is 3 but found 1
replica(s).
.
/user/HadoopAdmin/hbase table.doc: Under replicated
blk_4339521803948458144_103
1. Target Replicas is 3 but found 2 replica(s).
.
/user/HadoopAdmin/input/bin.doc: Under replicated
blk_-3661765932004150973_1030
. Target Replicas is 3 but found 2 replica(s).
.
/user/HadoopAdmin/input/file01.txt: Under replicated
blk_2744169131466786624_10
01. Target Replicas is 3 but found 2 replica(s).
.
/user/HadoopAdmin/input/file02.txt: Under replicated
blk_2021956984317789924_10
02. Target Replicas is 3 but found 2 replica(s).
.
/user/HadoopAdmin/input/test.txt: Under replicated
blk_-3062256167060082648_100
4. Target Replicas is 3 but found 2 replica(s).
...
/user/HadoopAdmin/output/part-00000: Under replicated
blk_8908973033976428484_1
010. Target Replicas is 3 but found 2 replica(s).
Status: HEALTHY
Total size: 48510226 B
Total dirs: 492
Total files: 439 (Files currently being written: 2)
Total blocks (validated): 401 (avg. block size 120973 B) (Total
open file
blocks (not validated): 2)
Minimally replicated blocks: 401 (100.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 399 (99.50124 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 2
Average block replication: 1.3117207
Corrupt blocks: 0
Missing replicas: 675 (128.327 %)
Number of data-nodes: 2
Number of racks: 1
The filesystem under path '/' is HEALTHY
Please tell what is wrong.
Aseem
-----Original Message-----
From: Alex Loddengaard [mailto:alex-psgPW5cihnJWk0Htik3J/***@public.gmane.org]
Sent: Friday, April 10, 2009 11:04 PM
To: core-user-7ArZoLwFLBtd/SJB6HiN2Ni2O/***@public.gmane.org
Subject: Re: More Replication on dfs
Aseem,
How are you verifying that blocks are not being replicated? Have you
ran
fsck? *bin/hadoop fsck /*
I'd be surprised if replication really wasn't happening. Can you run
fsck
and pay attention to "Under-replicated blocks" and "Mis-replicated
blocks?"
In fact, can you just copy-paste the output of fsck?
Alex
On Thu, Apr 9, 2009 at 11:23 PM, Puri, Aseem
Post by Puri, AseemHi
I also tried the command $ bin/hadoop balancer. But still the
same problem.
Aseem
-----Original Message-----
Sent: Friday, April 10, 2009 11:18 AM
Subject: RE: More Replication on dfs
Hi Alex,
Thanks for sharing your knowledge. Till now I have three
machines and I have to check the behavior of Hadoop so I want
replication factor should be 2. I started my Hadoop server with
replication factor 3. After that I upload 3 files to implement word
count program. But as my all files are stored on one machine and
replicated to other datanodes also, so my map reduce program takes input
from one Datanode only. I want my files to be on different data node so
to check functionality of map reduce properly.
Also before starting my Hadoop server again with replication
factor 2 I formatted all Datanodes and deleted all old data manually.
Please suggest what I should do now.
Regards,
Aseem Puri
-----Original Message-----
Sent: Friday, April 10, 2009 10:56 AM
Subject: Re: More Replication on dfs
To add to the question, how does one decide what is the optimal replication
factor for a cluster. For instance what would be the appropriate replication
factor for a cluster consisting of 5 nodes.
Mithila
Post by Alex LoddengaardDid you load any files when replication was set to 3? If so, you'll
have
<http://hadoop.apache.org/core/docs/r0.19.1/commands_manual.html#balance
http://hadoop.apache.org/core/docs/r0.19.1/hdfs_user_guide.html#Rebalanc
Post by Puri, Aseemer
Post by Alex LoddengaardNote that most people run HDFS with a replication factor of 3.
There
Post by Puri, Aseemhave
Post by Alex Loddengaardbeen cases when clusters running with a replication of 2 discovered
new
Post by Alex Loddengaardbugs, because replication is so often set to 3. That said, if you can
do
Post by Alex Loddengaardit, it's probably advisable to run with a replication factor of 3
instead
Post by Alex Loddengaardof
2.
Alex
On Thu, Apr 9, 2009 at 9:56 PM, Puri, Aseem
Post by Puri, AseemHi
I am a new Hadoop user. I have a small cluster with 3
Datanodes. In hadoop-site.xml values of dfs.replication property is
2
Post by Alex LoddengaardPost by Puri, Aseembut then also it is replicating data on 3 machines.
Please tell why is it happening?
Regards,
Aseem Puri