|
 
- 帖子
- 414
- 精华
- 0
- 积分
- 1079
- 威望
- 1079 路币
- 金钱
- 0
- 阅读权限
- 150
- 注册时间
- 2009-9-6
|
1#
发表于 2009-11-8 18:00
| 只看该作者
默认的hadoop 执行后会将结果存储在文件中,这几天搞hbase ,考虑将二者结合 将结果存储在hbase中,改写hadoop的wordCount代码
首先下载hbase 与hadoop
安装过程不写了 网上一堆。。基本是同一篇(~_~!!!)
在启hbase的时候hadoop必须也起 否则用代码总是连接不上
写一class 用来封装对hbase的操作:
public class MyClient {
public HTable createTableAndAddFamily(String tableName,String[] family) throws IOException {
HBaseConfiguration config = new HBaseConfiguration();
HBaseAdmin ha = new HBaseAdmin(config);
if (!ha.tableExists(tableName)) {
HTableDescriptor tableDec = new HTableDescriptor(tableName);
for (int i = 0; i < family.length; i++) {
tableDec.addFamily(new HColumnDescriptor(family+":"));
}
ha.createTable(tableDec);
}
HTable table = new HTable(config, tableName);
return table;
}
public void addDateForTable(HTable table,String key, String family,String lable,String value) throws IOException {
BatchUpdate batchUpdate = new BatchUpdate(key);
batchUpdate.put(family+":"+lable, Bytes.toBytes(value));
table.commit(batchUpdate);
}
在wordCount 引用上面的的方法 在做reduce的时候
public static class Reduce extends MapReduceBase implements Reducer<Text, IntWritable, Text, IntWritable> {
MyClient my = new MyClient();
public void reduce(Text key, Iterator<IntWritable> values,OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException {
int sum = 0;
String[] family = {"word","sum"};
HTable table = my.createTableAndAddFamily("wordCount", family);
while (values.hasNext()) {
sum += values.next().get();
}
my.addDateForTable(table, key.toString(),"sum", "", String.valueOf(sum));
output.collect(key, new IntWritable(sum));
}
之后重新打jar 运行hadoop 数据成功添加到hbase中
|
|