HBase 快速入门：单机部署、伪集群、集群部署到 Java 连接

后端

2023-01-17 10:22:22

深入浅出理解 HBase：入门教程

什么是 HBase？

HBase 是一款分布式、面向列的 NoSQL 数据库，专为处理海量数据而设计。它利用 HDFS 存储数据，而 ZooKeeper 则负责协调。

HBase 的关键概念

表：存储数据的容器，由行、列族和列组成。
行：由唯一标识符唯一标识，类似于关系数据库中的记录。
列族： 表中的逻辑分组，将具有相似性质的数据组织在一起。
列：列族中的特定属性，可以是数字、文本、布尔值或二进制数据。

HBase 部署

单机部署

步骤 1： 下载 HBase 发行版。
步骤 2： 解压缩并放置在所需目录。
步骤 3： 运行命令 ./hbase-daemon.sh start master 启动 HBase 主节点。
步骤 4： 运行命令 ./hbase-daemon.sh start regionserver 启动 HBase 区域服务器。

伪集群部署

与单机部署类似，但会在同一台机器上运行多个 HBase 实例。
步骤 5： 按照单机部署步骤启动第二个 HBase 主节点和区域服务器。

集群部署

在多台机器上部署 HBase。
步骤 6： 在每台机器上执行单机部署步骤。
步骤 7： 在其中一台机器上启动 HBase 主节点。
步骤 8： 在其他机器上启动 HBase 区域服务器。

Java 连接 HBase 集群

步骤 9： 在项目中添加 HBase 客户端依赖项。
步骤 10： 创建一个 HBase 配置对象并指定 ZooKeeper 连接信息。
步骤 11： 创建一个 HBase 管理员对象并连接到集群。
步骤 12： 创建一个 HBase 表并添加数据。
步骤 13： 从 HBase 表中获取数据。

示例代码

import org.apache.hadoop.hbase.client.*;
import org.apache.hadoop.hbase.util.Bytes;

public class HBaseExample {

    public static void main(String[] args) throws Exception {
        // 创建 HBase 配置对象
        Configuration config = HBaseConfiguration.create();
        config.set("hbase.zookeeper.quorum", "localhost");
        config.set("hbase.zookeeper.property.clientPort", "2181");

        // 创建 HBase 管理员对象
        HBaseAdmin admin = new HBaseAdmin(config);

        // 创建 HBase 表
        HTableDescriptor tableDescriptor = new HTableDescriptor(TableName.valueOf("test_table"));
        admin.createTable(tableDescriptor);

        // 创建 HBase 表格对象
        HTable table = new HTable(config, "test_table");

        // 添加数据到 HBase 表
        Put put = new Put(Bytes.toBytes("row1"));
        put.addColumn(Bytes.toBytes("cf1"), Bytes.toBytes("col1"), Bytes.toBytes("value1"));
        table.put(put);

        // 从 HBase 表中获取数据
        Get get = new Get(Bytes.toBytes("row1"));
        Result result = table.get(get);
        System.out.println(Bytes.toString(result.getValue(Bytes.toBytes("cf1"), Bytes.toBytes("col1"))));
    }
}