btlike迁移搬家go-mysql-elasticsearch实现mysql与elasticsearch同步初始化 图文教程

前面我们写了btlike的相关搭建教程:

BTLike Golang爬虫 LNMP面板 PHP前端 完整图文教程

BTLIKE PHP前端页面 安装下载 图文教程

Vultr 安装配置btlike BT搜索引擎 图文教程

但是等我们迁移环境的时候就需要备份Elasticsearch,但是我们这里又没有Elasticsearch,所以只能从数据库同步了。

网上搜了很多资料,只能通过中间插件去做,最后选定了go-mysql-elasticsearch(国人开发的)。下面我们给出图文教程。

go-mysql-elasticsearch 项目地址:https://github.com/siddontang/go-mysql-elasticsearch

这个中间件是用Golang语言写的,所以我们这里需要预先准备好Golang的环境。

Centos6.x yum 安装 golang 语言环境

安装完成初始化,我们这里就不再写详细步骤,不会的请看前面关于btlike的搭建教程。

恢复方法:https://jiloc.com/42711.html


由于Btlike的Elasticsearch结构特殊性导致go-mysql-elasticsearch 不能恢复数据!!!

以下内容仅为go-mysql-elasticsearch 使用实例!!!

go-mysql-elasticsearch 插件安装

 yum install go   go get github.com/tools/godep   go get github.com/siddontang/go-mysql-elasticsearch   cd $GOPATH/src/github.com/siddontang/go-mysql-elasticsearch   make

配置mariadb,mysql

官方原文注意事项:

  • binlog format must be row.
  • binlog row image must be full for MySQL, you may lost some field data if you update PK data in MySQL with minimal or noblob binlog row image. MariaDB only supports full row image.
  • Can not alter table format at runtime.
  • MySQL table which will be synced must have a PK(primary key), multi columns PK is allowed now, e,g, if the PKs is (a, b), we will use “a:b” as the key. The PK data will be used as “id” in Elasticsearch.
  • You should create the associated mappings in Elasticsearch first, I don’t think using the default mapping is a wise decision, you must know how to search accurately.
  • mysqldump must exist in the same node with go-mysql-elasticsearch, if not, go-mysql-elasticsearch will try to sync binlog only.
  • Don’t change too many rows at same time in one SQL.

修改数据库配置文件,默认文件位置为:/etc/my.cnf , 确保有以下配置内容

  1. 开启bin-log
  2. binglog_foramt格式必须为row
  3. 配置server_id 为1001
  4. binlog-row-image 必须为FULL

代码段如下:

[mysqld]  log-bin=mysql-bin  binlog_format=row  server_id=1001  binlog-row-image=full

修改配置后记得重启mysql服务

/etc/init.d/mysql restart

配置go-mysql-elasticsearch 插件:

vi?etc/river.toml
# MySQL address, user and password  # user must have replication privilege in MySQL.  my_addr = "127.0.0.1:3306"  my_user = "root"  my_pass = "数据库密码"    # Elasticsearch address  es_addr = "Elasticsearch的IP地址:9200"    # Path to store data, like master.info, and dump MySQL data   data_dir = "./var"    # Inner Http status address  stat_addr = "127.0.0.1:12800"    # pseudo server id like a slave   server_id = 1001        #此ID必须与上面的server_id一致    # mysql or mariadb  flavor = "mysql"    # mysqldump execution path  # if not set or empty, ignore mysqldump.  mysqldump = "mysqldump"    # MySQL data source  [[source]]  schema = "torrent"      #数据库名    # Only below tables will be synced into Elasticsearch.  # "test_river_[0-9]{4}" is a wildcard table format, you can use it if you have many sub tables, like table_0000 - table_1023  # I don't think it is necessary to sync all tables in a database.  # 这里就是需要添加索引的表  tables = ["torrent[0-9]{1}","torrenta","torrentb","torrentc","torrentd","torrente","torrentf"]    # Below is for special rule mapping  [[rule]]  schema = "torrent"         # 数据库名  table = "torrent[0-9]{1}"  # 表名  index = "torrent"          # 索引名,跟之前程序创立的一致即可    [[rule]]  schema = "torrent"  table = "torrenta"  index = "torrent"    [[rule]]  schema = "torrent"  table = "torrentb"  index = "torrent"    [[rule]]  schema = "torrent"  table = "torrentc"  index = "torrent"    [[rule]]  schema = "torrent"  table = "torrentd"  index = "torrent"    [[rule]]  schema = "torrent"  table = "torrente"  index = "torrent"    [[rule]]  schema = "torrent"  table = "torrentf"  index = "torrent"

运行同步命令:

cd $GOPATH/src/github.com/siddontang/go-mysql-elasticsearch   ./bin/go-mysql-elasticsearch -config=./etc/river.toml

视数据库的量而定,此步需要花费很长时间。同步时可以将其他爬虫程序关闭。

可以将此命令放入screen中执行。

不会Screen ?Linux Screen 简单用法 图文教程

可以通过别的终端执行如下命令查看执行情况:

curl 127.0.0.1:12800/stat

server_current_binlog:(mysql-bin.000020, 343)
read_binlog:(mysql-bin.000018, 0)
insert_num:99397
update_num:0
delete_num:0

如果以上数字没有变化请检查配置选项及文件。

 

腾讯云限时秒杀【点击购买】

搬瓦工,CN2高速线路,1GB带宽,电信联通优化KVM,延迟低,速度快,建站稳定,搬瓦工BandwagonHost VPS优惠码BWH26FXH3HIQ,支持<支付宝> 【点击购买】!

Vultr$3.5日本节点,512M内存/500G流量/1G带宽,电信联通优化,延迟低,速度快【点击购买】!

阿里云香港、新加坡VPS/1核/1G/25G SSD/1T流量/30M带宽/年付¥288【点击购买】