本篇博客源自PostgreSQL中文社区李传成大神的博文,链接:http://www.postgres.cn/news/viewone/1/417
作者在此基础上做了案例模拟和参数讲解。

一、XLOGminer/WalMiner介绍

1、简单介绍

WalMiner是从PostgreSQL的WAL(write ahead logs)日志中解析出执行的SQL语句的工具,并能生成出对应的undo SQL语句。

2、版本控制

支持范围:支持所有版本HGDB

PG9.5.0之前的版本没有做过测试

PG9.5.0~PG10.X版本使用WalMiner_10_X

PG11之后的版本使用WalMiner_11_X

10.0.1

1.WalMiner支持解析minimal级别以上的任何wal日志级别。

2.无需将表设置为IDENTITY FULL模式。

3.增加对系统表修改的wal记录的解析。

4.他库解析时relfilenode不匹配的bug修改

11.0.1

1.WalMiner支持解析minimal级别以上的任何wal日志级别。

2.无需将表设置为IDENTITY FULL模式。

3.增加对系统表修改的wal记录的解析。

4.他库解析时relfilenode不匹配的bug修改

5.pg11支持性修改

 

二、测试实例

1、检查版本

查看本机环境数据库版本:

数据库版本:

highgo=# select kernel_version();

                              kernel_version                              

---------------------------------------------------------------------------

 HighGo Database V4.7 Enterprise Edition Release 4.7.7 - 64-bit Production

(1 row)

注意:由于本次测试环境是在瀚高数据库V4.7.7,故使用的工具名称为XLOGminer

 

2、参数设置

创建归档路径:

mkdir /hgdbbak/archive/ -p

chown highgo:highgo /hgdbbak/archive/ -R

 

必须设置如下三个参数,据库日志级别配置需要大于minimal

wal_level minimal, archive, hot_standby, or logical ,若想做最完整的日志挖掘,建议设置为logical。

 

alter system set wal_level = 'archive';

alter system set archive_mode = on;

alter system set archive_directory = '/hgdbbak/archive/';

修改后重启数据库生效。

 

3、创建xlogminer/walminer

瀚高数据库中已经默认添加该工具,只需直接执行如下SQL创建即可。

 

highgo=#  create extension xlogminer;

CREATE EXTENSION

注意HGDBV5之后为:

create extension walminer;

 

 

4、创建测试表

create table test (id int,namevarchar(2));

 

test=# create table gt_test1 (id int,passwd text,num int);

插入随机1000条数据:

test=# insert into gt_test1 select (random()*(100-1)+1)::integer,md5((random()*(10-1)+1)::text),generate_series(3001,4000);

INSERT 0 1000

test=# select * from gt_test1 limit 10;

 id |              passwd              | num 

----+----------------------------------+------

 68 | 7de6438f0d592accb9ad6c1eb809cda5 | 3001

 78 | dd46b18d04456faf136bf2b978f3c422 | 3002

 82 | 84bd8a226b46b967bf5fd3bf03f07ece | 3003

  7 | e98368efa32f33a1ca1aad9b2788cfe2 | 3004

 99 | 7447d95101f5770cdb3ae7f7240dc2dd | 3005

 83 | 07426b161bbaf24ec9bdbde478260998 | 3006

 65 | 3c45e107402b835759b3d6fad38edec4 | 3007

 26 | fdd53b61ec0b0d44d4317804afdbd18d | 3008

 75 | f9fb9392adb0cc7309376b4868723293 | 3009

 21 | 3a0cccbc42549d65d5a130647f079882 | 3010

(10 rows)

【注意1:早期版本需要:alter table gt_test1 replica identity FULL;】

 

5、产生归档并解析

切换归档:

select pg_switch_xlog();

 

新生成的归档日志:

[root@tqz ~]# ls -l /hgdbbak/archive/

total 16384

-rw-------. 1 highgo highgo 16777216 Jun 28 14:42 000000010000000000000001

 

把id=10的数据删除:

test=# select * from gt_test1 where id=10;

 id |              passwd              | num 

----+----------------------------------+------

 10 | 178bfd0749bb00f2ca6455d064ec1274 | 3070

 10 | d1b53c5689c9729b468a1b5c6733b4af | 3079

 10 | 87524ef3ec7d510c15e83dd64d5a4410 | 3091

 10 | 189a9a572223455c2f5b946e3d7f8532 | 3290

 10 | a716d4ad9316cd0b5a3d370a5957e497 | 3328

 10 | 72b03cd9cdc81cb049584ed643f565a3 | 3454

 10 | e71f98a808364dd90e2ef5929e8fef6e | 3520

 10 | c069fd0695e37d92eba0ed053e0d4c78 | 3655

 10 | 0a4c5857d36b1d6adecf900046418638 | 3682

 10 | 14124b2144de04227e798f148d93888a | 3692

 10 | 82fac12f7234eb9f9df4a70e3d0e2019 | 3696

 10 | 10a0dcdc65fb9e2edcf2a1543edcf922 | 3788

 10 | 6a17205d7aa1b7a82878e4b005bdd75c | 3887

(13 rows)



test=# delete from gt_test1 where id=10;

DELETE 13

test=# select * from gt_test1 where id=10;

 id | passwd | num

----+--------+-----

(0 rows)

 

 

切换归档:

select pg_switch_xlog();

 

 

[root@tqz ~]# ls -l /hgdbbak/archive/

total 32768

-rw-------. 1 highgo highgo 16777216 Jun 28 14:42 000000010000000000000001

-rw-------. 1 highgo highgo 16777216 Jun 28 14:51 000000010000000000000002

 

 

修改 id=13的数据num统一为1000:

test=# select * from gt_test1 where id=13;

 id |              passwd              | num 

----+----------------------------------+------

 13 | 20ebc85db9f92bc104d47ee7036d699e | 3057

 13 | 8f29b6896c36c4a879bb276d104194bd | 3071

 13 | 7d0472e1be168dafadc78d3c30c3d421 | 3168

 13 | 5e20989e3befa739b59b2e7c54967123 | 3208

 13 | 770618db9dd4a7e618c61bb7a9072bde | 3228

 13 | a40bc406d4f5eb68d5f7e867d3ddc816 | 3261

 13 | fe6168a6b3e3019399a6d9ccddee796c | 3329

 13 | 51a81b5de6ee618317ed18a462e8725e | 3361

 13 | e8ae6793a7457b33112e30049a7f9362 | 3563

 13 | 1d1e542d1c6187597df637a69be94a45 | 3876

 13 | b37fc81040d688470ad7c14c78c2ee4c | 3878

 13 | 365b61f83ca6ef087fd56b4229e3e3fe | 3993

(12 rows)



test=# update gt_test1 set num=1000 where id=13;

UPDATE 12

test=# select * from gt_test1 where id=13;

 id |              passwd              | num 

----+----------------------------------+------

 13 | 20ebc85db9f92bc104d47ee7036d699e | 1000

 13 | 8f29b6896c36c4a879bb276d104194bd | 1000

 13 | 770618db9dd4a7e618c61bb7a9072bde | 1000

 13 | fe6168a6b3e3019399a6d9ccddee796c | 1000

 13 | 1d1e542d1c6187597df637a69be94a45 | 1000

 13 | 7d0472e1be168dafadc78d3c30c3d421 | 1000

 13 | 5e20989e3befa739b59b2e7c54967123 | 1000

 13 | a40bc406d4f5eb68d5f7e867d3ddc816 | 1000

 13 | 51a81b5de6ee618317ed18a462e8725e | 1000

 13 | e8ae6793a7457b33112e30049a7f9362 | 1000

 13 | b37fc81040d688470ad7c14c78c2ee4c | 1000

 13 | 365b61f83ca6ef087fd56b4229e3e3fe | 1000

(12 rows)

切换归档:

select pg_switch_xlog();

 

[root@tqz ~]# ls -l /hgdbbak/archive/

total 49152

-rw-------. 1 highgo highgo 16777216 Jun 28 14:42 000000010000000000000001

-rw-------. 1 highgo highgo 16777216 Jun 28 14:51 000000010000000000000002

-rw-------. 1 highgo highgo 16777216 Jun 28 14:52 000000010000000000000003

 

 

添加WAL日志:

--直接添加文件

select xlogminer_xlogfile_add('/hgdbbak/archive/000000010000000000000001')  

test=# select xlogminer_xlogfile_add('/hgdbbak/archive/000000010000000000000001');

注意:  L0011: Get data dictionary from current database.

 xlogminer_xlogfile_add

------------------------

 1 file add success

(1 row)

 

test=# select xlogminer_xlogfile_list();

           xlogminer_xlogfile_list          

---------------------------------------------

 (/hgdbbak/archive/000000010000000000000001)

(1 row)

--直接添加目录

select xlogminer_xlogfile_add('/hgdbbak/archive') 

test=# select xlogminer_xlogfile_add('/hgdbbak/archive') ;

 xlogminer_xlogfile_add

------------------------

 2 file add success

(1 row)

 

test=# select xlogminer_xlogfile_list();

           xlogminer_xlogfile_list          

---------------------------------------------

 (/hgdbbak/archive/000000010000000000000001)

 (/hgdbbak/archive/000000010000000000000002)

 (/hgdbbak/archive/000000010000000000000003)

(3 rows)

 

执行解析

全日志解析:

test=# select xlogminer_start('null','null',0,0);

 xlogminer_start 

------------------

 xlogminer start!

(1 row)

全解析查询:

select * from xlogminer_contents;

解析查询1:

test=# select xid,timestamptz,op_undo from xlogminer_contents limit 10;

 xid  |          timestamptz          |                                                              op_undo                                                            

 

------+-------------------------------+---------------------------------------------------------------------------------------------------------------------------------

--

 1837 | 2019-06-28 14:38:41.673117+08 | DELETE FROM "public"."gt_test1" WHERE "id"=68 AND "passwd"='7de6438f0d592accb9ad6c1eb809cda5' AND "num"=3001 AND ctid = '(0,1)';

 1837 | 2019-06-28 14:38:41.673117+08 | DELETE FROM "public"."gt_test1" WHERE "id"=78 AND "passwd"='dd46b18d04456faf136bf2b978f3c422' AND "num"=3002 AND ctid = '(0,2)';

 1837 | 2019-06-28 14:38:41.673117+08 | DELETE FROM "public"."gt_test1" WHERE "id"=82 AND "passwd"='84bd8a226b46b967bf5fd3bf03f07ece' AND "num"=3003 AND ctid = '(0,3)';

 1837 | 2019-06-28 14:38:41.673117+08 | DELETE FROM "public"."gt_test1" WHERE "id"=7 AND "passwd"='e98368efa32f33a1ca1aad9b2788cfe2' AND "num"=3004 AND ctid = '(0,4)';

 1837 | 2019-06-28 14:38:41.673117+08 | DELETE FROM "public"."gt_test1" WHERE "id"=99 AND "passwd"='7447d95101f5770cdb3ae7f7240dc2dd' AND "num"=3005 AND ctid = '(0,5)';

 1837 | 2019-06-28 14:38:41.673117+08 | DELETE FROM "public"."gt_test1" WHERE "id"=83 AND "passwd"='07426b161bbaf24ec9bdbde478260998' AND "num"=3006 AND ctid = '(0,6)';

 1837 | 2019-06-28 14:38:41.673117+08 | DELETE FROM "public"."gt_test1" WHERE "id"=65 AND "passwd"='3c45e107402b835759b3d6fad38edec4' AND "num"=3007 AND ctid = '(0,7)';

 1837 | 2019-06-28 14:38:41.673117+08 | DELETE FROM "public"."gt_test1" WHERE "id"=26 AND "passwd"='fdd53b61ec0b0d44d4317804afdbd18d' AND "num"=3008 AND ctid = '(0,8)';

 1837 | 2019-06-28 14:38:41.673117+08 | DELETE FROM "public"."gt_test1" WHERE "id"=75 AND "passwd"='f9fb9392adb0cc7309376b4868723293' AND "num"=3009 AND ctid = '(0,9)';

 1837 | 2019-06-28 14:38:41.673117+08 | DELETE FROM "public"."gt_test1" WHERE "id"=21 AND "passwd"='3a0cccbc42549d65d5a130647f079882' AND "num"=3010 AND ctid = '(0,10)'

;

(10 rows)

解析查询2:

test=# select xid,timestamptz,op_undo from xlogminer_contents limit 10;

 xid  |          timestamptz          |                                                              op_undo                                                             

------+-------------------------------+-----------------------------------------------------------------------------------------------------------------------------------

 1837 | 2019-06-28 14:38:41.673117+08 | DELETE FROM "public"."gt_test1" WHERE "id"=68 AND "passwd"='7de6438f0d592accb9ad6c1eb809cda5' AND "num"=3001 AND ctid = '(0,1)';

 1837 | 2019-06-28 14:38:41.673117+08 | DELETE FROM "public"."gt_test1" WHERE "id"=78 AND "passwd"='dd46b18d04456faf136bf2b978f3c422' AND "num"=3002 AND ctid = '(0,2)';

 1837 | 2019-06-28 14:38:41.673117+08 | DELETE FROM "public"."gt_test1" WHERE "id"=82 AND "passwd"='84bd8a226b46b967bf5fd3bf03f07ece' AND "num"=3003 AND ctid = '(0,3)';

 1837 | 2019-06-28 14:38:41.673117+08 | DELETE FROM "public"."gt_test1" WHERE "id"=7 AND "passwd"='e98368efa32f33a1ca1aad9b2788cfe2' AND "num"=3004 AND ctid = '(0,4)';

 1837 | 2019-06-28 14:38:41.673117+08 | DELETE FROM "public"."gt_test1" WHERE "id"=99 AND "passwd"='7447d95101f5770cdb3ae7f7240dc2dd' AND "num"=3005 AND ctid = '(0,5)';

 1837 | 2019-06-28 14:38:41.673117+08 | DELETE FROM "public"."gt_test1" WHERE "id"=83 AND "passwd"='07426b161bbaf24ec9bdbde478260998' AND "num"=3006 AND ctid = '(0,6)';

 1837 | 2019-06-28 14:38:41.673117+08 | DELETE FROM "public"."gt_test1" WHERE "id"=65 AND "passwd"='3c45e107402b835759b3d6fad38edec4' AND "num"=3007 AND ctid = '(0,7)';

 1837 | 2019-06-28 14:38:41.673117+08 | DELETE FROM "public"."gt_test1" WHERE "id"=26 AND "passwd"='fdd53b61ec0b0d44d4317804afdbd18d' AND "num"=3008 AND ctid = '(0,8)';

 1837 | 2019-06-28 14:38:41.673117+08 | DELETE FROM "public"."gt_test1" WHERE "id"=75 AND "passwd"='f9fb9392adb0cc7309376b4868723293' AND "num"=3009 AND ctid = '(0,9)';

 1837 | 2019-06-28 14:38:41.673117+08 | DELETE FROM "public"."gt_test1" WHERE "id"=21 AND "passwd"='3a0cccbc42549d65d5a130647f079882' AND "num"=3010 AND ctid = '(0,10)';

(10 rows)



解析查询3:

test=# select xid,timestamptz,op_text from xlogminer_contents limit 10;

 xid  |          timestamptz          |                                                   op_text                                                   

------+-------------------------------+--------------------------------------------------------------------------------------------------------------

 1837 | 2019-06-28 14:38:41.673117+08 | INSERT INTO "public"."gt_test1"("id", "passwd", "num") VALUES(68, '7de6438f0d592accb9ad6c1eb809cda5', 3001);

 1837 | 2019-06-28 14:38:41.673117+08 | INSERT INTO "public"."gt_test1"("id", "passwd", "num") VALUES(78, 'dd46b18d04456faf136bf2b978f3c422', 3002);

 1837 | 2019-06-28 14:38:41.673117+08 | INSERT INTO "public"."gt_test1"("id", "passwd", "num") VALUES(82, '84bd8a226b46b967bf5fd3bf03f07ece', 3003);

 1837 | 2019-06-28 14:38:41.673117+08 | INSERT INTO "public"."gt_test1"("id", "passwd", "num") VALUES(7, 'e98368efa32f33a1ca1aad9b2788cfe2', 3004);

 1837 | 2019-06-28 14:38:41.673117+08 | INSERT INTO "public"."gt_test1"("id", "passwd", "num") VALUES(99, '7447d95101f5770cdb3ae7f7240dc2dd', 3005);

 1837 | 2019-06-28 14:38:41.673117+08 | INSERT INTO "public"."gt_test1"("id", "passwd", "num") VALUES(83, '07426b161bbaf24ec9bdbde478260998', 3006);

 1837 | 2019-06-28 14:38:41.673117+08 | INSERT INTO "public"."gt_test1"("id", "passwd", "num") VALUES(65, '3c45e107402b835759b3d6fad38edec4', 3007);

 1837 | 2019-06-28 14:38:41.673117+08 | INSERT INTO "public"."gt_test1"("id", "passwd", "num") VALUES(26, 'fdd53b61ec0b0d44d4317804afdbd18d', 3008);

 1837 | 2019-06-28 14:38:41.673117+08 | INSERT INTO "public"."gt_test1"("id", "passwd", "num") VALUES(75, 'f9fb9392adb0cc7309376b4868723293', 3009);

 1837 | 2019-06-28 14:38:41.673117+08 | INSERT INTO "public"."gt_test1"("id", "passwd", "num") VALUES(21, '3a0cccbc42549d65d5a130647f079882', 3010);

(10 rows)

结束walminer操作

--该函数作用为释放内存,结束日志分析,该函数没有参数

test=# select xlogminer_stop();

 xlogminer_stop 

-----------------

 xlogminer stop!

(1 row)

 

三、从非WAL产生的数据库中执行WAL日志解析

1、生产库操作

生成数据字典

test=# select xlogminer_build_dictionary('/hgdbbak/other');

 xlogminer_build_dictionary

----------------------------

 Dictionary build success!

(1 row)

 

 

2、测试库测试

load数据字典

highgo=# select xlogminer_load_dictionary('/hgdbbak/other');

 xlogminer_load_dictionary

---------------------------

 Dictionary load success!

(1 row)

 

add wal日志文件

highgo=# select xlogminer_xlogfile_add('/hgdbbak/archive') ;

 xlogminer_xlogfile_add

------------------------

 3 file add success

(1 row)

 

highgo=#  select xlogminer_xlogfile_list();

           xlogminer_xlogfile_list          

---------------------------------------------

 (/hgdbbak/archive/000000010000000000000001)

 (/hgdbbak/archive/000000010000000000000002)

 (/hgdbbak/archive/000000010000000000000003)

(3 rows)

 

remove wal日志文件

highgo=# select xlogminer_xlogfile_remove('/hgdbbak/archive');

 xlogminer_xlogfile_remove

---------------------------

 3 file remove success

(1 row)

解析过程与二、5相同,不在赘述

highgo=#  select xlogminer_xlogfile_list();

错误:  L0016: Xlogfilelist has not been loaded or has been removed.

highgo=# select xlogminer_stop();

 xlogminer_stop 

-----------------

 xlogminer stop!

(1 row)

注意:xlogminer_contents是xlogminer自动生成的临时表,因此当session断开再重新进入或其他session中解析数据不可见。这么做主要是基于安全考虑。 如果希望保留解析结果,可利用create xxx as select * from xlogminer_contents;写入普通表中

 

 

 

 

四、相关解释

1、参数解释

select walminer_start(’START_TIMSTAMP’,’STOP_TIMESTAMP’,’START_XID’,’STOP_XID’)

---如果分析全部日志:

select walminer_start('null','null',0,0);

---将系统表修改结果输出到$PGDATA/walminer/temp下:

select walminer_start('null','null',0,0,true);

 

START_TIMESTAMP:指定输出结果中最早的记录条目,即从该时间开始输出分析数据;若该参数值为空,则以分析日志列表中最早数据开始输出;若该参数值指定时间没有包含在所分析xlog列表中,即通过分析发现全部早于该参数指定时间,则返回空值。

STOP_TIMESTAMP:指定数据结果中最晚的记录条目,即输出结果如果大于该时间,则停止分析,不需要继续输出;如果该参数值为空,则从START_TIMESTAMP开始的所有日志都进行分析和输出。

START_XID:作用与START_TIMESTAMP相同,指定开始的XID值;

STOP_XID:作用与STOP_TIMESTAMP相同,指定结束的XID值

注意:两组参数只能有一组为有效输入,否则报错。

2、使用限制

本版本只解析DML语句,不处理DDL语句

执行了删除表、truncate表、更改表的表空间、更改表字段的类型、vacuum full,这样的DDL语句后,发生DDL语句之前的此表相关的DML语句不会再被解析。

解析结果依赖于数据字典。(举例:创建表t1,所有者为user1,但是中间将所有者改为user2。那解析结果中,所有t1相关操作所有者都将标示为user2)

wal日志如果发生缺失,在缺失的wal日志中发生提交的数据,都不会在解析结果中出现

解析结果中undo字段的ctid属性是发生变更“当时”的值,如果因为vacuum等操作导致ctid发生变更,这个值将不准确。对于有可能存在重复行的数据,我们需要通过这个值确定undo对应的tuple条数,不代表可以直接执行该undo语句。

执行了表字段drop的DDL语句后,发生DDL语句之前的这个字段相关的值都会被解析为encode('AD976BC56F',hex)的形式,另外自定义类型也会解析为这种形式

只能解析与数据字典时间线一致的wal文件

不建议使用walminer解析大宗copy语句(在同一个事务中插入大量数据行)产生的wal日志,这会导致解析过程中的效率低下和内存占用过高

Logo

开放原子开发者工作坊旨在鼓励更多人参与开源活动,与志同道合的开发者们相互交流开发经验、分享开发心得、获取前沿技术趋势。工作坊有多种形式的开发者活动,如meetup、训练营等,主打技术交流,干货满满,真诚地邀请各位开发者共同参与!

更多推荐