Page 2 of 9

orzdba工具安装注意事项

orzdba是一个监控mysql性能的一个比较好用的perl脚本,是淘宝开源的小工具,下载地址http://code.taobao.org/p/orzdba/src/trunk/

配置过程中除了参照《orzdba工具使用说明》外,还有一些注意事项

 

我的安装环境是

Red Hat Enterprise Linux Server release 6.3 (Santiago)

5.6.22-log MySQL Community Server (GPL)

 

安装tcprstat前,先安装包

yum install glibc-static-2.12-1.80.el6.x86_64.rpm

 

参照

http://my.oschina.net/moooofly/blog/157063安装tcprstat

安装tcprstat过程中,除了前面的那个glibc-static外,还需要安装5个包

yum -y install automake

yum install bison

yum install flex

yum install patch

yum install make

配置tcprstat的权限

chown mysql:mysql -R tcprstat

ln -sf /mysqldata/soft/tcprstat/src/tcprstat /usr/bin/tcprstat

chown root:root /usr/bin/tcprstat

chmod u+s /usr/bin/tcprstat

ll /usr/bin/tcprstat

lrwxrwxrwx 1 root root 37 Feb 11 11:22 /usr/bin/tcprstat -> /mysqldata/soft/tcprstat/src/tcprstat

查看目录/mysqldata/soft/tcprstat/src

[root@open_mysql1 src]# ll

total 1996

-rw-r–r– 1 mysql mysql    3006 Feb 10 15:48 capture.c

-rw-r–r– 1 mysql mysql    1057 Feb 10 15:48 capture.h

-rw-r–r– 1 mysql mysql    4015 Feb 10 15:48 functions.c

-rw-r–r– 1 mysql mysql    1019 Feb 10 15:48 functions.h

-rw-r–r– 1 mysql mysql    4325 Feb 10 15:48 local-addresses.c

-rw-r–r– 1 mysql mysql    1065 Feb 10 15:48 local-addresses.h

-rw-r–r– 1 mysql mysql   41829 Feb 11 11:07 Makefile

-rw-r–r– 1 mysql mysql    1800 Feb 10 15:48 Makefile.am

-rw-r–r– 1 mysql mysql   47676 Feb 11 11:07 Makefile.in

-rw-r–r– 1 mysql mysql   10157 Feb 10 15:48 output.c

-rw-r–r– 1 mysql mysql    1315 Feb 10 15:48 output.h

-rw-r–r– 1 mysql mysql    3970 Feb 10 15:48 process-packet.c

-rw-r–r– 1 mysql mysql    1139 Feb 10 15:48 process-packet.h

-rw-r–r– 1 mysql mysql   10411 Feb 10 15:48 stats.c

-rw-r–r– 1 mysql mysql    1958 Feb 10 15:48 stats.h

-rw-r–r– 1 mysql mysql    7845 Feb 10 15:48 stats-hash.c

-rw-r–r– 1 mysql mysql    1480 Feb 10 15:48 stats-hash.h

-rwsr-xr-x 1 root  root   291086 Feb 11 11:08 tcprstat

-rw-r–r– 1 mysql mysql    5970 Feb 10 15:48 tcprstat.c

-rw-r–r– 1 mysql mysql   10720 Feb 11 11:07 tcprstat-capture.o

-rw-r–r– 1 mysql mysql   10272 Feb 11 11:07 tcprstat-functions.o

-rw-r–r– 1 mysql mysql    1339 Feb 10 15:48 tcprstat.h

-rw-r–r– 1 mysql mysql   13568 Feb 11 11:08 tcprstat-local-addresses.o

-rw-r–r– 1 mysql mysql   23312 Feb 11 11:08 tcprstat-output.o

-rw-r–r– 1 mysql mysql   16072 Feb 11 11:08 tcprstat-process-packet.o

-rwxr-xr-x 1 mysql mysql 1258608 Feb 11 11:08 tcprstat-static

-rw-r–r– 1 mysql mysql   10720 Feb 11 11:08 tcprstat_static-capture.o

-rw-r–r– 1 mysql mysql   10272 Feb 11 11:08 tcprstat_static-functions.o

-rw-r–r– 1 mysql mysql   13568 Feb 11 11:08 tcprstat_static-local-addresses.o

-rw-r–r– 1 mysql mysql   23312 Feb 11 11:08 tcprstat_static-output.o

-rw-r–r– 1 mysql mysql   16072 Feb 11 11:08 tcprstat_static-process-packet.o

-rw-r–r– 1 mysql mysql   16056 Feb 11 11:08 tcprstat_static-stats-hash.o

-rw-r–r– 1 mysql mysql   27232 Feb 11 11:08 tcprstat_static-stats.o

-rw-r–r– 1 mysql mysql   18576 Feb 11 11:08 tcprstat_static-tcprstat.o

-rw-r–r– 1 mysql mysql   16056 Feb 11 11:08 tcprstat-stats-hash.o

-rw-r–r– 1 mysql mysql   27232 Feb 11 11:08 tcprstat-stats.o

-rw-r–r– 1 mysql mysql   18576 Feb 11 11:07 tcprstat-tcprstat.o

普通的yum,rpm方式安装version模块会报错,用如下安装方式解决

yum install perl-ExtUtils-CBuilder perl-ExtUtils-MakeMaker

yum -y install perl-CPAN

yum install perl-Params-Validate

perl -MCPAN -e ‘install “Module::Build”‘

perl -MCPAN -e ‘install “File::Lockfile”‘

 

执行上面命令后,应该安装好了version, Class-Data-Inheritable, Module-Build, File::Lockfile等模块,可以用脚本查看是否安装成功

$ cat check_module.pl

#!/usr/bin/perl

use ExtUtils::Installed;

my $inst = ExtUtils::Installed->new();

print join “\n”,$inst->modules();

print “\n”;

对于orzdba脚本,在160行修改root密码,才能运行orzdba脚本,修改后样子如下:

my $MYSQL    = qq{mysql -s –skip-column-names -uroot -P$port -pLisx_new_123 };

改成这样后,会出现mysql5.6的明文密码告警:Warning: Using a password on the command line interface can be insecure.我是如下解决的

orzdba1

弄一个免密码的本地可以访问的管理员账户,用这个账户监控?

这个方案也不行,因为每一行都要回车确认

 

我的解决方案,在mysql的根目录新建隐藏文件,权限为400

-bash-4.1$ cat .my.cnf

[client]

password=lisx123

user=dbalisx

 

并且修改orzdba脚本的第160行处

my $MYSQL    = qq{mysql -s –skip-column-names -udbalisx -P$port };

orzdba2

好了,可以痛快的使用orzdba脚本了。

如何学习MySQL

转自高手的帖子:

1.坚持阅读官方手册,看MySQL书籍作用不会特别大;(挑选跟工作相关的内容优先阅读,例如InnoDB存储引擎,MySQL复制,查询优化)

2.阅读官方手册,同时对阅读的内容做对应的测试;

3.结合你现在的工作内容,多实战即可;

4.外部的BLOG ,也包过我的 http://mysqlops.com 只是作为参考即可,更多要培养自己的分析思考的模式;

备注:

国内人写的MySQL书籍都不要作为重点,包过我可能2014年出版一本关于MySQL的书籍,推荐大家只作为辅助的,可以上厕所的时候看看,坚持官方手册+测试+实战+思考总结为主的模式。

看来耐心看完官方文档是十分重要的!

pt-table-checksum、pt-table-sync核对主从库一致性

一.下载并安装工具
http://www.percona.com/downloads/percona-toolkit/
目前最新的版本是percona-toolkit_2.2.12.tar.gz
上传到服务器后,解压缩,并设置到环境变量
在mysql用户的环境变量文件增加路径
vi .bash_profile
export PATH=$PATH:/mysqldata/soft/percona-toolkit-2.2.12/bin

二.使用pt-table-checksum命令查找不一致的数据
主要关注的列是DIFFS,为0则一致,为1则不一致
-bash-4.1$ pt-table-checksum –nocheck-replication-filters –no-check-binlog-format h=localhost,u=root,p=passwd,P=3306
TS ERRORS DIFFS ROWS CHUNKS SKIPPED TIME TABLE
01-21T14:00:20 0 0 34 1 0 0.304 test.test_inv_log
01-21T14:00:20 0 0 9 1 0 0.304 test.test_inventory
01-21T14:00:21 0 0 2 1 0 0.061 test.test_lastexpressno
01-21T14:00:21 0 0 38 1 0 0.070 test.test_location
01-21T14:00:21 0 0 97 1 0 0.068 test.test_login_history
01-21T14:00:21 0 1 33 1 0 0.065 test.test_menu
01-21T14:00:21 0 0 11 1 0 0.061 test.test_pkd
01-21T14:00:21 0 0 1 1 0 0.061 test.test_promotion
01-21T14:00:21 0 0 1 1 0 0.064 test.test_promotion_condition
01-21T14:00:21 0 0 2 1 0 0.060 test.test_promotion_gift

实验1:在从库多插入记录,制造差异
上面标红的一行是我故意到从库手工插入的一条记录
为了制造不一致的数据,我停掉了双主复制的中的从库到主库的反向复制,关闭了只读参数,手工插入了一条记录。

pt-table-checksum还可以带很多参数,比较重要的用法如下:
-bash-4.1$ pt-table-checksum –nocheck-replication-filters –no-check-binlog-format [–replicate=rep_test.checksums –databases=rep_test –tables=test1] h=localhost,u=root,p=passwd,P=3306

中间的黑色部分,中括号里面的部分是可选的,
–nocheck-replication-filters :不检查复制过滤器,建议启用。后面可以用–databases来指定需要检查的数据库。
–no-check-binlog-format : 不检查复制的binlog模式,要是binlog模式是ROW,则会报错。
–replicate-check-only :只显示不同步的信息。
–replicate= :把checksum的信息写入到指定表中,建议直接写到被检查的数据库当中,如果不指定,默认会在主库新建一个database叫percona,检查结果存放在percona的checksums表中。
–databases= :指定需要被检查的数据库,多个则用逗号隔开。
–tables= :指定需要被检查的表,多个用逗号隔开

我们再次尝试一下,加2个好用的参数,replicate-check-only和database参数
-bash-4.1$ pt-table-checksum –nocheck-replication-filters –no-check-binlog-format –replicate-check-only –databases=test h=localhost,u=root,p=passwd,P=3306

使用这命令效果不好,不建议带replicate-check-only参数,因为输出的列不全
只使用—databases参数指定要核对的数据库比较好
-bash-4.1$ pt-table-checksum –nocheck-replication-filters –no-check-binlog-format –databases=test h=localhost,u=root,p=passwd,P=3306

TS ERRORS DIFFS ROWS CHUNKS SKIPPED TIME TABLE
01-21T14:34:30 0 0 34 1 0 0.303 test.test_inv_log
01-21T14:34:30 0 0 9 1 0 0.061 test.test_inventory
01-21T14:34:30 0 0 2 1 0 0.306 test.test_lastexpressno
01-21T14:34:30 0 0 38 1 0 0.322 test.test_location
01-21T14:34:31 0 0 97 1 0 0.307 test.test_login_history
01-21T14:34:31 0 1 33 1 0 0.305 test.test_menu
01-21T14:34:31 0 0 11 1 0 0.060 test.test_pkd
01-21T14:34:31 0 0 1 1 0 0.055 test.test_promotion
01-21T14:34:31 0 0 1 1 0 0.057 test.test_promotion_condition
01-21T14:34:31 0 0 2 1 0 0.068 test.test_promotion_gift
01-21T14:34:31 0 0 2 1 0 0.058 test.test_promotion_rule
01-21T14:34:31 0 0 33 1 0 0.056 test.test_role
01-21T14:34:31 0 0 95 1 0 0.059 test.test_search_tab
01-21T14:34:32 0 0 25 1 0 0.303 test.test_send_template

执行后,可以发现差异,但是核查结果表percona.checksums,却发现有BUG,结果表显示无差异,结果表主要关注两列this_crc和master_crc
mysql> show tables ;
+——————-+
| Tables_in_percona |
+——————-+
| checksums |
+——————-+
1 row in set (0.00 sec)
mysql> select db,tbl,chunk,this_crc,this_cnt,master_crc,master_cnt,ts from checksums ;
+——-+—————————+——-+———-+———-+————+————+———————+
| db | tbl | chunk | this_crc | this_cnt | master_crc | master_cnt | ts |
+——-+—————————+——-+———-+———-+————+————+———————+
| test | test_inv_adjust_header | 1 | 2a7935e8 | 1 | 2a7935e8 | 1 | 2015-01-21 14:34:29 |
| test | test_inv_log | 1 | efc2d050 | 34 | efc2d050 | 34 | 2015-01-21 14:34:29 |
| test | test_lastexpressno | 1 | 96b2f90d | 2 | 96b2f90d | 2 | 2015-01-21 14:34:30 |
| test | test_location | 1 | d5582f63 | 38 | d5582f63 | 38 | 2015-01-21 14:34:30 |
| test | test_login_history | 1 | db189cf9 | 97 | db189cf9 | 97 | 2015-01-21 14:34:30 |
| test | test_menu | 1 | 6046676 | 33 | 6046676 | 33 | 2015-01-21 14:34:31 |
| test | test_pkd | 1 | ee959bc2 | 11 | ee959bc2 | 11 | 2015-01-21 14:34:31 |
| test | test_promotion | 1 | 2020c504 | 1 | 2020c504 | 1 | 2015-01-21 14:34:31 |
| test | test_promotion_condition | 1 | a71da967 | 1 | a71da967 | 1 | 2015-01-21 14:34:31 |
| test | test_promotion_gift | 1 | 5d6b31f | 2 | 5d6b31f | 2 | 2015-01-21 14:34:31 |
| test | test_promotion_rule | 1 | 76daf3aa | 2 | 76daf3aa | 2 | 2015-01-21 14:34:31 |

标红的部分显示,没有发现主从库的差异
因此,我推断核查差异的时候,以命令输出为准,检查结果表的数据不准确,后面的实验否定了这个结论

实验2:修改主库的一条数据,制造差异
停止从库的slave,在主库修改一条数据,和从库内容不一致
从库:
mysql> stop slave ;
Query OK, 0 rows affected (0.02 sec)

主库:
mysql> select * from test.test_menu where id=60 \G
*************************** 1. row ***************************
id: 60
createTime: 2015-01-21 13:58:18
updateTime: 2015-01-21 13:58:20
domain_id: -1
creator_id: NULL
updator_id: NULL
status_id: NULL
version: 0
menuName: MENU_TEST2
parent_id: 54
role_id: 57
sortIndex: 6
menuDescr: TEST2
remark: NULL
udf1: NULL
udf2: NULL
udf3: NULL
udf4: NULL
1 row in set (0.00 sec)

mysql> update test.test_menu set menuDescr=’TEST2222′ where id=60 ;
Query OK, 1 row affected (0.01 sec)
Rows matched: 1 Changed: 1 Warnings: 0

mysql> commit ;
Query OK, 0 rows affected (0.00 sec)

用pt-table-check命令核查,竟然没发现差异,难道是因为slave没有开启的缘故吗,但是一旦开启从库的话,同步了binlog,差异又没了,因此打算重新搭建slave,跳过几个transacation,试试看

从库执行
mysql> change master to master_host=’192.168.2.195′,master_user=’repl’,master_password=’slave123′,master_log_file=’mysql-bin.000006′,master_log_pos=692149 ;

mysql> start slave ;

配合后,主备复制正常,主备差异存在
主库:
mysql> select * from test.test_menu where id=60 \G
*************************** 1. row ***************************
id: 60
createTime: 2015-01-21 13:58:18
updateTime: 2015-01-21 13:58:20
domain_id: -1
creator_id: NULL
updator_id: NULL
status_id: NULL
version: 0
menuName: MENU_TEST2
parent_id: 54
role_id: 57
sortIndex: 6
menuDescr: TEST2222
remark: NULL
udf1: NULL
udf2: NULL
udf3: NULL
udf4: NULL
1 row in set (0.00 sec)

从库:
mysql> select * from test.test_menu where id=60 \G
*************************** 1. row ***************************
id: 60
createTime: 2015-01-21 13:58:18
updateTime: 2015-01-21 13:58:20
domain_id: -1
creator_id: NULL
updator_id: NULL
status_id: NULL
version: 0
menuName: MENU_TEST2
parent_id: 54
role_id: 57
sortIndex: 6
menuDescr: TEST2
remark: NULL
udf1: NULL
udf2: NULL
udf3: NULL
udf4: NULL
1 row in set (0.00 sec)
标红的就是主从库的差异

在主库再次执行pt-table-checksum命令
-bash-4.1$ pt-table-checksum –nocheck-replication-filters –no-check-binlog-format –databases=test h=localhost,u=root,p=passwd,P=3306
TS ERRORS DIFFS ROWS CHUNKS SKIPPED TIME TABLE
01-21T15:17:41 0 0 4440 4 0 0.283 test.test_address
01-21T15:17:42 0 0 340 1 0 0.304 test.test_allocation
01-21T15:17:42 0 0 2 1 0 0.059 test.test_allocationexp
01-21T15:17:42 0 0 10 1 0 0.055 test.test_asn_detail
01-21T15:17:42 0 0 9 1 0 0.051 test.test_asn_header
01-21T15:17:42 0 0 8 1 0 0.051 test.test_biz_inventory
01-21T15:17:42 0 0 0 1 0 0.300 test.test_combination_map
01-21T15:17:42 0 0 20 1 0 0.055 test.test_common_property
01-21T15:17:42 0 0 103 1 0 0.055 test.test_doc_log
01-21T15:17:42 0 0 0 1 0 0.047 test.test_etrackno_used
01-21T15:17:42 0 0 20 1 0 0.051 test.test_express_company
01-21T15:17:42 0 0 2 1 0 0.051 test.test_inter_promotion_rule
01-21T15:17:42 0 0 1 1 0 0.062 test.test_inv_adjust_detail
01-21T15:17:43 0 0 1 1 0 0.056 test.test_inv_adjust_header
01-21T15:17:43 0 0 34 1 0 0.058 test.test_inv_log
01-21T15:17:43 0 0 9 1 0 0.054 test.test_inventory
01-21T15:17:43 0 0 2 1 0 0.058 test.test_lastexpressno
01-21T15:17:43 0 0 38 1 0 0.056 test.test_location
01-21T15:17:43 0 0 97 1 0 0.058 test.test_login_history
01-21T15:17:43 0 1 34 1 0 0.067 test.test_menu
01-21T15:17:43 0 0 11 1 0 0.161 test.test_pkd
01-21T15:17:44 0 0 1 1 0 0.347 test.test_promotion
01-21T15:17:44 0 0 1 1 0 0.409 test.test_promotion_condition
01-21T15:17:44 0 0 2 1 0 0.334 test.test_promotion_gift
01-21T15:17:45 0 0 2 1 0 0.163 test.test_promotion_rule
01-21T15:17:45 0 0 33 1 0 0.059 test.test_role
01-21T15:17:45 0 0 95 1 0 0.055 test.test_search_tab
01-21T15:17:45 0 0 25 1 0 0.309 test.test_send_template
01-21T15:17:45 0 0 4 1 0 0.057 test.test_sequence
01-21T15:17:45 0 0 30 1 0 0.300 test.test_sku
01-21T15:17:46 0 0 21 1 0 0.298 test.test_so_detail
01-21T15:17:46 0 0 15 1 0 0.069 test.test_so_header
01-21T15:17:46 0 0 2 1 0 0.056 test.test_soh_ee_relation
01-21T15:17:46 0 0 14 1 0 0.308 test.test_soh_eo_relation
01-21T15:17:46 0 0 12 1 0 0.060 test.test_store
01-21T15:17:46 0 0 23 1 0 0.059 test.test_sys_code
01-21T15:17:46 0 0 3 1 0 0.054 test.test_sys_config
01-21T15:17:47 0 0 32 1 0 0.055 test.test_sys_domain
01-21T15:17:47 0 0 19 1 0 0.061 test.test_taobao_so_detail
01-21T15:17:47 0 0 14 1 0 0.312 test.test_taobao_so_header
01-21T15:17:47 0 0 50 1 0 0.063 test.test_trackprint_template
01-21T15:17:47 0 0 32 1 0 0.057 test.test_user
01-21T15:17:47 0 0 31 1 0 0.053 test.test_user_role
01-21T15:17:47 0 0 31 1 0 0.060 test.test_warehouse
终于发现了差异

在主库检查结果表percona.checksums,没有显示差异,难道还是有BUG ?
mysql> select db,tbl,chunk,this_crc,this_cnt,master_crc,master_cnt,ts from checksums ;
+——-+—————————+——-+———-+———-+————+————+———————+
| db | tbl | chunk | this_crc | this_cnt | master_crc | master_cnt | ts |
+——-+—————————+——-+———-+———-+————+————+———————+
| test | test_lastexpressno | 1 | 96b2f90d | 2 | 96b2f90d | 2 | 2015-01-21 15:17:43 |
| test | test_location | 1 | d5582f63 | 38 | d5582f63 | 38 | 2015-01-21 15:17:43 |
| test | test_login_history | 1 | db189cf9 | 97 | db189cf9 | 97 | 2015-01-21 15:17:43 |
| test | test_menu | 1 | d9d69755 | 34 | d9d69755 | 34 | 2015-01-21 15:17:43 |
| test | test_pkd | 1 | ee959bc2 | 11 | ee959bc2 | 11 | 2015-01-21 15:17:43 |
| test | test_promotion | 1 | 2020c504 | 1 | 2020c504 | 1 | 2015-01-21 15:17:43 |
| test | test_promotion_condition | 1 | a71da967 | 1 | a71da967 | 1 | 2015-01-21 15:17:44 |

68 rows in set (0.00 sec)

在从库检查结果表percona.checksums
mysql> select db,tbl,chunk,this_crc,this_cnt,master_crc,master_cnt,ts from checksums ;
+——-+—————————+——-+———-+———-+————+————+———————+
| db | tbl | chunk | this_crc | this_cnt | master_crc | master_cnt | ts |
| test | test_lastexpressno | 1 | 96b2f90d | 2 | 96b2f90d | 2 | 2015-01-21 15:17:43 |
| test | test_location | 1 | d5582f63 | 38 | d5582f63 | 38 | 2015-01-21 15:17:43 |
| test | test_login_history | 1 | db189cf9 | 97 | db189cf9 | 97 | 2015-01-21 15:17:43 |
| test | test_menu | 1 | 631fa3bf | 34 | d9d69755 | 34 | 2015-01-21 15:17:43 |
| test | test_pkd | 1 | ee959bc2 | 11 | ee959bc2 | 11 | 2015-01-21 15:17:43 |
| test | test_promotion | 1 | 2020c504 | 1 | 2020c504 | 1 | 2015-01-21 15:17:43 |
| test | test_promotion_condition | 1 | a71da967 | 1 | a71da967 | 1 | 2015-01-21 15:17:44 |
| test | test_promotion_gift | 1 | 5d6b31f | 2 | 5d6b31f | 2 | 2015-01-21 15:17:44 |

+——-+—————————+——-+———-+———-+————+————+———————+
68 rows in set (0.01 sec)
终于显示了差异,原来查看结果表记录的差异,要在slave库查看才行
前面在从库多插入的记录的那个实验,在从库查看的话,应该也没有问题,是我查看选了不正确的库查看导致的
结论:查看差异结果表,应该在slave库上查询

下面的结果是我后来又重做的实验,在从库查询的
select db,tbl,chunk,this_crc,this_cnt,master_crc,master_cnt,ts from checksums where this_crc != master_crc ;
mysql> select db,tbl,chunk,this_crc,this_cnt,master_crc,master_cnt,ts from checksums where this_crc != master_crc ;
+—–+———-+——-+———-+———-+————+————+———————+
| db | tbl | chunk | this_crc | this_cnt | master_crc | master_cnt | ts |
+—–+———-+——-+———-+———-+————+————+———————+
| test | test_menu | 1 | 3be44ed2 | 35 | d9d69755 | 34 | 2015-01-21 16:41:25 |
+—–+———-+——-+———-+———-+————+————+———————+
1 row in set (0.00 sec)
三.用pt-table-sync修复不一致数据
-bash-4.1$ pt-table-sync –print –replicate=percona.checksums h=localhost,u=root,p=passwd h=192.168.2.196,u=root,p=passwd

REPLACE INTO `test`.`test_menu`(`id`, `createtime`, `updatetime`, `domain_id`, `creator_id`, `updator_id`, `status_id`, `version`, `menuname`, `parent_id`, `role_id`, `sortindex`, `menudescr`, `remark`, `udf1`, `udf2`, `udf3`, `udf4`) VALUES (’60’, ‘2015-01-21 13:58:18’, ‘2015-01-21 13:58:20’, ‘-1’, NULL, NULL, NULL, ‘0’, ‘MENU_TEST2′, ’54’, ’57’, ‘6’, ‘TEST2222′, NULL, NULL, NULL, NULL, NULL) /*percona-toolkit src_db:test src_tbl:test_menu src_dsn:h=localhost,p=…,u=root dst_db:test dst_tbl:test_menu dst_dsn:h=192.168.2.196,p=…,u=root lock:1 transaction:1 changing_src:percona.checksums replicate:percona.checksums bidirectional:0 pid:8016 user:mysql host:open_mysql1*/;
-bash-4.1$ pt-table-sync –print –sync-to-master h=192.168.2.196,u=root,p=passwd –databases=test –tables=test_menu
REPLACE INTO `test`.`test_menu`(`id`, `createtime`, `updatetime`, `domain_id`, `creator_id`, `updator_id`, `status_id`, `version`, `menuname`, `parent_id`, `role_id`, `sortindex`, `menudescr`, `remark`, `udf1`, `udf2`, `udf3`, `udf4`) VALUES (’60’, ‘2015-01-21 13:58:18’, ‘2015-01-21 13:58:20’, ‘-1’, NULL, NULL, NULL, ‘0’, ‘MENU_TEST2′, ’54’, ’57’, ‘6’, ‘TEST2222′, NULL, NULL, NULL, NULL, NULL) /*percona-toolkit src_db:test src_tbl:test_menu src_dsn:P=3306,h=192.168.2.195,p=…,u=root dst_db:test dst_tbl:test_menu dst_dsn:h=192.168.2.196,p=…,u=root lock:1 transaction:1 changing_src:1 replicate:0 bidirectional:0 pid:8022 user:mysql host:open_mysql1*/;

上面两个命令等价:
参数的意义:
–replicate= :指定通过pt-table-checksum得到的表,这2个工具差不多都会一直用。
–databases= : 指定执行同步的数据库,多个用逗号隔开。
–tables= :指定执行同步的表,多个用逗号隔开。
–sync-to-master :指定一个DSN,即从的IP,他会通过show processlist或show slave status 去自动的找主。
h=127.0.0.1 :服务器地址,命令里有2个ip,第一次出现的是M的地址,第2次是Slave的地址。
u=root :帐号。
p=123456 :密码。
–print :打印,但不执行命令。
–execute :执行命令。

修复方法:上面生成的语句直接在slave库执行

注意如果从库比主库记录多,得到的是delete语句,也在从库执行
-bash-4.1$ pt-table-sync –print –replicate=percona.checksums h=localhost,u=root,p=passwd h=192.168.2.196,u=root,p=passwd
DELETE FROM `test`.`test_menu` WHERE `id`=’61’ LIMIT 1 /*percona-toolkit src_db:test src_tbl:test_menu src_dsn:h=localhost,p=…,u=root dst_db:test dst_tbl:test_menu dst_dsn:h=192.168.2.196,p=…,u=root lock:1 transaction:1 changing_src:percona.checksums replicate:percona.checksums bidirectional:0 pid:8029 user:mysql host:open_mysql1*/;

ORACLE11.2.0.3数据库alter table drop column慢原因分析

中午某生产数据库发布补丁,开发人员反应有语句阻塞,无法执行下面语句
alter table pyramid.pd_loanProduct drop column customerType;
这是一个只有4行的小表

我建个测试表尝试了一下,果然很慢
create table pyramid.test_alter2(id number,name varchar2(10))  ;
alter table pyramid.test_alter2 drop column name;

Continue reading

迁移MySQL生产库的data目录,并搭建slave

大致思路:
1.通知开发停应用
2.停mysql库
3. 以read only方式打开数据库
4. 逻辑备份mysql数据库
5. 记录数据库当前的binlog位置
6. 新建目录/data/mysql_data
7. 拷贝数据库相关文件到mysql_data
8. 修改my.cnf中的mysql数据库文件路径
9. 以read only启动mysql数据库,查看binlog位置
10. 关闭mysql数据库
11. 将/data/mysql_data目录下的所有文件打包传输到备库的mysql数据库目录
12. 启动备库
13. 启动主库
14. 在备库设置主库的配置信息,启动slave
15. 过一天后删除/data目录下的mysql数据库文件
————————————————————————————
Continue reading

解决standby数据库standby_file_management为MANUAL引发的问题

standby数据库参数
SQL> show parameter standby

NAME TYPE VALUE
———————————— ———– ——————————
standby_archive_dest string ?/dbs/arch
standby_file_management string MANUAL

在主库新建表空间后,备库alert报错,并且mrp进程终止
主库执行的命令:
create tablespace tbs_test datafile ‘/u01/app/oracle/oradata/prod/tbs_test01.dbf’ size 10m;
备库alert:
Continue reading

修改standby的日志传输模式,从ASYNC到LGWR SYNC AFFIRM

修改之前是主备库相差一个日志文件,如果switchover ,有数据丢失的风险,修改后,主库的日志可以实时传输,并且实时应用到standby,有个AFFIRM的过程,主库commit之前会确认日止在备库已经应用。

从alert看,非ADG的alert的关键词是Media Recovery Log ,ADG的alert关键词是Recovery of Online Redo Log

版本:
SQL*Plus: Release 11.2.0.3.0 Production on Tue Sep 23 07:04:35 2014

Copyright (c) 1982, 2011, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 – 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

修改前参数
alter system set log_archive_dest_2=’SERVICE=dsedidg ASYNC VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=dsedidg’ scope=both ;

修改方法:
–主库
alter system set log_archive_dest_2=’SERVICE=dsedidg LGWR SYNC AFFIRM VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=dsedidg’ scope=both ;

–备库
alter database recover managed standby database cancel;
alter database recover managed standby database using current logfile disconnect from session;
Continue reading

oracle 11gR2 standby重建online和standby log删除不掉最后一个log

因为上了SSD,SSD对小文件连续写会有性能问题,因此ORACLE推荐把redo和control迁移到SAS盘上,因此要重建redo log和standby_log,在备库上新建了redo和standby后,想删除旧的log
发现问题,备库上新建的online和standby log都无法被ORACLE使用,因此备库上旧的日志也无法彻底删除
17:25:12 sys@DSEDI>show parameter standby

NAME TYPE VALUE
———————————— ———– ——————————
standby_archive_dest string ?/dbs/arch
standby_file_management string AUTO
Continue reading

WAITEVENT: “log file sync” Reference Note (文档 ID 34592.1)

WAITEVENT: “log file sync” Reference Note (文档 ID 34592.1)

Versions:7.0 – 11.1 Documentation: 11g 10g
When a user session(foreground process) COMMITs (or rolls back), the session’s redo information needs to be flushed to the redo logfile. The user session will post the LGWR to write all redo required from the log buffer to the redo log file. When the LGWR has finished it will post the user session. The user session waits on this wait event while waiting for LGWR to post it back to confirm all redo changes are safely on disk.

This may be described further as the time user session/foreground process spends waiting for redo to be flushed to make the commit durable. Therefore, we may think of these waits as commit latency from the foreground process (or commit client generally).

Continue reading

Troubleshooting ‘Log File Sync’ Waits

I have been contacted by one of our customers to provide reference information on troubleshooting Oracle Log File Sync waits.

I think that this information worth short blog post.

Reasons:
Log File Sync waits occur when sessions wait for redo data to be written to disk
typically this is caused by slow writes
or committing too frequently in the application
CPU overburning(very high demand => LGWR on run queue)
improper Operating System configuration(check 169706.1)
BUGs in Oracle(especially with RAC option) and 3rd Party software(like ODM/DISM)

Recommendations:
tune LGWR process to get good throughput, especially when ‘log file parallel write‘ high too:
do not put redo logs on RAID 5 without good write cache
do not put redo logs on Solid State Disk (SSD)
It looks like last recommendatin was based on old experience working with SSD disk, which is obsolete now and even Oracle recommends using SSD disks for REDO logs(1566935.1 Implementing Oracle E-Business Suite 12.1 Databases on Oracle Database Appliance):

“Move REDO log files to +REDO diskgroup on Solid State Disks (SSDs).”

if CPUs are overburned(check runqueue with vmstat):
check for non-oracle system activity, like GZIP or BZIP2 running in business hours…
lower instance’s CPU usage(for example, tune SQL for LIOs)
increase LGWR priority(renice or _high_priority_processes),
decrease COMMITs count for applications with many short transactions
use COMMIT [BATCH] NOWAIT(10g+) when possible
do some processing with NOLOGGING(or may be even with _disable_logging=TRUE if just testing performance benchmark/impact), but think about database recoverability
lower system’s CPU usage or increase LGWR priority
check if there is some 3rd party software, or utilities like RMAN, activity on the same disks as redo logs placed, like trace/systemstate dump files, e.t.c
trace LGWR as the last option for troubleshooting OS/3rd party issues 😉

References:
34592.1 WAITEVENT: “log file sync” Reference Note
34583.1 WAITEVENT: “log file parallel write” Reference Note
1376916.1 Troubleshooting: log file sync’ Waits
223117.1 Troubleshooting I/O-related waits
857576.1 How to Minimise Waits for ‘Log File Sync’
1064487.1 Script to Collect Log File Sync Diagnostic Information (lfsdiag.sql)
1318709.1 AIX: Things To Check When Seeing Long Log File Sync Time in 11.2.
1205673.1 ‘Log File Sync’ problem on a Sun Server: A Typical Source for LOGFILE SYNC Performance Problems
1523164.1 SPARC: Reducing High Waits on ‘log file sync’ on Oracle Solaris SPARC by Increasing Priority of Log Writer
13551402.8 High “log file parallel write” and “log file sync” after upgrading 11.2 with Veritas/Symantec ODM
1278149.1 Intermittent Long ‘log file sync’ Waits, LGWR Posting Long Write Times, I/O Portion of Wait Minimal
1229104.1 LOG FILE SYNC WAITS SPIKES DURING RMAN ARCHIVELOG BACKUPS
1462942.1 Adaptive Switching Between Log Write Methods can Cause ‘log file sync’ Waits
Kevin Closson: “Manly Men Only Use Solid State Disk For Redo Logging. LGWR I/O is Simple, But Not LGWR Processing”
Jeremy Schneider: “Adaptive Log File Sync: Oracle, Please Don’t Do That Again”
Riyaj Shamsudee: “Tuning ‘log file sync’ wait events”
Gwen Shapira: “De-Confusing SSD (for Oracle Databases)”
Guy Harrison: “Using Solid State Disk to optimize Oracle databases”
SSD Performance Blog

© 2019 Init dba

Theme by Anders NorenUp ↑