多流聚合与JSON全文检索的功能应用

来源：互联网发布：大数据公司是做什么的编辑：程序博客网时间：2024/05/16 01:56

背景

这个需求是这样的，数据在写入时，以上一条记录作为基础，将当前写入的记录与上一条记录合并，然后作为新的记录写进去。

从而每一条记录都携带了之前所有记录的内容。

当然这里指的是每个维度各自的快照，并不是一张表所有记录的快照。

例如，一笔电商订单，可能经过若干个系统（每个系统产生的属性可能都不一样，多个系统合起来就是个大宽表，应用为了设计简单，往往可能选择JSON存储，而不是大宽表），产生若干笔记录，每次写入时期望将之前与之相关的记录内容都合并起来，产生新的值写入。

但是不要忘记，同一笔订单的数据，可能存在并行写入（除非业务上能将订单编号按哈希让一个线程来处理它，而且不能多机）。当存在并发写同一笔订单时，写时合并就违反自然规律。

例子：

tbl已有记录 (0, 1, 'test0', now())    session A:    insert into tbl (pk, caseid, info, crt_time) values (1, 1, 'test1', now());    session B:    insert into tbl (pk, caseid, info, crt_time) values (2, 1, 'test2', now());    如果SESSION A,B同时发起，那么写入的记录可能变成：    (1, 1, 'test0_test1', now());    (2, 1, 'test0_test2', now());    然而实际上要的可能是这两条

我有几张阿里云幸运券分享给你，用券购买或者升级阿里云相应产品会有特惠惊喜哦！把想要买的产品的幸运券都领走吧！快下手，马上就要抢光了。  (1, 1, 'test0_test1', now());    (2, 1, 'test0_test1_test2', now());

所以，我们使用另一种方法来获取快照，写入时，不改变原始的写入方法，即各个业务线产生的订单记录，分别写入到一个单表，使用JSON来表示各个业务线对这个订单的描述。

JSON写入性能

create table tbl_ord (    ordid int8,   -- 订单号    appid  int,   -- 应用ID    info jsonb,   -- 内容    crt_time timestamp  -- 写入时间  );    create index idx_tbl_ord on tbl_ord(ordid, crt_time);

单条写入压测

vi test.sql    \set ordid random(1,10000000)  \set appid random(1,10)  insert into tbl_ord (ordid,appid,info,crt_time) values (:ordid,:appid,jsonb '{"a" : 1, "b" : 2}',now());    pgbench -M prepared -n -r -P 1 -f ./test.sql -c 40 -j 40 -t 2500000

单条写入压测，23.4万行/s。

transaction type: ./test.sql  scaling factor: 1  query mode: prepared  number of clients: 40  number of threads: 40  number of transactions per client: 2500000  number of transactions actually processed: 100000000/100000000  latency average = 0.170 ms  latency stddev = 0.498 ms  tps = 234047.009786 (including connections establishing)  tps = 234060.902533 (excluding connections establishing)  script statistics:   - statement latencies in milliseconds:           0.001  \set ordid random(1,10000000)           0.001  \set appid random(1,10)           0.168  insert into tbl_ord (ordid,appid,info,crt_time) values (:ordid,:appid,jsonb '{"a" : 1, "b" : 2}',now());

如果使用批量写入，可以达到100万+行/s。

原文链接

阅读全文

0 0

多流聚合与JSON全文检索的功能应用

标签

背景

JSON写入性能