测试工具

TPC-H 是美国交易处理效能委员会 TPC(Transaction Processing Performance Council)组织制定的用来模拟决策支持类应用的测试集。它包括一整套面向业务的 ad-hoc 查询和并发数据修改。

TPC-H 根据真实的生产运行环境来建模,模拟了一套销售系统的数据仓库。该测试共包含 8 张表,数据量可设定从 1 GB~3 TB不等。其基准测试共包含了 22 个查询,主要评价指标为各个查询的响应时间,即从提交查询到结果返回所需时间。

测试数据

数据总大小:64GB

数据总行数:519611624

平均大小:132 Bytes

表名行数
customer9000000
lineitem360011594
nation25
orders90000000
part12000000
partsupp48000000
region5
supplier600000

测试结果

导入

[root@centos221 tpch-poc-1.0]# time sh bin/stream_load.sh data_60
[INFO] 2024-12-03 10:47:24 db_table_operation.py[101] stream load from dir:/usr/local/src/tpch-poc-1.0/data_60
[INFO] 2024-12-03 10:47:24 config_util.py[43] concurrency load number for table: customer is not set, use 1 by default.
[INFO] 2024-12-03 10:47:24 db_table_operation.py[25] stream load start. table: customer, path: /usr/local/src/tpch-poc-1.0/data_60/customer.tbl
[INFO] 2024-12-03 10:47:24 db_table_operation.py[27] stream load command: {curl --location-trusted -u root: -T /usr/local/src/tpch-poc-1.0/data_60/customer.tbl -H "column_separator:|"  -H "columns:c_custkey,c_name,c_address,c_nationkey,c_phone,c_acctbal,c_mktsegment,c_comment,not_use_column"  http://192.168.0.241:8030/api/tpch/customer/_stream_load}
[INFO] 2024-12-03 10:47:39 db_table_operation.py[41] stream load success. table: customer, path: /usr/local/src/tpch-poc-1.0/data_60/customer.tbl
[INFO] 2024-12-03 10:47:39 config_util.py[41] concurrency load number for table: lineitem is 10.
[INFO] 2024-12-03 10:47:39 db_table_operation.py[25] stream load start. table: lineitem, path: /usr/local/src/tpch-poc-1.0/data_60/lineitem.tbl.1
[INFO] 2024-12-03 10:47:39 db_table_operation.py[27] stream load command: {curl --location-trusted -u root: -T /usr/local/src/tpch-poc-1.0/data_60/lineitem.tbl.1 -H "column_separator:|"  -H "columns:l_orderkey,l_partkey,l_suppkey,l_linenumber,l_quantity,l_extendedprice,l_discount,l_tax,l_returnflag,l_linestatus,l_shipdate,l_commitdate,l_receiptdate,l_shipinstruct,l_shipmode,l_comment,not_use_column"  http://192.168.0.241:8030/api/tpch/lineitem/_stream_load}
[INFO] 2024-12-03 10:47:39 db_table_operation.py[25] stream load start. table: lineitem, path: /usr/local/src/tpch-poc-1.0/data_60/lineitem.tbl.2
[INFO] 2024-12-03 10:47:39 db_table_operation.py[27] stream load command: {curl --location-trusted -u root: -T /usr/local/src/tpch-poc-1.0/data_60/lineitem.tbl.2 -H "column_separator:|"  -H "columns:l_orderkey,l_partkey,l_suppkey,l_linenumber,l_quantity,l_extendedprice,l_discount,l_tax,l_returnflag,l_linestatus,l_shipdate,l_commitdate,l_receiptdate,l_shipinstruct,l_shipmode,l_comment,not_use_column"  http://192.168.0.241:8030/api/tpch/lineitem/_stream_load}
[INFO] 2024-12-03 10:47:39 db_table_operation.py[25] stream load start. table: lineitem, path: /usr/local/src/tpch-poc-1.0/data_60/lineitem.tbl.3
[INFO] 2024-12-03 10:47:39 db_table_operation.py[27] stream load command: {curl --location-trusted -u root: -T /usr/local/src/tpch-poc-1.0/data_60/lineitem.tbl.3 -H "column_separator:|"  -H "columns:l_orderkey,l_partkey,l_suppkey,l_linenumber,l_quantity,l_extendedprice,l_discount,l_tax,l_returnflag,l_linestatus,l_shipdate,l_commitdate,l_receiptdate,l_shipinstruct,l_shipmode,l_comment,not_use_column"  http://192.168.0.241:8030/api/tpch/lineitem/_stream_load}
[INFO] 2024-12-03 10:47:39 db_table_operation.py[25] stream load start. table: lineitem, path: /usr/local/src/tpch-poc-1.0/data_60/lineitem.tbl.4
[INFO] 2024-12-03 10:47:39 db_table_operation.py[27] stream load command: {curl --location-trusted -u root: -T /usr/local/src/tpch-poc-1.0/data_60/lineitem.tbl.4 -H "column_separator:|"  -H "columns:l_orderkey,l_partkey,l_suppkey,l_linenumber,l_quantity,l_extendedprice,l_discount,l_tax,l_returnflag,l_linestatus,l_shipdate,l_commitdate,l_receiptdate,l_shipinstruct,l_shipmode,l_comment,not_use_column"  http://192.168.0.241:8030/api/tpch/lineitem/_stream_load}
[INFO] 2024-12-03 10:47:39 db_table_operation.py[25] stream load start. table: lineitem, path: /usr/local/src/tpch-poc-1.0/data_60/lineitem.tbl.5
[INFO] 2024-12-03 10:47:39 db_table_operation.py[27] stream load command: {curl --location-trusted -u root: -T /usr/local/src/tpch-poc-1.0/data_60/lineitem.tbl.5 -H "column_separator:|"  -H "columns:l_orderkey,l_partkey,l_suppkey,l_linenumber,l_quantity,l_extendedprice,l_discount,l_tax,l_returnflag,l_linestatus,l_shipdate,l_commitdate,l_receiptdate,l_shipinstruct,l_shipmode,l_comment,not_use_column"  http://192.168.0.241:8030/api/tpch/lineitem/_stream_load}
[INFO] 2024-12-03 10:47:39 db_table_operation.py[25] stream load start. table: lineitem, path: /usr/local/src/tpch-poc-1.0/data_60/lineitem.tbl.6
[INFO] 2024-12-03 10:47:39 db_table_operation.py[27] stream load command: {curl --location-trusted -u root: -T /usr/local/src/tpch-poc-1.0/data_60/lineitem.tbl.6 -H "column_separator:|"  -H "columns:l_orderkey,l_partkey,l_suppkey,l_linenumber,l_quantity,l_extendedprice,l_discount,l_tax,l_returnflag,l_linestatus,l_shipdate,l_commitdate,l_receiptdate,l_shipinstruct,l_shipmode,l_comment,not_use_column"  http://192.168.0.241:8030/api/tpch/lineitem/_stream_load}
[INFO] 2024-12-03 10:47:39 db_table_operation.py[25] stream load start. table: lineitem, path: /usr/local/src/tpch-poc-1.0/data_60/lineitem.tbl.7
[INFO] 2024-12-03 10:47:39 db_table_operation.py[27] stream load command: {curl --location-trusted -u root: -T /usr/local/src/tpch-poc-1.0/data_60/lineitem.tbl.7 -H "column_separator:|"  -H "columns:l_orderkey,l_partkey,l_suppkey,l_linenumber,l_quantity,l_extendedprice,l_discount,l_tax,l_returnflag,l_linestatus,l_shipdate,l_commitdate,l_receiptdate,l_shipinstruct,l_shipmode,l_comment,not_use_column"  http://192.168.0.241:8030/api/tpch/lineitem/_stream_load}
[INFO] 2024-12-03 10:47:39 db_table_operation.py[25] stream load start. table: lineitem, path: /usr/local/src/tpch-poc-1.0/data_60/lineitem.tbl.8
[INFO] 2024-12-03 10:47:39 db_table_operation.py[27] stream load command: {curl --location-trusted -u root: -T /usr/local/src/tpch-poc-1.0/data_60/lineitem.tbl.8 -H "column_separator:|"  -H "columns:l_orderkey,l_partkey,l_suppkey,l_linenumber,l_quantity,l_extendedprice,l_discount,l_tax,l_returnflag,l_linestatus,l_shipdate,l_commitdate,l_receiptdate,l_shipinstruct,l_shipmode,l_comment,not_use_column"  http://192.168.0.241:8030/api/tpch/lineitem/_stream_load}
[INFO] 2024-12-03 10:47:39 db_table_operation.py[25] stream load start. table: lineitem, path: /usr/local/src/tpch-poc-1.0/data_60/lineitem.tbl.9
[INFO] 2024-12-03 10:47:39 db_table_operation.py[27] stream load command: {curl --location-trusted -u root: -T /usr/local/src/tpch-poc-1.0/data_60/lineitem.tbl.9 -H "column_separator:|"  -H "columns:l_orderkey,l_partkey,l_suppkey,l_linenumber,l_quantity,l_extendedprice,l_discount,l_tax,l_returnflag,l_linestatus,l_shipdate,l_commitdate,l_receiptdate,l_shipinstruct,l_shipmode,l_comment,not_use_column"  http://192.168.0.241:8030/api/tpch/lineitem/_stream_load}
[INFO] 2024-12-03 10:47:39 db_table_operation.py[25] stream load start. table: lineitem, path: /usr/local/src/tpch-poc-1.0/data_60/lineitem.tbl.10
[INFO] 2024-12-03 10:47:39 db_table_operation.py[27] stream load command: {curl --location-trusted -u root: -T /usr/local/src/tpch-poc-1.0/data_60/lineitem.tbl.10 -H "column_separator:|"  -H "columns:l_orderkey,l_partkey,l_suppkey,l_linenumber,l_quantity,l_extendedprice,l_discount,l_tax,l_returnflag,l_linestatus,l_shipdate,l_commitdate,l_receiptdate,l_shipinstruct,l_shipmode,l_comment,not_use_column"  http://192.168.0.241:8030/api/tpch/lineitem/_stream_load}
[INFO] 2024-12-03 10:53:01 db_table_operation.py[41] stream load success. table: lineitem, path: /usr/local/src/tpch-poc-1.0/data_60/lineitem.tbl.8
[INFO] 2024-12-03 10:53:04 db_table_operation.py[41] stream load success. table: lineitem, path: /usr/local/src/tpch-poc-1.0/data_60/lineitem.tbl.9
[INFO] 2024-12-03 10:53:05 db_table_operation.py[41] stream load success. table: lineitem, path: /usr/local/src/tpch-poc-1.0/data_60/lineitem.tbl.3
[INFO] 2024-12-03 10:53:09 db_table_operation.py[41] stream load success. table: lineitem, path: /usr/local/src/tpch-poc-1.0/data_60/lineitem.tbl.1
[INFO] 2024-12-03 10:53:13 db_table_operation.py[41] stream load success. table: lineitem, path: /usr/local/src/tpch-poc-1.0/data_60/lineitem.tbl.2
[INFO] 2024-12-03 10:53:14 db_table_operation.py[41] stream load success. table: lineitem, path: /usr/local/src/tpch-poc-1.0/data_60/lineitem.tbl.10
[INFO] 2024-12-03 10:53:14 db_table_operation.py[41] stream load success. table: lineitem, path: /usr/local/src/tpch-poc-1.0/data_60/lineitem.tbl.5
[INFO] 2024-12-03 10:53:15 db_table_operation.py[41] stream load success. table: lineitem, path: /usr/local/src/tpch-poc-1.0/data_60/lineitem.tbl.4
[INFO] 2024-12-03 10:53:17 db_table_operation.py[41] stream load success. table: lineitem, path: /usr/local/src/tpch-poc-1.0/data_60/lineitem.tbl.7
[INFO] 2024-12-03 10:53:19 db_table_operation.py[41] stream load success. table: lineitem, path: /usr/local/src/tpch-poc-1.0/data_60/lineitem.tbl.6
[INFO] 2024-12-03 10:53:19 db_table_operation.py[25] stream load start. table: lineitem, path: /usr/local/src/tpch-poc-1.0/data_60/lineitem.tbl.11
[INFO] 2024-12-03 10:53:19 db_table_operation.py[27] stream load command: {curl --location-trusted -u root: -T /usr/local/src/tpch-poc-1.0/data_60/lineitem.tbl.11 -H "column_separator:|"  -H "columns:l_orderkey,l_partkey,l_suppkey,l_linenumber,l_quantity,l_extendedprice,l_discount,l_tax,l_returnflag,l_linestatus,l_shipdate,l_commitdate,l_receiptdate,l_shipinstruct,l_shipmode,l_comment,not_use_column"  http://192.168.0.241:8030/api/tpch/lineitem/_stream_load}
[INFO] 2024-12-03 10:53:19 db_table_operation.py[25] stream load start. table: lineitem, path: /usr/local/src/tpch-poc-1.0/data_60/lineitem.tbl.12
[INFO] 2024-12-03 10:53:19 db_table_operation.py[27] stream load command: {curl --location-trusted -u root: -T /usr/local/src/tpch-poc-1.0/data_60/lineitem.tbl.12 -H "column_separator:|"  -H "columns:l_orderkey,l_partkey,l_suppkey,l_linenumber,l_quantity,l_extendedprice,l_discount,l_tax,l_returnflag,l_linestatus,l_shipdate,l_commitdate,l_receiptdate,l_shipinstruct,l_shipmode,l_comment,not_use_column"  http://192.168.0.241:8030/api/tpch/lineitem/_stream_load}
[INFO] 2024-12-03 10:54:27 db_table_operation.py[41] stream load success. table: lineitem, path: /usr/local/src/tpch-poc-1.0/data_60/lineitem.tbl.12
[INFO] 2024-12-03 10:54:32 db_table_operation.py[41] stream load success. table: lineitem, path: /usr/local/src/tpch-poc-1.0/data_60/lineitem.tbl.11
[INFO] 2024-12-03 10:54:32 config_util.py[43] concurrency load number for table: nation is not set, use 1 by default.
[INFO] 2024-12-03 10:54:32 db_table_operation.py[25] stream load start. table: nation, path: /usr/local/src/tpch-poc-1.0/data_60/nation.tbl
[INFO] 2024-12-03 10:54:32 db_table_operation.py[27] stream load command: {curl --location-trusted -u root: -T /usr/local/src/tpch-poc-1.0/data_60/nation.tbl -H "column_separator:|"  -H "columns:n_nationkey,n_name,n_regionkey,n_comment,not_use_column"  http://192.168.0.241:8030/api/tpch/nation/_stream_load}
[INFO] 2024-12-03 10:54:32 db_table_operation.py[41] stream load success. table: nation, path: /usr/local/src/tpch-poc-1.0/data_60/nation.tbl
[INFO] 2024-12-03 10:54:32 config_util.py[41] concurrency load number for table: orders is 5.
[INFO] 2024-12-03 10:54:32 db_table_operation.py[25] stream load start. table: orders, path: /usr/local/src/tpch-poc-1.0/data_60/orders.tbl.1
[INFO] 2024-12-03 10:54:32 db_table_operation.py[27] stream load command: {curl --location-trusted -u root: -T /usr/local/src/tpch-poc-1.0/data_60/orders.tbl.1 -H "column_separator:|"  -H "columns:o_orderkey,o_custkey,o_orderstatus,o_totalprice,o_orderdate,o_orderpriority,o_clerk,o_shippriority,o_comment,not_use_column"  http://192.168.0.241:8030/api/tpch/orders/_stream_load}
[INFO] 2024-12-03 10:54:32 db_table_operation.py[25] stream load start. table: orders, path: /usr/local/src/tpch-poc-1.0/data_60/orders.tbl.2
[INFO] 2024-12-03 10:54:32 db_table_operation.py[27] stream load command: {curl --location-trusted -u root: -T /usr/local/src/tpch-poc-1.0/data_60/orders.tbl.2 -H "column_separator:|"  -H "columns:o_orderkey,o_custkey,o_orderstatus,o_totalprice,o_orderdate,o_orderpriority,o_clerk,o_shippriority,o_comment,not_use_column"  http://192.168.0.241:8030/api/tpch/orders/_stream_load}
[INFO] 2024-12-03 10:56:11 db_table_operation.py[41] stream load success. table: orders, path: /usr/local/src/tpch-poc-1.0/data_60/orders.tbl.1
[INFO] 2024-12-03 10:56:12 db_table_operation.py[41] stream load success. table: orders, path: /usr/local/src/tpch-poc-1.0/data_60/orders.tbl.2
[INFO] 2024-12-03 10:56:12 config_util.py[43] concurrency load number for table: part is not set, use 1 by default.
[INFO] 2024-12-03 10:56:12 db_table_operation.py[25] stream load start. table: part, path: /usr/local/src/tpch-poc-1.0/data_60/part.tbl
[INFO] 2024-12-03 10:56:12 db_table_operation.py[27] stream load command: {curl --location-trusted -u root: -T /usr/local/src/tpch-poc-1.0/data_60/part.tbl -H "column_separator:|"  -H "columns:p_partkey,p_name,p_mfgr,p_brand,p_type,p_size,p_container,p_retailprice,p_comment,not_use_column"  http://192.168.0.241:8030/api/tpch/part/_stream_load}
[INFO] 2024-12-03 10:56:29 db_table_operation.py[41] stream load success. table: part, path: /usr/local/src/tpch-poc-1.0/data_60/part.tbl
[INFO] 2024-12-03 10:56:29 config_util.py[43] concurrency load number for table: partsupp is not set, use 1 by default.
[INFO] 2024-12-03 10:56:29 db_table_operation.py[25] stream load start. table: partsupp, path: /usr/local/src/tpch-poc-1.0/data_60/partsupp.tbl.1
[INFO] 2024-12-03 10:56:29 db_table_operation.py[27] stream load command: {curl --location-trusted -u root: -T /usr/local/src/tpch-poc-1.0/data_60/partsupp.tbl.1 -H "column_separator:|"  -H "columns:ps_partkey,ps_suppkey,ps_availqty,ps_supplycost,ps_comment,not_use_column"  http://192.168.0.241:8030/api/tpch/partsupp/_stream_load}
[INFO] 2024-12-03 10:57:03 db_table_operation.py[41] stream load success. table: partsupp, path: /usr/local/src/tpch-poc-1.0/data_60/partsupp.tbl.1
[INFO] 2024-12-03 10:57:03 db_table_operation.py[25] stream load start. table: partsupp, path: /usr/local/src/tpch-poc-1.0/data_60/partsupp.tbl.2
[INFO] 2024-12-03 10:57:03 db_table_operation.py[27] stream load command: {curl --location-trusted -u root: -T /usr/local/src/tpch-poc-1.0/data_60/partsupp.tbl.2 -H "column_separator:|"  -H "columns:ps_partkey,ps_suppkey,ps_availqty,ps_supplycost,ps_comment,not_use_column"  http://192.168.0.241:8030/api/tpch/partsupp/_stream_load}
[INFO] 2024-12-03 10:57:39 db_table_operation.py[41] stream load success. table: partsupp, path: /usr/local/src/tpch-poc-1.0/data_60/partsupp.tbl.2
[INFO] 2024-12-03 10:57:39 config_util.py[43] concurrency load number for table: region is not set, use 1 by default.
[INFO] 2024-12-03 10:57:39 db_table_operation.py[25] stream load start. table: region, path: /usr/local/src/tpch-poc-1.0/data_60/region.tbl
[INFO] 2024-12-03 10:57:39 db_table_operation.py[27] stream load command: {curl --location-trusted -u root: -T /usr/local/src/tpch-poc-1.0/data_60/region.tbl -H "column_separator:|"  -H "columns:r_regionkey,r_name,r_comment,not_use_column"  http://192.168.0.241:8030/api/tpch/region/_stream_load}
[INFO] 2024-12-03 10:57:39 db_table_operation.py[41] stream load success. table: region, path: /usr/local/src/tpch-poc-1.0/data_60/region.tbl
[INFO] 2024-12-03 10:57:39 config_util.py[43] concurrency load number for table: supplier is not set, use 1 by default.
[INFO] 2024-12-03 10:57:39 db_table_operation.py[25] stream load start. table: supplier, path: /usr/local/src/tpch-poc-1.0/data_60/supplier.tbl
[INFO] 2024-12-03 10:57:39 db_table_operation.py[27] stream load command: {curl --location-trusted -u root: -T /usr/local/src/tpch-poc-1.0/data_60/supplier.tbl -H "column_separator:|"  -H "columns:s_suppkey,s_name,s_address,s_nationkey,s_phone,s_acctbal,s_comment,not_use_column"  http://192.168.0.241:8030/api/tpch/supplier/_stream_load}
[INFO] 2024-12-03 10:57:40 db_table_operation.py[41] stream load success. table: supplier, path: /usr/local/src/tpch-poc-1.0/data_60/supplier.tbl

real	10m15.401s
user	0m3.546s
sys	1m48.801s

导入时长:615.4s

平均导入条数:519611624/615.4=844347

查询

单位均为s

SQL第一轮第二轮第三轮第四轮排除第一轮预热后三轮平均值
Q16.0044.7065.3784.6644.916
Q20.380.2610.2680.2780.269
Q36.0023.3063.3923.3063.334
Q41.4751.2721.3561.4031.343
Q53.9574.0674.1574.0324.085
Q60.2290.2190.2060.2040.209
Q7内存不足
Q83.4612.6652.6792.5452.629
Q97.6937.6647.6267.7337.674
Q102.8782.9592.9783.0422.993
Q110.6250.4350.4350.420.43
Q120.9800.5760.650.5420.589
Q135.2715.2795.2825.2255.262
Q140.2910.2820.320.3310.311
Q150.6360.6290.6810.6620.657
Q160.5680.4070.5670.5890.521
Q171.4021.4581.4231.4171.432
Q185.0834.9655.0335.0415.013
Q191.6921.6481.6311.7021.66
Q200.6550.6580.6510.7420.683
Q213.0453.0563.0533.233.113
Q220.6190.6770.5910.6110.626

查询SQL

--Q1
select
  l_returnflag,
  l_linestatus,
  sum(l_quantity) as sum_qty,
  sum(l_extendedprice) as sum_base_price,
  sum(l_extendedprice * (1 - l_discount)) as sum_disc_price,
  sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge,
  avg(l_quantity) as avg_qty,
  avg(l_extendedprice) as avg_price,
  avg(l_discount) as avg_disc,
  count(*) as count_order
from
  lineitem
where
  l_shipdate <= date '1998-12-01' - interval '90' day
group by
  l_returnflag,
  l_linestatus
order by
  l_returnflag,
  l_linestatus;

--Q2
select
  s_acctbal,
  s_name,
  n_name,
  p_partkey,
  p_mfgr,
  s_address,
  s_phone,
  s_comment
from
  part,
  supplier,
  partsupp,
  nation,
  region
where
  p_partkey = ps_partkey
  and s_suppkey = ps_suppkey
  and p_size = 15
  and p_type like '%BRASS'
  and s_nationkey = n_nationkey
  and n_regionkey = r_regionkey
  and r_name = 'EUROPE'
  and ps_supplycost = ( 
    select
      min(ps_supplycost)
    from
      partsupp,
      supplier,
      nation,
      region
    where
      p_partkey = ps_partkey
      and s_suppkey = ps_suppkey
      and s_nationkey = n_nationkey
      and n_regionkey = r_regionkey
      and r_name = 'EUROPE'
  )
order by
  s_acctbal desc,
  n_name,
  s_name,
  p_partkey
limit 100;

--Q3
select
  l_orderkey,
  sum(l_extendedprice * (1 - l_discount)) as revenue,
  o_orderdate,
  o_shippriority
from
  customer,
  orders,
  lineitem
where
  c_mktsegment = 'BUILDING'
  and c_custkey = o_custkey
  and l_orderkey = o_orderkey
  and o_orderdate < date '1995-03-15'
  and l_shipdate > date '1995-03-15'
group by
  l_orderkey,
  o_orderdate,
  o_shippriority
order by
  revenue desc,
  o_orderdate
limit 10;

--Q4
select
  o_orderpriority,
  count(*) as order_count
from
  orders
where
  o_orderdate >= date '1993-07-01'
  and o_orderdate < date '1993-07-01' + interval '3' month
  and exists (
    select
      *   
    from
      lineitem
    where
      l_orderkey = o_orderkey
      and l_commitdate < l_receiptdate
  )
group by
  o_orderpriority
order by
  o_orderpriority;

--Q5
select
  n_name,
  sum(l_extendedprice * (1 - l_discount)) as revenue
from
  customer,
  orders,
  lineitem,
  supplier,
  nation,
  region
where
  c_custkey = o_custkey
  and l_orderkey = o_orderkey
  and l_suppkey = s_suppkey
  and c_nationkey = s_nationkey
  and s_nationkey = n_nationkey
  and n_regionkey = r_regionkey
  and r_name = 'ASIA'
  and o_orderdate >= date '1994-01-01'
  and o_orderdate < date '1994-01-01' + interval '1' year
group by
  n_name
order by
  revenue desc;

--Q6
select
  sum(l_extendedprice * l_discount) as revenue
from
  lineitem
where
  l_shipdate >= date '1994-01-01'
  and l_shipdate < date '1994-01-01' + interval '1' year
  and l_discount between .06 - 0.01 and .06 + 0.01
  and l_quantity < 24;

--Q7
select
  supp_nation,
  cust_nation,
  l_year,
  sum(volume) as revenue
from
  (
    select
      n1.n_name as supp_nation,
      n2.n_name as cust_nation,
      extract(year from l_shipdate) as l_year,
      l_extendedprice * (1 - l_discount) as volume
    from
      supplier,
      lineitem,
      orders,
      customer,
      nation n1, 
      nation n2
    where
      s_suppkey = l_suppkey
      and o_orderkey = l_orderkey
      and c_custkey = o_custkey
      and s_nationkey = n1.n_nationkey
      and c_nationkey = n2.n_nationkey
      and (
        (n1.n_name = 'FRANCE' and n2.n_name = 'GERMANY')
        or (n1.n_name = 'GERMANY' and n2.n_name = 'FRANCE')
      )   
      and l_shipdate between date '1995-01-01' and date '1996-12-31'
  ) as shipping
group by
  supp_nation,
  cust_nation,
  l_year
order by
  supp_nation,
  cust_nation,
  l_year;

--Q8
select
  o_year,
  sum(case
    when nation = 'BRAZIL' then volume
    else 0
  end) / sum(volume) as mkt_share
from
  (
    select
      extract(year from o_orderdate) as o_year,
      l_extendedprice * (1 - l_discount) as volume,
      n2.n_name as nation
    from
      part,
      supplier,
      lineitem,
      orders,
      customer,
      nation n1, 
      nation n2, 
      region
    where
      p_partkey = l_partkey
      and s_suppkey = l_suppkey
      and l_orderkey = o_orderkey
      and o_custkey = c_custkey
      and c_nationkey = n1.n_nationkey
      and n1.n_regionkey = r_regionkey
      and r_name = 'AMERICA'
      and s_nationkey = n2.n_nationkey
      and o_orderdate between date '1995-01-01' and date '1996-12-31'
      and p_type = 'ECONOMY ANODIZED STEEL'
  ) as all_nations
group by
  o_year
order by
  o_year;

--Q9
select
  nation,
  o_year,
  sum(amount) as sum_profit
from
  (
    select
      n_name as nation,
      extract(year from o_orderdate) as o_year,
      l_extendedprice * (1 - l_discount) - ps_supplycost * l_quantity as amount
    from
      part,
      supplier,
      lineitem,
      partsupp,
      orders,
      nation
    where
      s_suppkey = l_suppkey
      and ps_suppkey = l_suppkey
      and ps_partkey = l_partkey
      and p_partkey = l_partkey
      and o_orderkey = l_orderkey
      and s_nationkey = n_nationkey
      and p_name like '%green%'
  ) as profit
group by
  nation,
  o_year
order by
  nation,
  o_year desc;

--Q10
select
      c_custkey,
      c_name,
      sum(l_extendedprice * (1 - l_discount)) as revenue,
      c_acctbal,
      n_name,
      c_address,
      c_phone,
      c_comment
    from
      customer,
      orders,
      lineitem,
      nation
    where
      c_custkey = o_custkey
      and l_orderkey = o_orderkey
      and o_orderdate >= date '1993-10-01'
      and o_orderdate < date '1993-10-01' + interval '3' month
      and l_returnflag = 'R' 
      and c_nationkey = n_nationkey
    group by
      c_custkey,
      c_name,
      c_acctbal,
      c_phone,
      n_name,
      c_address,
      c_comment
    order by
      revenue desc
limit 20;

--Q11
select
  ps_partkey,
  sum(ps_supplycost * ps_availqty) as value
from
  partsupp,
  supplier,
  nation
where
  ps_suppkey = s_suppkey
  and s_nationkey = n_nationkey
  and n_name = 'GERMANY'
group by
  ps_partkey having
    sum(ps_supplycost * ps_availqty) > ( 
      select
        sum(ps_supplycost * ps_availqty) * 0.000001
      from
        partsupp,
        supplier,
        nation
      where
        ps_suppkey = s_suppkey
        and s_nationkey = n_nationkey
        and n_name = 'GERMANY'
    )   
order by
  value desc;

--Q12
select
  l_shipmode,
  sum(case
    when o_orderpriority = '1-URGENT'
      or o_orderpriority = '2-HIGH'
      then 1
    else 0
  end) as high_line_count,
  sum(case
    when o_orderpriority <> '1-URGENT'
      and o_orderpriority <> '2-HIGH'
      then 1
    else 0
  end) as low_line_count
from
  orders,
  lineitem
where
  o_orderkey = l_orderkey
  and l_shipmode in ('MAIL', 'SHIP')
  and l_commitdate < l_receiptdate
  and l_shipdate < l_commitdate
  and l_receiptdate >= date '1994-01-01'
  and l_receiptdate < date '1994-01-01' + interval '1' year
group by
  l_shipmode
order by
  l_shipmode;

--Q13
select  c_count, count(*) as custdist
from (
    select 
        c_custkey,
        count(o_orderkey) as c_count
    from 
        customer left outer join orders on 
            c_custkey = o_custkey 
            and o_comment not like '%special%requests%'
    group by  
        c_custkey
    ) as c_orders
group by  
    c_count 
order by 
    custdist desc, 
    c_count desc;

--Q14
select
  100.00 * sum(case
    when p_type like 'PROMO%'
      then l_extendedprice * (1 - l_discount)
    else 0
  end) / sum(l_extendedprice * (1 - l_discount)) as promo_revenue
from
  lineitem,
  part
where
  l_partkey = p_partkey
  and l_shipdate >= date '1995-09-01'
  and l_shipdate < date '1995-09-01' + interval '1' month;

--Q15
select
  s_suppkey,
  s_name,
  s_address,
  s_phone,
  total_revenue
from
  supplier,
  revenue0
where
  s_suppkey = supplier_no
  and total_revenue = ( 
    select
      max(total_revenue)
    from
      revenue0
  )
order by
  s_suppkey;

--Q16
select
  p_brand,
  p_type,
  p_size,
  count(distinct ps_suppkey) as supplier_cnt
from
  partsupp,
  part
where
  p_partkey = ps_partkey
  and p_brand <> 'Brand#45'
  and p_type not like 'MEDIUM POLISHED%'
  and p_size in (49, 14, 23, 45, 19, 3, 36, 9)
  and ps_suppkey not in (
    select
      s_suppkey
    from
      supplier
    where
      s_comment like '%Customer%Complaints%'
  )
group by
  p_brand,
  p_type,
  p_size
order by
  supplier_cnt desc,
  p_brand,
  p_type,
  p_size;

--Q17
select
  sum(l_extendedprice) / 7.0 as avg_yearly
from
  lineitem,
  part
where
  p_partkey = l_partkey
  and p_brand = 'Brand#23'
  and p_container = 'MED BOX'
  and l_quantity < ( 
    select
      0.2 * avg(l_quantity)
    from
      lineitem
    where
      l_partkey = p_partkey
  );

--Q18
select
  c_name,
  c_custkey,
  o_orderkey,
  o_orderdate,
  o_totalprice,
  sum(l_quantity)
from
  customer,
  orders,
  lineitem
where
  o_orderkey in (
    select
      l_orderkey
    from
      lineitem
    group by
      l_orderkey having
        sum(l_quantity) > 300 
  )
  and c_custkey = o_custkey
  and o_orderkey = l_orderkey
group by
  c_name,
  c_custkey,
  o_orderkey,
  o_orderdate,
  o_totalprice
order by
  o_totalprice desc,
  o_orderdate
limit 100;

--Q19
select
  sum(l_extendedprice* (1 - l_discount)) as revenue
from
  lineitem,
  part
where
  (
    p_partkey = l_partkey
    and p_brand = 'Brand#12'
    and p_container in ('SM CASE', 'SM BOX', 'SM PACK', 'SM PKG')
    and l_quantity >= 1 and l_quantity <= 1 + 10
    and p_size between 1 and 5
    and l_shipmode in ('AIR', 'AIR REG')
    and l_shipinstruct = 'DELIVER IN PERSON'
  )
  or  
  (
    p_partkey = l_partkey
    and p_brand = 'Brand#23'
    and p_container in ('MED BAG', 'MED BOX', 'MED PKG', 'MED PACK')
    and l_quantity >= 10 and l_quantity <= 10 + 10
    and p_size between 1 and 10
    and l_shipmode in ('AIR', 'AIR REG')
    and l_shipinstruct = 'DELIVER IN PERSON'
  )
  or  
  (
    p_partkey = l_partkey
    and p_brand = 'Brand#34'
    and p_container in ('LG CASE', 'LG BOX', 'LG PACK', 'LG PKG')
    and l_quantity >= 20 and l_quantity <= 20 + 10
    and p_size between 1 and 15
    and l_shipmode in ('AIR', 'AIR REG')
    and l_shipinstruct = 'DELIVER IN PERSON'
  );

--Q20
select
  s_name,
  s_address
from
  supplier,
  nation
where
  s_suppkey in (
    select
      ps_suppkey
    from
      partsupp
    where
      ps_partkey in (
        select
          p_partkey
        from
          part
        where
          p_name like 'forest%'
      )   
      and ps_availqty > ( 
        select
          0.5 * sum(l_quantity)
        from
          lineitem
        where
          l_partkey = ps_partkey
          and l_suppkey = ps_suppkey
          and l_shipdate >= date '1994-01-01'
          and l_shipdate < date '1994-01-01' + interval '1' year
      )   
  )
  and s_nationkey = n_nationkey
  and n_name = 'CANADA'
order by
  s_name;

--Q21
select
        s_name,
        count(*) as numwait
from
        supplier,
        lineitem l1,
        orders,
        nation
where
        s_suppkey = l1.l_suppkey
        and o_orderkey = l1.l_orderkey
        and o_orderstatus = 'F'
        and l1.l_receiptdate > l1.l_commitdate
        and exists (
                select
                        *
                from
                        lineitem l2
                where
                        l2.l_orderkey = l1.l_orderkey
                        and l2.l_suppkey <> l1.l_suppkey
        )
        and not exists (
                select
                        *
                from
                        lineitem l3
                where
                        l3.l_orderkey = l1.l_orderkey
                        and l3.l_suppkey <> l1.l_suppkey
                        and l3.l_receiptdate > l3.l_commitdate
        )
        and s_nationkey = n_nationkey
        and n_name = 'SAUDI ARABIA'
group by
        s_name
order by
        numwait desc,
        s_name
limit 100;


--Q22
select
  cntrycode,
  count(*) as numcust,
  sum(c_acctbal) as totacctbal
from
  (
    select
      substring(c_phone, 1, 2) as cntrycode,
      c_acctbal
    from
      customer
    where
      substring(c_phone, 1, 2) in
        ('13', '31', '23', '29', '30', '18', '17')
      and c_acctbal > ( 
        select
          avg(c_acctbal)
        from
          customer
        where
          c_acctbal > 0.00
          and substring(c_phone, 1, 2) in
            ('13', '31', '23', '29', '30', '18', '17')
      )   
      and not exists (
        select
          *
        from
          orders
        where
          o_custkey = c_custkey
      )   
  ) as custsale
group by
  cntrycode
order by
  cntrycode;

星霜荏苒 居诸不息