Your Web News in One Place

Help Webnuz

Referal links:

Sign up for GreenGeeks web hosting
July 27, 2022 02:40 am GMT

DistSQL Applications: Building a Dynamic Distributed Database

Background

Ever since the release of ShardingSphere 5.0.0, DistSQL has been providing strong dynamic management capabilities to the ShardingSphere ecosystem.

Thanks to DistSQL, users have been empowered to do the following:

  • Create logical databases online.
  • Dynamically configure rules (i.e. sharding, data encryption, read/write splitting, database discovery, shadow DB, and global rules).
  • Adjust storage resources in real-time.
  • Switch transaction types instantly.
  • Turn SQL logs on and off at any time.
  • Preview the SQL routing results.At the same time, in the context of increasingly diversified scenarios, more and more DistSQL features are being created and a variety of valuable syntaxes have been gaining popularity among users.

Overview

This post takes data sharding as an example to illustrate DistSQLs application scenarios and related sharding methods.

A series of DistSQL statements are concatenated through practical cases, to give you a complete set of practical DistSQL sharding management schemes, which create and maintain distributed databases through dynamic management. The following DistSQL will be used in this example:
Image description

Practical Case

Required scenarios

  • Create two sharding tables: t_order and t_order_item.
  • For both tables, database shards are carried out with the user_id field, and table shards with the order_id field.
  • The number of shards is 2 databases * 3 tables.As shown in the figure below:Image description

Setting up the environment

1.Prepare a MySQL database instance for access. Create two new databases: demo_ds_0 and demo_ds_1.

Here we take MySQL as an example, but you can also use PostgreSQL or openGauss databases.

2.Deploy Apache ShardingSphere-Proxy 5.1.2 and Apache ZooKeeper. ZooKeeper acts as a governance center and stores ShardingSphere metadata information.

3.Configure server.yaml in the Proxy conf directory as follows:

mode:  type: Cluster  repository:    type: ZooKeeper    props:      namespace: governance_ds      server-lists: localhost:2181  # ZooKeeper address      retryIntervalMilliseconds: 500      timeToLiveSeconds: 60      maxRetries: 3      operationTimeoutMilliseconds: 500  overwrite: falserules:  - !AUTHORITY    users:      - root@%:root

4.Start ShardingSphere-Proxy and connect it to Proxy using a client, for example:

mysql -h 127.0.0.1 -P 3307 -u root -p

Creating a distributed database

CREATE DATABASE sharding_db;USE sharding_db;

Adding storage resources
1.Add storage resources corresponding to the prepared MySQL database.

ADD RESOURCE ds_0 (    HOST=127.0.0.1,    PORT=3306,    DB=demo_ds_0,    USER=root,    PASSWORD=123456), ds_1(    HOST=127.0.0.1,    PORT=3306,    DB=demo_ds_1,    USER=root,    PASSWORD=123456);

2.View storage resources

mysql> SHOW DATABASE RESOURCES\G;*************************** 1. row ***************************                           name: ds_1                           type: MySQL                           host: 127.0.0.1                           port: 3306                             db: demo_ds_1                            -- Omit partial attributes*************************** 2. row ***************************                           name: ds_0                           type: MySQL                           host: 127.0.0.1                           port: 3306                             db: demo_ds_0                            -- Omit partial attributes

Adding \G to the query statement aims to make the output format more readable, and it is not a must.

Creating sharding rules
ShardingSpheres sharding rules support regular sharding and automatic sharding. Both sharding methods have the same effect. The difference is that the configuration of automatic sharding is more concise, while regular sharding is more flexible and independent.

Please refer to the following links for more details on automatic sharding:

Intro to DistSQL-An Open Source and More Powerful SQL

AutoTable: Your Butler-Like Sharding Configuration Tool

Next, well adopt regular sharding and use the INLINE expression algorithm to implement the sharding scenarios described in the requirements.

Primary key generator

The primary key generator can generate a secure and unique primary key for a data table in a distributed scenario. For details, refer to the document Distributed Primary Key.

1.Create the primary key generator.

CREATE SHARDING KEY GENERATOR snowflake_key_generator (TYPE(NAME=SNOWFLAKE));

2.Query primary key generator

mysql> SHOW SHARDING KEY GENERATORS;+-------------------------+-----------+-------+| name                    | type      | props |+-------------------------+-----------+-------+| snowflake_key_generator | snowflake | {}    |+-------------------------+-----------+-------+1 row in set (0.01 sec)

Sharding algorithm

1.Create a database sharding algorithm, used by t_order and t_order_item in common.

-- Modulo 2 based on user_id in database shardingCREATE SHARDING ALGORITHM database_inline (TYPE(NAME=INLINE,PROPERTIES("algorithm-expression"="ds_${user_id % 2}")));

2.Create different table shards algorithms for t_order and t_order_item.

-- Modulo 3 based on order_id in table shardingCREATE SHARDING ALGORITHM t_order_inline (TYPE(NAME=INLINE,PROPERTIES("algorithm-expression"="t_order_${order_id % 3}")));CREATE SHARDING ALGORITHM t_order_item_inline (TYPE(NAME=INLINE,PROPERTIES("algorithm-expression"="t_order_item_${order_id % 3}")));

3.Query sharding algorithm

mysql> SHOW SHARDING ALGORITHMS;+---------------------+--------+---------------------------------------------------+| name                | type   | props                                             |+---------------------+--------+---------------------------------------------------+| database_inline     | inline | algorithm-expression=ds_${user_id % 2}            || t_order_inline      | inline | algorithm-expression=t_order_${order_id % 3}      || t_order_item_inline | inline | algorithm-expression=t_order_item_${order_id % 3} |+---------------------+--------+---------------------------------------------------+3 rows in set (0.00 sec)

Default sharding strategy

Sharding strategy consists of sharding key and sharding algorithm. Please refer to Sharding Strategy for its concept.

Sharding strategy consists of databaseStrategy and tableStrategy.

Since t_order and t_order_item have the same database sharding field and sharding algorithm, we create a default strategy that will be used by all shard tables with no sharding strategy configured:

1.Create a default database sharding strategy

CREATE DEFAULT SHARDING DATABASE STRATEGY (TYPE=STANDARD,SHARDING_COLUMN=user_id,SHARDING_ALGORITHM=database_inline);

2.Query default strategy

mysql> SHOW DEFAULT SHARDING STRATEGY\G;*************************** 1. row ***************************                    name: TABLE                    type: NONE         sharding_column: sharding_algorithm_name: sharding_algorithm_type:sharding_algorithm_props:*************************** 2. row ***************************                    name: DATABASE                    type: STANDARD         sharding_column: user_id sharding_algorithm_name: database_inline sharding_algorithm_type: inlinesharding_algorithm_props: {algorithm-expression=ds_${user_id % 2}}2 rows in set (0.00 sec)

The default table sharding strategy is not configured, so the default strategy of TABLE is NONE.

Sharding rules

The primary key generator and sharding algorithm are both ready. Now create sharding rules.

1.t_order

CREATE SHARDING TABLE RULE t_order (DATANODES("ds_${0..1}.t_order_${0..2}"),TABLE_STRATEGY(TYPE=STANDARD,SHARDING_COLUMN=order_id,SHARDING_ALGORITHM=t_order_inline),KEY_GENERATE_STRATEGY(COLUMN=order_id,KEY_GENERATOR=snowflake_key_generator));

DATANODES specifies the data nodes of shard tables.
TABLE_STRATEGY specifies the table strategy, among which SHARDING_ALGORITHM uses created sharding algorithm t_order_inline;
KEY_GENERATE_STRATEGY specifies the primary key generation strategy of the table. Skip this configuration if primary key generation is not required.

2.t_order_item

CREATE SHARDING TABLE RULE t_order_item (DATANODES("ds_${0..1}.t_order_item_${0..2}"),TABLE_STRATEGY(TYPE=STANDARD,SHARDING_COLUMN=order_id,SHARDING_ALGORITHM=t_order_item_inline),KEY_GENERATE_STRATEGY(COLUMN=order_item_id,KEY_GENERATOR=snowflake_key_generator));

3.Query sharding rules

mysql> SHOW SHARDING TABLE RULES\G;*************************** 1. row ***************************                            table: t_order                actual_data_nodes: ds_${0..1}.t_order_${0..2}              actual_data_sources:           database_strategy_type: STANDARD         database_sharding_column: user_id database_sharding_algorithm_type: inlinedatabase_sharding_algorithm_props: algorithm-expression=ds_${user_id % 2}              table_strategy_type: STANDARD            table_sharding_column: order_id    table_sharding_algorithm_type: inline   table_sharding_algorithm_props: algorithm-expression=t_order_${order_id % 3}              key_generate_column: order_id               key_generator_type: snowflake              key_generator_props:*************************** 2. row ***************************                            table: t_order_item                actual_data_nodes: ds_${0..1}.t_order_item_${0..2}              actual_data_sources:           database_strategy_type: STANDARD         database_sharding_column: user_id database_sharding_algorithm_type: inlinedatabase_sharding_algorithm_props: algorithm-expression=ds_${user_id % 2}              table_strategy_type: STANDARD            table_sharding_column: order_id    table_sharding_algorithm_type: inline   table_sharding_algorithm_props: algorithm-expression=t_order_item_${order_id % 3}              key_generate_column: order_item_id               key_generator_type: snowflake              key_generator_props:2 rows in set (0.00 sec)

So far, the sharding rules for t_order and t_order_item have been configured.

A bit complicated? Well, you can also skip the steps of creating the primary key generator, sharding algorithm, and default strategy, and complete the sharding rules in one step. Let's see how to make it easier.

Syntax
Now, if we have to add a shard table t_order_detail, we can create sharding rules as follows:

CREATE SHARDING TABLE RULE t_order_detail (DATANODES("ds_${0..1}.t_order_detail_${0..1}"),DATABASE_STRATEGY(TYPE=STANDARD,SHARDING_COLUMN=user_id,SHARDING_ALGORITHM(TYPE(NAME=INLINE,PROPERTIES("algorithm-expression"="ds_${user_id % 2}")))),TABLE_STRATEGY(TYPE=STANDARD,SHARDING_COLUMN=order_id,SHARDING_ALGORITHM(TYPE(NAME=INLINE,PROPERTIES("algorithm-expression"="t_order_detail_${order_id % 3}")))),KEY_GENERATE_STRATEGY(COLUMN=detail_id,TYPE(NAME=snowflake)));

**Note: **The above statement specified database sharding strategy, table strategy, and primary key generation strategy, but it didnt use existing algorithms.

Therefore, the DistSQL engine automatically uses the input expression to create an algorithm for the sharding rules of t_order_detail. Now the primary key generator, sharding algorithm, and sharding rules are as follows:

1.Primary key generator

mysql> SHOW SHARDING KEY GENERATORS;+--------------------------+-----------+-------+| name                     | type      | props |+--------------------------+-----------+-------+| snowflake_key_generator  | snowflake | {}    || t_order_detail_snowflake | snowflake | {}    |+--------------------------+-----------+-------+2 rows in set (0.00 sec)

2.Sharding algorithm

mysql> SHOW SHARDING ALGORITHMS;+--------------------------------+--------+-----------------------------------------------------+| name                           | type   | props                                               |+--------------------------------+--------+-----------------------------------------------------+| database_inline                | inline | algorithm-expression=ds_${user_id % 2}              || t_order_inline                 | inline | algorithm-expression=t_order_${order_id % 3}        || t_order_item_inline            | inline | algorithm-expression=t_order_item_${order_id % 3}   || t_order_detail_database_inline | inline | algorithm-expression=ds_${user_id % 2}              || t_order_detail_table_inline    | inline | algorithm-expression=t_order_detail_${order_id % 3} |+--------------------------------+--------+-----------------------------------------------------+5 rows in set (0.00 sec)

3.Sharding rules

mysql> SHOW SHARDING TABLE RULES\G;*************************** 1. row ***************************                            table: t_order                actual_data_nodes: ds_${0..1}.t_order_${0..2}              actual_data_sources:           database_strategy_type: STANDARD         database_sharding_column: user_id database_sharding_algorithm_type: inlinedatabase_sharding_algorithm_props: algorithm-expression=ds_${user_id % 2}              table_strategy_type: STANDARD            table_sharding_column: order_id    table_sharding_algorithm_type: inline   table_sharding_algorithm_props: algorithm-expression=t_order_${order_id % 3}              key_generate_column: order_id               key_generator_type: snowflake              key_generator_props:*************************** 2. row ***************************                            table: t_order_item                actual_data_nodes: ds_${0..1}.t_order_item_${0..2}              actual_data_sources:           database_strategy_type: STANDARD         database_sharding_column: user_id database_sharding_algorithm_type: inlinedatabase_sharding_algorithm_props: algorithm-expression=ds_${user_id % 2}              table_strategy_type: STANDARD            table_sharding_column: order_id    table_sharding_algorithm_type: inline   table_sharding_algorithm_props: algorithm-expression=t_order_item_${order_id % 3}              key_generate_column: order_item_id               key_generator_type: snowflake              key_generator_props:*************************** 3. row ***************************                            table: t_order_detail                actual_data_nodes: ds_${0..1}.t_order_detail_${0..1}              actual_data_sources:           database_strategy_type: STANDARD         database_sharding_column: user_id database_sharding_algorithm_type: inlinedatabase_sharding_algorithm_props: algorithm-expression=ds_${user_id % 2}              table_strategy_type: STANDARD            table_sharding_column: order_id    table_sharding_algorithm_type: inline   table_sharding_algorithm_props: algorithm-expression=t_order_detail_${order_id % 3}              key_generate_column: detail_id               key_generator_type: snowflake              key_generator_props:3 rows in set (0.01 sec)

Note: In the CREATE SHARDING TABLE RULE statement, DATABASE_STRATEGY, TABLE_STRATEGY, and KEY_GENERATE_STRATEGY can all reuse existing algorithms.

Alternatively, they can be defined quickly through syntax. The difference is that additional algorithm objects are created. Users can use it flexibly based on scenarios.

After the configuration verification rules are created, you can verify them in the following ways:

Checking node distribution

DistSQL provides SHOW SHARDING TABLE NODES for checking node distribution and users can quickly learn the distribution of shard tables.
Image description

We can see the node distribution of the shard table is consistent with what is described in the requirement.

SQL Preview

Previewing SQL is also an easy way to verify configurations. Its syntax is PREVIEW SQL:

1.Query with no shard key with all routes
Image description

2.Specify user_id to query with a single database route
Image description

3.Specify user_id and order_id with a single table route
Image description

Single-table routes scan the least shard tables and offer the highest efficiency.

DistSQL auxiliary query

During the system maintenance, algorithms or storage resources that are no longer in use may need to be released, or resources that need to be released may have been referenced and cannot be deleted. The following DistSQL can solve these problems.

Query unused resources

1.Syntax: SHOW UNUSED RESOURCES

2.Sample:
Image description

Query unused primary key generator

1.Syntax: SHOW UNUSED SHARDING KEY GENERATORS

2.Sample:
Image description

Query unused sharding algorithm

1.Syntax: SHOW UNUSED SHARDING ALGORITHMS

2.Sample:
Image description

Query rules that use the target storage resources

1.Syntax: SHOW RULES USED RESOURCE

2.Sample:
Image description

All rules that use the resource can be queried, not limited to the Sharding Rule.

Query sharding rules that use the target primary key generator

1.Syntax: SHOW SHARDING TABLE RULES USED KEY GENERATOR

2.Sample:
Image description

Query sharding rules that use the target algorithm

1.Syntax: SHOW SHARDING TABLE RULES USED ALGORITHM

2.Sample:
Image description

Conclusion

This post takes the data sharding scenario as an example to introduce DistSQLs applications and methods.

DistSQL provides flexible syntax to help simplify operations. In addition to the INLINE algorithm, DistSQL supports standard sharding, compound sharding, Hint sharding, and custom sharding algorithms. More examples will be covered in the coming future.

If you have any questions or suggestions about Apache ShardingSphere, please feel free to post them on the GitHub Issue list.

Project Links:

ShardingSphere Github

ShardingSphere Twitter

ShardingSphere Slack

Contributor Guide

GitHub Issues

Contributor Guide

References

  1. Concept-DistSQL

  2. Concept-Distributed Primary Key

  3. Concept-Sharding Strategy

  4. Concept INLINE Expression

  5. Built-in Sharding Algorithm

  6. User Manual: DistSQL

Author

Jiang Longtao
SphereEx Middleware R&D Engineer & Apache ShardingSphere Committer.

Longtao focuses on the R&D of DistSQL and related features.


Original Link: https://dev.to/apache_shardingsphere/distsql-applications-building-a-dynamic-distributed-database-lbk

Share this article:    Share on Facebook
View Full Article

Dev To

An online community for sharing and discovering great ideas, having debates, and making friends

More About this Source Visit Dev To