Performance tuning in database

Performance tuning in database

Some time ago, I did some research on database performance tuning and I want to write about it here. In the following, each of the following topics will be briefly discussed.

  1. Definition of performance tuning
  2. An overview of database
  3. SQL Vs NoSQL
  4. Performance tuning from an overall perspective
  5. Performance tuning in SQL-Software level
  6. Performance tuning in SQL-Hardware level
  7. Performance tuning in NoSQL-Software level
  8. Performance tuning in NoSQL-Hardware level
    1. Replication and Sharding
  9. Performance tuning in network
  10. Is Performance tuning always done ?

Definition of performance tuning

Performance tuning is the process of analyzing and optimizing a system (database, network, or even automobile) with the goal of achieving the best possible performance under given constraints.

An overview of database

What is a database? A database is like a large filing cabinet, except it's electronic and stores digital information instead of paper documents. A database stores, manages, and retrieves information. A database is essentially software that runs on a computer called a server. A database can also be a collection of databases, servers, and the network that connects the servers. Now, let's look at the two main types of databases that we encounter.

SQL Vs NoSQL

SQL databases, also known as relational databases, store data in tables. Tables have rows and columns, and each table can have relationships with other tables.

relational databases

In NoSQL databases we have more flexibility, meaning we can store data in different formats such as:

  1. document
  2. Key-Value pairs
  3. graph
  4. wide column
Data Structure in NoSQL

Each of these two database models has its own advantages and disadvantages, but in many projects there is no need to prefer one over the other and the best thing to do is to use a combination of both.

AspectsSQLNoSQL
TypeRelationalNon-relational
DataStructuredStructured, semi-structured, and unstructured
SchemaStaticDynamic
ScalabilityVerticalHorizontal
TransactionsACIDEventual consistency
FlexibilityLessMore
LanguageStructured Query Language (SQL)Languages ​​specific to each NoSQL database
Use casesSuitable for complex queries and transactionsSuitable for rapid development and scalability
ExamplesMySQL, PostgreSQL, Oracle, SQLiteMongoDB, Cassandra, Redis, Elasticsearch

Using SQL and NoSQL together in a project is a powerful technique, often known as polyglot persistence, and is a design approach that leverages the strengths of each database type to manage different types of data and tasks. Of course, this method is for large projects and is not suitable for small projects.

Performance tuning from an overall perspective

Performance tuning is the process of optimizing a system for the highest level of performance. Take a BMW GTR M3 for example. This car is great in normal mode, but if our goal is to participate in competitions, we need to make some changes to the car. For example, putting tires that have less friction or increasing engine power. Of course, note that although these changes are great for racing, they are not good at all for normal conditions. For example, the tires must have a standard level of friction so that the car has the necessary safety and the engine noise should not bother the citizens, in addition, these changes have a much higher maintenance cost and this means spending on things that may never be used.

Image of a BMW GTR M3 before and after Performance tuning:

BMW GTR M3

Performance tuning in a database is a process that covers everything from query optimization to hardware resource management, and it is not a one-time task, but an ongoing process that involves monitoring, diagnosing, and modifying system behavior, just like we have in the case of car racing.

Monitoring Databases and Cars

In short, performance tuning in a database is important for the following reasons:

  1. Speed: Reduce query execution time
  2. Scalability: Manage users as workloads increase
  3. Resource efficiency: Optimize CPU, memory, and ... usage

Each of these can be achieved by changes in software or hardware architecture. I will give a very general explanation below, but if this topic is to be learned well and implemented in practice, you should go to reference sources, such as this book:

Database Performance Tuning Book

Performance tuning in SQL-Software level

Query Optimization:

  1. Query Rewriting: Write the names of the required columns instead of using SELECT *
  2. Automated Tools: Use tools to analyze how queries are executed and identify bottlenecks

Indexing Strategies:

  1. Create appropriate indexes: Index columns that are frequently used
  2. Periodic Index Checking: Periodically check indexes to keep the program optimized

Database Configuration:

  1. Parameter Setting: Modify configuration files to allocate appropriate memory, cache size, and disk I/O settings
  2. Connection pooling: A technique for optimally managing database access

Storage Mechanisms:

  1. Result Caching: Store the results of common queries
  2. Memory Storage: Store frequently used data on other databases, to reduce pressure on the main database

Performance tuning in SQL-Hardware level

Upgrading various components:

  1. Hard: Using SSD instead of HDD for faster data access
  2. RAM: For faster processing
  3. CPU: Multi-core or faster CPUs for increased speed and complex processing

Server configuration:

  1. Load balancing: Distributing the load of queries across multiple servers

Performance tuning in NoSQL-Software level

Scaling:

  1. Horizontal Scaling: Distributing data across multiple servers to manage load and optimize access speed
  2. Replication and Consistency: How fast data is distributed across other servers
  3. Shard keys: Choosing good shard keys to divide data evenly and support search patterns
  4. Caching: Just like SQL, storing frequently used data in a way that is optimized for access

Data Modeling and Schema Design:

  1. Choosing the right data model: We choose a data model according to the situation
  2. Flexible Schema: Organizing information in a way that makes it fast to find what we are looking for
  3. Denormalization: Based on user requests, lists are created (made of combinations of Several lists (which may contain duplicate information, but since they are widely used, the system speed increases.

Performance tuning in NoSQL-Hardware level

  1. Distributed Node Optimization: Each node in a cluster should be balanced
  2. Network performance: Since data in NoSQL environments is often distributed across multiple servers or even data centers, the network can have a significant impact
  3. Replication and Sharding: NoSQL databases often use Replication to distribute data across nodes and Sharding to partition and ensure data availability.

So far, I have briefly mentioned some aspects of database performance tuning. Next, I will mention some aspects of network performance tuning because, as I mentioned before, for example, in the case of Replication and Sharding, we need communication between different nodes, so the role of the network becomes more colorful here. But before I give examples of performance tuning about the network, I think it would be good to explain Replication and Sharding a little more. Just note that we can have this method in SQL or a combination of SQL and NoSQL in some way.

Replication and Sharding

Sharding and Replication are both fundamental strategies in NoSQL databases and serve distinct purposes in how data is managed and scaled.

Replication

Replication creates multiple copies of the same data across servers (nodes). The main purpose is to ensure high availability and fault tolerance. If one node fails, other nodes have the same data and can continue to serve queries. This method significantly increases the system's resilience in the face of hardware failures or network problems.

Consistency considerations: Operations such as data deletion must be synchronized between copies, either synchronously or asynchronously.

Sharding

Sharding divides a large data set into smaller, more manageable chunks called shards. This allows the system to spread the workload across multiple nodes and increase throughput.

Considerations for managing complex queries: Queries that target a specific shard are fast, but queries that span multiple shards may add complexity to the system.

Performance tuning in network

  1. Bandwidth Optimization: Adjusting the amount of data that can be sent over the network
  2. Latency Reduction: Configuring routing protocols to choose a better path, i.e. reducing the number of hops between the source and destination
  3. Hardware setup: Using high-quality network tools and configuring them properly

Is Performance tuning always done ?

Venn Diagram to Describe an Engineer

The image above is a Venn diagram to describe an engineer, any engineer. According to this diagram, engineers should do their work quickly and dirty. But this will cost a lot in the long run or even in the short run, so you should try to proceed with principles from the very beginning.

Performance tuning in database | Mohammad Seyedabadi