Amazon DynamoDB is a fully managed NoSQL database service designed to automatically scale tables to adjust for capacity and maintain high performance. It does all that while the business can focus on innovation without using important resources on managing the database infrastructure. There is no need for installing, operating, or maintaining database software or managing servers for scalability and availability.
While AWS does the heavy lifting to support DynamoDB, developers do need to be mindful when modeling data for this service, so today we are sharing the best practices to maximize its performance and minimize the costs. Since DynamoDB is a NoSQL database, you will find the first three tips that focus on working efficiently with the database type itself.
1. Query and Scan
- Avoid full table scans – Optimize your searches in a way that utilizes query operation as much as possible. These operations are restricted to a specified partition key while the scan operation scans the entire table item by item.
- Use ‘Parallel Scan’ for big datasets – If you want to increase the speed of your scans for huge datasets, try using Parallel Scan which divides the table into segments and then processes each one of them simultaneously. Be aware of its high-capacity demand and potential throttling in case of misuse.
- The locality of reference – Keep related items as close as possible in your NoSQL database because it will improve performance and decrease costs. This is one of the most important factors when it comes to decreasing response times.
- Store hot and cold Data in different tables – Frequency of data access should be your guide when making tables. For example, you can consider separating time series data into different tables and separating the frequently accessed data in a table different from the old data. This builds upon the principle of Locality of reference but can also serve as a deciding factor for which tables to assign high read-to-write rates to. This will ensure efficient use of provision capacity and minimize the chances of throttling.
- Avoid Hot Keys – To scale across multiple instances, DynamoDB partitions your data and uses a partition key as a part of the primary keys. These partition keys indicate the instance which should store the particular piece of data. A hotkey is referred to as a partition key that receives significantly more traffic than the other keys in your DynamoDB table. This might lead to going over the I/O capacity and cause throttling. When designing data keys, make sure to distribute traffic uniformly across partitions and choose them wisely.
Large attributes with large names can slow down the process. When dealing with large attributes, either compress them using common algorithms like GZIP or store them in an Amazon S3 bucket and its S3 object identifier in the DynamoDB item. Also, use abbreviations or shorter names wherever possible.
4. Identify Traffic Patterns using On-Demand Mode
DynamoDB has two modes for setting up a required capacity. First is the provisioned mode, where you specify the amount of reading capacity and write capacity units you want available for your tables in advance. Second is the on-demand mode. If you do not know your traffic patterns and are yet to establish an idea of your need, this is the mode to go for. Our suggestion is always to go for on-demand, understand your real requirement, and then switch to provisioned once you know the numbers.
5. Use Cache for Read-Heavy Workloads
Read-heavy workloads or frequently accessed items can constitute most of your capacity. This is why you should practice the use of caching. It has the ability to get your DynamoDB costs down by up to 80% by reducing the number of calls made to your table. This not only reduces the cost, it even increases the performance of your table by reducing the traffic.
6. Use IAM to Enforce Restrictions
IAM has several benefits and almost every section of security, costs, and performance can be taken control of by using this really powerful tool. What we really mean is that IAM policies can be used to control who is allowed to access certain services. In this case, restricting developers and services running expensive scan operations on DynamoDB tables can be helpful to keep your cost in check while ensuring all the resources are being used efficiently for making your business efficient.
7. Use DynamoDB Global Tables for Low Latency Applications
OTT platforms need to deliver content at low latency to as many consumers as possible. The most efficient way of doing this is by bringing the content as close to the user as possible. DynamoDB Global tables work on a similar theory and it automatically replicates data across multiple AWS Regions to deliver content at low latency.