Your Web News in One Place

Help Webnuz

Referal links:

Sign up for GreenGeeks web hosting
April 3, 2022 07:18 pm GMT

The Single Biggest Beginner Mistake with DynamoDB

DynamoDB is an amazingly powerful and performant database, best known for its low latency and elastic scaling characteristics. But there is one trap it is super easy to fall into, especially if you have any background at all in more traditional relational database systems (think MySQL, Postgres, Oracle, and SQL Server).

That's Not Normal, Man!

In the relational database world, the dependable best practice is to normalize your data model. It's a fairly academic topic, but the short version is that every piece of data should have one home (i.e., one table which is its canonical location), and any references to that data in another table will be in the form of a foreign key, which a pointer to its "true" location.

In this way of storing data, an entity can be assembled from all the different rows in all the different tables by asking for a JOIN operation.

If you've ever done a data modeling exercise in this paradigm, you might have started with a table-per-entity. It might feel natural to follow this same process with DynamoDB. Stop!

Don't Be a Joiner

If you have taken DynamoDB for a spin, you may have noticed that there aren't any JOIN operations. This is a feature, not a bug! Let's talk a bit about why JOIN was invented in the first place. In the dawn of SQL databases (let's go back to 1979), storage was scarce and expensive. A database join saves storage at the expense of computation, a tradeoff which made sense for a long time, but doesn't anymore. In the present day, the cost equation is completely flipped: storage is millions of times cheaper, and while computation has also improved (a lot), not by the same orders of magnitude as storage, which means computation is now the bottleneck for cost and performance. DynamoDB achieves remarkable performance by not incurring the computation cost of doing all those joins.

If you make this common mistake (and I did, a bunch!) and continue modeling entities the old way, entity-per-table, you will end up doing all the joins yourself, in your application code. This is the worst of both worlds, because you're giving up the expressive flexibility of SQL while still paying the cost of joins.

So Now What?

What do you do instead? This problem is solvable with a little planning up-front. The best way I know to describe the new way of working is to imaging the query result you want, and store the rows in that form, completely denormalized. There's an old wisdom about performance: the less it has to do, the faster it can be. With denormalized rows in your DynamoDB table, the only work is fetching by your chosen partition key (and optionally a little bit extra to apply attribute constraints before returning).

The Next Level

When you start modeling in this way, the surprising result is that most applications can be implemented with a single database table. This is like waking up from living in the Matrix. When you're ready for more, start with this post by Alex Debrie, and then read everything else he has written! Absorb all of this documentation. And when you're done with that and you're ready to level up again, go find everything by Rick Houlihan, like this talk from re:Invent entitled "Amazon DynamoDB advanced design patterns."

Go With the Flow

When you start using DynamoDB the way it was designed, it will blow your mind. Have fun on the journey!


Original Link: https://dev.to/aws-builders/beginner-mistakes-with-dynamodb-2ofn

Share this article:    Share on Facebook
View Full Article

Dev To

An online community for sharing and discovering great ideas, having debates, and making friends

More About this Source Visit Dev To