Data Modeling in Riak(转)

来源:互联网 发布:aes算法过程 编辑:程序博客网 时间:2024/06/06 06:56

Data modeling

It can be hard to think outside the table, but once you do, you may find interesting patterns to use in any database, even a relational one.sql-databases

sql-databases. Feel free to use a relational database when you're ↩

willing to sacrifice the scalability, performance, and availability of Riak...but why would you?

If you thoroughly absorbed the earlier content, some of this may feel redundant, but the implications of the key/value model are not always obvious.

Rules to live by

As with most such lists, these are guidelines rather than hard rules, but take them seriously.

(@keys) Know your keys.

The cardinal rule of any key/value datastore: the fastest way to getdata is to know what to look for, which means knowing which key you want.How do you pull that off? Well, that's the trick, isn't it?The best way to always know the key you want is to be able toprogrammatically reproduce it based on information you alreadyhave. Need to know the sales data for one of your client'smagazines in December 2013? Store it in a **sales** bucket andname the key after the client, magazine, and month/year combo.Guess what? Retrieving it will be much faster than running a SQL`SELECT *` statement in a relational database.And if it turns out that the magazine didn't exist yet, and thereare no sales figures for that month? No problem. A negativeresponse, especially for immutable data, is among the fastestoperations Riak offers.Because keys are only unique within a bucket, the same uniqueidentifier can be used in different buckets to represent differentinformation about the same entity (e.g., a customer address mightbe in an `address` bucket with the customer id as its key, whereasthe customer id as a key in a `contacts` bucket would presumablycontain contact information).

(@namespace) Know your namespaces.

Riak has several levels of namespaces when storing data.Historically, buckets have been what most thought of as Riak'svirtual namespaces.The newest level is provided by **bucket types**, introduced in Riak 2.0, whichallow you to group buckets for configuration and security purposes.Less obviously, keys are their own namespaces. If you want ahierarchy for your keys that looks like `sales/customer/month`,you don't need nested buckets: you just need to name your keysappropriately, as discussed in (@keys). `sales` can be yourbucket, while each key is prepended with customer name and month.

(@views) Know your queries.

Writing data is cheap. Disk space is cheap. Dynamic queries in Riakare very, very expensive.As your data flows into the system, generate the views you're going towant later. That magazine sales example from (@keys)? The Decembersales numbers are almost certainly aggregates of smaller values, butif you know in advance that monthly sales numbers are going to berequested frequently, when the last data arrives for that month theapplication can assemble the full month's statistics for laterretrieval.Yes, getting accurate business requirements is non-trivial, butmany Riak applications are version 2 or 3 of a system, writtenonce the business discovered that the scalability of MySQL,Postgres, or MongoDB simply wasn't up to the job of handling theirgrowth.

(@small) Take small bites.

Remember your parents' advice over dinner? They were right.When creating objects that will be updated, constrain their scopeand keep the number of contained elements low to reduce the oddsof multiple clients attempting to update the data concurrently.

(@indexes) Create your own indexes.

Riak offers metadata-driven secondary indexes (2i) and full-text indexes(Riak Search) for values, but these face scaling challenges: inorder to identify all objects for a given index value, roughly athird of the cluster must be involved.For many use cases, creating your own indexes is straightforwardand much faster/more scalable, since you'll be managing andretrieving a single object.See [Conflict Resolution](#conflict-resolution) for more discussion of this.

(@immutable) Embrace immutability.

As we discussed in [Mutability], immutable data offers a way outof some of the challenges of running a high-volume, high-velocitydatastore.If possible, segregate mutable from non-mutable data, ideallyusing different buckets for [request tuning][Request tuning].[Datomic](http://www.datomic.com) is a unique data storage systemthat leverages immutability for all data, with Riak commonly usedas a backend datastore. It treats any data item in its system asa "fact," to be potentially superseded by later facts but neverupdated.

(@hybrid) Don't fear hybrid solutions.

As much as we would all love to have a database that is an excellentsolution for any problem space, we're a long way from that goal.In the meantime, it's a perfectly reasonable (and very common)approach to mix and match databases for different needs. Riak isvery fast and scalable for retrieving keys, but it's decidedlysuboptimal at ad hoc queries. If you can't model your way out ofthat problem, don't be afraid to store keys alongside searchablemetadata in a relational or other database that makes queryingsimpler, and once you have the keys you need, grab the valuesfrom Riak.Just make sure that you consider failure scenarios when doing so;it would be unfortunate to compromise Riak's availability byrendering it useless when your other database is offline.
转载于  https://billo.gitbooks.io/lfe-little-riak-book/content/ch5/3.html#fn_sql-databases
further read  : https://docs.basho.com/riak/kv/2.2.0/developing/data-modeling/
##博客仅作个人记录##

Further r

0 0
原创粉丝点击