Why does Quora use MySQL as the data store instead of NoSQLs such as Cassandra, MongoDB, CouchDB etc?
来源:互联网 发布:mac怎么安装字体到ps 编辑:程序博客网 时间:2024/05/18 14:28
1. If you partition your data at the application level, MySQL scalability isn't an issue. Facebook reported [1] running 1800 MySQL servers with just two DBAs in 2008. You can't do joins across partitions, but the NoSQL databases don't allow this anyway. Facebook hasn't confirmed using Cassandra as the primary source for any data, and it seems like inbox search might be their only use of it. [2]
2. These distributed databases like Cassandra, MongoDB, and CouchDB[3] aren't actually very scalable or stable. Twitter apparently has been trying to move from MySQL to Cassandra for over a year. When someone reports using one of these systems as their primary data store for over 1000 machines for over a year, I'll reconsider my opinion on this.
3. The primary online data store for an application is the worst place to take a risk with new technology. If you lose your database or there's corruption, it's a disaster that could be impossible to recover from. If you're not the developer of one of these new databases, and you're one of a very small number of companies using them at scale in production, you're at the mercy of the developer to fix bugs and handle scalability issues as they come up.
4. You can actually get pretty far on a single MySQL database and not even have to worry about partitioning at the application level. You can "scale up" to a machine with lots of cores and tons of ram, plus a replica. If you have a layer of memcached servers in front of the databases (which are easy to scale out) then the database basically only has to worry about writes. You can also use S3 or some other distributed hash table to take the largest objects out of rows in the database. There's no need to burden yourself with making a system scale more than 10x further than it needs to, as long as you're confident that you'll be able to scale it as you grow.
5. Many of the problems created by manually partitioning the data over a large number of MySQL machines can be mitigated by creating a layer below the application and above MySQL that automatically distributes data. FriendFeed described a good example implementation of this [4].
6. Personally, I believe the relational data model is the "right" way to structure most of the data for an application like Quora (and for most user-generated content sites). Schemas allow the data to persist in a typed manner across lots of new versions of the application as it's developed, they serve as documentation, and prevent a lot of bugs. And SQL lets you move the computation to the data as necessary rather than having to fetch a ton of data and post-process it in the application everywhere. I think the "NoSQL" fad will end when someone finally implements a distributed relational database with relaxed semantics.
--------
[1] http://www.datacenterknowledge.c...
[2] What portions of Facebook use Cassandra today?
[3] How scalable is CouchDB in practice, not just in theory?
[4] http://bret.appspot.com/entry/ho...
Quora: http://www.quora.com/Quora-Infrastructure/Why-does-Quora-use-MySQL-as-the-data-store-instead-of-NoSQLs-such-as-Cassandra-MongoDB-CouchDB-etc#ans554476
Ref: http://www.quora.com/MongoDB/What-is-the-largest-production-deployment-of-MongoDB-for-online-use
http://bret.appspot.com/entry/how-friendfeed-uses-mysql
- Why does Quora use MySQL as the data store instead of NoSQLs such as Cassandra, MongoDB, CouchDB etc?
- Why does Quora use MySQL as the data store instead of NoSQLs such as Cassandra, MongoDB, or CouchDB?
- storm杂谈之Why use netty as transport instead of zeromq
- Why does MariaDB 10.2 use InnoDB instead of XtraDB?
- What is the use of having destructor as private?
- Why does the C# compiler translate this != comparison as if it were a > comparison?
- 为什么奇巧需要使用isValidFragment的?(Why does Kit Kat require the use of the isValidFragment?)
- Elasticsearch as a Time Series Data Store
- Avoid using "px" as units; use "dp" instead
- Avoid using "px" as units; use "dp" instead
- Avoid using "px" as units; use "dp" instead
- Why should I use XML instead of HTML?
- Why Use Git Instead of a Legacy Version Control System
- Why use MTP instead of USB Mass Storage
- Why we should use ENUM instead of String ?
- Why use std::type_index instead of std::type_info*
- Why does “extern const int n;” not work as expected?
- How to select the data type in SQLserver database such as varchar, nvarchar
- 字符串中左右括号匹配判断
- 您的钱包选对颜色了吗
- .Net 下利用ICSharpCode.SharpZipLib.dll实现文件/文件夹压缩、解压缩(改进)
- SQL 系统存储过程用法整理
- Open Inventor学习资源
- Why does Quora use MySQL as the data store instead of NoSQLs such as Cassandra, MongoDB, CouchDB etc?
- 利用图片链接跳转网页的方法
- 第一个Django应用
- JS设置title的样式
- 关于字符编码 g_convert(), g_locale_to_utf8()
- Windows下配置Django开发环境
- 非常搞笑的一则求购女友短讯
- 通过脚本判断一个程序是否在运行
- 博客管理系统的工作总结