Chapter 2：Guidelines for Python Developers

来源：互联网发布：数据降维方法编辑：程序博客网时间：2024/06/04 19:36

Chapter 2 Guidelines for Python Developers

The following guidelines cover aspects of developing MySQL applications that might not be immediately obvious to developers coming from a Python background:

For security, do not hardcode the values needed to connect and log into the database in your main script. Python has the convention of aconfig.py module, where you can keep such values separate from the rest of your code.
Python scripts often build up and tear down large data structures in memory, up to the limits of available RAM. Because MySQL often deals with data sets that are many times larger than available memory, techniques that optimize storage space and disk I/O are especially important. For example, in MySQL tables, you typically use numeric IDs rather than string-based dictionary keys, so that the key values are compact and have a predictable length. This is especially important for columns that make up the primary key for anInnoDB table, because those column values are duplicated within eachsecondary index.
Any application that accepts input must expect to handle bad data.
The bad data might be accidental, such as out-of-range values or misformatted strings. The application can use server-side checks such asunique constraints andNOT NULL constraints, to keep the bad data from ever reaching the database. On the client side, use techniques such as exception handlers to report any problems and take corrective action.
The bad data might also be deliberate, representing an“SQL injection” attack. For example, input values might contain quotation marks, semicolons,% and _ wildcard characters and other characters significant in SQL statements. Validate input values to make sure they have only the expected characters. Escape any special characters that could change the intended behavior when substituted into a SQL statement. Never concatenate a user input value into a SQL statement without doing validation and escaping first. Even when accepting input generated by some other program, expect that the other program could also have been compromised and be sending you incorrect or malicious data.
Because the result sets from SQL queries can be very large, use the appropriate method to retrieve items from the result set as you loop through them.fetchone() retrieves a single item, when you know the result set contains a single row. fetchall() retrieves all the items, when you know the result set contains a limited number of rows that can fit comfortably into memory.fetchmany() is the general-purpose method when you cannot predict the size of the result set: you keep calling it and looping through the returned items, until there are no more results to process.
# fetchone() :检索一个项目，只有一个单行时。
# fetchall() :检索所有项目，有限行，能轻松存进内存。
# fetchmany() :通用的方法，遍历并返回所有项目。
Since Python already has convenient modules such aspickle andcPickle to read and write data structures on disk, data that you choose to store in MySQL instead is likely to have special characteristics:
# python有pickle,cPickle模块，用来在磁盘上写数据结构，使存储在MySQL的数据可能有特殊的特征：
- Too large to all fit in memory at one time. You useSELECT statements to query only the precise items you need, and aggregate functions to perform calculations across multiple items. You configure the innodb_buffer_pool_size option within the MySQL server to dedicate a certain amount of RAM for caching query results.
- Too complex to be represented by a single data structure. You divide the data between different SQL tables. You can recombine data from multiple tables by using ajoin query. You make sure that related data is kept in sync between different tables by setting upforeign key（外部键） relationships.
- Updated frequently, perhaps by multiple users simultaneously. The updates might only affect a small portion of the data, making it wasteful to write the whole structure each time. You use the SQL INSERT,UPDATE, andDELETE statements to update different items concurrently（并发地）, writing only the changed values to disk. You use InnoDB tables and transactions to keep write operations from conflicting with each other, and to return consistent query results even as the underlying data is being updated.
Using MySQL best practices for performance can help your application to scale without requiring major rewrites and architectural changes. SeeOptimization for best practices for MySQL performance. It offers guidelines and tips for SQL tuning, database design, and server configuration.
You can avoid reinventing the wheel by learning the MySQL SQL statements for common operations: operators to use in queries, techniques for bulk loading data, and so on. Some statements and clauses are extensions to the basic ones defined by the SQL standard. See Data Manipulation Statements,Data Definition Statements, andSELECT Syntax for the main classes of statements.
# SQL批量操作
Issuing SQL statements from Python typically involves declaring very long, possibly multi-line string literals. Because string literals within the SQL statements could be enclosed by single quotation, double quotation marks, or contain either of those characters, for simplicity you can use Python's triple-quoting mechanism to enclose the entire statement. For example:
```
'''It doesn't matter if this string contains 'single'or "double" quotes, as long as there aren't 3 in arow.'''
```
You can use either of the ' or" characters for triple-quoting multi-line string literals.
Many of the secrets to a fast, scalable MySQL application involve using the right syntax at the very start of your setup procedure, in theCREATE TABLE statements. For example, Oracle recommends theENGINE=INNODB clause for most tables, and makesInnoDB the default storage engine in MySQL 5.5 and up. UsingInnoDB tables enables transactional behavior that helps scalability of read-write workloads and offers automaticcrash recovery. Another recommendation is to declare a numeric primary key for each table, which offers the fastest way to look up values and can act as a pointer to associated values in other tables (aforeign key). Also within theCREATE TABLE statement, using the most compact column data types that meet your application requirements helps performance and scalability（可拓展） because that enables the database server to move less data back and forth between memory and disk.

摘自：http://dev.mysql.com/doc/connector-python/en/connector-python-coding.html

0 0