SQL基本语法&SQLite

来源:互联网 发布:java中锁的种类 编辑:程序博客网 时间:2024/04/29 15:19

Databases

数据库是一个数据存储区用于存储、查询和处理数据。数据库存储我们需要的数据并且开放一个和数据交互的接口。大多数科技公司使用数据库来组织数。数据库系统包括数据库管理软件与管理控制、安全和访问控制,语言与数据库接口这些内容。
首先,我们将关注SQL语言一个结构化查询语言。它是用来查询、更新和修改数据库中的数据。

SQL

SQL是最常见的一种数据库语言,在任何数据专业工具箱中都是一个重要的工具。虽然SQL是一种语言,但它完全不同于像Python或r 这样的语言, SQL是专门为了与数据库进行交互而建立的一种语言,不会像传统编程语言一样有很多的功能。由于SQL是一种声明式语言,用户需要关注的是表达他想要的东西而计算机则集中在如何执行计算。

Tables, Rows, & Columns

一个数据库就是tables的集合,一个表格包含一行行的数据,以及列信息。table和Pandas中的DataFrame很像。
本文的数据集是**recent-grads.csv**2010到2012的毕业生信息,它的属性如下:

  • Rank - Rank by median earnings.
  • Major_code - Major code.
  • Major - Major description.
  • Major_category - Category of major.
  • Total - Total number of people with major.
  • Sample_size - Sample size (unweighted) of full-time.
  • Men - Male graduates.
  • Women - Female graduates.
  • ShareWomen - Women as share of total.
  • Employed - Number employed.

Querying

  • 写一个查询语言是与数据库进行的基本交互,比如:
SELECT [columnA, columnB, ...]FROM tableName;SELECT Rank,MajorFROM recent_grads;
  • 其中分号 ; 是不能省略的,它表明这个查询语句的结尾。这可以允许我们在一行写很多个查询语句,只要不写 ; 就没有结束。

SQLite

SQLite是一个轻量级别的数据库,用来探索和学习SQL是再好不过了。

SELECT Rank,Major FROM recent_grads;'''[["Rank", "Major"], [1, "PETROLEUM ENGINEERING"], [2, "MINING AND MINERAL ENGINEERING"], [3, "METALLURGICAL ENGINEERING"], [4, "NAVAL ARCHITECTURE AND MARINE ENGINEERING"], . . .'''

Specifying Column Order

  • SQL可以指定select返回值中列的顺序,下面将Major放在前面:
SELECT Major,Rank FROM recent_grads;'''[["Major", "Rank"], ["PETROLEUM ENGINEERING", 1], ["MINING AND MINERAL ENGINEERING", 2], ["METALLURGICAL ENGINEERING", 3], . . .'''

Practice: Select

SELECT Rank,Major_code,Major,Major_category,Total FROM recent_grads;'''[["Rank", "Major_code", "Major", "Major_category", "Total"], [1, 2419, "PETROLEUM ENGINEERING", "Engineering", 2339], [2, 2416, "MINING AND MINERAL ENGINEERING", "Engineering", 756], [3, 2415, "METALLURGICAL ENGINEERING", "Engineering", 856], [4, 2417, "NAVAL ARCHITECTURE AND MARINE ENGINEERING", "Engineering", 1258],. . .'''

Where

我们用select返回的是整个列的数据,倘若我们想要返回某列满足一些条件的数据,则需要用到where语句,where语句是用来过滤的。

比如我们想获取哪些Major的女生要大于男生,使用where需要三个东西:

  • The column we want the database to filter on: ShareWomen
  • A comparison operator to specify how we want a value in a column to be compared: >
  • The comparison value we want the database to compare each value to: 0.5
SELECT Major,ShareWomenFROM recent_gradsWHERE ShareWomen > 0.5;'''[["Major", "ShareWomen"], ["ACTUARIAL SCIENCE", 0.535714286], ["COMPUTER SCIENCE", 0.578766338],["ENVIRONMENTAL ENGINEERING", 0.558548009],["NURSING", 0.896018988], . . .'''

where语句中可以使用的比较符:< <= > >= = !=

Practice: Where

SELECT Major,Employed FROM recent_grads WHERE Employed > 10000;'''[["Major", "Employed"], ["CHEMICAL ENGINEERING", 25694], ["MECHANICAL ENGINEERING", 76442], ["ELECTRICAL ENGINEERING", 61928], . . .'''

Limit

前面查询语句返回的数据都很多,有的时候这很麻烦,我们需要做一个限制。SQL中有一个LIMIT 语句可以实现这个功能,LIMIT 语句放在查询的最后。

  • 下面这条语句将返回结果的前五条:
SELECT Major FROM recent_grads LIMIT 5;'''[["PETROLEUM ENGINEERING"], ["MINING AND MINERAL ENGINEERING"], ["METALLURGICAL ENGINEERING"], ["NAVAL ARCHITECTURE AND MARINE ENGINEERING"], ["CHEMICAL ENGINEERING"]]'''
  • 这个例子将返回Employed>1000的前10条:
SELECT Major FROM recent_grads WHERE Employed>10000 LIMIT 10;'''[["Major"], ["CHEMICAL ENGINEERING"], ["MECHANICAL ENGINEERING"], ["ELECTRICAL ENGINEERING"], ["COMPUTER ENGINEERING"], ["AEROSPACE ENGINEERING"], ["BIOMEDICAL ENGINEERING"], ["INDUSTRIAL AND MANUFACTURING ENGINEERING"], ["GENERAL ENGINEERING"], ["COMPUTER SCIENCE"], ["MANAGEMENT INFORMATION SYSTEMS AND STATISTICS"]]'''

Logical Operators

where语句中有6个比较操作,逻辑操作(Logical operators)最常用的是OR 以及AND,使用逻辑操作可以连接多个比较操作,来进行更细致的过滤。

And Operator

  • AND的语法如下:
SELECT [column1, column2,...] FROM [table1]WHERE [condition1] AND [condition2]
  • 下面这个语句表达的是:从recent_grads 中挑选出ShareWomen>0.5**并且**Employed>10000的数据(Major,ShareWomen,Employed)的前10条
SELECT Major,ShareWomen,Employed FROM recent_grads WHERE ShareWomen>0.5 AND Employed>10000 LIMIT 10;

Or Operator

  • OR操作的语法如下:
SELECT [column1, column2,...] FROM [table1]WHERE [condition1] OR [condition2]
  • 下面这段代码表示的是:从recent_grads 中挑选出edian >= 10000 或者Unemployed <= 1000的数据(Major,Median,Unemployed)的前20条
SELECT Major,Median,Unemployed FROM recent_grads WHERE Median >= 10000 OR Unemployed <= 1000 LIMIT 20;

Grouping Operators

  • 如果我们想获得majors 是Engineering 且大部分是women或者unemployment rate小于5.1%,数据,查询语句如下,可以发现SQL语言是不群分大小写的:
select Major, Major_category, ShareWomen, Unemployment_ratefrom recent_gradswhere (Major_category = 'Engineering') and (ShareWomen > 0.5 or Unemployment_rate < 0.051);'''[["Major", "Major_category", "ShareWomen", "Unemployment_rate"], ["PETROLEUM ENGINEERING", "Engineering", 0.120564344, 0.018380527], ["METALLURGICAL ENGINEERING", "Engineering", 0.153037383, 0.024096386], ["NAVAL ARCHITECTURE AND MARINE ENGINEERING", "Engineering", 0.107313196, 0.050125313],'''
  • 另一个需要注意的是使用括号的问题,如果我们对于Major_category = ‘Engineering’这个句子不加括号,系统会认为我们想表达的意思是下面这样,那么结果会出错:
where (Major_category = 'Engineering' and ShareWomen > 0.5) or (Unemployment_rate < 0.051) 

Practice Grouping Operators

select Major, Major_category, Employed, Unemployment_ratefrom recent_gradswhere (Major_category = 'Business' or Major_category = 'Arts' or Major_category = 'Health') and (Employed > 20000 or Unemployment_rate < 0.051);

Order By

  • select语句返回的结果从上往下的顺序是按照数据库中数据原有的相对位置,比如上面的数据都是按照Rank排列的。但是有时候我们想要获取的结果按照我们的意愿排列,这个时候我们可以使用Order By语句,Order By语句的语法如下:
SELECT [column1, column2,...] FROM [table1]WHERE [conditions]..ORDER BY column1 [ASC or DESC]
  • SQL使用标准的按字母顺序排序,默认是ESC,从小到大。
select Majorfrom recent_gradsorder by Major desclimit 10;'''[["Major"], ["ZOOLOGY"], ["VISUAL AND PERFORMING ARTS"], ["UNITED STATES HISTORY"], ["TREATMENT THERAPY PROFESSIONS"], ["TRANSPORTATION SCIENCES AND TECHNOLOGIES"], ["THEOLOGY AND RELIGIOUS VOCATIONS"], ["TEACHER EDUCATION: MULTIPLE LEVELS"], ["STUDIO ARTS"], ["STATISTICS AND DECISION SCIENCE"], ["SPECIAL NEEDS EDUCATION"]]'''

Order Using Multiple Columns

  • SQL也可以使用多列进行排序,通常在进行人名排序是,先按Last Name排序,再按First Name排序,因为Last Name有很多相同的。
select [column1, column2..]from table_nameorder by column1 (asc or desc), column2 (asc or desc)

这里写图片描述

select Major_category, Median, Majorfrom recent_gradsorder by Major asc, Median desclimit 20;
0 0
原创粉丝点击