Dataquest Data Scientist Path 整理笔记(1)
来源:互联网 发布:mysql 降序 desc 编辑:程序博客网 时间:2024/06/05 13:46
在Dataquest中学习Data Scientist方向的知识要点整理笔记。
Step1:Introduction to Python
- 打开文件
f = open("crime_rates.csv","r")g = f.read()
- 分割list
sample = "john,plastic,joe"split_list = sample.split(",")
- string中符号替换
text = "Howdy,my,name"text = text.replace(",", "")
- Function
def clean_text(string_value,clean = False):#clean默认为False,可通过将实例中相应参数设为Ture来实现函数功能。 if clean: cleaned_value = string_value.replace(",", "") return(cleaned_value)sentence = "Howdy,james,bond!"sentence = clean_text(sentence,Ture)
- String大写变小写
words = "Michael JACKSON Thriller"lower_words = words.lower()
- Model
利用csv中的.reader功能读取csv文件
import csvf = open("my_data.csv")csvreader = csv.reader(f)my_data = list(csvreader)
- Class
class Car(): def __init__(self, name): self.name = name self.color = "black" self.make = "honda" self.model = "accord" def print_name(self): print(self.name)Car.print_name()
- Set
去重
unique_animals = set(["Dog", "Cat", "Hippo", "Dog", "Cat", "Dog", "Dog", "Cat"])
- Try/Except Blocks
numbers = [1,2,3,4,5,6,7,8,9,10]for i in numbers: try: int('') except Exception: print("There was an error")#如果不希望print,可换为pass,不进行处理
- enumerate()
将door_count中的元素加入cars中的每一行,add columns to list of list。
door_count = [4, 4]cars = [ ["black", "honda", "accord"], ["red", "toyota", "corolla"] ]for i, car in enumerate(cars): car.append(door_count[i])
- List Comprehensions
animals = ["Dog", "Tiger", "SuperLion", "Cow", "Panda"]animal_lengths = [len(animal) for animal in animals]
- items()
fruits = {"apple": 2, "orange": 5, "melon": 10}for fruit, rating in fruits.items(): print(rating)
- Regular Expressions 正则表达式
“.”:通配符,可表示任意字符。
“^”:匹配以本符号开头的字符串,如”^abc”。
“$”:匹配以本符号结束的字符串,如”abc$”。
“|”:匹配本符号前字符开头或本符号后字符结尾的字符串,如”cat|dog”可匹配”catalog”或”hotdog”。
“[]”:中括号中为或的关系,如”[bcr]at” 可以匹配 “bat”, “cat”, “rat”。
“[-]”:”[0-9]”表示0至9任意一个数字,”[h-y]”表示h至y之间任意一个字母。
“{}”:”{4}”表示前边的字符重复4次,”[0-9]{4}”可以匹配年份。
“\” :转义字符,如”.”表示”.”这一字符。
import rere.search(regex, string)#在"regex"中查询是否存在"string",如存在,则返回match这个object,如不存在则返回"None"re.sub("yo", "hello", "yo world")#用"hello"替换"yo world"中的"yo",可用于标准化字符串re.findall("[1-2][0-9]{3}", years_string)#在"years_string"中匹配"1000"到"2999"的字符串
- time.time()
Unix timestamps 标准时间戳:表示相对于新纪元过了多长时间
import timecurrent_time = time.time()#取得现在的标准时间戳
time.gmtime()
更为易读的时间格式,部分属性如下:
tm_year: The year of the timestamp
tm_mon: The month of the timestamp (1-12)
tm_mday: The day in the month of the timestamp (1-31)
tm_hour: The hour of the timestamp (0-23)
tm_min: The minute of the timestamp (0-59)UTC
Coordinated Universal Time 格林威治时间,通过datetime表示,属性如下:
year
month
day
hour
minute
second
microsecond
import datetimecurrent_datetime = datetime.datetime.now()current_year = current_datetime.yearcurrent_month = current_datetime.month
- timedelta
datetime中的一个类,用于表示时间跨度,有以下参数:
weeks
days
hours
minutes
seconds
milliseconds
microseconds
diff = datetime.timedelta(weeks = 3, days = 2)
- datetime.strftime()
datetime中的一个方法,用于将时间表达为想要的易读的形式,方法介绍
import datetimemarch3 = datetime.datetime(year = 2010, month = 3, day = 3)pretty_march3 = march3.strftime("%b %d, %Y")
- datetime.datetime.strptime()
函数,用于将一个表示时间的字符串转换为datetime实例
march3 = datetime.datetime.strptime("Mar 03, 2010", "%b %d, %Y")
- datetime.datetime.fromtimestamp()
函数,用于将一个Unix timestamps转换为datetime对象
datetime_object = datetime.datetime.fromtimestamp(1433213314.0)
阅读全文
1 0
- Dataquest Data Scientist Path 整理笔记(1)
- Dataquest Data Scientist Path 整理笔记(2)
- Data scientist's tool笔记
- Dataquest学习代码笔记
- become a data scientist
- Dataquest学习总结[1]
- A road map to become a Data Scientist(上)
- 【转载】如何才是Data Scientist?
- The Data Scientist’s Toolbox
- The Data Scientist's Toolbox
- MapReduce and the Data Scientist翻译
- How to become a data scientist
- The Data Scientist Guide with Links
- 01_The Data Scientist's Toolbox
- How can I become data scientist?FAQ
- The Data Scientist's Toolbox -- markdown基础
- 【机器学习】Data Scientist进阶书籍
- Becoming a Data Scientist – Curriculum via Metromap
- 忘记mysql root密码
- 链接:NFC:基于主机的卡模拟
- php_network_getaddresses: getaddrinfo failed
- Delphi中string数据类型的特殊用法(获取string字符串的单个字符)
- Qt知识点
- Dataquest Data Scientist Path 整理笔记(1)
- es遇到的问题
- [Eclipse Jboss]System.out.println和logger在控制台不显示
- 前端基础进阶(二):执行上下文详细图解
- Block传值
- $.each遍历json对象的问题
- 如果给你选择_你更愿意在哪座城市的一隅敲代码?
- Reverse String
- Android 6.0权限管理解决方案