查阅图书馆借书情况并发送邮件提醒

来源:互联网 发布:excel表删除重复数据 编辑:程序博客网 时间:2024/06/13 21:19

转载自https://segmentfault.com/a/1190000006106478
主要用到Beautifulsoup,mysql,pymysql,等库。

1.首先载入需要的库函数

import reimport requestsimport pymysqlimport sysimport timefrom email import encodersfrom email.header import Headerfrom email.mime.text import MIMETextfrom email.utils import parseaddr, formataddr

2.创建登录

(1)python requests 自动管理cookie, session保持连接,抓取数据后结束

session = requests.Session()    # 会话对象让你能够跨请求保持某些参数,它也会在同一个Session实例发出的所有请求之间保持cookiesession.headers = {    'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.143 Safari/537.36',    'Cookie': 'PHPSESSID=n1grlir20m1snkuhtke3otck76'}

(2)其中的User-agent 和cookies 首先要登录学校图书馆网站,然后抓取。

3.创建数据库

(1)安装mysql,对应网站下载安装即可。
(2)创建book数据库,以及下文需要用到的数据表“book_list”,可以通过代码创建,方便的话可以通过workbench创建表,这样比较直观。

mysql>create database book
def get_mysql():    conn = pymysql.connect(host = 'localhost',port = 3306, user = 'root', passwd = '888888', db = 'mysql', charset = 'utf8')    # user为数据库的名字,passwd为数据库的密码,一般把要把字符集定义为utf8,不然存入数据库容易遇到编码问题    cur = conn.cursor()    # 获取操作游标    cur.execute('use book')   # 使用book这个数据库    return (cur, conn)

(2)‘use book ’这些都是mysql的指令,通过cur.execute发送到mysql。

4得到图书馆借书目录

(1)注意这里sql 语句‘select * from book.book_list’

def get_book_name(book_url):    html = session.get(book_url, cookies = session.cookies, headers = session.headers).content.decode('utf-8')    soup = BeautifulSoup(html, 'lxml')    book_bar = []    # 书籍的条形码列表,用来判断要存入数据库的书籍是否已经存在    cur, conn = get_mysql()    sql = "select * from book.book_list"    cur.execute(sql)    rows = cur.fetchall()#返回所有列    for row in rows:        book_bar.append(row[1])    book_list = []    # 这个是我测试时使用的,作用是把每本书籍的信息列表放在这个列表中    book_every = []  # 一本书籍的所有信息列表    sur1 = soup.find_all("td", class_="whitetext")    for book_time in soup.find_all('td', class_="whitetext"):        print(book_time.get_text().strip())  # 移除字符串头尾指定的字符(默认为空格)        pattern = re.compile(r'\s')        content = re.sub(pattern, r'', book_time.get_text())  # 目的也是匹配任何空白符并去除,貌似对空行去除没影响        if content != '':            book_every.append(content)            if len(book_every) == 7:                book_list.append(book_every)                print sur1                if book_every[0] not in book_bar:                    sql = 'insert into `book`.`book_list` (`条形码`, `题名和作者`, `借阅日期`,`应还日期`, `续借量`, `馆藏地`, `附件`) values (' +"\'" \                          + book_every[0] + "\'," + "\'" + book_every[1] + "\'," + "\'" + book_every[2] + "\'," + "\'" \                          + book_every[3] + "\'," + "\'" + book_every[4] + "\'," + "\'" + book_every[5] + "\'," + "\'" \                          + book_every[6] + "\'" + ');'#注意这里的sql 语句,列表头都是(``)带着的,是tab上面的那个键,不是单引号。这里不对是不能创建数据库内容的                try:                    cur.execute(sql)                    conn.commit()                except:                    conn.rollback()                book_every = []    print(book_list)

5比较内容,发送邮件

def send_message():    day_num = [31, 29, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31]    day_num1 = [31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31]    sql = 'select * from book_list;'    cur, conn = get_mysql()    cur.execute(sql)    rows = cur.fetchall()    local_time = time.strftime("%Y-%m-%d", time.localtime())  # 获取当前时间    local_time = str(local_time)    times = re.split(r'-', local_time)    year = int(times[0])    number = 0    cur.close()    conn.close()    day = 0    while(True):        for i in rows:            print(i[3])            pattern = re.split(r'-', i[3])            print pattern            if times[1] == pattern[1]:                day = int(times[2]) - int(pattern[2])                if day > 0:                    print('已经超期了%d天' % day)                    number += 1                    send_email(day, number, i[2])            elif times[1] > pattern[1]:                if (year % 4 == 0 and year % 100 != 0) or year % 400 == 0:                    extend_day = day_num1[int(pattern[1]) - 1] - int(pattern[2]) + int(times[2])                    print('已经超期了%d天' % extend_day)                    number += 1                    send_email(day, number, i[2])                else:                    extend_day = day_num[int(pattern[1]) - 1] - int(pattern[2]) + int(times[2])                    print('已经超期了%d天' % extend_day)                    number += 1                    send_email(day, number, i[2])            else:                print('还没有超期的书籍')            print(pattern[2])        time.sleep(3600 * 24)def send_email(day, number, title):    from_addr = 'lol@163.com'    password = 'lol'    to_addr = 'dota@qq.com'    smtp_server = 'smtp.163.com'    text = '你好 我们是图书馆管理员,有事情通知你,你有一本叫《%s》的书籍超期了,而且已经超期了%d天了,总共有%d书超期了!!!' % (title, day, number)    msg = MIMEText(text, 'plain', 'utf-8')    msg['From'] = _format_addr('图书馆的通知<%s>' % from_addr)    msg['To'] = _format_addr('管理员<%s>' % to_addr)    msg['Subject'] = Header('来着***的问候......', 'utf-8').encode()    server = smtplib.SMTP(smtp_server, 25)    server.set_debuglevel(1)    server.login(from_addr, password)    server.sendmail(from_addr, [to_addr], msg.as_string())    server.quit()def _format_addr(s):    name, addr = parseaddr(s)    return formataddr((Header(name, 'utf-8').encode(), addr))

6 运行

get_book_name('借书界面的网页源代码的html')send_message()
0 0
原创粉丝点击