6354
說明:
|
8440
|
這個顏色代表刪除的 | 這個顏色代表增加的 |
行號 1: | 行號 1: |
{{{#!sidebar '''本站提供的 Python 資源''' * [[Python/Cookbook|Python 菜譜]]:教你用 Python 解決各種問題。 * [[Python/第一次用就上手]]。 * [[Python/能做什麼]] <- 待整理 * [[The Zen Of Python|Python 之道]]:撰寫 Python 風味「Pythonic 程式」的神功心法。 * [[CategoryApplications:Python|應用程式]]。 ------ '''到哪裡下載 Python?''' * [[http://www.Python.org/download/|Python 官方下載區|target="_blank"]]。 * [[http://www.activestate.com/products/activepython/|ActivePython|target="_blank"]] 是一個 Python windows 發佈套件,包括 Python 語言核心、zlib, bzip2 壓縮模組、SQLite, Berkeley DB 存取模組 (bsddb)、Tix GUI 元件等。 * [[http://pypi.python.org/pypi|PyPI: Python Package Index|target="_blank"]] 「Python 套件索引」裡面列出了一萬三千種以上的 Python 程式或模組。 }}} = Python 是什麼? = Python 是一種泛用性的動態物件導向程式語言。自 1990 年代初由 [[http://www.Python.org/~guido/|Guido van Rossum]] (又常被稱為 GvR 或 [[http://en.wikipedia.org/wiki/Guido_van_Rossum|BDFL]]) 創造至今已歷十數年發展,應用於系統管理、網路管理、網路傳輸程式、網頁程式開發、數值分析程式、圖形介面應用程式等方面,均有優秀的表現。 * [[/History]] 頁面簡述 Python 發展的歷史。 * [[http://zh.wikipedia.org/wiki/Python|中文 Wikipedia 上的 Python 條目]] == 方便的 Python == Python 的標準程式庫豐富強大,「能量充沛」(batteries included)。 Python 的用途廣泛,使用者來自各個領域。在 [[http://pypi.python.org/pypi|PyPI: Python 套件索引]]裡,你可以找到符合各種需要的套件模組。 Python 不像 Java 有商業級的宣傳,但能寫出更加簡潔清晰的程式碼,發揮程式員的生產力,提昇軟體專案的成功率。在 [[http://www.ferg.org/projects/python_java_side-by-side.html|Python & Java: A Side-by-Side Comparison]] 文章裡,研究顯示 Python 比 Java 普遍具備五倍以上的生產力。 [[/IDE]] 頁面簡述 PythonIDE 清單。 == 快速的 Python == Python 程式執行的速度,在常用的動態語言 (PHP, Perl, Ruby, etc.) 中是數一數二快的。 == 跨平台的 Python == Python 可以執行在 Windows、Mac OS X、Linux 等常見的作業系統平台和其它較少使用的作業系統上,也可以在 Java 和 .Net 環境中執行。 另外,除了最普及的 Windows CE PDA 之外,Nokia S60 系列手機上也可以執行 Python 語言喔。 == 高彈性的 Python == Python 以它的「膠著力」聞名,被稱作「膠水語言」 (Python as a glue),多年來都與 C/C++ 合作愉快。網路遊戲「星戰前夜 (EVE)」用它與 C++ 合作,打造成功商業範例。知名的戰略遊戲「火線交鋒 (Act of War)」也使用 Python 作為連線對戰介面。 此外,透過 Jython,Python 能與 Java 合作愉快;透過 Iron Python,Python 能與 .Net 合作愉快。Iron Python 的作者現任職於微軟,也正是 Jython 的原作者。 = Python 的應用 = == Python 網頁開發 == Python 有眾多網頁開發工具。從各式各樣的模板到框架級的 [[Django]]、[[http://pylonsproject.org/|Pylons Project]]、[[http://bottlepy.org/|Bottle]] 或 [[http://flask.pocoo.org|Flask]],提供了優良的網頁開發支援。 如果你在網頁開發上的功能需求超越了框架的能力,請考慮 [[Zope]]: Z Object Publishing Environment,萬用的網頁應用程式伺服器。同時,你應該也會對 Python/Zope 下強大的網頁式內容管理系統 [[Plone]] 感興趣。 Python 支援各種資料庫。sqlite、MySQL、PostgresSQL、Oracle、MSSQL、FireBird 等等都沒問題。 Python 有極好的 SQL wrapper:SQLAlchemy,幫助我們用物件導向的方式存取資料庫。 == 以 Python 開發的知名軟體 == 愈來愈愛歡迎的原始碼管理程式 [[Trac]] 是用 Python所開發的。 常用的 wiki 引擎 MoinMoin、強大的應用程式伺服器 [[Zope]],以及最常用的 mailing list 軟體 [[http://www.gnu.org/software/mailman/|Mailman]] 也是用 Python 所開發出來的。 分散式版本控制系統 [[Mercurial]] 及 [[Bazaar]] 也使用Python開發。 [[http://www.bittorrent.org/|Bittorrent (BT)]] 最早的主要版本 (mainline) 版本6.0以前都是用 Python 開發的。 == 以 Python 開發的知名服務 == 你知道嗎? [[http://www.NASA.gov|NASA]] 用 Python 計算衛星軌道。 大家常用的 [[http://www.YouTube.com|YouTube]] 網站,大部分使用 Python 語言開發。 [[http://www.Google.com.tw|Google]] 用 Python 語言撰寫網路爬蟲 (crawler) 與許多其它服務;Guido van Rossum 在 2006 年加入 Google。 == 其它以 Python 開發的系統 == [[http://matplotlib.SourceForge.net/|Matplotlib]]:類似 Matlab 的自由工程計算/繪圖軟體。 [[http://salstat.sourceforge.net/|SalStat Statistics]]:類似 SPSS 的自由統計軟體。 [[http://bibus-biblio.sourceforge.net/wiki/index.php/Main_Page|Bibus Bibliographic software]]:書目資料庫,像 Endnote 一樣是寫論文的好幫手。 [[http://www.Gnome.org/projects/straw/|Straw]]:一個好用的RSS閱讀器。 [[http://www.tortall.net/mu/wiki/Cankiri|Cankiri]]:Linux 上的畫面錄製軟體。 [[http://www.pitivi.org/wiki/Main_Page|PiTiVi]]:非線性影音剪輯。 == Python 還可以作什麼? == NASA 使用 Python 計算衛星軌道,那麼離用來飛太空梭也不遠了? 讓我們來看看官方對 [[http://www.Python.org/about/quotes/|Python 使用領域的說明]]。 別忘了它還能[[http://xkcd.com/353/|反萬有引力]]。 = 聽聽 Python 愛好者怎麼說 = [[Thinker]]:我最愛用的語言! [[timchen119]]:一種易學易用,鼓勵使用者開發易讀程式碼的優雅語言。 [[yungyuc]]:{OK} [[marr]]:像初吻般的感動。 [[gasolin]]:學 Python 讓我寫其他語言程式時更清晰。 DrakeGuan:看到同事開始使用我用 wxPython 寫的程式,心情實在非常的 high。 [[keitheis]]: 工藝什樂,簡單有力的語言特性與內建函式庫,可謂好用美觀又好吃(無誤)。 |
<!-- #!/usr/bin/python # -*- coding: big5 -*- import httplib,urllib,sys,traceback def findSubBetween(body,startstr,endstr,start = 0): s = body.find(startstr,start) e = body.find(endstr,s+len(startstr)) return str(body[s+len(startstr):e]) class Detail(object): pass class ETDBean(object): def setDownLoadETD(self,detd): self.detd = detd def gettitle(self): pass def getweb(self): pass def getlistparams(self): pass def haslist(self,rs): pass def findlistcounts(self,rs): pass def hasnextlistrs(self): pass def nextlistrs(self): pass def findurnlist(self,rs): pass def getdetailbean(self,urn): pass class DownLoadETD(object): def __init__(self,etdbean): self.etdbean = etdbean self.etdbean.setDownLoadETD(self) self.headers = {"User-Agent":"Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)"} self.conn = httplib.HTTPConnection(self.etdbean.getweb()) pass def getresponse(self,body,params={}): self.conn.request("POST",body,urllib.urlencode(params),self.headers) return self.conn.getresponse().read() def getqueryurnlist(self): urnlist = [] while self.etdbean.hasnextlistrs(): rs = self.etdbean.nextlistrs() for u in self.etdbean.findurnlist(rs): urnlist.append(u) return urnlist #pass def loaddeatail(self,urnlist): all = len(urnlist) deataillist = [] for urn in urnlist: deataillist.append(self.etdbean.getdetailbean(urn)) sys.stderr.write("load detail %d/%d done \r"%(len(deataillist),all) ) print '\nloaddeatail done' deataillist.sort(key=lambda x : x.year) return deataillist pass def exportCSV(self,deataillist): if not len(deataillist) : return fname = '%s-歷屆論文.csv'%self.etdbean.gettitle() outf = open(fname,'w') outf.write('年度,論文,頁數,作者,指導教授,論文識別碼,網址\n') for d in deataillist: outf.write(','.join([str(d.year),d.subject,str(d.pages),d.author,d.advisor,d.urn,'http://%s/ETD-db/ETD-search-c/view_etd?URN=%s'%(self.etdbean.getweb(),d.urn)])) outf.write('\n') outf.close() print 'export %s done' % fname pass def download(self): """ 1.getqueryurnlist 2.getdetail from getqueryurn 3.loaddeatail 4.exportcsv """ print self.etdbean.gettitle() urnlist = self.getqueryurnlist() print len(urnlist),'\n','\n'.join(urnlist) deataillist = self.loaddeatail(urnlist) self.exportCSV(deataillist) pass class MCUETDBean(ETDBean): def __init__(self,year1 = 1 ,year2 = 999 , department = '資訊管理學系碩士在職專班'): self.year1 = year1 self.year2 = year2 self.department = department self.place=0 self.maxplace=0 self.nextrs = None pass def gettitle(self): return "MCU-%s-%s-%s" %(self.department,self.year1,self.year2) def getweb(self): return "ethesys.lib.mcu.edu.tw" def getlistparams(self): return {'field6': 'year', 'queryy6': self.year1 , 'query6': self.year2 ,'boolean6':'AND' ,'field7': 'department_c', 'query7': self.department, 'boolean7':'AND' ,'num_terms':'9','place':0,'field1':'name_c','query1':''} def haslist(self,rs): return rs.find("沒有記錄") == -1 def findlistcounts(self,rs): return int(findSubBetween(rs,'檢索結果共<font color="red"><b>','</b>').strip()) pass def hasnextlistrs(self): if self.place > self.maxplace : return False params = self.getlistparams() params['place'] = self.place rs = self.detd.getresponse("/ETD-db/ETD-search-c/search",params) if self.haslist(rs) == False : return False if not self.maxplace : self.maxplace = self.findlistcounts(rs) self.nextrs = rs self.place = self.place + 12 return True def nextlistrs(self): return self.nextrs pass def findurnlist(self,rs): urnlist = [] ssrt = '<a href="view_etd?URN=' start = rs.find(ssrt) while start > -1 : urnlist.append(findSubBetween(rs,ssrt,'">',start).strip()) start = rs.find(ssrt,start+len(ssrt)) return urnlist pass def getdetailbean(self,urn): d = Detail() rs = self.detd.getresponse("/ETD-db/ETD-search-c/view_etd",{'URN':urn}).replace('\n','',9999999) d.author = findSubBetween(rs,'<tr><td align="left" valign="top">中文姓名</td><td align="left" valign="top">','</td>') d.year = findSubBetween(rs,'<tr><td align="left" valign="top">學年度</td><td align="left" valign="top">','</td>') d.subject = findSubBetween(rs,'<tr><td align="left" valign="top">論文名稱(中)</td><td align="left" valign="top">','</td>') d.pages = findSubBetween(rs,'<tr><td align="left" valign="top">頁數</td><td align="left" valign="top">','</td>') d.advisor = findSubBetween(rs,'<tr><td align="left" valign="top">口試委員</td><td align="left" valign="top">','- 指導教授').split('<li>')[-1].strip(' \n').replace('教授','') d.urn = urn return d pass class SCUETDBean(ETDBean): def __init__(self,year1 = 1 ,year2 = 999 , department = '法律學系'): self.year1 = year1 self.year2 = year2 self.department = department self.pg=1 self.maxpg=1 self.pgrecordlimit=999999999999 self.nextrs=None pass def gettitle(self): return "SCU-%s-%s-%s" %(self.department,self.year1,self.year2) def getweb(self): return "etd.library.scu.edu.tw" def getlistparams(self): return {'field3': 'year', 'query3': self.year2, 'queryy3': self.year1,'boolean3':'AND' ,'field2': 'department_c', 'query2': self.department, 'boolean2':'AND' ,'num_terms':'6','sep_num':self.pgrecordlimit,'field1':'name_c','query1':''} def haslist(self,rs): return rs.find("查無任何資料") == -1 def findlistcounts(self,rs): return int(findSubBetween(rs,'共 <b><font color=red>','</font></b> 筆資料').strip()) pass def hasnextlistrs(self): if self.pg > self.maxpg : return False rs = None if self.pg == 1 : rs = self.detd.getresponse("/ETD-db/ETD-search-c/search",self.getlistparams()) if self.haslist(rs) == False : return False sumrecords = self.findlistcounts(rs) self.maxpg = (sumrecords / self.pgrecordlimit ) + min(1,sumrecords % self.pgrecordlimit) else: params = self.getlistparams() params['pg']=self.pg rs = self.detd.getresponse("/ETD-db/ETD-search-c/search",params) if self.haslist(rs) == False : return False self.pg=self.pg+1 self.nextrs = rs return True pass def nextlistrs(self): return self.nextrs pass def findurnlist(self,rs): urnlist = [] ssrt = '<input type="checkbox" name="flag" value="' start = rs.find(ssrt) while start > -1 : urnlist.append(findSubBetween(rs,ssrt,'">',start).strip()) start = rs.find(ssrt,start+len(ssrt)) #print urnlist return urnlist pass def getdetailbean(self,urn): d = Detail() rs = self.detd.getresponse("/ETD-db/ETD-search-c/view_etd",{'URN':urn}) d.author = findSubBetween(rs,'<td class="data_col_a">姓名</td><td class="data_col_b data_col_bgw">','(') d.year = findSubBetween(rs,'學期</td><td class="data_col_bgw">','學年度第') d.subject = findSubBetween(rs,'<td class="data_col_a">論文名稱</td><td colspan="3" class="data_col_bgw">','</td>').replace(' ','') d.pages = findSubBetween(rs,'中文',' 頁')[4:] d.advisor = findSubBetween(rs,'依職稱與姓名排序</font> <li>','- 指導教授').replace('指導教授','').replace('教授','').replace('博士','').strip() d.urn = urn return d pass try: """ MCUETDBean() DownLoadETD(MCUETDBean()) , """ for detd in [ DownLoadETD(MCUETDBean()) , DownLoadETD(SCUETDBean()) ]: detd.download() #detd = DownLoadETD(MCUETDBean()) #detd.download() except: traceback.print_exc() PressKey = raw_input("\n\n\nPress Any key to exit...") //--> |
<!-- #!/usr/bin/python # -*- coding: big5 -*-
import httplib,urllib,sys,traceback
def findSubBetween(body,startstr,endstr,start = 0):
- s = body.find(startstr,start) e = body.find(endstr,s+len(startstr)) return str(body[s+len(startstr):e])
class Detail(object):
- pass
class ETDBean(object):
- def setDownLoadETD(self,detd):
- self.detd = detd
- pass
- pass
- pass
- pass
- pass
- pass
- pass
- pass
- pass
class DownLoadETD(object):
def init(self,etdbean):
- self.etdbean = etdbean self.etdbean.setDownLoadETD(self) self.headers = {"User-Agent":"Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)"} self.conn = httplib.HTTPConnection(self.etdbean.getweb()) pass
- self.conn.request("POST",body,urllib.urlencode(params),self.headers) return self.conn.getresponse().read()
- urnlist = [] while self.etdbean.hasnextlistrs():
- rs = self.etdbean.nextlistrs() for u in self.etdbean.findurnlist(rs):
- urnlist.append(u)
- rs = self.etdbean.nextlistrs() for u in self.etdbean.findurnlist(rs):
- all = len(urnlist) deataillist = [] for urn in urnlist:
- deataillist.append(self.etdbean.getdetailbean(urn)) sys.stderr.write("load detail %d/%d done \r"%(len(deataillist),all) )
- if not len(deataillist) : return fname = '%s-歷屆論文.csv'%self.etdbean.gettitle() outf = open(fname,'w') outf.write('年度,論文,頁數,作者,指導教授,論文識別碼,網址\n') for d in deataillist:
outf.write(','.join([str(d.year),d.subject,str(d.pages),d.author,d.advisor,d.urn,'http://%s/ETD-db/ETD-search-c/view_etd?URN=%s'%(self.etdbean.getweb(),d.urn)])) outf.write('\n')
- """ 1.getqueryurnlist 2.getdetail from getqueryurn 3.loaddeatail 4.exportcsv """ print self.etdbean.gettitle() urnlist = self.getqueryurnlist() print len(urnlist),'\n','\n'.join(urnlist) deataillist = self.loaddeatail(urnlist) self.exportCSV(deataillist) pass
class MCUETDBean(ETDBean):
def init(self,year1 = 1 ,year2 = 999 , department = '資訊管理學系碩士在職專班'):
- self.year1 = year1 self.year2 = year2 self.department = department self.place=0 self.maxplace=0 self.nextrs = None pass
- return "MCU-%s-%s-%s" %(self.department,self.year1,self.year2)
- return "ethesys.lib.mcu.edu.tw"
- return {'field6': 'year', 'queryy6': self.year1 , 'query6': self.year2 ,'boolean6':'AND' ,'field7': 'department_c', 'query7': self.department, 'boolean7':'AND'
,'num_terms':'9','place':0,'field1':'name_c','query1':}
- return rs.find("沒有記錄") == -1
return int(findSubBetween(rs,'檢索結果共<font color="red"><b>','</b>').strip()) pass
if self.place > self.maxplace : return False params = self.getlistparams() params['place'] = self.place rs = self.detd.getresponse("/ETD-db/ETD-search-c/search",params) if self.haslist(rs) == False : return False if not self.maxplace : self.maxplace = self.findlistcounts(rs) self.nextrs = rs self.place = self.place + 12 return True
- return self.nextrs pass
- urnlist = []
ssrt = '<a href="view_etd?URN=' start = rs.find(ssrt) while start > -1 :
urnlist.append(findSubBetween(rs,ssrt,'">',start).strip()) start = rs.find(ssrt,start+len(ssrt))
- d = Detail()
rs = self.detd.getresponse("/ETD-db/ETD-search-c/view_etd",{'URN':urn}).replace('\n',,9999999) d.author = findSubBetween(rs,'<tr><td align="left" valign="top">中文姓名</td><td align="left" valign="top">','</td>') d.year = findSubBetween(rs,'<tr><td align="left" valign="top">學年度</td><td align="left" valign="top">','</td>') d.subject = findSubBetween(rs,'<tr><td align="left" valign="top">論文名稱(中)</td><td align="left" valign="top">','</td>') d.pages = findSubBetween(rs,'<tr><td align="left" valign="top">頁數</td><td align="left" valign="top">','</td>') d.advisor = findSubBetween(rs,'<tr><td align="left" valign="top">口試委員</td><td align="left" valign="top">','- 指導教授').split('<li>')[-1].strip(' \n').replace('教授',) d.urn = urn return d pass
class SCUETDBean(ETDBean):
def init(self,year1 = 1 ,year2 = 999 , department = '法律學系'):
- self.year1 = year1 self.year2 = year2 self.department = department self.pg=1 self.maxpg=1 self.pgrecordlimit=999999999999 self.nextrs=None pass
- return "SCU-%s-%s-%s" %(self.department,self.year1,self.year2)
- return "etd.library.scu.edu.tw"
- return {'field3': 'year', 'query3': self.year2, 'queryy3': self.year1,'boolean3':'AND' ,'field2': 'department_c', 'query2': self.department, 'boolean2':'AND'
,'num_terms':'6','sep_num':self.pgrecordlimit,'field1':'name_c','query1':}
- return rs.find("查無任何資料") == -1
return int(findSubBetween(rs,'共 <b><font color=red>','</font></b> 筆資料').strip()) pass
if self.pg > self.maxpg : return False rs = None if self.pg == 1 :
- rs = self.detd.getresponse("/ETD-db/ETD-search-c/search",self.getlistparams()) if self.haslist(rs) == False : return False sumrecords = self.findlistcounts(rs) self.maxpg = (sumrecords / self.pgrecordlimit ) + min(1,sumrecords % self.pgrecordlimit)
- params = self.getlistparams() params['pg']=self.pg rs = self.detd.getresponse("/ETD-db/ETD-search-c/search",params)
- return self.nextrs pass
- urnlist = []
ssrt = '<input type="checkbox" name="flag" value="' start = rs.find(ssrt) while start > -1 :
urnlist.append(findSubBetween(rs,ssrt,'">',start).strip()) start = rs.find(ssrt,start+len(ssrt))
- d = Detail() rs = self.detd.getresponse("/ETD-db/ETD-search-c/view_etd",{'URN':urn})
d.author = findSubBetween(rs,'<td class="data_col_a">姓名</td><td class="data_col_b data_col_bgw">','(') d.year = findSubBetween(rs,'學期</td><td class="data_col_bgw">','學年度第') d.subject = findSubBetween(rs,'<td class="data_col_a">論文名稱</td><td colspan="3" class="data_col_bgw">','</td>').replace('
',) d.advisor = findSubBetween(rs,'依職稱與姓名排序</font> <li>','- 指導教授').replace('指導教授',
- d.urn = urn return d pass
try:
- """ MCUETDBean() DownLoadETD(MCUETDBean()) , """ for detd in [ DownLoadETD(MCUETDBean()) , DownLoadETD(SCUETDBean()) ]:
- detd.download()
except:
- traceback.print_exc()
PressKey = raw_input("\n\n\nPress Any key to exit...")
//-->