成人视频免费观看网址大全导航,成人激情视频在线观看,最新线上免费激情视频

新聞中心

這里有您想知道的互聯(lián)網(wǎng)營銷解決方案

Python多線程下載有聲小說

有經(jīng)驗的老鳥都(未婚的)會在公司附近租房，免受舟車勞頓之苦的同時節(jié)約了大把時間；也有些人出于某種原因需要每天披星戴月地游走于公司與家之間，很不幸俺就是這其中一員。由于家和公司離得比較遠(yuǎn)，我平時在公交車上的時間占據(jù)了工作時間段的1/4，再加上杭州一向有中國的拉斯維加斯之稱(堵城)，每每堵起來，哥都能想象自己成為變形金剛。這段漫長時間我想作為每個程序猿來說是無法忍受的，可是既然短時間無法改變生存的現(xiàn)狀，咱就好好利用這段時間吧。所以，我特地買了大屏幕的Note II 以便看pdf，另外耳朵也不能閑著，不過咱不是聽英語而是聽小說，我在讀書的時候就喜歡聽廣播，特別是說書、相聲等，所以我需要大量的有聲小說，現(xiàn)在網(wǎng)上這些資源多的很，但是下載頁記為麻煩，為了掙取更多的流量和廣告點擊，這些網(wǎng)站的下載鏈接都需要打開至少兩個以上的網(wǎng)頁才能找到真正的鏈接，甚是麻煩，為了節(jié)省整體下載時間，我寫了這個小程序，方便自己和大家下載有聲小說（當(dāng)然，還有任何其他類型的資源）

先說明一下，我不是為了爬很多資料和數(shù)據(jù)，僅僅是為了娛樂和學(xué)習(xí)，所以這里不會漫無目的的取爬取一個網(wǎng)站的所有鏈接，而是給定一個小說，比方說我要下載小說《童年》，我會在我聽評書網(wǎng)上找到該小說的主頁然后用程序下載所有mp3音頻，具體做法見下面代碼，所有代碼都在模塊crawler5tps中：

1. 先設(shè)定一下start url 和保存文件的目錄

  
 
 
 
   
  
  
  #-*-coding:GBK-*-     
  
  
   import urllib,urllib2     
  
  
   import re,threading,os     
  
  
   baseurl = 'http://www.5tps.com' #base url      
  
  
   down2path = 'E:/enovel/'        #saving path     
  
  
   save2path = ''                  #saving file name (full path)

2. 從start url 解析下載頁面的url

  
 
 
 
   
  
  
  def parseUrl(starturl):     
  
  
       '''''     
  
  
       parse out download page from start url.     
  
  
       eg. we can get 'http://www.5tps.com/down/8297_52_1_1.html' from 'http://www.5tps.com/html/8297.html'     
  
  
       '''    
  
  
       global save2path     
  
  
       rDownloadUrl = re.compile(".*?   
  
  
  
     #rTitle = re.compile(".{4}\s{1}(.*)\s{1}.*")     
  
  
       #有聲小說 悶騷1 播音:劉濤 全集     
  
  
       f = urllib2.urlopen(starturl)     
  
  
       totalLine =  f.readlines()     
  
  
            
  
  
  　　　　''''' create the name of saving file '''    
  
  
       title = totalLine[3].split(" ")[1]     
  
  
       if os.path.exists(down2path+title) is not True:     
  
  
           os.mkdir(down2path+title)     
  
  
           save2path = down2path+title+"/"    
  
  
            
  
  
       downUrlLine = [ line for line in totalLine if rDownloadUrl.match(line)]     
  
  
       downLoadUrl = [];     
  
  
       for dl in downUrlLine:     
  
  
           while True:     
  
  
               m = rDownloadUrl.match(dl)     
  
  
               if not m:     
  
  
                   break    
  
  
               downUrl = m.group(1)     
  
  
               downLoadUrl.append(downUrl.strip())     
  
  
               dl = dl.replace(downUrl,'')     
  
  
       return downLoadUrl

3. 從下載頁面解析出真正的下載鏈接

  
 
 
 
   
  
  
  def getDownlaodLink(starturl):     
  
  
       '''''     
  
  
       find out the real download link from download page.     
  
  
       eg. we can get the download link 'http://180j-d.ysts8.com:8000/人物紀(jì)實/童年/001.mp3?\     
  
  
       1251746750178x1356330062x1251747362932-3492f04cf54428055a110a176297d95a' from \     
  
  
       'http://www.5tps.com/down/8297_52_1_1.html'     
  
  
       '''    
  
  
       downUrl = []     
  
  
       gbk_ClickWord = '點此下載'    
  
  
       downloadUrl = parseUrl(starturl)     
  
  
       rDownUrl = re.compile(''+gbk_ClickWord+'.*') #find the real download link     
  
  
       for url in downloadUrl:     
  
  
           realurl = baseurl+url     
  
  
           print realurl     
  
  
           for line in urllib2.urlopen(realurl).readlines():     
  
  
               m = rDownUrl.match(line)     
  
  
               if m:     
  
  
                   downUrl.append(m.group(1))     
  
  
          
  
  
       return downUrl

4. 定義下載函數(shù)

  
 
 
 
   
  
  
  def download(url,filename):     
  
  
       ''''' download mp3 file '''    
  
  
       print url     
  
  
       urllib.urlretrieve(url, filename)

5. 創(chuàng)建用于下載文件的線程類

  
 
 
 
   
  
  
  class DownloadThread(threading.Thread):     
  
  
       ''''' dowanload thread class '''    
  
  
       def __init__(self,func,savePath):     
  
  
           threading.Thread.__init__(self)     
  
  
           self.function = func     
  
  
           self.savePath = savePath     
  
  
            
  
  
       def run(self):     
  
  
           download(self.function,self.savePath)

6. 開始下載

  
 
 
 
   
  
  
  if __name__ == '__main__':     
  
  
       starturl = 'http://www.5tps.com/html/8297.html'    
  
  
       downUrl = getDownlaodLink(starturl)     
  
  
       aliveThreadDict = {}        # alive thread     
  
  
       downloadingUrlDict = {}     # downloading link     
  
  
        
  
  
       i = 0;     
  
  
       while i < len(downUrl):     
  
  
           ''''' Note:我聽評說網(wǎng) 只允許同時有三個線程下載同一部小說，但是有時受網(wǎng)絡(luò)等影響，\     
  
  
                           為確保下載的是真實的mp3，這里將線程數(shù)設(shè)為2 '''    
  
  
           while len(downloadingUrlDict)< 2 :     
  
  
               downloadingUrlDict[i]=i     
  
  
               i += 1    
  
  
           for urlIndex in downloadingUrlDict.values():     
  
  
               #argsTuple = (downUrl[urlIndex],save2path+str(urlIndex+1)+'.mp3')     
  
  
               if urlIndex not in aliveThreadDict.values():     
  
  
                   t = DownloadThread(downUrl[urlIndex],save2path+str(urlIndex+1)+'.mp3')     
  
  
                   t.start()     
  
  
                   aliveThreadDict[t]=urlIndex     
  
  
           for (th,urlIndex) in aliveThreadDict.items():     
  
  
               if th.isAlive() is not True:     
  
  
                   del aliveThreadDict[th] # delete the thread slot     
  
  
                   del downloadingUrlDict[urlIndex] # delete the url from url list needed to download      
  
  
            
  
  
       print 'Completed Download Work'

這樣就可以了，讓他盡情的下吧，咱還得碼其他的項目去，哎 >>>

等下了班copy到Note中就可以一邊聽小說一邊看資料啦，***附上源碼。

原文鏈接：http://www.cnblogs.com/wuren/archive/2012/12/24/2831100.html

文章標(biāo)題：Python多線程下載有聲小說
瀏覽地址：http://m.5511xx.com/article/codsdse.html

日韩无码专区无码一级三级片|91人人爱网站中日韩无码电影|厨房大战丰满熟妇|AV高清无码在线免费观看|另类AV日韩少妇熟女|中文日本大黄一级黄色片|色情在线视频免费|亚洲成人特黄a片|黄片wwwav色图欧美|欧亚乱色一区二区三区

新聞中心

其他資訊