get bbs data

Method

實作從鼠洞bbs抓文並且轉到google group:
1. 先把版上之前的文章都先抓下來, 並設定轉信
實作上,為了處理big5的編碼問題,我會先把文章轉成utf-8,在使用RE,並且寄到google group上

2. 從news group server上抓文:
首先先在group.nctu.edu.tw指定依個xxx.twbbs.org到自己的ip
這樣便可以連線到group.nctu.edu.tw, 接著就能使用nntplib連線到server抓文章了

Resource

adsl 轉址:
http://redhat.ecenter.idv.tw/bbs/showthread.php?s=&threadid=12891
http://www.adsl.org/
http://www.5402.idv.tw/is/iptoip/no-ip/no-ip.htm
http://linux.vbird.org/linux_server/0270dynamic_dns.php#need_dynamic_noip

申請twbbs.org
http://twbbs.org/

Tips

其實BBS的文章如果有開放交大group轉信的話,

在group.nctu.edu.tw指定一個XXXXX.twbbs.org到自己的ip

便可直接telnet group.nctu.edu.tw 119

就可以連上去交大的news group了

然後就會多一個group.XXXXX.account的群組

就可以到那邊抓文章了…

基本的操作指令是

切換group: group group.XXXXX.account

Source

import os
import smtplib
import re
import time
from stat import *
import pickle

def date_cmp(f1,f2):
    date1 = os.stat('./'+f1)[ST_MTIME]
    date2 = os.stat('./'+f2)[ST_MTIME]
    if date1 > date2:
        return 1
    elif date1 == date2:
        return 0
    else:
        return -1

def send_mail(from_addr, to_addr, context, subject):

    # Add the From: and To: headers at the start!
    msg = ("From: %s\r\nTo: %s\r\nSubject: %s\r\n"
           % (from_addr, to_addr, subject))
    msg = msg + context

    print "Message length is " + repr(len(msg))

    server = smtplib.SMTP('localhost')
    #server.set_debuglevel(1)
    server.sendmail(from_addr, to_addr, msg)
    server.quit()
def main():
    from_addr = "Robot"
    to_addr = "room_joke@googlegroups.com , shenyute@gmail.com"
    f_list = os.listdir('./')
    f_list.sort(date_cmp)
    counter = 0
    for file_name in f_list:
        context = ""
        subject = ""
        if file_name[-2:] == '.A':
            counter = counter + 1
            if counter > 300:
                f = open(file_name)
                context = "".join(f.readlines())
                #context.decode('Big5').encode('UTF-8')
                #print context
                result = re.search(u"翹??D: (.*)",context)
                if result != None:
                    subject = result.group(1)
                    print result.group(1)
                send_mail(from_addr,to_addr,context,subject)
                time.sleep(40)

if __name__ == '__main__':
    main()
Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License