1. 首页
  2. IT资讯

充气娃娃有多爽,python告诉你!!

源码已上传,大家共同学习!

充气娃娃有多爽,python告诉你!!

importrequestsimportjsonimportosimporttimeimportrandomimportjiebafromwordcloudimportWordCloudfromimageioimportimreadcomment_file_path =’jd_comments.txt’defget_spider_comments(page =0):#爬取某东评论url =’https://sclub.jd.com/comment/productPageComments.action?callback=fetchJSON_comment98vv7990&productId=1070129528&score=0&sortType=5&page=%s&pageSize=10&isShadowSku=0&rid=0&fold=1’%page    headers = {‘user-agent’:’Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36′,’referer’:’https://item.jd.com/1070129528.html’    }try:        response = requests.get(url, headers = headers)except:print(“something wrong!”)#获取json格式数据集comments_json = response.text[26:-2]#将json数据集转为json对象    comments_json_obj = json.loads(comments_json)#获取comments里面的所有内容comments_all = comments_json_obj[‘comments’]#获取comments中评论content的内容forcommentincomments_all:withopen(comment_file_path,’a+’,encoding=’utf-8′)asfin:fin.write(comment[‘content’]+’n’)print(comment[‘content’])defbatch_spider_comments():# 每次写入数据之前先清空文件ifos.path.exists(comment_file_path):        os.remove(comment_file_path)foriinrange(100):print(‘正在爬取’+str(i+1)+’页数据。。。。’)        get_spider_comments(i)time.sleep(random.random()*5)defcut_word():withopen(comment_file_path,encoding=’utf-8′)asfile:        comment_text = file.read()        wordlist = jieba.lcut_for_search(comment_text)new_wordlist =’ ‘.join(wordlist)returnnew_wordlistdefcreate_word_cloud():mask = imread(‘ball.jpg’)wordcloud = WordCloud(font_path=’msyh.ttc’,mask = mask).generate(cut_word())wordcloud.to_file(‘picture.png’)if__name__ ==’__main__’:        create_word_cloud()

谢谢大家支持,我会继续努力✊!!

本文来自投稿,不代表程序员编程网立场,如若转载,请注明出处:http://www.cxybcw.com/202455.html

联系我们

13687733322

在线咨询:点击这里给我发消息

邮件:1877088071@qq.com

工作时间:周一至周五,9:30-18:30,节假日休息

QR code