生不了是骡子
生1到2个是驴
生一窝是兔子
多试几次排除偶然因素
来自B站评论区
大意如下:
跟她结婚。
#&*,[18+内容已遮盖],■■■然后○□☆以后
如果她不孕不育那就是骡子
如果她一胎生了一至两个那就是驴
如果一次下了一窝那就是兔子(咦动词不一样了)
可以重复操作排除偶然因素影响
无懈可击的方案
我室友听到这个:Σ(°Д°;你可做个人吧??!!
从前有一个二次元游戏角色阿米娅,她长了一双驴耳朵,为了不让人知道,总是戴着一顶大帽子。但是后来头发越来越长,为了修剪刘海,不得已找来了理发师。一个画师趁机威胁理发师:「你要是说出去,就杀了你。」
理发师答应了不会说出去。但是这件事儿憋在心里,总是不好受。理发师一天一天显得憔悴了。他的妻子说:「你要是有什么事情不想要人知道,就去山上挖个坑对着坑喊好了。」
理发师深表赞同。于是上山挖了个坑,对着里面喊道:「阿米娅长着驴耳朵!阿米娅长着驴耳朵!」喊完,心情舒坦多了。于是就心满意足得胜地走了。
冬去春来,在理发师挖开的坑中萌发了一颗种子,种子最终长成了一棵树。牧童用这棵树的树枝做了支牧笛,结果吹出来的全是「阿米娅长着驴耳朵!阿米娅长着驴耳朵!」
现在,全王国的人都知道,阿米娅长着一双驴耳朵了。
一说起驴,就不可避免的想到单核。
虽然我还在观望贵游,但是我怀疑你们歧视驴。
谢邀,看论文看累了,闲着没事,就搭了一个小型的数据集,训了一个分类器来解决这个问题,参考了kaggle知名比赛/项目 Dogs vs. Cats https://www.kaggle.com/c/dogs-vs-cats ,我们也来完成一个Donkeys vs. Rabbits的阿米娅二分类判别
先上结论吧,输入阿米娅的无背景(减少噪声干扰)官方原画,判别器得出的结论是:阿米娅有99.9%的可能是只驴,神经网络之所以能得出这个结论,我推测可能是因为对结果contribution最大的是阿米娅颜色的特征(donkey基本全是灰/黑暗色,而rabbit则是一半白色和一半灰黑色),当然,耳朵的长度,形状应该也对结果有影响
下面谈一下数据准备和网络结构吧,数据准备我使用了一个开源的爬虫工具 https://github.com//WuLC/GoogleImagesDownloader,利用google图片url修改keyword的特性爬了2000张donkey和2000张rabbit(数据量比较小,毕竟是tiny dataset),遇到坏链/无法下载就跳过
import os import json import time import logging import urllib.request import urllib.error from urllib.parse import urlparse from multiprocessing import Pool from user_agent import generate_user_agent from selenium import webdriver from selenium.webdriver.common.keys import Keys def get_image_links(main_keyword, supplemented_keywords, link_file_path, num_requested = 1000): number_of_scrolls = int(num_requested / 400) + 1 # number_of_scrolls * 400 images will be opened in the browser img_urls = set() driver = webdriver.Firefox() for i in range(len(supplemented_keywords)): search_query = main_keyword + ' ' + supplemented_keywords[i] url = "https://www.google.com/search?q="+search_query+"&source=lnms&tbm=isch" driver.get(url) for _ in range(number_of_scrolls): for __ in range(10): # multiple scrolls needed to show all 400 images driver.execute_script("window.scrollBy(0, 1000000)") time.sleep(2) # to load next 400 images time.sleep(5) try: driver.find_element_by_xpath("//input[@value='Show more results']").click() except Exception as e: print("Process-{0} reach the end of page or get the maximum number of requested images".format(main_keyword)) break # imges = driver.find_elements_by_xpath('//div[@class="rg_meta"]') # not working anymore imges = driver.find_elements_by_xpath('//div[contains(@class,"rg_meta")]') for img in imges: img_url = json.loads(img.get_attribute('innerHTML'))["ou"] # img_type = json.loads(img.get_attribute('innerHTML'))["ity"] img_urls.add(img_url) print('Process-{0} add keyword {1} , got {2} image urls so far'.format(main_keyword, supplemented_keywords[i], len(img_urls))) print('Process-{0} totally get {1} images'.format(main_keyword, len(img_urls))) driver.quit() with open(link_file_path, 'w') as wf: for url in img_urls: wf.write(url +'
') print('Store all the links in file {0}'.format(link_file_path)) def download_images(link_file_path, download_dir, log_dir): print('Start downloading with link file {0}..........'.format(link_file_path)) if not os.path.exists(log_dir): os.makedirs(log_dir) main_keyword = link_file_path.split('/')[-1] log_file = log_dir + 'download_selenium_{0}.log'.format(main_keyword) logging.basicConfig(level=logging.DEBUG, filename=log_file, filemode="a+", format="%(asctime)-15s %(levelname)-8s %(message)s") img_dir = download_dir + main_keyword + '/' count = 0 headers = {} if not os.path.exists(img_dir): os.makedirs(img_dir) # start to download images with open(link_file_path, 'r') as rf: for link in rf: try: o = urlparse(link) ref = o.scheme + '://' + o.hostname #ref = 'https://www.google.com' ua = generate_user_agent() headers['User-Agent'] = ua headers['referer'] = ref print('
{0}
{1}
{2}'.format(link.strip(), ref, ua)) req = urllib.request.Request(link.strip(), headers = headers) response = urllib.request.urlopen(req, timeout = 30) data = response.read() file_path = img_dir + '{0}.jpg'.format(count) with open(file_path,'wb') as wf: wf.write(data) print('Process-{0} download image {1}/{2}.jpg'.format(main_keyword, main_keyword, count)) count += 1 if count % 10 == 0: print('Process-{0} is sleeping'.format(main_keyword)) time.sleep(5) except urllib.error.URLError as e: print('URLError') logging.error('URLError while downloading image {0}reason:{1}'.format(link, e.reason)) continue except urllib.error.HTTPError as e: print('HTTPError') logging.error('HTTPError while downloading image {0}http code {1}, reason:{2}'.format(link, e.code, e.reason)) continue except Exception as e: print('Unexpected Error') logging.error('Unexpeted error while downloading image {0}error type:{1}, args:{2}'.format(link, type(e), e.args)) continue if __name__ == "__main__": list_keyword = 'rabbit' main_keywords = [list_keyword] supplemented_keywords = [''] download_dir = './data/' link_files_dir = './link_files/' log_dir = './logs/' max_pic_num = 2000 # multiple processes p = Pool(1) # default number of process is the number of cores of your CPU, change it by yourself for keyword in main_keywords: p.apply_async(get_image_links, args=(keyword, supplemented_keywords, link_files_dir + keyword, max_pic_num)) p.close() p.join() print('Fininsh getting all image links') p = Pool() # default number of process is the number of cores of your CPU, change it by yourself for keyword in main_keywords: p.apply_async(download_images, args=(link_files_dir + keyword, download_dir, log_dir)) p.close() p.join() print('Finish downloading all images')
网络懒得自己搭了,用了kaggle里面一个现成的,改了一下dataloader,结构就是三层卷积加一层fc,中间用到了BN和dropout,代码就不放了,可以自行去kaggle下载,后面附上了model的summary,因为数据量不大,所以训了100个epoch,很快训好,train和val的loss都不错,infer的结果见开头
_________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv2d_1 (Conv2D) (None, 126, 126, 32) 896 _________________________________________________________________ batch_normalization_1 (Batch (None, 126, 126, 32) 128 _________________________________________________________________ max_pooling2d_1 (MaxPooling2 (None, 63, 63, 32) 0 _________________________________________________________________ dropout_1 (Dropout) (None, 63, 63, 32) 0 _________________________________________________________________ conv2d_2 (Conv2D) (None, 61, 61, 64) 18496 _________________________________________________________________ batch_normalization_2 (Batch (None, 61, 61, 64) 256 _________________________________________________________________ max_pooling2d_2 (MaxPooling2 (None, 30, 30, 64) 0 _________________________________________________________________ dropout_2 (Dropout) (None, 30, 30, 64) 0 _________________________________________________________________ conv2d_3 (Conv2D) (None, 28, 28, 128) 73856 _________________________________________________________________ batch_normalization_3 (Batch (None, 28, 28, 128) 512 _________________________________________________________________ max_pooling2d_3 (MaxPooling2 (None, 14, 14, 128) 0 _________________________________________________________________ dropout_3 (Dropout) (None, 14, 14, 128) 0 _________________________________________________________________ flatten_1 (Flatten) (None, 25088) 0 _________________________________________________________________ dense_1 (Dense) (None, 512) 12845568 _________________________________________________________________ batch_normalization_4 (Batch (None, 512) 2048 _________________________________________________________________ dropout_4 (Dropout) (None, 512) 0 _________________________________________________________________ dense_2 (Dense) (None, 1) 513 ================================================================= Total params: 12,942,273 Trainable params: 12,940,801 Non-trainable params: 1,472 _________________________________________________________________
Reference:
[1] GoogleImagesDownloader https://github.com//WuLC/GoogleImagesDownloader
[2] keras-cnn-dog-or-cat-classification https://www.kaggle.com/uysimty/keras-cnn-dog-or-cat-classification