Skip to content

Conversation

@sourcery-ai
Copy link

@sourcery-ai sourcery-ai bot commented Oct 26, 2023

Pull Request #17 refactored by Sourcery.

If you're happy with these changes, merge this Pull Request using the Squash and merge strategy.

NOTE: As code is pushed to the original Pull Request, Sourcery will
re-run and update (force-push) this Pull Request with new refactorings as
necessary. If Sourcery finds no refactorings at any point, this Pull Request
will be closed automatically.

See our documentation here.

Run Sourcery locally

Reduce the feedback loop during development by using the Sourcery editor plugin:

Review changes via command line

To manually merge these changes, make sure you're on the dev branch, then run:

git fetch origin sourcery/dev
git merge --ff-only FETCH_HEAD
git reset HEAD^

Help us improve this pull request!

Copy link
Author

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Due to GitHub API limits, only the first 60 comments can be shown.


def greet(name):
print('Hello {}'.format(name))
print(f'Hello {name}')
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function greet refactored with the following changes:

"""
def job1():
print("I'm running on threads %s" % threading.current_thread())
print(f"I'm running on threads {threading.current_thread()}")
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function job1 refactored with the following changes:

print(f"I'm running on threads {threading.current_thread()}")
def job2():
print("I'm running on threads %s" % threading.current_thread())
print(f"I'm running on threads {threading.current_thread()}")
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function job2 refactored with the following changes:

print(f"I'm running on threads {threading.current_thread()}")
def job3():
print("I'm running on threads %s" % threading.current_thread())
print(f"I'm running on threads {threading.current_thread()}")
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function job3 refactored with the following changes:

Comment on lines -13 to +15
CHROME_DRIVER_MAPPING_FILE = r"{}\mapping.json".format(CHROME_DRIVER_FOLDER)
CHROME_DRIVER_EXE = r"{}\chromedriver.exe".format(CHROME_DRIVER_FOLDER)
CHROME_DRIVER_ZIP = r"{}\chromedriver_win32.zip".format(CHROME_DRIVER_FOLDER)
CHROME_DRIVER_MAPPING_FILE = f"{CHROME_DRIVER_FOLDER}\mapping.json"
CHROME_DRIVER_EXE = f"{CHROME_DRIVER_FOLDER}\chromedriver.exe"
CHROME_DRIVER_ZIP = f"{CHROME_DRIVER_FOLDER}\chromedriver_win32.zip"
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lines 13-15 refactored with the following changes:

Comment on lines -386 to +371
ret = []
for i in range(1, page):
ret.append('http://www.samair.ru/proxy/proxy-%(num)02d.htm' % {'num': i})
return ret
return [
'http://www.samair.ru/proxy/proxy-%(num)02d.htm' % {'num': i}
for i in range(1, page)
]
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function build_list_urls_3 refactored with the following changes:

Comment on lines -406 to +388
ip = match[0] + "." + match[1] + match[2]
ip = f"{match[0]}.{match[1]}{match[2]}"
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function parse_page_3 refactored with the following changes:

Comment on lines -432 to +417
ret = []
for i in range(1, page):
ret.append('http://www.pass-e.com/proxy/index.php?page=%(n)01d' % {'n': i})
return ret
return [
'http://www.pass-e.com/proxy/index.php?page=%(n)01d' % {'n': i}
for i in range(1, page)
]
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function build_list_urls_4 refactored with the following changes:

Comment on lines -476 to +461
ret = []
for i in range(1, page):
ret.append('http://www.ipfree.cn/index2.asp?page=%(num)01d' % {'num': i})
return ret
return [
'http://www.ipfree.cn/index2.asp?page=%(num)01d' % {'num': i}
for i in range(1, page)
]
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function build_list_urls_5 refactored with the following changes:

Comment on lines -508 to +493
ret = []
for i in range(1, page):
ret.append('http://www.cnproxy.com/proxy%(num)01d.html' % {'num': i})
return ret
return [
'http://www.cnproxy.com/proxy%(num)01d.html' % {'num': i}
for i in range(1, page)
]
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function build_list_urls_6 refactored with the following changes:

Comment on lines +507 to -528
type = -1 # 该网站未提供代理服务器类型
for match in matches:
ip = match[0]
port = match[1]
type = -1 # 该网站未提供代理服务器类型
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function parse_page_6 refactored with the following changes:

Comment on lines -595 to +580
ret = []
for i in range(0, page):
ret.append('http://proxylist.sakura.ne.jp/index.htm?pages=%(n)01d' % {'n': i})
return ret
return [
'http://proxylist.sakura.ne.jp/index.htm?pages=%(n)01d' % {'n': i}
for i in range(0, page)
]
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function build_list_urls_9 refactored with the following changes:

Comment on lines -616 to +598
if (type == 'Anonymous'):
type = 1
else:
type = -1
type = 1 if (type == 'Anonymous') else -1
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function parse_page_9 refactored with the following changes:

Comment on lines -634 to +616
ret = []
for i in range(1, page):
ret.append('http://www.publicproxyservers.com/page%(n)01d.html' % {'n': i})
return ret
return [
'http://www.publicproxyservers.com/page%(n)01d.html' % {'n': i}
for i in range(1, page)
]
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function build_list_urls_10 refactored with the following changes:

Comment on lines -678 to +667
ret = []
for i in range(1, page):
ret.append('http://www.my-proxy.com/list/proxy.php?list=%(n)01d' % {'n': i})

ret.append('http://www.my-proxy.com/list/proxy.php?list=s1')
ret.append('http://www.my-proxy.com/list/proxy.php?list=s2')
ret.append('http://www.my-proxy.com/list/proxy.php?list=s3')
ret = [
'http://www.my-proxy.com/list/proxy.php?list=%(n)01d' % {'n': i}
for i in range(1, page)
]
ret.extend(
(
'http://www.my-proxy.com/list/proxy.php?list=s1',
'http://www.my-proxy.com/list/proxy.php?list=s2',
'http://www.my-proxy.com/list/proxy.php?list=s3',
)
)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function build_list_urls_11 refactored with the following changes:

urls = []
for page in range(beg, end):
urls.append('http://www.baidu.com?&page=%d' % page)
urls = ['http://www.baidu.com?&page=%d' % page for page in range(beg, end)]
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function run_spider refactored with the following changes:

Comment on lines -92 to +90
res = []
for i in range(len(s) - 1):
res.append((s[i], s[i + 1]))
res = [(s[i], s[i + 1]) for i in range(len(s) - 1)]
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function main refactored with the following changes:

Comment on lines -64 to -68
proxies = {
'http': '{proxy_type}://{ip}:{port}'.format(proxy_type=proxy_type, ip=ip, port=port),
'https': '{proxy_type}://{ip}:{port}'.format(proxy_type=proxy_type, ip=ip, port=port),
return {
'http': '{proxy_type}://{ip}:{port}'.format(
proxy_type=proxy_type, ip=ip, port=port
),
'https': '{proxy_type}://{ip}:{port}'.format(
proxy_type=proxy_type, ip=ip, port=port
),
}
return proxies
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function get_proxy_dict refactored with the following changes:

urls = []
for page in range(1, 1000):
urls.append('http://www.jb51.net/article/%s.htm' % page)
urls = [f'http://www.jb51.net/article/{page}.htm' for page in range(1, 1000)]
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function main refactored with the following changes:

class TestSyncSpider(SyncSpider):
def handle_html(self, url, html):
print(html)
pass
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function TestSyncSpider.handle_html refactored with the following changes:

class TestAsyncSpider(AsyncSpider):
def handle_html(self, url, html):
print(html)
pass
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function TestAsyncSpider.handle_html refactored with the following changes:

Comment on lines -24 to +25
urls = []
for page in range(1, 1000):
#urls.append('http://www.jb51.net/article/%s.htm' % page)
urls.append('http://www.imooc.com/data/check_f.php?page=%d'%page)
urls = [
'http://www.imooc.com/data/check_f.php?page=%d' % page
for page in range(1, 1000)
]
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lines 24-27 refactored with the following changes:

This removes the following comments ( why? ):

#urls.append('http://www.jb51.net/article/%s.htm' % page)

Comment on lines -64 to +69
func_name = 'get_' + field
func_name = f'get_{field}'
xpath_str = self.xpath_dict.get(field)
if hasattr(self, func_name):
return getattr(self, func_name)(xpath_str)
else:
self.logger.debug(field, self.url)
return self.parser.xpath(xpath_str)[0].strip() if xpath_str else ''
self.logger.debug(field, self.url)
return self.parser.xpath(xpath_str)[0].strip() if xpath_str else ''
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function XpathCrawler.get_field refactored with the following changes:

Comment on lines -73 to +75
xpath_result = {}
for field, xpath_string in self.xpath_dict.items():
xpath_result[field] = makes(self.get_field(field)) # to utf8
return xpath_result
return {
field: makes(self.get_field(field))
for field, xpath_string in self.xpath_dict.items()
}
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function XpathCrawler.get_result refactored with the following changes:

This removes the following comments ( why? ):

# to utf8

Comment on lines -93 to +92
category_urls = []
for href in category_hrefs:
category_urls.append(urljoin(self.domain, href))
category_urls = [urljoin(self.domain, href) for href in category_hrefs]
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function FantasyhairbuySite.generate_category_urls refactored with the following changes:

Comment on lines -11 to +14
backup_name = os.path.basename(file_path) + '_' + datetime.datetime.now().strftime('%Y%m%d%H%M%S')
backup_name = (
f'{os.path.basename(file_path)}_'
+ datetime.datetime.now().strftime('%Y%m%d%H%M%S')
)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function backup_file refactored with the following changes:

Comment on lines -44 to +47
backup_list = glob.glob(os.path.join(path, pattern + '_*'))
backup_list = glob.glob(os.path.join(path, f'{pattern}_*'))
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function main refactored with the following changes:


# if it's anything else, return it in its original form
return data
return data.encode('utf-8') if str(type(data)) == "<type 'unicode'>" else data
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function _byteify refactored with the following changes:

This removes the following comments ( why? ):

# if it's anything else, return it in its original form

try:
res = query.find()
return res
return query.find()
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function LeanCloudApi.get_skip_obj_list refactored with the following changes:

Comment on lines -49 to +48
img_info_url = img_url + '?imageInfo'
img_info_url = f'{img_url}?imageInfo'
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function LeanCloudApi.add_img_info refactored with the following changes:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants