-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Labels
bugSomething isn't workingSomething isn't workingenhancementNew feature or requestNew feature or request
Description
Lately, price scraping jobs have failed to complete because of various connection errors; notably 502 errors.
Feb 26 09:21:35 vps-c0ce24d7 start.sh[14631]: ERROR:canadiantracker.triangle:Got status code 502 on try 3
Feb 26 09:21:40 vps-c0ce24d7 start.sh[14631]: DEBUG:canadiantracker.triangle:requested 50 product infos
Feb 26 09:21:40 vps-c0ce24d7 start.sh[14631]: DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): apim.canadiantire.ca:443
Feb 26 09:21:41 vps-c0ce24d7 start.sh[14631]: DEBUG:urllib3.connectionpool:https://apim.canadiantire.ca:443 "POST /v1/product/api/v1/product/sku/PriceAvailability/?lang=en_CA&storeId=64 HTTP/1.1" 502 375
Feb 26 09:21:41 vps-c0ce24d7 start.sh[14631]: ERROR:canadiantracker.triangle:Got status code 502 on try 4
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: Traceback with variables (most recent call last):
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: File "<string>", line 1, in <module>
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: ...skipped... 9 vars
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: File "/home/scraper/.cache/pypoetry/virtualenvs/canadiantracker-lR2ht7gH-py3.10/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: return self.main(*args, **kwargs)
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: self = <Group cli>
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: args = ()
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: kwargs = {}
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: File "/home/scraper/.cache/pypoetry/virtualenvs/canadiantracker-lR2ht7gH-py3.10/lib/python3.10/site-packages/click/core.py", line 1055, in main
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: rv = self.invoke(ctx)
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: self = <Group cli>
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: args = ['scrape-prices', '--db-path', '/home/scraper/db.tmp.gh6iBe/inventory.db']
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: prog_name = 'ctscraper'
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: complete_var = None
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: standalone_mode = True
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: windows_expand_args = True
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: extra = {}
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: ctx = <click.core.Context object at 0x7fb4cd39b820>
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: File "/home/scraper/.cache/pypoetry/virtualenvs/canadiantracker-lR2ht7gH-py3.10/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: return _process_result(sub_ctx.command.invoke(sub_ctx))
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: _process_result = <function MultiCommand.invoke.<locals>._process_result at 0x7fb4cd4fbf40>
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: args = []
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: cmd_name = 'scrape-prices'
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: cmd = <Command scrape-prices>
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: sub_ctx = <click.core.Context object at 0x7fb4cba35ba0>
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: ctx = <click.core.Context object at 0x7fb4cd39b820>
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: self = <Group cli>
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: __class__ = <class 'click.core.MultiCommand'>
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: File "/home/scraper/.cache/pypoetry/virtualenvs/canadiantracker-lR2ht7gH-py3.10/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: return ctx.invoke(self.callback, **ctx.params)
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: self = <Command scrape-prices>
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: ctx = <click.core.Context object at 0x7fb4cba35ba0>
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: File "/home/scraper/.cache/pypoetry/virtualenvs/canadiantracker-lR2ht7gH-py3.10/lib/python3.10/site-packages/click/core.py", line 760, in invoke
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: return __callback(*args, **kwargs)
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: _Context__self = <click.core.Context object at 0x7fb4cba35ba0>
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: _Context__callback = <function scrape_prices at 0x7fb4cb3fe3b0>
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: args = ()
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: kwargs = {'db_path': '/home/scraper/db.tmp.gh6iBe/inventory.db', 'older_than': 1, 'discard_equal': True}
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: ctx = <click.core.Context object at 0x7fb4cba35ba0>
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: File "/home/scraper/CanadianTracker/src/canadiantracker/scraper.py", line 246, in scrape_prices
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: repository.add_product_price_samples(ledger, discard_equal)
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: db_path = '/home/scraper/db.tmp.gh6iBe/inventory.db'
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: older_than = 1
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: discard_equal = True
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: repository = <canadiantracker.storage.ProductRepository object at 0x7fb4cb5d5c00>
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: progress_bar_settings = {'label': 'Scraping prices', 'show_pos': True, 'item_show_func': <function scrape_prices.<locals>.<lambda> at 0x7fb4cb4c4790>, 'bar_template': ''}
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: skus = <click._termui_impl.ProgressBar object at 0x7fb4cb433370>
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: ledger = <canadiantracker.triangle.ProductLedger object at 0x7fb4cb4324d0>
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: File "/home/scraper/CanadianTracker/src/canadiantracker/storage.py", line 308, in add_product_price_samples
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: for info in product_infos:
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: self = <canadiantracker.storage.ProductRepository object at 0x7fb4cb5d5c00>
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: product_infos = <canadiantracker.triangle.ProductLedger object at 0x7fb4cb4324d0>
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: discard_equal = True
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: info = {'_raw_payload': {'code': '4084018', 'active': True, 'sellable': True, 'orderable': False, 'originalPrice': None, 'currentPrice': {'value': Decimal('292.99')}, 'displayWasLabel': False, 'badges': [], 'storeShelfLocation': None, 'fulfillment': {'availability': {'Corporate': {'MinOrderQty': 1, 'bopisETA': {'MinETA': '2023-02-27T00:00:00.000Z', 'MaxETA': '2023-03-03T00:00:00.000Z'}, 'sthETA': {'MinETA': '2023-03-01T00:00:00.000Z', 'MaxETA': '2023-03-06T00:00:00.000Z'}}, 'quantity': 0}, 'storePickUp': {'etaEarliest': None, 'enabled': True}, 'shipToHome': {'etaEarliest': None, 'etaLatest': None, 'enabled': False}, 'expressDelivery': {'enabled': False, 'orderIn': None, 'etaEarliest': None}}, 'partNumber': '171149010', 'feeValue': 3, 'priceMessage': [{'label': None, 'tooltip': None}], 'rebate': None, 'priceValidUntil': None, 'warrantyMessage': 'Passenger and light truck tires purchased, installed and balanced at a Canadian Tire Associate Store are covered by a pro-rated Road Hazard Damage and...
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: price = Decimal('292.99')
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: sku = _StorageSku(code=4084018, formatted_code=408-4018-8, product_index=2)
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: last_sample = _StorageProductSample(index=44312685, sample_time=2023-02-24 08:17:15.719178, sku_index=44, price_cents=29299)
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: equal = True
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: new_sample = _StorageProductSample(index=None, sample_time=2023-02-26 09:21:16.883392, sku_index=None, price_cents=29299)
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: File "/home/scraper/CanadianTracker/src/canadiantracker/triangle.py", line 338, in __iter__
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: for product_info in self._get_product_infos(batch):
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: self = <canadiantracker.triangle.ProductLedger object at 0x7fb4cb4324d0>
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: batch = [_StorageSku(code=4084010, formatted_code=408-4010-4, product_index=2), _StorageSku(code=0081170, formatted_code=008-1170-2, product_index=2), _StorageSku(code=0072349, formatted_code=007-2349-6, product_index=2), _StorageSku(code=0062186, formatted_code=006-2186-4, product_index=2), _StorageSku(code=4084008, formatted_code=408-4008-2, product_index=2), _StorageSku(code=4084009, formatted_code=408-4009-0, product_index=2), _StorageSku(code=4084024, formatted_code=408-4024-2, product_index=2), _StorageSku(code=4084025, formatted_code=408-4025-0, product_index=2), _StorageSku(code=4084022, formatted_code=408-4022-6, product_index=2), _StorageSku(code=4084023, formatted_code=408-4023-4, product_index=2), _StorageSku(code=4089357, formatted_code=408-9357-0, product_index=2), _StorageSku(code=4084028, formatted_code=408-4028-4, product_index=2), _StorageSku(code=4084029, formatted_code=408-4029-2, product_index=2), _StorageSku(code=0081167, formatted_code=008-1167-2, product_index=2), _Stor...
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: product_info = {'_raw_payload': {'code': '4084018', 'active': True, 'sellable': True, 'orderable': False, 'originalPrice': None, 'currentPrice': {'value': Decimal('292.99')}, 'displayWasLabel': False, 'badges': [], 'storeShelfLocation': None, 'fulfillment': {'availability': {'Corporate': {'MinOrderQty': 1, 'bopisETA': {'MinETA': '2023-02-27T00:00:00.000Z', 'MaxETA': '2023-03-03T00:00:00.000Z'}, 'sthETA': {'MinETA': '2023-03-01T00:00:00.000Z', 'MaxETA': '2023-03-06T00:00:00.000Z'}}, 'quantity': 0}, 'storePickUp': {'etaEarliest': None, 'enabled': True}, 'shipToHome': {'etaEarliest': None, 'etaLatest': None, 'enabled': False}, 'expressDelivery': {'enabled': False, 'orderIn': None, 'etaEarliest': None}}, 'partNumber': '171149010', 'feeValue': 3, 'priceMessage': [{'label': None, 'tooltip': None}], 'rebate': None, 'priceValidUntil': None, 'warrantyMessage': 'Passenger and light truck tires purchased, installed and balanced at a Canadian Tire Associate Store are covered by a pro-rated Road Hazard Damage and...
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: File "/home/scraper/CanadianTracker/src/canadiantracker/triangle.py", line 333, in _get_product_infos
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: raise RuntimeError("Failed to get product info")
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: skus = [_StorageSku(code=4084010, formatted_code=408-4010-4, product_index=2), _StorageSku(code=0081170, formatted_code=008-1170-2, product_index=2), _StorageSku(code=0072349, formatted_code=007-2349-6, product_index=2), _StorageSku(code=0062186, formatted_code=006-2186-4, product_index=2), _StorageSku(code=4084008, formatted_code=408-4008-2, product_index=2), _StorageSku(code=4084009, formatted_code=408-4009-0, product_index=2), _StorageSku(code=4084024, formatted_code=408-4024-2, product_index=2), _StorageSku(code=4084025, formatted_code=408-4025-0, product_index=2), _StorageSku(code=4084022, formatted_code=408-4022-6, product_index=2), _StorageSku(code=4084023, formatted_code=408-4023-4, product_index=2), _StorageSku(code=4089357, formatted_code=408-9357-0, product_index=2), _StorageSku(code=4084028, formatted_code=408-4028-4, product_index=2), _StorageSku(code=4084029, formatted_code=408-4029-2, product_index=2), _StorageSku(code=0081167, formatted_code=008-1167-2, product_index=2), _Stor...
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: ntry = 4
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: url = 'https://apim.canadiantire.ca/v1/product/api/v1/product/sku/PriceAvailability/?lang=en_CA&storeId=64'
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: headers = {'authority': 'apim.canadiantire.ca', 'accept': 'application/json, text/plain, */*', 'accept-language': 'en-US,en;q=0.9', 'bannerid': 'CTR', 'basesiteid': 'CTR', 'ocp-apim-subscription-key': 'c01ef3612328420c9f5cd9277e815a0e', 'origin': 'https://www.canadiantire.ca', 'referer': 'https://www.canadiantire.ca/', 'sec-ch-ua': '" Not A;Brand";v="99", "Chromium";v="102", "Google Chrome";v="102"', 'sec-ch-ua-mobile': '?0', 'sec-ch-ua-platform': '"Windows"', 'sec-fetch-dest': 'empty', 'sec-fetch-mode': 'cors', 'sec-fetch-site': 'same-site', 'service-client': 'ctr/web', 'service-version': 'ctc-dev2', 'user-agent': 'Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:110.0) Gecko/20100101 Firefox/110.0', 'x-web-host': 'www.canadiantire.ca', 'cache-control': 'no-cache', 'pragma': 'no-cache', 'content-type': 'application/json'}
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: body = {'skus': [{'code': '4084010', 'lowStockThreshold': 0}, {'code': '0081170', 'lowStockThreshold': 0}, {'code': '0072349', 'lowStockThreshold': 0}, {'code': '0062186', 'lowStockThreshold': 0}, {'code': '4084008', 'lowStockThreshold': 0}, {'code': '4084009', 'lowStockThreshold': 0}, {'code': '4084024', 'lowStockThreshold': 0}, {'code': '4084025', 'lowStockThreshold': 0}, {'code': '4084022', 'lowStockThreshold': 0}, {'code': '4084023', 'lowStockThreshold': 0}, {'code': '4089357', 'lowStockThreshold': 0}, {'code': '4084028', 'lowStockThreshold': 0}, {'code': '4084029', 'lowStockThreshold': 0}, {'code': '0081167', 'lowStockThreshold': 0}, {'code': '0062806', 'lowStockThreshold': 0}, {'code': '4084026', 'lowStockThreshold': 0}, {'code': '4084027', 'lowStockThreshold': 0}, {'code': '0073667', 'lowStockThreshold': 0}, {'code': '4089356', 'lowStockThreshold': 0}, {'code': '4084020', 'lowStockThreshold': 0}, {'code': '4086683', 'lowStockThreshold': 0}, {'code': '4084021', 'lowStockThreshold': 0}, ...
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: response = <Response [502]>
Feb 26 09:21:46 vps-c0ce24d7 start.sh[14631]: builtins.RuntimeError: Failed to get product info
Feb 26 09:21:47 vps-c0ce24d7 start.sh[14631]: Failed to run scrape-prices, aborting job.
SKU scrapings also fail for similar reasons.
Feb 26 09:20:56 vps-c0ce24d7 start.sh[14631]: DEBUG:canadiantracker.storage: SKU 3999158 is already present
Feb 26 09:20:56 vps-c0ce24d7 start.sh[14631]: DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): apim.canadiantire.ca:443
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: Traceback with variables (most recent call last):
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: File "<string>", line 1, in <module>
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: ...skipped... 9 vars
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: File "/home/scraper/.cache/pypoetry/virtualenvs/canadiantracker-lR2ht7gH-py3.10/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: return self.main(*args, **kwargs)
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: self = <Group cli>
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: args = ()
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: kwargs = {}
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: File "/home/scraper/.cache/pypoetry/virtualenvs/canadiantracker-lR2ht7gH-py3.10/lib/python3.10/site-packages/click/core.py", line 1055, in main
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: rv = self.invoke(ctx)
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: self = <Group cli>
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: args = ['scrape-skus', '--db-path', '/home/scraper/db.tmp.gh6iBe/inventory.db']
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: prog_name = 'ctscraper'
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: complete_var = None
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: standalone_mode = True
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: windows_expand_args = True
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: extra = {}
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: ctx = <click.core.Context object at 0x7ff572ef7820>
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: File "/home/scraper/.cache/pypoetry/virtualenvs/canadiantracker-lR2ht7gH-py3.10/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: return _process_result(sub_ctx.command.invoke(sub_ctx))
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: _process_result = <function MultiCommand.invoke.<locals>._process_result at 0x7ff573057f40>
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: args = []
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: cmd_name = 'scrape-skus'
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: cmd = <Command scrape-skus>
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: sub_ctx = <click.core.Context object at 0x7ff5715a1ba0>
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: ctx = <click.core.Context object at 0x7ff572ef7820>
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: self = <Group cli>
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: __class__ = <class 'click.core.MultiCommand'>
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: File "/home/scraper/.cache/pypoetry/virtualenvs/canadiantracker-lR2ht7gH-py3.10/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: return ctx.invoke(self.callback, **ctx.params)
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: self = <Command scrape-skus>
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: ctx = <click.core.Context object at 0x7ff5715a1ba0>
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: File "/home/scraper/.cache/pypoetry/virtualenvs/canadiantracker-lR2ht7gH-py3.10/lib/python3.10/site-packages/click/core.py", line 760, in invoke
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: return __callback(*args, **kwargs)
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: _Context__self = <click.core.Context object at 0x7ff5715a1ba0>
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: _Context__callback = <function scrape_skus at 0x7ff570f62200>
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: args = ()
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: kwargs = {'db_path': '/home/scraper/db.tmp.gh6iBe/inventory.db', 'products': None}
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: ctx = <click.core.Context object at 0x7ff5715a1ba0>
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: File "/home/scraper/CanadianTracker/src/canadiantracker/scraper.py", line 188, in scrape_skus
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: for sku in triangle.SkusInventory(product):
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: db_path = '/home/scraper/db.tmp.gh6iBe/inventory.db'
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: products = None
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: repository = <canadiantracker.storage.ProductRepository object at 0x7ff57113dab0>
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: products_list = None
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: progress_bar_settings = {'label': 'Scraping SKUs', 'show_pos': True, 'item_show_func': <function scrape_skus.<locals>.<lambda> at 0x7ff571028790>, 'length': 134849, 'bar_template': ''}
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: products_wrapper = <click._termui_impl.ProgressBar object at 0x7ff570f97190>
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: i = 124788
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: product = <canadiantracker.storage._StorageProduct object at 0x7ff561006c20>
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: sku = <canadiantracker.model.Sku object at 0x7ff5604539d0>
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: File "/home/scraper/CanadianTracker/src/canadiantracker/triangle.py", line 256, in __iter__
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: resp = SkusInventory._request_page(self._product.code)
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: self = <canadiantracker.triangle.SkusInventory object at 0x7ff5604537f0>
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: ntry = 0
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: File "/home/scraper/CanadianTracker/src/canadiantracker/triangle.py", line 249, in _request_page
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: return requests.get(
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: product_code = '7745620P'
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: headers = {'authority': 'apim.canadiantire.ca', 'accept': 'application/json, text/plain, */*', 'accept-language': 'en-US,en;q=0.9', 'bannerid': 'CTR', 'basesiteid': 'CTR', 'ocp-apim-subscription-key': 'c01ef3612328420c9f5cd9277e815a0e', 'origin': 'https://www.canadiantire.ca', 'referer': 'https://www.canadiantire.ca/', 'sec-ch-ua': '" Not A;Brand";v="99", "Chromium";v="102", "Google Chrome";v="102"', 'sec-ch-ua-mobile': '?0', 'sec-ch-ua-platform': '"Windows"', 'sec-fetch-dest': 'empty', 'sec-fetch-mode': 'cors', 'sec-fetch-site': 'same-site', 'service-client': 'ctr/web', 'service-version': 'ctc-dev2', 'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Safari/537.36 Edg/110.0.1587.56', 'x-web-host': 'www.canadiantire.ca', 'cache-control': 'no-cache', 'pragma': 'no-cache'}
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: File "/home/scraper/.cache/pypoetry/virtualenvs/canadiantracker-lR2ht7gH-py3.10/lib/python3.10/site-packages/requests/api.py", line 73, in get
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: return request("get", url, params=params, **kwargs)
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: url = 'https://apim.canadiantire.ca/v1/product/api/v1/product/productFamily/7745620P?baseStoreId=CTR&lang=en_CA&storeId=64'
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: params = None
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: kwargs = {'headers': {'authority': 'apim.canadiantire.ca', 'accept': 'application/json, text/plain, */*', 'accept-language': 'en-US,en;q=0.9', 'bannerid': 'CTR', 'basesiteid': 'CTR', 'ocp-apim-subscription-key': 'c01ef3612328420c9f5cd9277e815a0e', 'origin': 'https://www.canadiantire.ca', 'referer': 'https://www.canadiantire.ca/', 'sec-ch-ua': '" Not A;Brand";v="99", "Chromium";v="102", "Google Chrome";v="102"', 'sec-ch-ua-mobile': '?0', 'sec-ch-ua-platform': '"Windows"', 'sec-fetch-dest': 'empty', 'sec-fetch-mode': 'cors', 'sec-fetch-site': 'same-site', 'service-client': 'ctr/web', 'service-version': 'ctc-dev2', 'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Safari/537.36 Edg/110.0.1587.56', 'x-web-host': 'www.canadiantire.ca', 'cache-control': 'no-cache', 'pragma': 'no-cache'}}
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: File "/home/scraper/.cache/pypoetry/virtualenvs/canadiantracker-lR2ht7gH-py3.10/lib/python3.10/site-packages/requests/api.py", line 59, in request
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: return session.request(method=method, url=url, **kwargs)
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: method = 'get'
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: url = 'https://apim.canadiantire.ca/v1/product/api/v1/product/productFamily/7745620P?baseStoreId=CTR&lang=en_CA&storeId=64'
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: kwargs = {'params': None, 'headers': {'authority': 'apim.canadiantire.ca', 'accept': 'application/json, text/plain, */*', 'accept-language': 'en-US,en;q=0.9', 'bannerid': 'CTR', 'basesiteid': 'CTR', 'ocp-apim-subscription-key': 'c01ef3612328420c9f5cd9277e815a0e', 'origin': 'https://www.canadiantire.ca', 'referer': 'https://www.canadiantire.ca/', 'sec-ch-ua': '" Not A;Brand";v="99", "Chromium";v="102", "Google Chrome";v="102"', 'sec-ch-ua-mobile': '?0', 'sec-ch-ua-platform': '"Windows"', 'sec-fetch-dest': 'empty', 'sec-fetch-mode': 'cors', 'sec-fetch-site': 'same-site', 'service-client': 'ctr/web', 'service-version': 'ctc-dev2', 'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Safari/537.36 Edg/110.0.1587.56', 'x-web-host': 'www.canadiantire.ca', 'cache-control': 'no-cache', 'pragma': 'no-cache'}}
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: session = <requests.sessions.Session object at 0x7ff560287700>
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: File "/home/scraper/.cache/pypoetry/virtualenvs/canadiantracker-lR2ht7gH-py3.10/lib/python3.10/site-packages/requests/sessions.py", line 587, in request
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: resp = self.send(prep, **send_kwargs)
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: self = <requests.sessions.Session object at 0x7ff560287700>
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: method = 'get'
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: url = 'https://apim.canadiantire.ca/v1/product/api/v1/product/productFamily/7745620P?baseStoreId=CTR&lang=en_CA&storeId=64'
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: params = None
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: data = None
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: headers = {'authority': 'apim.canadiantire.ca', 'accept': 'application/json, text/plain, */*', 'accept-language': 'en-US,en;q=0.9', 'bannerid': 'CTR', 'basesiteid': 'CTR', 'ocp-apim-subscription-key': 'c01ef3612328420c9f5cd9277e815a0e', 'origin': 'https://www.canadiantire.ca', 'referer': 'https://www.canadiantire.ca/', 'sec-ch-ua': '" Not A;Brand";v="99", "Chromium";v="102", "Google Chrome";v="102"', 'sec-ch-ua-mobile': '?0', 'sec-ch-ua-platform': '"Windows"', 'sec-fetch-dest': 'empty', 'sec-fetch-mode': 'cors', 'sec-fetch-site': 'same-site', 'service-client': 'ctr/web', 'service-version': 'ctc-dev2', 'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Safari/537.36 Edg/110.0.1587.56', 'x-web-host': 'www.canadiantire.ca', 'cache-control': 'no-cache', 'pragma': 'no-cache'}
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: cookies = None
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: files = None
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: auth = None
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: timeout = None
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: allow_redirects = True
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: proxies = {}
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: hooks = None
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: stream = None
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: verify = None
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: cert = None
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: json = None
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: req = <Request [GET]>
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: prep = <PreparedRequest [GET]>
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: settings = {'proxies': OrderedDict(), 'stream': False, 'verify': True, 'cert': None}
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: send_kwargs = {'timeout': None, 'allow_redirects': True, 'proxies': OrderedDict(), 'stream': False, 'verify': True, 'cert': None}
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: File "/home/scraper/.cache/pypoetry/virtualenvs/canadiantracker-lR2ht7gH-py3.10/lib/python3.10/site-packages/requests/sessions.py", line 701, in send
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: r = adapter.send(request, **kwargs)
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: self = <requests.sessions.Session object at 0x7ff560287700>
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: request = <PreparedRequest [GET]>
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: kwargs = {'timeout': None, 'proxies': OrderedDict(), 'stream': False, 'verify': True, 'cert': None}
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: allow_redirects = True
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: stream = False
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: hooks = {'response': []}
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: adapter = <requests.adapters.HTTPAdapter object at 0x7ff560286f20>
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: start = 1677403256.396566
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: File "/home/scraper/.cache/pypoetry/virtualenvs/canadiantracker-lR2ht7gH-py3.10/lib/python3.10/site-packages/requests/adapters.py", line 565, in send
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: raise ConnectionError(e, request=request)
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: self = <requests.adapters.HTTPAdapter object at 0x7ff560286f20>
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: request = <PreparedRequest [GET]>
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: stream = False
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: timeout = Timeout(connect=None, read=None, total=None)
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: verify = True
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: cert = None
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: proxies = OrderedDict()
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: conn = <urllib3.connectionpool.HTTPSConnectionPool object at 0x7ff5602874c0>
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: url = '/v1/product/api/v1/product/productFamily/7745620P?baseStoreId=CTR&lang=en_CA&storeId=64'
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: chunked = False
Feb 26 09:21:06 vps-c0ce24d7 start.sh[14631]: requests.exceptions.ConnectionError: HTTPSConnectionPool(host='apim.canadiantire.ca', port=443): Max retries exceeded with url: /v1/product/api/v1/product/productFamily/7745620P?baseStoreId=CTR&lang=en_CA&storeId=64 (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7ff5602874f0>: Failed to establish a new connection: [Errno -2] Name or service not known'))
Feb 26 09:21:07 vps-c0ce24d7 start.sh[14631]: Failed to run scrape-skus, continuing...
I'm wondering if we want to make scrapings "resumable". Either we save enough context to take up where we left off or we attempt to only "refresh" the SKUs/categories that weren't scraped for a long while.
Otherwise we can do something simpler and just retry with some kind of exponential back-off until we see the service is back online.
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingenhancementNew feature or requestNew feature or request