Skip to content

milahu/aiohttp_chromium

Repository files navigation

aiohttp_chromium

aiohttp-like interface to chromium

based on selenium_driverless to bypass cloudflare

status

working prototype

usage

aiohttp_chromium is a drop-in replacement for aiohttp

import asyncio

#import aiohttp
import aiohttp_chromium as aiohttp

async def main():
    async with aiohttp.ClientSession() as session:
        async with session.get('http://httpbin.org/get') as resp:
            print(resp.status)
            print(await resp.text())

asyncio.run(main())

see also

why

handling file downloads with selenium is too verbose, and too complex to integrate into selenium, so this is a wrapper for selenium

i wanted a "stupid http client", so it has the same interface as aiohttp.client, and handling web pages has lower priority, so the selenium interface is hidden in response._driver

known issues

chromium window is stealing focus

when creating new tabs, or when switching between tabs, the chromium window is grabbing focus

this is an issue with the window manager

by default (_headless=False, _prevent_focus_stealing=True) this is already fixed in _enable_prevent_focus_stealing_kde for the KDE plasma desktop

manual fix on the KDE plasma desktop: KWin focus stealing prevention: window titlebar → rightclick → more actions → special settings for this window → add property → prevent unwanted activation → level: extreme → apply

example KDE config file ~/.config/kwinrulesrc

[General]
count=1
rules=28ef9b3f-6f35-4fbf-b2c6-b06d0fd959e3

[28ef9b3f-6f35-4fbf-b2c6-b06d0fd959e3]
Description=aiohttp_chromium: prevent focus stealing
clientmachine=localhost
fsplevel=4
fsplevelrule=2
types=1
windowrole=browser
windowrolematch=1
wmclass=chromium-browser \(/run/user/1000/fetch\-subs\-[0-9]{8}T[0-9]{6}\.[0-9]+Z/chromium\-user\-data\) Chromium-browser
wmclasscomplete=true
wmclassmatch=3

/run/user/1000/fetch\-subs\-[0-9]{8}T[0-9]{6}\.[0-9]+Z/chromium\-user\-data is a regex for the chromium user-data directory path which is the tempdir argument for ClientSession

chromium seems to have no command line switch to disable this focus-grabbing

possible solutions

  • run chromium in a LD_PRELOAD wrapper
  • binary patching of the chromium executable
  • configure the window manager
    • done for KDE plasma

todo

keywords

  • web scraper
  • chromium
  • aiohttp
  • web scraping
  • asyncio
  • bypass cloudflare
  • headful scraper
  • headful web scraper
  • headful chromium
  • gui scripting
  • headful webscraper
  • selenium driverless

similar projects

Contributors 2

  •  
  •  

Languages