The workers:
- clone a list of companies, stored as a git repostiry: github/joblistcity/companies
- fetch all available job positions from every companies by talking with job-boards providers APIS
- job-board providers implemented are:
greenhouse,recruitee,smartrecruiters... (see https://gitlab.com/joblist/job-board-providers). - uploads the normalized data to populate an algolia search index
- jobs data is searchable on joblist.today, etc.
- install dependencies:
npm install - run the script:
npm runto find all available scripts - check the file
.env.exampleand create the.envfile with the correct values
the default
NODE_ENVvalues should not beproduction, and can be anything else (or left empty). All scripts should work with development databases by default, so we don't break anything in production when developing.
When not run for development purpose, set
process.env.NODE_ENVtoproduction'
Inside .gitlab-ci.yaml files, are defined jobs, triggered as
schedules via the Gitlab interface.
That way, jobs will be fetched for all companies, and an algolia index populated; triggered once daily.
The possible values of the key company.job_board_provider, is one of those known to https://gitlab.com/joblist/job-board-providers
Tried firebase, algolia, supabase, static files.
Now the status is that the data is edited in static files (markdown on the github data repository), and then consolidated into a sqlite wasm database.
It is stored as an artifact in gitlab workers repo, then fetched by the client (which fetches the latest).
- run
sqlite3 .db-sqlite/joblist.dbto create/open/use the database. - run
npm run save-companiesornpm run save-jobsto load the database with its data
sqlite3 .db-sqlite/joblist.db '.mode json' '.once out.json' 'select * from companies'sqlite .db-sqlite/joblist.dbthen in the sqlite3 shell
ATTACH DATABASE '.db-sqlite/stripe.db' AS stripeDb;
select * from stripeDb.highlight_companies;