Skip to content

URL Extraction - Filtering #3

@jjackson37

Description

@jjackson37

Once the URLs have been retrieved from the RegEx they will most likely need to be filtered.
I think seperating out the filters into their own classes would be the best approach here, I can then create objects of them in a collection and loop the URL collection through them all.

I can think of two filters that we need so far :

  1. Error/Incorrect URLs (Possible it might need to run this one before the incomplete URL building?)
  2. URLs that link to different domains
  • Create filter data structure and interface
  • Create media file filter
  • Create duplicate address filter
  • Create domain URL filter
  • Create error URLs filter
  • Create 404 filter (Not sure about this one yet)

Metadata

Metadata

Assignees

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions