I need to scrape data from web pages and store it in Snowflake - the feature is offered by a python library called "Scrapy" but my company forbids the use of open source libraries. Is there an equivalent Matillion component that offers this?
Great question, I hope you don't mind me giving our take, I am sure members of the community may have suggestions also :)
An option could be to build a custom connector for a similar web scraper API other than this if that is not an option, raising an idea on our ideas portal for this would be awesome.
Kind regards, Joe
Thanks @JoeCommunityManager ! I'd almost certainly prefer a custom connector if that would work, but in this case, the web site we intend to scrape doesn't have any API endpoints, it is a basic, static, HTML page refreshed periodically by a job. The developers are confident that Scapy could perform this action, but with security and support issues surrounding open source python libraries, I need to find an alternative. I didn't want to request a feature, only to find that it already exists and had been overlooked!
Fred