C
C#2y ago
N4nn3rz

❔ ***Webscraping- could a newer framework be disrupting htmlagility pack ?

I am currently building a wpf app and i am having issues crawling/grabbing information from news articles. I am using html agility pack and I have even tried selenium(was a bit too much for the small tasks I need to do). I looked at my application framework and I was using .NET 7.0- and I have now switched down to 4.5. I will now try out this framework and see if I just need a more html heavy newsite to scrape from. Could the htmlagility package having conflict with .NET 7.0? Does anyone have experience or time to help me with my scraping project? Most times I just need a "get gud" you know haha.
12 Replies
HimmDawg
HimmDawg2y ago
HtmlAgilityPack is compiled against .NET Standard 2.0, so that should not be the problem here
N4nn3rz
N4nn3rzOP2y ago
would trying to scrape from the google news api with just html agility pack too naive? shld I chose smaller website that are less dependant on javascript?
HimmDawg
HimmDawg2y ago
Scraping from an API? I'd assume you just make requests to an api Ah. I searched around a little. That API has been deprecated for ages it seems... In that case, AgilityPack should be sufficient. I didn't have problems with it so far.
SinFluxx
SinFluxx2y ago
HtmlAgilityPack 1.11.48
This is an agile HTML parser that builds a read/write DOM and supports plain XPATH or XSLT (you actually don't HAVE to understand XPATH nor XSLT to use it, don't worry...). It is a .NET code library that allows you to parse "out of the web" HTML files. The parser is very tolerant with "real world" malformed HTML. The object model is very similar...
N4nn3rz
N4nn3rzOP2y ago
u know what thats what i was looking for thanks!
Pobiega
Pobiega2y ago
... why did you go to 4.5 FX? In general, .NET 7 is about 30% faster at more or less everything than 4.5
N4nn3rz
N4nn3rzOP2y ago
because I keep getting httpRequestException + IOException"response ended prematurely" i could totally be parsing incorrectly though for all I know I
Pobiega
Pobiega2y ago
And your initial gut reaction is to downgrade to .NET 4.5? O_O I highly doubt there is any problem with HAP, but you could try AngleSharp
N4nn3rz
N4nn3rzOP2y ago
hmm ill try it well coming from visual basic, ive had to downgrade in the past to get access to certain features lol
Pobiega
Pobiega2y ago
O_O Have not done VB since vb6, but still
N4nn3rz
N4nn3rzOP2y ago
yeah the print form tools or access to certain packages are only availible or work in order frameworks for some reason
Accord
Accord2y ago
Was this issue resolved? If so, run /close - otherwise I will mark this as stale and this post will be archived until there is new activity.
Want results from more Discord servers?
Add your server