How to use C# to determine the end or next redirection point of a page
How to use C# to determine the end or next redirection point of a page
20 Replies
$details
When you ask a question, make sure you include as much detail as possible. Such as code, the issue you are facing, what you expect the result to be, what .NET version you are using and what platform/environment (if any) are relevant to your question. Upload code here https://paste.mod.gg/ (see $code for more information on how to paste your code)
What is a page? What is a redirection point? What is "end"? what is "next" (in this context)?
while (sheet.Cell(i, 0).ToString() != "")
{
string t = sheet.Cell(i, 0).ToString();
try
{
HTML = await client.GetStringAsync(sheet.Cell(i, 0).ToString());
}
catch (HttpRequestException e)
{
i++;
continue;
}
finally
{
if (HTML != null)
{
if (HTML.Contains("googletagmanager.com") || HTML.Contains("google-analytics.com"))
{
sheet.Cell(i, Write - 1).Value = "1";
}
else
{
sheet.Cell(i, Write - 1).Value = "0";
}
}
//here i nedd to save end URL after redirectings (if there was redirectings)
i++;
}
}
That's the point, are we talking about a Web-Page, an Excel-Sheet, a WPF-Page, a physical page on the scanner?
so uh.. you're fetching all the html from a webpage and storing it in... an excel cell?
if i get to site it may redirect me to another URL and if there was a redirection, i need to have a redirection URL
Ah okay, now I understand
i have an excel file where i have URL's and i need to check for some things
what kinda urls is this?
and what are you checking for?
any urls
just some sites
now i nedd to find end URL
more info
"any urls, just some sites" makes me think you are doing something malicious
i am watching sites that have links to my site to watch google analysis
hm... seems weird, but okay.
I'll play ball for now.
https://learn.microsoft.com/en-us/dotnet/api/system.net.http.httpclienthandler.allowautoredirect?view=net-8.0
first, we look at this property
turns out
HttpClient
already follows redirection responses for you
it does not follow meta header locations, or javascript redirects
also, that code you run in finally
is a bit weird
imagine you query the url for i == 2
it fails. you then increase i
by 1, so its now 3. then the finally block hits
HTML
might still contain the valid html from i == 1
but i
is now 3, so the code does the wrong thingyeah, i see
finally
will run after the try or the catch, but it always runs
you cant short circuit out of it
so remove the finally, clear HTML
in your catch, and remove the i++; continue;
so how can i see end url
probably can't when using
GetStringAsync
how can i do it w/o using it?
i was tryin to do this for 30 mins already
give this a try?
hello would be your URL, obviously
so here we get the actual http response object instead of just the content
we then check the location header from the response to determine the url we landed on
this is untested, I just wrote it up quickly