RSS

Monthly Archives: August 2009

Extracting HTML source from a URL website

Was just thinking of trying something short and sweet and thought of trying out a snippet for extracting code from the entered url.
Following is the code have not declared the namespaces on top but used them directly in the code to bring more clarity on which namespace the object comes from.

The code is self explanatory so wont add any explanations over here..

</// <summary> /// Extracts the source from the url entered. /// </summary> /// <param name="url">url to fetch the source from.</param> /// <returns>string: source for the url entered.</returns> public static string GetHtmlPageSource(string url) {


System.IO.Stream st = null;

System.IO.StreamReader sr = null;
try

 {

     // make a Web request

     System.Net.WebRequest req = System.Net.WebRequest.Create(url);

// get the response and read from the result stream System.Net.WebResponse resp = req.GetResponse(); st = resp.GetResponseStream(); sr = new System.IO.StreamReader(st); // read all the text in it return sr.ReadToEnd(); } catch (Exception ex) { return string.Empty; } finally { // close the stream & reader objects. sr.Close(); st.Close(); } }

UPDATE:

If you need to authenticate the request use the following just before you make the request to read the source
// authenticate using the credentials passed for getting access to the page. if (username != null && password != null) req.Credentials = new System.Net.NetworkCredential(username, password); // get the response and read from the result stream . . .

1 Comment

Posted by Shaunak Pandit on August 16, 2009 in .NET, Code Snippets, Problem Solving

Tags: extract html source, reading html from url, reading html from website, webrequest and webresponse object

In the Line of Fire….Ramblings of the Incoherent mind

Monthly Archives: August 2009

Extracting HTML source from a URL website

Blog Hits

Categories

Search in Posts

Subscribe to Blog via Email

Recent Posts

Archives

RSS Links

Top Rated

Gravatar

Shaunak Pandit

Personal Links

In the Line of Fire….Ramblings of the Incoherent mind

Monthly Archives: August 2009

Extracting HTML source from a URL website

Blog Hits

Categories

Search in Posts

Category Cloud

Subscribe to Blog via Email

Recent Posts

Archives

RSS Links

Top Rated

Gravatar

Shaunak Pandit

Personal Links