Making Authenticated HTTP Requests from an ASP.NET Page
By Scott Mitchell
Introduction
Performing HTTP requests from a web page - a task commonly referred to as "screen scraping" - involves server-side code
issuing an HTTP request to some other Web site, retrieving the returned results, and processing these results in some manner.
For example, screen scraping is oftentimes used to grab data from another site, such as scraping the HTML from a Yahoo!
Finance page to grab the current stock price for a particular stock symbol.
Performing simple HTTP requests in ASP.NET requires just a few lines of code, thanks to the WebClient class.
This class, found in the System.Net namespace, provides a small number of properties and methods useful for
making simple HTTP requests. A previous 4Guys article of mine, Screen
Scrapes in ASP.NET, illustrates how to use the WebClient class from an ASP.NET page.
RssFeed, a custom, compiled ASP.NET server control I created
for displaying RSS feeds in an ASP.NET page, uses programmatic HTTP requests to grab the syndicated content from a specified
URL. Recently a user of RssFeed asked me if RssFeed provided support for RSS feeds that required authentication. That is,
this user wanted to display the contents of an RSS feed that was only accessible by providing authentication information,
such as a username and password. While RssFeed itself didn't provide this functionality, the underlying classes used by
RssFeed to access the remote RSS feed do, so I added this functionality. (For more information on displaying RSS content
in your ASP.NET website, be sure to read A Custom ASP.NET Server Control for Displaying RSS Feeds.)
In this article we will first discuss the common authentication protocols used by web servers and then look at how to make
programmatic HTTP requests to a resource that requires authentication. Read on to learn more!
A Look at Authentication Protocols
There are a number of standard techniques through which a web server can identify the user of an incoming request. The three
most commonly used authentication protocols are:
Basic authentication - when an unauthenticated request comes into the web server, the web server returns
an HTTP 401 response, prompting the client for its credentials. The client re-requests the same resource, passing the
username and password in a base-64 encoded HTTP header. (The base-64 encoding does not encrypt or protect the credentials;
it merely ensures that the characters sent over the wire are in a format that won't conflict with any reserved characters.)
Since the credentials are sent over the wire in plain-text, Basic authentication should only be used when using SSL,
since this ensures that the entire contents of the HTTP request are encrypted.
Digest authentication - like Basic authentication, when an unauthenticated request comes into the web server, the web server returns
an HTTP 401 response, prompting the client for its credentials. In addition to this request, the web server also sends
back additional pieces of information, such as a nonce (a random string) and a sequence identifier. The client then
re-requests the resource including an HTTP header that has the username in plain-text and a hash of the password.
The hash is salted by the nonce, the sequence identifier, and other tidbits. (Hashing is the process of taking a plain-text
input and converting it into a form that cannot be converted back into the plain-text form. The web server receiving the hashed
password must know the user's plain-text password; it hashes this known plain-text password and ensures that it matches up
with the hashed version sent over the wire. The short of it is, a hashed password can be safely sent over an insecure channel.
To learn more about the basics of hashing, see the Wikipedia hashing entry.)
To learn more about Basic and Digest authentication, refer to RFC 2617.
NTLM, or NT Challenge/Response, or Integrated Windows Authentication - NTLM avoids sending even a digest of
the password. Instead, the server and client correspond in a three-step authentication procedure where the client ends up
hashing a nonce with their password. The client's username and this hashed nonce is then sent back to the server and
verified. For more in-depth information on NTLM refer to Microsoft's
NTLM documentation.
In order to protect a resource using one of these authentication schemes, the web server needs to be appropriately configured.
This topic is beyond the scope of this article - we'll just focus on how to programmatically make an HTTP request for a resource
protected by one of these authentication schemes. For more information on configuring your web server to support one or
more of these authentication schemes see How To Configure IIS
Web Site Authentication in Windows Server 2003.
As we just discussed, when a request comes in for a protected resource the web server sends back a message to the client - typically your browser.
This causes that familiar dialog box to popup, which prompts you for your username and password. (With NTLM, if you are logged on
to the domain, this information is seamlessly sent to the web server, without requiring the end user to re-enter their credentials.)
When making an HTTP programmatically, there's no dialog box. Rather, we must instruct the appropriate classes to use a particular
authentication scheme with particular credentials. We'll examine how to accomplish this shortly, after a quick look at the
basics of making programmatic HTTP requests in .NET.
A Quick Primer on Making HTTP Requests from an ASP.NET Page
The .NET Framework provides a couple of classes for making programmatic HTTP requests, both of which can be found in the
System.Net namespace. The first class, HttpWebRequest, provides a rich set of features for making
an HTTP request. Using this class you can perform very simple HTTP requests, or you can configure its properties to handle
more complex scenarios. For example, the HttpWebRequest provides properties to enable:
Tunneling the request through a proxy,
A timeout value - if the request does not return within a specified number of milliseconds, an exception is raised,
Asynchronous HTTP requests - start the HTTP request on a separate thread and receive notification when the request completes,
If-Modified-Since support, which enables the HTTP request to be smart enough to only download the
complete content if it's changes since the last request was made,
If all you need to do is make a simple HTTP request without needing to tunnel through a proxy, specify timeout values, or
make asynchronous requests, the .NET Framework provides the WebClient class, which is designed to simplify the
HTTP request process. Using the WebClient class requires a few less lines of code than the HttpWebRequest
class and, in my opinion, the resulting code is more readable. (As you may have guessed, the WebClient class
uses the HttpWebRequest class internally.) For more information on making HTTP requests with the WebClient class
be sure to read Screen Scrapes in ASP.NET.
The following snippets of code show how to use both the WebClient and HttpWebRequest classes to make
a simple HTTP request. Both snippets result in saving the HTML of the requested web page into the string variable results.
' Using the WebClient class
Dim req as New WebClient()
Dim results as String
results = System.Encoding.UTF8.GetString(req.DownloadData(URL))
... work with results ...
-----------------------------------------------------------
' Using the HttpWebRequest class
'Create the HttpWebRequest object
Dim req as HttpWebRequest = WebRequest.Create(URL)
'Get the data as an HttpWebResponse object
Dim resp as HttpWebResponse = req.GetResponse()
'Convert the data into a string (assumes that you are requesting text)
Dim sr as New StreamReader(resp.GetResponseStream())
Dim results as String = sr.ReadToEnd()
sr.Close()
... work with results ...
Making Authenticated HTTP Requests
Both the WebClient and HttpWebRequest classes make it easy to include authentication information
in the request through their Credentials properties. The Credentials property accepts an object
that implements ICredentials. The CredentialCache
class provides a store for credentials. You can add new NetworkCredential
instances to this store, or use the CredentialCache.DefaultCredentials
property to use the credentials of the currently logged on user. (If you are making an HTTP request from an ASP.NET page,
you likely will not want to use the DefaultCredentials property unless you are using
impersonation;
furthermore, the DefaultCredentials property can only be used when authenticating against NTLM or Keberos-based
authentication schemes.)
The intent of the CredentialCache class is to store a set of credentials for the user. When a request is made
to a resource, the CredentialCache class can be interrogated and the appropriate credentials can be extracted
based on the resource being requested. That is, the CredentialCache class can be used to hold credentials for
various websites and, when a request is made, the appropriate credentials can be grabbed from the store based on the URL
request. This functionality may be useful in a desktop-based application, where the CredentialCache class object
persists for the duration of the program's execution, but with ASP.NET pages you'll typically want to create a new
CredentialCache object each time an authenticated HTTP request is needed to be made.
The following code shows how to use the CredentialCache class and the WebClient's Credentials
property to make a request to a URL that is protected via basic authentication:
Dim req as New WebClient()
Dim myCache As New CredentialCache()
myCache.Add(New Uri(URL), "Basic", _
New NetworkCredential(Username, Password))
req.Credentials = myCache
Dim results as String
results = System.Encoding.UTF8.GetString(req.DownloadData(URL))
To authenticate using Digest, instead of using "Basic" as the second input parameter to the Add()
method, use "Digest".
To authenticate against an NTLM scheme using the current user's logged on credentials, use the following code:
Dim req as New WebClient()
req.Credentials = CredentialCache.DefaultCredentials
Dim results as String
results = System.Encoding.UTF8.GetString(req.DownloadData(URL))
Supporting Protected RSS Feeds with RssFeed
The impetus for this article stemmed from a user requesting the ability for displaying protected RSS feeds through
RssFeed. Just like regular web pages, RSS feeds can also be
protected through any one of the common authentication schemes. Most desktop-based RSS aggregators provide support for protected
RSS feeds by allowing the user to specify a username and password in the feed's properties dialog box. When requesting an
RSS feed programmatically, however, we need to use the techniques discussed in this article.
Underneath the covers, RssFeed uses the HttpWebRequest class to programmatically access a remote RSS feed.
The RssFeed control then provides a Credentials property of type ICredentials. If this property
is set to an object, the internal HttpWebRequest class instance's Credentials property is assigned
this value. And that's all there is to it!
Here's a snippet of code showing how to use RssFeed to display an authenticated RSS feed:
Dim myCache As New CredentialCache()
myCache.Add(New Uri(URL), "Basic", _
New NetworkCredential(Username, Password))
RssFeedID.Credentials = myCache
RssFeedID.DataSource = URL_to_RSS_feedRssFeedID.DataBind()