The first step in this process is to create a directory on your Web server that is writeable. The second is to determine how, and when, you’ll update your feeds. Then, we can write the code.
In re: writeable directories on a Web server, the usual cautions:
- Where possible, you should grant write permissions on directories to specific user accounts you impersonate, or the ASPNET worker process for the machine you are using. Allowing “Everyone” or anonymous users to write to directories on your server is dangerous, especially if the directory is in your public HTTP path.
- You should place writeable directories outside your HTTP path. So, for example, if your Web files go in a directory named D:Inetpubwwwroot-myserver, you might want to make your writeable directory something like D:NWSXML. Of course, this assumes that you have a Web host that gives you the ability to place files outside a publicly accessible HTTP directory. Most do, some do not, so check.
- An alternative, if you can’t move your XML directory outside your HTTP path or easily impersonate a user on your system, is to obfuscate the writeable directory’s name. Don’t name it something obvious, such as “upload”; name it something odd and impossible to guess, such as “7zj32x5p6.” Then, don’t link to it from any of your Web pages, make mention of it in any site map or robots.txt files, etc. Treat it like Area 51: Deny its existence, even as you use it for secret purposes.
Once you have a directory to which you can write, you’ll need to determine how you’re going to go about scraping the XML feeds from weather.gov. You have two options:
- You can check for the last update time when you have a new visitor, and grab the new file after you serve the current request. For lower-traffic sites, this is an acceptable option.
- You can automate the retrieval of the file by firing your code off every 60 minutes. This is the better course, but requires additional resources beyond your Web site.
- One method to automate the XML update is to write the code as a DLL and run it as a scheduled task on your Web server — something most shared hosting providers either won’t let you do, or want good money to let you do.
- Another option, if you also have LAMP-based hosting, is to write the code into an ASP.NET page on your Windows box, then set up a cron job on your Linux box that instructs Lynx — the built-in text-only Web browser that comes with most *nix installs — to visit the ASP.NET page, and thus execute your code. Here’s a tutorial on how to set that up.
I’m going to provide code here that works per versions 1 and 2.2 above. For both these versions, we are going to need the System.IO and System.Net namespaces; either add them to your page via the Import directive or your web.config file.
Version 1: Check With Each New Visitor
Again, checking with each new visitor is best for low-traffic pages; those that are going to see no more than one or two visitors per minute (60-120 visitors per hour). If you’re getting more traffic than that, you’re wasting resources, because we need to run uncached file requests on the XML file on your Web server for every visitor, and asking a Web server to pick up and parse a file is one of the most resource-intensive things we can demand.
We’re going to do this in two parts: A function that will check whether we have a current version of the NWS XML feed stored locally, and return a Boolean; and a subroutine that will get the appropriate XML feed if it’s time to get an update.
Additionally, we’re going to add two application keys to the web.config file’s
<configuration> <appSettings> <add key="NWSBaseURI" value="c:pathtolocaldirectory" /> <add key="NWSBaseURL" value="http://www.weather.gov/data/current_obs/" /> </appSettings> </configuration>
On to the function. It takes three parameters: A DateTime, which will be the current time; an Integer, which will be the number of minutes between pickups; and a String, which will be the file name of the feed we want to check.
Function CheckLastUpdateTime(ByRef dtTime As DateTime, ByRef intPickup As Integer, ByRef strFile As String) As Boolean 'set default value for function Dim blResult As Boolean = False 'multiply pickup period by -1, "add" it to the check time Dim intUpdate As Integer = intPickup * -1 Dim dtUpdate As DateTime = dtTime.AddMinutes(intUpdate) 'get local file Dim strPath As String = ConfigurationManager.AppSettings("NWSbaseURI").ToString() & strFile Dim objFI As New FileInfo(strPath) 'check that file exists; return false if it does not If objFI.Exists() Then 'compare last write time to update time; 'if less than update time, return true; 'else return false If objFI.LastWriteTime < dtUpdate Then blResult = True End If End If Return blResult End Function
With that function complete, we can now move on to the subroutine that does the updating. What we’ll specifically do is request the file via WebRequest, then save it via a StreamWriter.
Also, I know that I could use different objects to read and write these files, or work directly from the StreamReader to the StreamWriter, etc. I’m doing it this way.
Sub UpdateNWSFeed(strFile As String) 'set path strings Dim strLocalPath As String = ConfigurationManager.AppSettings("NWSbaseURI").ToString() & strFile Dim strRemotePath As String = ConfigurationManager.AppSettings("NWSbaseURL").ToString() & strFile 'Create the WebRequest object; 'give it 3 seconds to complete Dim objRequest as WebRequest = WebRequest.Create(strRemotePath) objRequest.Timeout = 3000 Try 'get the data as an WebResponse object 'only overwrite old file if we have a good response Dim objResponse As WebResponse = objRequest.GetResponse() 'convert the data into a string Dim objReader As New StreamReader(objResponse.GetResponseStream()) Dim strResults As String = objReader.ReadToEnd() objReader.Close() 'write results to file Dim objWriter As New StreamWriter(strLocalPath) objWriter.Write(strResults) objWriter.Close() Catch wex as WebException 'You can add your own error-catching code here 'Response.Write("Error getting " & strFile & ": " & wex.Message) End Try End Sub
By virtue of making a new file with the StreamWriter object, we’ll overwrite the old file in the same location. Also, I should give full thanks to 4guysfromrolla.com for the WebRequest code above, which I pirated and fixed to avoid some strong typing issues with Option Strict.
Use Page_LoadComplete To Invoke Version 1
We’ll invoke the Version 1 subroutine with the Page_LoadComplete event, which occurs after all the controls — including XmlDataSource — have been loaded onto the page. We don’t want to interrupt the loading onto the page of the XmlDataSource or the DataGrid / other controls that bind to it, so we wait for them to be on the page before we go off to look for new XML.
Sub Page_LoadComplete(Sender As Object, E As EventArgs) If Not Page.IsPostBack Then If CheckLastUpdateTime(Now(), 60, "KAUG.xml") Then UpdateNWSFeed("KAUG.xml") End If End If End Sub
Version 2.2: Running The Code With Crontab / Lynx
This is the best solution because it requires the least resources: However often you set the job up to run, Lynx fires and updates your feeds, using about the same amount of resources 2-3 visitors to your weather page would use under Version 1, but doing it in a fell swoop and in a manner that doesn’t impede any visitor’s experiences.
To use it, just create an ASP.NET page that fires off the UpdateNWSFeed subroutine from Version 1, above, in a Page_Load subroutine. Then, set up your Lynx crontab job to access that page however often you want to update the feeds.
I distribute code under the GNU GPL version 3.