Subscribe
Search

Entries in web (5)

Friday
Feb222013

Accessing the Internet from PowerShell: Download Files Part 2

So, last time we looked at download a file, and looked at what a CMDLet to do this should look like.

And here it is:

 

Most of the layout of this CMDLet should be familiar to you. There is the comment based help, then the CMDLetBinding. We then have a list of parameters, URL is the only one which is mandatory, and the others are all optional. URL is also set to accept the value from the pipeline, this is the first step to meeting the first requirement of our CMDLet. The filename and directory parameters are for specifying the intended destination files name and location. Finally clobber is a switch, meaning it is either on or off. Clobber will be the way the user of the CMDLet tells it to overwrite files.

Next you will notice a Begin {} block of code. In here we are going to be creating the WebClient object and setting all of its options. Begin{} blocks will only get executed once, no matter how many items are passed in on the pipeline. Begin{} will occur before any input data is processed. End{} which isn’t used, is only executed once all the processing is completed.

A Process{} block is next. Process{} will be executed for every item passed in on the pipeline to the CMDLet. In this block we spend time working out the destination filename and path, then we will test to see if it exists. If the file already exists, then we can either, raise a warning that we are overwriting the file, or throw an error. This will be determined by the clobber parameter.

Finally we are going to write the full url and destination path to verbose output and download the file. We will throw any errors encountered back to the calling code.

Let’s look at how to use the cmdlet:

So, now you have it. You can download files.

Saturday
Feb162013

Accessing the Internet from PowerShell: Download Files Part 1

 

Today we are going to download files from the internet, and other few places as a continuation of the series on the WebClient class in the .Net framework. The WebClient class has a method called DownloadFile, which as the name says, allows us to download files from many sources including HTTP, HTTPS, FTP and \\server\share locations. The aim will be by the end of this to have a CMDLet allowing us to download files from these various locations.

So, how does the DownloadFile method work? Here is an example:

The only issue with DownloadFile is that we need to specify a filename as the destination. This is something we will need to work around/with.

So if we wanted design a CMDLet around this, what should we consider? Firstly, let’s just ignore headers and encoding types and all that junk and assume we will handle that. What else is there? Well for a CMDLet that downloads files, we probably want to be able to use it in the pipeline, that is, I might want to pipe an array or list of URLs to that CMDLet and have it download them all. Next I don't want to have to specify the destination file name and folder all the time, but that doesn't mean I might not want to specify one or the other down the track. Finally, I hate downloading a file twice, or accidently overwriting a destination file. I want the CMDLet to not overwrite files unless I tell it to, and if it is, warn me about it.

So here are the requirements:

  • ·         We want to be able to pipe URLs to the CMDLet so they can be downloaded
  • ·         Destination filename optional. CMDLet to determine filename if none specified
  • ·         Destination folder/directory optional. We might want to send a file to a different directory than the current working directory.
  • ·         By default, do not overwrite files. If we tell the CMDLet to overwrite files, it should warn us

Well supporting the pipeline when writing a CMDLet is easy, we simply use the begin{} process {} end {} syntax. If we do things in a clever way, we should be able to reduce the amount of work we are doing for a large list of downloads.

The second requirement, optionally specifying the filename to download the file to...that can be a little more difficult! The trick is to have two parameters, filename and directory. Depending on what is or isn’t specified, we can do 4 different things.

  1. 1.       If both are specified -> join the two together
  2. 2.       If no filename or destination directory is specified -> the destination is the current directory (converted from .) joined with the "leaf" part of the URL
  3. 3.       If no filename is specified, but a directory is -> the destination is the specified directory joined with the "leaf" part of the URL
  4. 4.       If filename is specified but a directory is not -> The destination  is the current directory (converted from .) joined with the specified filename

The final requirement, not overwriting files by default, is also pretty simple, we can test to see if a file exist, if it doesn't then we are ok to download the file to the destination, otherwise we need a switch parameter, something like -clobber, to tell the CMDLet if its ok to overwrite the file. If it’s not ok, we will throw an error, if it is, we will use the write-warning to make sure we know that a file is being overwritten.

We will look at the cmdlet, next time.

 

Saturday
Feb092013

Note on PowerShell/Web Series

Just a quick note on the current series I am running on accessing the Internet from PowerShell. If you are looking at PowerShell V3, you will note the invoke-webrequest CMDLet. Whilst it is probably a million times more powerful than my humble set of web functions, it can be a little complicated. I am going to explore it over the coming weeks, and will report back my findings.

Friday
Feb082013

Accessing the Internet from PowerShell: Get-WebPage Part2

Welcome back.

So last time we looked at the get-webpage CMDLet, but first, you may notice that it is different from this version of this CMDLet to that I showed off at Infrastructure Saturday; your right, it has been updated. The update is to make supplying user-agent headers more easily.

Let’s go back to the get-externalip function we defined a few months ago. Previously it was:

Now if we make use of the cmdlet we just finished:

Pretty neat? 

I will leave it here, explore the cmdlet on your own.

 

 

Friday
Feb012013

Accessing the Internet from PowerShell: Get-WebPage Part1

So last time we looked at accessing the web from PowerShell, we looked a simple function which returned our external IP address, we then used that function to make a script to automatically alert us when our external IP address changed. This was a pretty simple function and script, but I want to spend some time discussing some of this further and build a full CMDLet.

In later posts I want to discuss how I believe you should design your PowerShell code, including CMDLets, functions, scripts and modules, but I want to touch on some of this today.

In the get-externalip function, we have simply created the .Net framework webclient object, set up some headers, and the called the downloadstring method. This is fine for this simple script, but it isn't great design in the long term. What we need is a CMDLet that downloads a URL from the internet (or maybe a local source) and then make use of that CMDLet. The CMDLet should enable us to have a powerful framework where we can easily leverage all of the webclient class functionality, but in a refined and controlled PowerShell manner.

We should also ensure that we provide all the help and documentation with our CMDLets, so anyone can use them. For this we will use the PowerShell Comment Based Help Syntax, something that is an over looked part of PowerShell. Seriously, Microsoft should be encouraging more use of this, every language has something like similar to this, it’s not something special in PowerShell, but if every PowerShell developer used the comment based help syntax, the world would be a much better place.

Let’s take a look at a CMDLet that will get the HTML representation of a page, this CMDLet will also allow us to set things like the proxy server, headers, credentials for the remote page and the user agent!

The CMDLet below is based upon my Infrastructure Saturday 2012 presentation on PowerShell.

Whoa, quite a bit here, around 100 lines of code! Most of which though is documentation. As you can see, we have the start of the function definition, then the comment based help syntax. This syntax is providing documentation for the get-help CMDLet, including descriptions, parameters, inputs, outputs and examples.

Next we have the CMDLetBinding, we are not specifying anything extra here, just telling PowerShell that this is a CMDLet.

After this we have parameters, there all look pretty simple, only one is mandatory, URL. I have also specified types for all of the parameters. Whilst the specification of types is option, I am not doing this wherever possible, it reduces errors and provides very useful input validation.

Now we have the body of the CMDLet. As expected, we are making a new instance of the WebClient class. Then we have some if statements. If these optional parameters have been specified, then we need to set the various parts of the WebClient object.

I hard set the encoding, it makes things simple an easy.

We then create an object to hold the result, and within a try {} catch{} block will call download string. If we do catch any errors, simply throw them to the calling code. We don’t particularly care about handling errors here, the calling code should; however its crucial to handle errors in a controlled manner.

Finally return the resulting page.

Next time we will look at some examples and some further discussion of the code.