Aperture Science

tagged

web in

Coding

Saturday

Feb162013

Accessing the Internet from PowerShell: Download Files Part 1

Saturday, February 16, 2013 at 12:00PM

Today we are going to download files from the internet, and other few places as a continuation of the series on the WebClient class in the .Net framework. The WebClient class has a method called DownloadFile, which as the name says, allows us to download files from many sources including HTTP, HTTPS, FTP and \\server\share locations. The aim will be by the end of this to have a CMDLet allowing us to download files from these various locations.

So, how does the DownloadFile method work? Here is an example:

The only issue with DownloadFile is that we need to specify a filename as the destination. This is something we will need to work around/with.

So if we wanted design a CMDLet around this, what should we consider? Firstly, let’s just ignore headers and encoding types and all that junk and assume we will handle that. What else is there? Well for a CMDLet that downloads files, we probably want to be able to use it in the pipeline, that is, I might want to pipe an array or list of URLs to that CMDLet and have it download them all. Next I don't want to have to specify the destination file name and folder all the time, but that doesn't mean I might not want to specify one or the other down the track. Finally, I hate downloading a file twice, or accidently overwriting a destination file. I want the CMDLet to not overwrite files unless I tell it to, and if it is, warn me about it.

So here are the requirements:

· We want to be able to pipe URLs to the CMDLet so they can be downloaded
· Destination filename optional. CMDLet to determine filename if none specified
· Destination folder/directory optional. We might want to send a file to a different directory than the current working directory.
· By default, do not overwrite files. If we tell the CMDLet to overwrite files, it should warn us

Well supporting the pipeline when writing a CMDLet is easy, we simply use the begin{} process {} end {} syntax. If we do things in a clever way, we should be able to reduce the amount of work we are doing for a large list of downloads.

The second requirement, optionally specifying the filename to download the file to...that can be a little more difficult! The trick is to have two parameters, filename and directory. Depending on what is or isn’t specified, we can do 4 different things.

1. If both are specified -> join the two together
2. If no filename or destination directory is specified -> the destination is the current directory (converted from .) joined with the "leaf" part of the URL
3. If no filename is specified, but a directory is -> the destination is the specified directory joined with the "leaf" part of the URL
4. If filename is specified but a directory is not -> The destination is the current directory (converted from .) joined with the specified filename

The final requirement, not overwriting files by default, is also pretty simple, we can test to see if a file exist, if it doesn't then we are ok to download the file to the destination, otherwise we need a switch parameter, something like -clobber, to tell the CMDLet if its ok to overwrite the file. If it’s not ok, we will throw an error, if it is, we will use the write-warning to make sure we know that a file is being overwritten.

We will look at the cmdlet, next time.

Comments Off | |

tagged

web in

Coding

Saturday

Feb092013

Note on PowerShell/Web Series

Saturday, February 9, 2013 at 12:00PM

Just a quick note on the current series I am running on accessing the Internet from PowerShell. If you are looking at PowerShell V3, you will note the invoke-webrequest CMDLet. Whilst it is probably a million times more powerful than my humble set of web functions, it can be a little complicated. I am going to explore it over the coming weeks, and will report back my findings.

Accessing the Internet from PowerShell: Get-WebPage Part2

Friday, February 8, 2013 at 12:00PM

Welcome back.

So last time we looked at the get-webpage CMDLet, but first, you may notice that it is different from this version of this CMDLet to that I showed off at Infrastructure Saturday; your right, it has been updated. The update is to make supplying user-agent headers more easily.

Let’s go back to the get-externalip function we defined a few months ago. Previously it was:

Now if we make use of the cmdlet we just finished:

Pretty neat?

I will leave it here, explore the cmdlet on your own.

Comments Off | |

tagged

web in

Coding

Friday

Feb012013

Accessing the Internet from PowerShell: Get-WebPage Part1

Friday, February 1, 2013 at 12:00PM

So last time we looked at accessing the web from PowerShell, we looked a simple function which returned our external IP address, we then used that function to make a script to automatically alert us when our external IP address changed. This was a pretty simple function and script, but I want to spend some time discussing some of this further and build a full CMDLet.

In later posts I want to discuss how I believe you should design your PowerShell code, including CMDLets, functions, scripts and modules, but I want to touch on some of this today.

In the get-externalip function, we have simply created the .Net framework webclient object, set up some headers, and the called the downloadstring method. This is fine for this simple script, but it isn't great design in the long term. What we need is a CMDLet that downloads a URL from the internet (or maybe a local source) and then make use of that CMDLet. The CMDLet should enable us to have a powerful framework where we can easily leverage all of the webclient class functionality, but in a refined and controlled PowerShell manner.

We should also ensure that we provide all the help and documentation with our CMDLets, so anyone can use them. For this we will use the PowerShell Comment Based Help Syntax, something that is an over looked part of PowerShell. Seriously, Microsoft should be encouraging more use of this, every language has something like similar to this, it’s not something special in PowerShell, but if every PowerShell developer used the comment based help syntax, the world would be a much better place.

Let’s take a look at a CMDLet that will get the HTML representation of a page, this CMDLet will also allow us to set things like the proxy server, headers, credentials for the remote page and the user agent!

The CMDLet below is based upon my Infrastructure Saturday 2012 presentation on PowerShell.

Whoa, quite a bit here, around 100 lines of code! Most of which though is documentation. As you can see, we have the start of the function definition, then the comment based help syntax. This syntax is providing documentation for the get-help CMDLet, including descriptions, parameters, inputs, outputs and examples.

Next we have the CMDLetBinding, we are not specifying anything extra here, just telling PowerShell that this is a CMDLet.

After this we have parameters, there all look pretty simple, only one is mandatory, URL. I have also specified types for all of the parameters. Whilst the specification of types is option, I am not doing this wherever possible, it reduces errors and provides very useful input validation.

Now we have the body of the CMDLet. As expected, we are making a new instance of the WebClient class. Then we have some if statements. If these optional parameters have been specified, then we need to set the various parts of the WebClient object.

I hard set the encoding, it makes things simple an easy.

We then create an object to hold the result, and within a try {} catch{} block will call download string. If we do catch any errors, simply throw them to the calling code. We don’t particularly care about handling errors here, the calling code should; however its crucial to handle errors in a controlled manner.

Finally return the resulting page.

Next time we will look at some examples and some further discussion of the code.

Comments Off | |

tagged

http,