Get item step and repeat (from web)

MrBishop

Well-Known Member
I need to get multiple items from an archived web-site. In this case, images. The images are saved on the same site with date-based file names.

Ie
http://www.otcentral.com/images/2001-01-01.gif
http://www.otcentral.com/images/2001-01-02.gif
http://www.otcentral.com/images/2001-01-03.gif
http://www.otcentral.com/images/2001-01-04.gif

Anyone know a way to program a step-repeat get program?

Huge time-saver for me...it beats changing the file addy, right click, save as, repeat. This is for 6 years worth of images and is VERY time consuming.

Thanks.
 
Python and wget should do it. How many MB are we talking?
I could code it and then send them to you in a zip file.
 
36-40k per file..they're all .gifs.
365 files/year
2002-2006

quick math
about 584MB altogether

wget or Python work through MAC?
 
Tried finding Python..couldn't. I DLed wget and tried working it, but the directory source isn't exact. (or I'm not using it right). Instead of just grabbing the images (as if off of an ftp site) it was archiving the lot.

So, what I'd get is a whole whack of archived web pages instead of just the gifs. Back to square one. :p
 
What do you mean you couldn't find python? It comes built into the OS. Open terminal, type "python". You'll need textmate or summit for the scripts tho'.
 
I have no access to my uni puter right now, so no python for me, but if you post the way the images are organized I can make the python script and post it tomorrow.
 
What do you mean you couldn't find python? It comes built into the OS. Open terminal, type "python". You'll need textmate or summit for the scripts tho'.

Sorry...but I looked in Applications and did a Find and couldn't find anything. I did now, thanks to your instructions.

I just 'use' a MAC...I'm not a MAC-head yet. :)

OK... I've got Python 2.3.5
Now what?
 
I have no access to my uni puter right now, so no python for me, but if you post the way the images are organized I can make the python script and post it tomorrow.

I'll look into it.
Post it as I can...but it's a remote location. I don't have 'server access' per se, so I can't browse.
 
Copy and save the next code as a file with a py extension, change its attributes to allow execution with chmod u+x file.py

then invoke it with
./file.py

PHP:
#!/usr/bin/python
import commands
import string

servername="http://servername.com/"
pathtoimages="path/to/images/"
filenamepreffix="filepreffix"

days=range(1,32)
months=range(1,13)
years=range(2005,2008)

for y in years:
        for m in months:
                for d in days:
                        print commands.getoutput("wget %s%s%s-%s-%s-%s.jpg"%(servername,pathtoimages,filenamepreffix,y,m,d))

Of course, you need to change the values of the variables to match what you want.
 
I'm guessing that *.* doesn't work here, eh :p

The file names are date based (2002-01-01) and all are .gif files.
No prefix
 
PHP:
#!/usr/bin/python
import commands
import string

servername="http://servername.com/"
pathtoimages="path/to/images/"
filenamepreffix=""

days=range(1,32)
months=range(1,13)
years=range(2002,2007)

for y in years:
        for m in months:
                for d in days:
                        print commands.getoutput("wget %s%s%s%s-%s-%s.gif"%(servername,pathtoimages,filenamepreffix,y,m,d))

Done.
 
Back
Top