[Noisebridge-discuss] city of oakland internal emails dump on DocumentCloud

Flatline flatline at hackbloc.org
Tue Mar 6 02:03:11 UTC 2012


Also, in PDF form, easier for human eyes to parse:
http://s3.documentcloud.org/documents/320449/oakland-city-official-emails-10-11-2011-to-11-13.pdf

Flatline
http://www.hackbloc.org

On 03/05/2012 05:58 PM, Flatline wrote:
> Or here it all is in one handy text file:
> http://s3.documentcloud.org/documents/320449/oakland-city-official-emails-10-11-2011-to-11-13.txt
> 
> Flatline
> http://www.hackbloc.org
> 
> On 03/05/2012 02:35 PM, Nicholas Granado wrote:
>> you can run the following code to download them all ....
>>
>> #!/usr/bin/python
>> import os
>> import sys
>> import socket
>> import urllib
>> import urllib2
>>
>> def download_image_url(url):
>> request = urllib2.Request(url)
>> opener = urllib2.build_opener(urllib2.HTTPRedirectHandler(),
>> urllib2.HTTPHandler(debuglevel=0))
>> handle = opener.open(request)
>> payload = handle.read()
>> filename = url.split('/')[6]
>> image_filename = "./data/%s" % (filename)
>> fh = open(image_filename, 'w')
>> fh.write(payload)
>> fh.close()
>> print "%s" % (filename)
>>
>> def main():
>> for i in range(1, 2184):
>> url =
>> "http://s3.documentcloud.org/documents/320449/pages/oakland-city-official-emails-10-11-2011-to-11-13-p%d-normal.gif"
>> % (i)
>> download_image_url(url)
>>
>> if __name__ == "__main__":
>> main()
>>
>> nick
>>
>>
>>
>> On Mon, Mar 5, 2012 at 2:21 PM, Nicholas Granado <ngranado at gmail.com
>> <mailto:ngranado at gmail.com>> wrote:
>>
>>     they are gif files. the file format is ....
>>
>>     http://s3.documentcloud.org/documents/320449/pages/oakland-city-official-emails-10-11-2011-to-11-13-p#-normal.gif
>>
>>     so for example if i wanted page 54
>>
>>     http://s3.documentcloud.org/documents/320449/pages/oakland-city-official-emails-10-11-2011-to-11-13-p54-normal.gif
>>
>>     cheers,
>>     nick
>>
>>
>>
>>
>>     On Mon, Mar 5, 2012 at 2:18 PM, Jake <jake at spaz.org
>>     <mailto:jake at spaz.org>> wrote:
>>
>>         does anyone know how to download the entire 2183 pages?
>>         I couldn't find a download button :)
>>
>>         http://www.mercurynews.com/documents/ci_20040081
>>         _______________________________________________
>>         Noisebridge-discuss mailing list
>>         Noisebridge-discuss at lists.noisebridge.net
>>         <mailto:Noisebridge-discuss at lists.noisebridge.net>
>>         https://www.noisebridge.net/mailman/listinfo/noisebridge-discuss
>>
>>
>>
>>
>>
>> _______________________________________________
>> Noisebridge-discuss mailing list
>> Noisebridge-discuss at lists.noisebridge.net
>> https://www.noisebridge.net/mailman/listinfo/noisebridge-discuss
> 
> 
> 
> _______________________________________________
> Noisebridge-discuss mailing list
> Noisebridge-discuss at lists.noisebridge.net
> https://www.noisebridge.net/mailman/listinfo/noisebridge-discuss

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 900 bytes
Desc: OpenPGP digital signature
URL: <http://lists.noisebridge.net/pipermail/noisebridge-discuss/attachments/20120305/2180b5cd/attachment-0003.sig>


More information about the Noisebridge-discuss mailing list