[Noisebridge-discuss] archiving a password protected website

Ryan Rawson ryanobjc at gmail.com
Mon May 14 06:42:53 UTC 2012


if the auth is done via cookies, use your browser (chrome dev tools is
good for this), copy all the cookie content, into a 'cookie.txt' file
and then use:
--load-cookies=cookie.txt

and skip the username/password

On Sun, May 13, 2012 at 11:41 PM, Ryan Rawson <ryanobjc at gmail.com> wrote:
> wget --username=<username> --password=<password> -r -np  <URL>
>
> be careful, sometimes auto-generated sites can produce endless loops
> of content that confuses wget.
>
> If things get hairy, put:
> -l <number>
>
> to limit how 'deep' the get should follow links.
>
> enjoy
>
> On Sun, May 13, 2012 at 11:33 PM, Lee Sonko <lee at lee.org> wrote:
>> I'm trying to make an archive of a website subscription I belong to,
>> my 23andme.com account. I can't find a tool that will download this website.
>> I'd rather not copy-and-paste 300 pages. I tried WinHTTrack. Maybe WGet
>> excels at this but it's a steep learning curve; it'd be nice if someone
>> point me in the direction of a tool that could do it.
>>
>> I can see two obvious hurdles. Logging in might be designed to be an
>> interactive process (I tried dragging cookies around in WinHTrack to no
>> avail). And maybe so much depends on the server on a modern website that it
>> might not be possible to have a web page without a server. What say the
>> Noisy-nerds?
>>
>> Lee
>>
>>
>>
>>
>> _______________________________________________
>> Noisebridge-discuss mailing list
>> Noisebridge-discuss at lists.noisebridge.net
>> https://www.noisebridge.net/mailman/listinfo/noisebridge-discuss
>>



More information about the Noisebridge-discuss mailing list