[Noisebridge-discuss] archiving a password protected website

Lee Sonko lee at lee.org
Sat Jun 2 00:07:11 UTC 2012


On Sun, May 13, 2012 at 11:41 PM, Ryan Rawson <ryanobjc at gmail.com> wrote:

> wget --username=<username> --password=<password> -r -np  <URL>
>
>
Isn't that username/password thing for Basic Authentication only? And it's
"--user" not "--username". I fiddled with wget for 1/2 an hour, getting
stuck on usernames, and trying to log in to their https server. No dice.

--------------------------------------------------

John Adams wrote about "wget -mk http://foo.com"
The wget manual has no mention of a "-mk" option. Can you tell me what else
I might look for?

--------------------------------------------------

Rigel writes:
>23andme does not sequence your genome
...
>it is, IMHO (as a former biology bench-researcher), kind of a scam

Yes yes. You are very smart.

--------------------------------------------------


David Roxex wrote about http://www.charlesproxy.com/
It looks peachy, now I just want to tell charlesproxy to crawl  the site...
and then it'd be super if I could figure out how to, you know, use it. It's
very powerful. Very. I want to do one thing and I've become impatient in my
old age. :-(






> be careful, sometimes auto-generated sites can produce endless loops
> of content that confuses wget.
>
> If things get hairy, put:
> -l <number>
>
> to limit how 'deep' the get should follow links.
>
> enjoy
>
> On Sun, May 13, 2012 at 11:33 PM, Lee Sonko <lee at lee.org> wrote:
> > I'm trying to make an archive of a website subscription I belong to,
> > my 23andme.com account. I can't find a tool that will download this
> website.
> > I'd rather not copy-and-paste 300 pages. I tried WinHTTrack. Maybe WGet
> > excels at this but it's a steep learning curve; it'd be nice if someone
> > point me in the direction of a tool that could do it.
> >
> > I can see two obvious hurdles. Logging in might be designed to be an
> > interactive process (I tried dragging cookies around in WinHTrack to no
> > avail). And maybe so much depends on the server on a modern website that
> it
> > might not be possible to have a web page without a server. What say the
> > Noisy-nerds?
> >
> > Lee
> >
> >
> >
> >
> > _______________________________________________
> > Noisebridge-discuss mailing list
> > Noisebridge-discuss at lists.noisebridge.net
> > https://www.noisebridge.net/mailman/listinfo/noisebridge-discuss
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.noisebridge.net/pipermail/noisebridge-discuss/attachments/20120601/13f9fa87/attachment-0002.html>


More information about the Noisebridge-discuss mailing list