| Events |
|
|
|
|
|
|
|
|
| Services |
|
|
|
|
| Interact |
|
|
| -
|
| -
|
|
|
|
|
| About Us |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Re: [vox-tech] PHP / CURL
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [vox-tech] PHP / CURL
Thanks Dave...
I have a RedHat Enterprise Linux 3 machine on which I installed: curl 7.15.5 : with SSL php 4.4.4 : with curl and command line interface
Unfortunately, the problem is a very basic one! I can't even read a regular html webpage frpm an http server, let alone the https stuff. For example:
$LOGINURL = "www.yahoo.com"; //or any other http webpage $agent = "Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4) Gecko/20030624 Netscape/7.1 (ax)";
$ch = curl_init(); curl_setopt($ch, CURLOPT_URL,$LOGINURL); curl_setopt($ch, CURLOPT_USERAGENT, $agent); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); $result = curl_exec ($ch); curl_close ($ch); print $result;
I save the above lines in a test.php file and then I type this on linux command line: > php -f test.php or > php test.php
both just print the content of the .php file (the above lines!) instead of the html webpage!!!!!! and I don't know what's wrong in here; that should be a small bug or a user mistake or something like that....
Would you give me a sample php/curl code and the necessary steps for running it on a linux command line? a simple one, something that does the same thing as "wget www.yahoo.com
" for example.
Thanks!
On 9/4/06, Dave Margolis <dave@silogram.net> wrote:
On Sep 1, 2006, at 10:35 AM, serendipitu serendipitu wrote:
> I need to READ some data from that page without manually loging in > every 24 hours.
PHP/curl makes this pretty easy (depending on how much energy the
site developers have put into trying to prevent screen-scraping). Also, any language that has a curl implementation can also do this (PERL is one that comes to mind).
You need a pretty strong understanding of PHP and a basic
understudying of how HTTP works. You'll need a webserver that runs PHP or a local machine with the PHP command line interface installed. Then you'll need a script. That script will take a series of steps that each represent a login, a link click, a form
submission, or some kind of user interaction with a website.
The process basically works like this:
First you call curl_init() to get things started.
You need to call curl_setopt() any number of times to define what
type of call you're going to make (in this case a series of HTTP transactions). These curl_setopt() calls are very similar to the command line switches you'd throw at the command line version of curl.
Then you finish up with a curl_exec() and a curl_close().
It took me a lot of ready and trial and error to figure this all out. I'd start here: http://www.php.net/manual/en/ref.curl.php
Every site is different, and it's difficult to tell you what to do
without having a half.com account.
Dave _______________________________________________ vox-tech mailing list vox-tech@lists.lugod.org
http://lists.lugod.org/mailman/listinfo/vox-tech
_______________________________________________
vox-tech mailing list
vox-tech@lists.lugod.org
http://lists.lugod.org/mailman/listinfo/vox-tech
|
|