Ответы пользователя по тегу Парсинг
  • Как получить данные из кода html посредством php?

    @OVK2015
    $testStr = '<tr>
    <td height="20"><a title="Замена кнопки Home" href="/V-iPhone-ne-rabotaet-knopka-Home">Замена кнопки HOME</a></td>
    <td align="center"><span id="id169-iphone-5-home">1200</span></td>
    <td align="center">от 30 минут</td>
    <td align="center">3 месяца</td>
    </tr>
    <tr>
    <td height="20"><a title="Ремонт кнопки блокировки" href="/Ne-rabotaet-knopka-vkliucheniia">Замена кнопки включения</a></td>
    <td align="center"><span id="id166-iphone-5-power">1400</span></td>
    <td align="center">от 30 минут</td>
    <td align="center">3 месяца</td>
    </tr>';
    	$regExpWrapper = "#<span(?:.*?)\"id(.*?)\-#si";
    	preg_match_all($regExpWrapper, $testStr, $matches);
    	print_r($matches);
    Ответ написан
    21 комментарий
  • Как спарсить блок на странице?

    @OVK2015
    <?php
    	$link = "http://www.teleguide.info/kanal100055_20160413.html";
    	$page = file_get_contents($link);
    	$regExpWrapper = "#(?:<div id=\"programm\">)(.*?)(?:<div id=\"programm_up\">)#si";
    	preg_match_all($regExpWrapper, $page, $matches);
    	echo iconv("UTF-8", "CP1251", $matches[1][0]);
    ?>
    Ответ написан
    6 комментариев
  • Парсинг с Simple Html Dom, Как правильно?

    @OVK2015
    <?php	
    	function getRemoteData($url, $argsArray, $ifPostRequest)
    	{		
    		$userAgent = "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2414.0 Safari/537.36";
    		$cURLsession = curl_init();
    	
    		curl_setopt($cURLsession, CURLOPT_URL, $url);		
    		curl_setopt($cURLsession, CURLOPT_SSL_VERIFYPEER, false);
    		curl_setopt($cURLsession, CURLOPT_RETURNTRANSFER, true);			
    		curl_setopt($cURLsession, CURLOPT_USERAGENT, $userAgent);							
    		curl_setopt($cURLsession, CURLOPT_FOLLOWLOCATION, true);
    		curl_setopt($cURLsession, CURLOPT_CONNECTTIMEOUT, 30);
    		// curl_setopt($cURLsession, CURLOPT_REFERER, $url);
    		if($ifPostRequest)
    		{
    			curl_setopt($cURLsession, CURLOPT_POST, true);		
    			curl_setopt($cURLsession, CURLOPT_POSTFIELDS, $argsArray);
    			curl_setopt($cURLsession, CURLOPT_HTTPHEADER, 
    			array
    			(			
    				"X-Requested-With: XMLHttpRequest"		   
    			));			
    		}
    		if(($curlResult = curl_exec($cURLsession)) === false)		
    		{		
    			die("Error fetchind data: ".curl_error($cURLsession)." from ".$this->url);								
    		}
    		
    		curl_close($cURLsession);
    	
    		return $curlResult;
    	}		
    	
    	$url = "http://toto.fonsportsbet.com/list/ru/322/";
    	$content = getRemoteData($url, "", false);
    
    	// file_put_contents(__DIR__."\\footbal.html", $content);
    	// echo "Saved\n";
     
    	// $content = file_get_contents(__DIR__."\\footbal.html");
    
    	$regExpLigaWrapper = 
    		"#(?<=<td colspan=4 class=S2L>)(.*?)(<td class=bl>)".
    		"(.*?)((?:<td colspan=4 class=S2L>)|(?:</table>))#si";
    	$regExpPlayWrapper = 
    		"#<td>(\d{1,})<td>(.*?)<td class=S1L>(.*?)<td>".
    		"(.*?)<td(?:.*?)bl>(.*?)<td>(.*?)<(?:.*?)>(.*?)(?:<|$)#si";
    	preg_match_all($regExpLigaWrapper, $content, $ligaMatches, PREG_SET_ORDER);	
    	
    	foreach($ligaMatches as $ligaMatch) 
    	{
    		echo "Liga: ".$ligaMatch[1]."\n****************************\n";
    		preg_match_all($regExpPlayWrapper, $ligaMatch[3], $playMatches, PREG_SET_ORDER);		
    		foreach($playMatches as $playMatch) 
    		{
    			echo 
    			"id: ".$playMatch[1]."\n".
    			"Time: ".$playMatch[2]."\n".
    			"Name: ".$ligaMatch[1]."\t".$playMatch[3]."\n".
    			"Count: ".$playMatch[4]."\n".
    			"Class1: ".$playMatch[5]."\n".
    			"Class2: ".$playMatch[6]."\n".
    			"Class3: ".$playMatch[7]."\n".
    			"\n";			
    		}
    	}
    ?>
    Ответ написан
    6 комментариев