Filter just duplicate urls from an php array
If you want to modify the input array, not generate a new filtered array, you can use strpos()
to identify urls, a lookup
array to identify duplicate urls, and unset()
to modify the array.
strpos($v,'http')===0
not only requireshttp
to be in the string, it requires it to be the first four characters in the string. To be clear, this accommodateshttps
as well.strstr()
andsubstr()
will always be less efficient thanstrpos()
when simply checking the existence or position of a substring. (The second note @ PHP Manual's strstr() boasts of the benefits of usingstrpos()
when merely checking the existence of a substring.)- Using iterated
in_array()
calls to check the$lookup
array, is less efficient than storing the duplicate urls as keys in the lookup array.isset()
will outperformin_array()
every time. (Reference Link) - The OP's sample input does not indicate that there are any monkey-wrenching values that will start with
http
yet not be a url, nor non-urls that start withhttp
. For this reason,strpos()
is a suitable and lightweight function call. If trouble-making urls are possible, then sevavietl's url validation is a more reliable function call. (PHP Manual Link) - From my online performance tests, my answer is the fastest method posted which provides the desired output array.
Code: (Demo)
$array=[
'EM Debt'=>'http://globalevolution.gws.fcnws.com/fs_Overview.html?isin=LU0616502026&culture=en-GB',
'EM Local Debt'=>'Will be launched shortly',
'EM Blended Debt'=>'Will be launched shortly',
'Frontier Markets'=>'http://globalevolution.gws.fcnws.com/fs_Overview.html?isin=LU0501220262',
'Absolute Return Debt and FX'=>'Will be launched shortly',
'Em Debt'=>'http://globalevolution.gws.fcnws.com/fs_Overview.html?isin=LU0501220262'
];
foreach($array as $k=>$v){
if(isset($lookup[$v])){ // $v is a duplicate
unset($array[$k]); // remove it from $array
}elseif(strpos($v,'http')===0){ // $v is a url (because starts with http or https)
$lookup[$v]=''; // store $v in $lookup as a key to an empty string
}
}
var_export($array);
Output:
array (
'EM Debt' => 'http://globalevolution.gws.fcnws.com/fs_Overview.html?isin=LU0616502026&culture=en-GB',
'EM Local Debt' => 'Will be launched shortly',
'EM Blended Debt' => 'Will be launched shortly',
'Frontier Markets' => 'http://globalevolution.gws.fcnws.com/fs_Overview.html?isin=LU0501220262',
'Absolute Return Debt and FX' => 'Will be launched shortly',
)
Just for fun, a functional/unorthodox/convoluted method can look like this (not recommended, purely a demonstration):
var_export(
array_intersect_key(
$array, // use $array to preserve order
array_merge( // combine filtered urls and unfiltered non-urls
array_unique( // remove duplicates
array_filter($array,function($v){ // generate array of urls
return strpos($v,'http')===0;
})
),
array_filter($array,function($v){ // generate array of non-urls
return strpos($v,'http')!==0;
})
)
)
);
You can traverse the array one time to get the result, in this process you need to use an extra array to indicate which url you have saved in the result.
$saved_urls = [];
$result = [];
foreach($array as $k => $v)
{
if('http://' == substr(trim($v), 0, 7) || 'https://' == substr(trim($v), 0, 8))
{
if(!isset($saved_urls[$v])) // check if the url have saved
{
$result[$k] = $v;
$saved_urls[$v] = 1;
}
}else
$result[$k] = $v;
}
Here is your answer:
<?php
// taking just example here, replace `$array` with yours
$array = ['http://globalevolution.gws.fcnws.com/fs_Overview.html?isin=LU0616502026&culture=en-GB', 'abc', 'abc', 'http://globalevolution.gws.fcnws.com/fs_Overview.html?isin=LU0616502026&culture=en-GB'];
$url_array = [];
foreach($array as $ele) {
if(strpos($ele, 'http://') !== false) {
$url_array[] = $ele;
} else {
$string_array[] = $ele;
}
}
$url_array = array_unique($url_array);
print_r(array_merge($string_array, $url_array));
?>
Well, you can use array_filter
:
$filtered = array_filter($urls, function ($url) {
static $used = [];
if (filter_var($url, FILTER_VALIDATE_URL)) {
return isset($used[$url]) ? false : $used[$url] = true;
}
return true;
});
Here is demo.