• 6
name

A PHP Error was encountered

Severity: Notice

Message: Undefined index: userid

Filename: views/question.php

Line Number: 191

Backtrace:

File: /home/prodcxja/public_html/questions/application/views/question.php
Line: 191
Function: _error_handler

File: /home/prodcxja/public_html/questions/application/controllers/Questions.php
Line: 433
Function: view

File: /home/prodcxja/public_html/questions/index.php
Line: 315
Function: require_once

I'm not very good at regular expressions at all.

I've been using a lot of framework code to date, but I'm unable to find one that is able to match a URL like http://www.example.com/etcetc but also is able to catch something like www.example.com/etcetc and example.com/etcetc.

Any help would be great. Thanks guys!

      • 1
    • the first two options can be matched, but matching your last one example.com/etcetc is going to be virtually impossible. You'd need to basically just match anything with a dot in the middle.
    • Like I was answering questions like this till yesterday, but was asked to mark as duplicates if any such question existed today, thats why did it.

For matching all kind of URLs following code should work:

<?php
    $regex = "((https?|ftp)://)?"; // SCHEME
    $regex .= "([a-z0-9+!*(),;?&=$_.-]+(:[a-z0-9+!*(),;?&=$_.-]+)?@)?"; // User and Pass
    $regex .= "([a-z0-9\-\.]*)\.(([a-z]{2,4})|([0-9]{1,3}\.([0-9]{1,3})\.([0-9]{1,3})))"; // Host or IP
    $regex .= "(:[0-9]{2,5})?"; // Port
    $regex .= "(/([a-z0-9+$_%-]\.?)+)*/?"; // Path
    $regex .= "(\?[a-z+&\$_.-][a-z0-9;:@&%=+/$_.-]*)?"; // GET Query
    $regex .= "(#[a-z_.-][a-z0-9+$%_.-]*)?"; // Anchor
?>

Then, the correct way to check against the regex is as follows:

<?php
   if(preg_match("~^$regex$~i", 'www.example.com/etcetc', $m))
      var_dump($m);

   if(preg_match("~^$regex$~i", 'http://www.example.com/etcetc', $m))
      var_dump($m);
?>

Courtesy: Comments made by splattermania on PHP manual: http://php.net/manual/en/function.preg-match.php

RegEx Demo in regex101

  • 52
Reply Report
      • 2
    • +1 Comment inside a method is usually a sign of code smell. BUT, comment in regex or complex SQL queries is THE way to go.
      • 2
    • hi, i had to add A-Z next to every a-z because of youtube like links. but i think it is still excellent anyway
    • I liked the way you broke it down with comments. It's kinda like a regular expression buffet, where you can pick and choose what you want to put on your plate
      • 1
    • if you say try that i know for sure that it willl work because you dont make mistakes :) . thanks anuba it works now thats why i asked you :) . +1

This works for me in all cases I had tested:

$url_pattern = '/((http|https)\:\/\/)?[a-zA-Z0-9\.\/\?\:@\-_=#]+\.([a-zA-Z0-9\&\.\/\?\:@\-_=#])*/';

Tests:

http://test.test-75.1474.stackoverflow.com/
https://www.stackoverflow.com
https://www.stackoverflow.com/
http://wwww.stackoverflow.com/
http://wwww.stackoverflow.com


http://test.test-75.1474.stackoverflow.com/
http://www.stackoverflow.com
http://www.stackoverflow.com/
stackoverflow.com/
stackoverflow.com

http://www.example.com/etcetc
www.example.com/etcetc
example.com/etcetc
user:pass@example.com/etcetc

example.com/etcetc?query=aasd
example.com/etcetc?query=aasd&dest=asds

http://stackoverflow.com/questions/6427530/regular-expression-pattern-to-match-url-with-or-without-http-www
http://stackoverflow.com/questions/6427530/regular-expression-pattern-to-match-url-with-or-without-http-www/

Every valid internet URL has at least one dot, so the above pattern will simply try to find any at least two string chained by a dot, and has valid characters that URL may have.

  • 18
Reply Report
      • 1
    • simplified this regex a bit: /^[a-z0-9./?:@-_=#]+.([a-z0-9./?:@-_=#])*$/i - meta chars don't need to be escaped within square brackets - stripped the optional part in front, doesn't required for validating the url (in don't need the captured values in my use case) - simplified pattern with a case-less modifier instead repeating everything within the character groups
    • another glitch: the above regex does not work for urls containing parameters (and therefore an &). also encoded params are not supported - % sign.
      • 1
    • /(http|https)://+[a-zA-Z0-9./?:@-_=#]+.([a-zA-Z0-9&./?:@-_=#])*/ please use + instead of ? after (http|https):// as ? also passes the http:/ so this way http:/yahoo.com is correct which is not actually. adding the + sign will fix it.
      • 1
    • From the original pattern, I only replaced the last * with a + to avoid that strings like word. matches the expression. Only strings like word.com should match.

Try this:

/^http:\/\/|(www\.)?[a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,5}(:[0-9]{1,5})?(\/.*)?$/

It works exactly like the people want.

It takes with or with out http://, https://, and www.

  • 5
Reply Report

You can use a question mark after a regular expression to make it conditional so you would want to use:

http:\/\/(www\.)?

That will match anything that has either http://www. or http:// (with no www.)

What you could do is just use a replace method to remove the above, thus getting you the domain. Depends on what you need the domain for.

  • 4
Reply Report

I know this is an old post, but just contributing my solution which is a combination of some of the answers I've found here on stackoverflow.

/(https?://)?((?:(\w+-)*\w+)\.)+(?:[a-z]{2})(\/?\w?-?=?_?\??&?)+[\.]?([a-z0-9\?=&_\-%#])?/g

Matches something.com, http(s):// or www. Does not match other [something]:// urls though, but for my purpose that's not necessary.

The regex matches e.g.:

http://foo.co.uk/
www.regex.com/foo.html?q=bar$some=thi-ng,regex
regex.foo.com/blog
  • 3
Reply Report

Try this

$url_reg = /(ftp|https?):\/\/(\w+:?\w*@)?(\S+)(:[0-9]+)?(\/([\w#!:.?+=&%@!\/-])?)?/;
  • 1
Reply Report

you can try this:

r"(http[s]:\/\/)?([\w-]+\.)+([a-z]{2,5})(\/+\w+)? "

selection :
1. may be start with http:// or https:// (optional)
2. anything (word) end with dot (.)
3. followed by 2 to 5 character [a-z]
4. followed by "/[anything]" (optional)
5. followed by space

  • 1
Reply Report

I was getting so many issues get the answer from @anubhava due to recent php allowing $ in strings and the preg match wasn't working.

Here is what I used:

// regex
$re = '/((https?|ftp):\/\/)?([a-z0-9+!*(),;?&=.-]+(:[a-z0-9+!*(),;?&=.-]+)?@)?([a-z0-9\-\.]*)\.(([a-z]{2,4})|([0-9]{1,3}\.([0-9]{1,3})\.([0-9]{1,3})))(:[0-9]{2,5})?(\/([a-z0-9+%-]\.?)+)*\/?(\?[a-z+&$_.-][a-z0-9;:@&%=+\/.-]*)?(#[a-z_.-][a-z0-9+$%_.-]*)?/i';
// match all
preg_match_all($re, $blob, $matches, PREG_SET_ORDER, 0);
// Print the entire match result
var_dump($matches);
// the first element of the array is the full match
  • 0
Reply Report

If it does not have to be regex, you could always use the Validate filters that are in PHP.

filter_var('http://example.com', FILTER_VALIDATE_URL);

filter_var(mixed $variable [, int $filter = FILTER_DEFAULT [, mixed $options ]]);

Types of Filters

Validate Filters

  • -1
Reply Report
      • 1
    • Validates value as URL (according to faqs.org/rfcs/rfc2396), optionally with required components. Beware a valid URL may not specify the HTTP protocol http:// so further validation may be required to determine the URL uses an expected protocol, e.g. ssh:// or mailto:. Note that the function will only find ASCII URLs to be valid; internationalized domain names (containing non-ASCII characters) will fail. -- However, as this is built into PHP, you can expect it to be upgraded and updated later on to be made more useful.

Warm tip !!!

This article is reproduced from Stack Exchange / Stack Overflow, please click

Trending Tags

Related Questions