I can fix the URL code if someone can give me a verbal description of what it should do -- and of course show me the current version. I thought it worked by finding http colon slash-slash followed by [any number of characters as long as they're not whitespace] but obviously that's not so or we wouldn't have this problem.
EDIT: by verbal description I mean "it should match http colon slash-slash followed by any character in the following list" or "it should match http colon slash-slash followed by any character
not
in the following list"
The problem as I understand it is that the URL is delimited by single quotes, so if the URL has an apostrophe, it is read as the end of the URL.
The problem as I understand it is that the URL is delimited by single quotes, so if the URL has an apostrophe, it is read as the end of the URL.
That's not it, I'm almost sure.
The website is called something like Martha's Place, and the apostrophe appears in the document name so it's something like "blah.com/martha'splace.html" -- I must say I was really surprised that apostrophes are legal, and it's really not likely to come up very often.
EDIT: ita, can you post the regex that does it? I bet it's a simple fix.
Here you go, John:
/(^|\\s)(https?:\\/\\/[^\\s]*)\\b(\\/?)/
Are spaces legal or no? How do we do on those? Also, was there a problem with _?
That's not it, I'm almost sure.
If you look at the html source code of the post as displayed, that's what's going on. What am I missing?
Sorry Jon, you're right, I'm wrong. So the problem could be fixed simply by changing the regex to wrap the URL in double quotes not single. Double quotes are
definitely
not allowed in URLs ... right?
I must admit I'm a bit shocked at the use of single quotes for HTML attribute values. Aren't there some browsers where that would be a big problem?
Double quotes are definitely not allowed in URLs ... right?
I hope not. Better look at the code though, to make sure that won't cause a different problem.
If the regexp is working, we can URLENCODE the URL.
Returns a string in which all non-alphanumeric characters except -_. have been replaced with a percent (%) sign followed by two hex digits and spaces encoded as plus (+) signs
Nothing to add, just joining the throng of the shocked and appalled.
Test 1 2 3
Edit: Sorry, I had a message I thought I posted not show up, and I figured this was a better place to test than elsewhere. Carry on.