I would be interested to know the points that are in excess or in lack in my regex, and to correct it to make it more suitable for you. I don't know if the result fits exactly to the need expressed by Michael Prescott. I've been carried away with my enthusiasm and my liking of regexes. I have taken the regex's pattern I had writen in that previous problem I worked on and I improved it in order that it can treat the numbers with commas as an answer for this problem. In the continuity of that problem, I find this one interesting because it widens the problem to numbers comprising commas. Some days ago, I worked on the problem of removing trailing zeros from the string of a number. Or you could do some basic processing, then use your language's built-in number parsing functions. Instead of trying to do everything in one step, you could do it in two: a regex to catch anything that might be a number, then another one to weed out whatever isn't a number.
I enjoyed the challenge, but you should consider whether you really want to use this in a production environment. Obviously, this is a massive, complicated, nigh-unreadable regex. The numbers you're looking for will be in capture group 1. Basic patternĬonsidering the examples you've given, here's a simple regex that allows pretty much any integer or decimal in 0000 format and blocks everything else: ^\d*\.?\d+$ If that's not an option for you, keep reading. What you should really do is split the whole thing on whitespace, then run two or three smaller regexes on the results. A single regex for two different number formats is hard to maintain even when they aren't embedded in other text. is a wrong answer.įirst of all, if you don't need to do this all in one regex, don't.
IMHO anything that fails to pull 1,234.- and only those numbers-out of abc22 1,234.56 9.9.9. That's simple enough to fix, even if the numbers are embedded in other text. This is a very common task, but all the answers I see here so far will accept inputs that don't match your number format, such as ,111, 9,9,9, or even. Now that that's out of the way, most of the following is meant as commentary on how complex regex can get if you try to be clever with it, and why you should seek alternatives. #Commas optional as long as they're consistent #For numbers embedded in sentences, see discussion below EDIT: Since this has gotten a lot of views, let me start by giving everybody what they Googled for: #ALL THESE REQUIRE THE WHOLE STRING TO BE A NUMBER