I need to remove the spaces from beginning and end of a string without using the strip
, join
, or split
methods.
I have searched many similar questions and found similar answers as below.
The point I could not understand is '|' operator is used to match either A or B for A|B but here, it works as 'and' operator.
What I want to learn is that this use is normal for | operator or it has another functionality here!
To make it a little bit clearer, I have replaced spaces as 'xxx'
>>> pattern = re.compile(r'^\s+|\s+$')
>>> mo = re.sub(pattern,'xxx',' life is beautiful ')
>>> mo
'xxxlife is beautifulxxx'
The thing is to understand that the given pattern will be matched more than once in the input string. For each possible match it will decide either to have leading or trailing whitespace. Altogether it will consider both; leading and trailing whitespace.
This is probably where the confusion arises from.
To clarify this, let's have a look at the documentation of the re.sub Method Documentation
re.sub(pattern, repl, string, count=0, flags=0)
The optional argument count is the maximum number of pattern occurrences to be replaced; count must be a non-negative integer. If omitted or zero, all occurrences will be replaced. Empty matches for the pattern are replaced only when not adjacent to a previous match, so sub('x*', '-', 'abc') returns '-a-b-c-'.
With count
set to 1
it is easier to describe what is really going on inside the sub
method. Take a look at the following snippet:
>>> pattern = re.compile(r'^\s+|\s+$')
>>> mo0 = ' life is beautiful '
>>> mo1 = re.sub(pattern, 'xxx', mo0, 1)
>>> mo2 = re.sub(pattern, 'xxx', mo1, 1)
>>> mo0
' life is beautiful '
>>> mo1
'xxxlife is beautiful '
>>> mo2
'xxxlife is beautifulxxx'
Here the sub
method just replaces a single occurrence of the matched pattern. In this case mo0
is processed and the result is put into mo1
, where the given pattern is replaced only once - more precisely matching leading whitespace. Afterwards mo1
is processed in the same way and the result is put into mo2
, where the given pattern is replaced only once again - more precisely matching trailing whitespace. m2
is in the end the same result as the previously defined mo
in the opening example. So in the end mo
equals the string where both; the leading and the trailing whitespace are processed as in mo2
. Although, in each of the steps the selection which part of the pattern to match is being done using a logical OR
.
I might have another clue why this is so confusing. Let's take closer look at the And/Or Wikipedia article:
And/or (also and or) is a grammatical conjunction used to indicate that one or more of the cases it connects may occur. For example, the sentence "He will eat cake, pie, and/or brownies" indicates that although the person may eat any of the three listed desserts, the choices are not exclusive; the person may eat one, two, or all three of the choices.
So believing Wikipedia and my own experience with people leads me to the conclusion, that it is not always clear what the precise meaning might be, when using and/or in informal communication. In the formal world of science like mathematics it is pretty clear what an OR
has to mean. Therefore Wikipedia states further:
It is used to describe the precise "or" in logic and mathematics, while an "or" in spoken language might indicate inclusive or or exclusive or.
Some authors of legal texts define best practices by abandoning those ambiguity drivers from legal texts (e.g. here).
However, wikipedia further states:
And/or has been used in official, legal and business documents since the mid-19th century, and evidence of broader use appears in the 20th century.
This tells me that it appears growing, even though it's use is discouraged in precise environments.
I guess the context of the statement is not clear. If one would take the context within a single match into the sentence there would be no space for any confusion anymore.