Friday, December 28, 2018

Python Ninja Bootcamp 40-Regular expression

Learn Python like a Professional! Take You from 0 to Hero






In this Python lecture, we will learn about the Regular Expression.

"Regular expression is a sequence of characters that forms a search pattern
Python has a built-in package called,re module which can be used to work with Regular Expressions."

Important methods in Regex

1. match( )
Takes a pattern and string as parameter. It applies the pattern at the beginning of the string & returns the match object if found or None if not found  

Let see an example-

In order to work with regular expression, we need to import re

import re

pattern='python'
string='python ninja bootcamp'
if re.match(pattern,string):
    print('Match is found')
else:
    print('No Match is found')
>>Match is found

pattern='python'
string='Welcome to python ninja bootcamp'
if re.match(pattern,string):
    print('Match is found')
else:
    print('No Match is found')
>>No Match is found

2. search( )

takes a pattern and string as a parameter. It matches the pattern anywhere in the string & returns the object if found or None if not found  

pattern='python'
string='Welcome to python ninja bootcamp'
if re.search(pattern,string):
    print('Match is found')
else:
    print('No Match is found')
>>Match is found

3. findall( )


Takes a pattern and string as a parameter. It returns the list of all substrings that match a pattern. 

pattern='python'
string='python111python222python333'
re.findall(pattern,string)
>> ['python', 'python', 'python']

Metacharacters

Metacharacters are the building block of a regular expression, characters having some important meaning.

Important Metacharacter

1. dot 
. matches with any character, other than \n.

pattern='b.t'
string='bat_bet_bot_but_b\nt'
re.findall(pattern,string)
>> ['bat', 'bet', 'bot', 'but']


2. Asterisks
* matches with zero or more occurrence of a character.


pattern='ab*'
string='a_ab_abbb_abc_b'
re.findall(pattern,string)
>>['a', 'ab', 'abbb', 'ab']



3. Plus
matches with one or more occurrence of a character.


pattern='ab+'
string='a_ab_abbb_abc_b'
re.findall(pattern,string)
>> ['ab', 'abbb', 'ab']


4. Question Mark
? matches with zero or one occurrence of a character.


pattern='ab?'
string='a_ab_abbb_abc_b'
re.findall(pattern,string)
>>['a', 'ab', 'ab', 'ab']


5. Curly Bracket
{} matches with the number defined inside it.

pattern='ab{3}'
string='a_ab_abbb_abc_b'
re.findall(pattern,string)
>> ['abbb']


6. Caret
matches pattern at the beginning of the string.


pattern='^ab'
string='abcd'
bool(re.search(pattern,string))
>> True

7. Dollar
matches the pattern at the beginning of the string.


pattern='cd$'
string='abcd'
bool(re.search(pattern,string))
>> True

Character Set in Regex.


Character set matches only one out of several characters
It is defined by putting the character in the [ ].

pattern='b[aeiou]t'
string='bat'
bool(re.search(pattern,string))
>> True

pattern='b[aeiou]t'
string='bot'
bool(re.search(pattern,string))
>> True

pattern='b[aeiou]t'
string='bxt'
bool(re.search(pattern,string))
>> False


Caret & Character set
Caret inside [ ] excludes all the character defined inside it

pattern='b[^aeiou]t'
string='bxt'
bool(re.search(pattern,string))
>> True

pattern='b[^aeiou]t'
string='bxt'
bool(re.search(pattern,string))
>> False


Range & Character Set
Describe the range of character and numbers in character set.

pattern='[A-Z][0-9]'
string='A1'
bool(re.search(pattern,string))
>> True

pattern='[A-Z][0-9]'
string='A+'
bool(re.search(pattern,string))

>> False


Some Special Sequence

\d : for digit 

pattern='\d+'
string='Welcome to #Python Ninja Bootcamp 007. Lets get started'
re.findall(pattern,string)
>>['007']


\D : for non-digit 

pattern='\D+'
string='Welcome to #Python Ninja Bootcamp 007. Lets get started'
re.findall(pattern,string)
>>['Welcome to #Python Ninja Bootcamp ', '. Lets get started']



\s : for white space 


pattern='\s+'
string='Welcome to #Python Ninja Bootcamp 007. Lets get started'
re.findall(pattern,string)
>>[' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ']



\S : for non-white space 


pattern='\S+'
string='Welcome to #Python Ninja Bootcamp 007. Lets get started'
re.findall(pattern,string)
>>['007']




\w : for alphanumeric


pattern='\w+'
string='Welcome to #Python Ninja Bootcamp 007. Lets get started'
re.findall(pattern,string)
>>['Welcome','to','#Python','Ninja','Bootcamp','007.','Lets','get','started']


\W : for non- alphanumeric

pattern='\W+'
string='Welcome to #Python Ninja Bootcamp 007. Lets get started'
re.findall(pattern,string)
>>[' ', ' #', ' ', ' ', ' ', '. ', ' ', ' ']




EDITS ARE WELCOMED!!

In the next Blog, we will discuss Q&A Seventh  

https://sngurukuls247.blogspot.com/2018/12/python-ninja-bootcamp-41-q-7th.html

......................................................................................................................................

Follow the link below to access Free Python Lectures-
https://www.youtube.com/channel/UCENc9qI7_r8KMf6-_1R1xnw

Instagram-
https://www.instagram.com/python.india/

View the Jupyter Notebook for this lecture

Download the Jupyter Notebook for this lecture 



Feel free contact me on-
Email - sn.gurukul24.7uk@gmail.com

No comments:

Post a Comment