How to select dataframe columns that end with *** with pandas ?

Active January 26, 2022    /    Viewed 290    /    Comments 0    /    Edit


Examples of how to select (filter) dataframe columns that end with *** with pandas:

Create a dataframe with pandas

Let's first create a dataframe with pandas

import pandas as pd
import numpy as np

data = np.arange(1,33)
data = data.reshape(4,8)

df = pd.DataFrame(data=data,columns=['product_l1','product_l2','product_l3','product_l4','product_id',
                                                                         'product_01_name','product_02_name','product_source'])

print(df)

returns

     product_l1  product_l2  product_l3  product_l4  product_id  \
0           1           2           3           4           5   
1           9          10          11          12          13   
2          17          18          19          20          21   
3          25          26          27          28          29

     product_01_name  product_02_name  product_source  
0                6                7               8  
1               14               15              16  
2               22               23              24  
3               30               31              32

Select columns that end with "_name"

To select only the columns that starts with "_name", a solution is first create a list a column names starting with "_name":

col_list = [col for col in df.columns if col.endswith('_name')]

returns here

['product_01_name', 'product_02_name']

and then to do

df[col_list ]

or

df.loc[:, col_list ]

which returns

     product_01_name  product_02_name
0                6                7
1               14               15
2               22               23
3               30               31

Note that

df.loc[:, df.columns.str.endswith('_name') ]

also returns

     product_01_name  product_02_name
0                6                7
1               14               15
2               22               23
3               30               31

Using filter

Another solution is to use filter:

df.filter(regex='_name$',axis=1)

returns

     product_01_name  product_02_name
0                6                7
1               14               15
2               22               23
3               30               31

Select columns that end with '_l1', '_l2', '_l3' or '_l4'

Another more complex example, let's select the columns that end with '_l1', '_l2', '_l3' or '_l4':

col_list = [col for col in df.columns if col[:-1].endswith('_l')]

df.loc[:, col_list ]

returns

 product_l1  product_l2  product_l3  product_l4
0           1           2           3           4
1           9          10          11          12
2          17          18          19          20
3          25          26          27          28

References


Card image cap
profile-image
Daidalos

Hi, I am Ben.

I have developed this web site from scratch with Django to share with everyone my notes. If you have any ideas or suggestions to improve the site, let me know ! (you can contact me using the form in the welcome page). Thanks!



Did you find this content useful ?, If so, please consider donating a tip to the author(s). MoonBooks.org is visited by millions of people each year and it will help us to maintain our servers and create new contents.

Amount