python - using pool.map to apply function to list of strings in parallel? -


i have large list of http user agent strings (taken pandas dataframe) trying parse using python implementation of ua-parser. can parse list fine when using single thread, based on preliminary speed testing, it'd take me on 10 hours run whole dataset.

i trying use pool.map() decrease processing time can't quite seem figure out how work. i've read dozen 'tutorials' found online , have searched (likely duplicate of sort, there lot of similar questions), none of dozens of attempts have worked 1 reason or another. i'm assuming/hoping it's easy fix.

here have far:

from ua_parser import user_agent_parser      http_str = df['user_agents'].tolist()  def uaparse(http_str):         i, item in enumerate(http_str):             return user_agent_parser.parse(http_str[i])  pool = mp.pool(processes=10) parsed = pool.map(uaparse, range(0,len(http_str)) 

right i'm seeing following error message:

--------------------------------------------------------------------------- typeerror                                 traceback (most recent call last) <ipython-input-25-701fbf58d263> in <module>()       7        8 pool = mp.pool(processes=10) ----> 9 results = pool.map(uaparse, range(0,len(http_str)))  /home/ubuntu/anaconda/lib/python2.7/multiprocessing/pool.pyc in map(self, func, iterable, chunksize)     249         '''     250         assert self._state == run --> 251         return self.map_async(func, iterable, chunksize).get()     252      253     def imap(self, func, iterable, chunksize=1):  /home/ubuntu/anaconda/lib/python2.7/multiprocessing/pool.pyc in get(self, timeout)     565             return self._value     566         else: --> 567             raise self._value     568      569     def _set(self, i, obj):  typeerror: 'int' object not iterable 

thanks in advance assistance/direction can provide.

it seems need is:

http_str = df['user_agents'].tolist()  pool = mp.pool(processes=10) parsed = pool.map(user_agent_parser.parse, http_str) 

Comments