python - Access part of a row very quickly in Pandas -


I'm calculating 20 billion, and it has come to know that the slow phase of the two orders of magnitude is only in the relevant panda dataframe Rows

 % timeit x = query_results.ix [i] 10000 loop, best 3: 155 μs per loop  

How do we get that speed from one Can leave or two orders of magnitude?

200,000 rows and 11 columns in the Detafrem, to float all the strings strings moving the hard speed reaches a certain length values ​​of different length (which is not feasible in terms of the use ) Will drop about half the pace.

Editing for more context: It is almost a matter of full use to use the suggestion of Brainburn: ix instead of IoL note that we Use only two rows at a time Large number of calculations are computed by every second line (200,000 ^ 2/2).

  test = pd.DataFrame (index = array (200000), columns = array (11)) test.ix [,,:] = 'asdfasdf' i = 0 j = 1% timeit X = set (test.iloc [i]). Intersection (test.iloc [j]) 1000 loops, best 3: 235 μs per loop  

It would be great if it could be more like number 5μs

side note, as an example, each μs Why counts: need to delete does not actually have the data in each cell, so I still have resulted in missing values ​​of data ( nan ) , Which will take more μs test.iloc [i] .dropna () something like, very slow for these purposes Is there. [22]: tx-out [22]: array ([['' ' Asdfksdf 'Asdfksdf' Asdfksdf ', ...,' Asdfksdf 'Asdfksdf' Asdfksdf '], [' Asdfksdf 'Asdfksdf' Asdfksdf ', ...,' Asdfksdf ' ('Asdfasdf', 'asdfasdf', 'asdfasdf', ..., 'asdfasdf', 'asdfasdf', 'asdfasdf'], ..., ['asdfasdf', 'asdfasdf' "Asdfksdf ', ...,' Asdfksdf 'Asdfksdf' Asdfksdf '], [' Asdfksdf 'Asdfksdf' Asdfksdf ', ...,' Asdfksdf 'Asdfksdf' Asdfksdf ' ], ['Asdfasdf', 'asdfasdf', 'asdfasdf' ..., 'asdfasdf', 'asdfasdf', 'asdfasdf']], dtype = in the object [23]:% timeit x = set (tx [i] ). Intersection (tx [j]) 100000 loops, best 3: 1.99 μs per loop


Comments

Popular posts from this blog

java - org.apache.http.ProtocolException: Target host is not specified -

java - Gradle dependencies: compile project by relative path -

ruby on rails - Object doesn't support #inspect when used with .include -